Madhan Mohan • 8 months ago
Issue Running GPT-OSS 20B on 12GB RAM – Need Guidance on Using Smaller Models
My laptop only has 12GB RAM, but GPT-OSS 20B needs 16GB so it’s not running. Can I first use a smaller model (like LLaMA-7B or similar) for development and then switch to the required one later? Or is it okay to stick with another model like LLaMA -7B?
Comments are closed.

4 comments
Mousuf Nayon • 8 months ago
Someone asked a similar question, you can find the answer here: https://openai.devpost.com/forum_topics/41708-can-i-use-other-ai-models
Private user • 8 months ago
I'm having the same issue. It's impossible for me to run the model due to hardware limitations as for this discussion https://openai.devpost.com/forum_topics/41708-can-i-use-other-ai-models. I think they're running other AI model on top of GPT-OSS. have you find other alternatives yet?
Shawni Devpost Manager • 8 months ago
Hi everyone,
Apologies for the delay. I've asked the OpenAI Team and they said "You will have to use gpt-oss but you could use one of the hosted providers as a stop gap. For the local agent category proving that it can run fully locally is required." Good luck!
Victor Pimshin • 8 months ago
I'm running GPT-OSS 20B on 12GB VRAM (4070).
You just need to reduce the number of layers loaded onto GPU with the --ngl flag.
If you set --ngl lower (like --ngl 10 or even --ngl 5), fewer layers are offloaded to GPU, so RAM usage drops and the rest runs on CPU.