•   8 months ago

Issue Running GPT-OSS 20B on 12GB RAM – Need Guidance on Using Smaller Models

My laptop only has 12GB RAM, but GPT-OSS 20B needs 16GB so it’s not running. Can I first use a smaller model (like LLaMA-7B or similar) for development and then switch to the required one later? Or is it okay to stick with another model like LLaMA -7B?

  • 4 comments

  •   •   8 months ago

    Someone asked a similar question, you can find the answer here: https://openai.devpost.com/forum_topics/41708-can-i-use-other-ai-models

  • Private user

    Private user   •   8 months ago

    I'm having the same issue. It's impossible for me to run the model due to hardware limitations as for this discussion https://openai.devpost.com/forum_topics/41708-can-i-use-other-ai-models. I think they're running other AI model on top of GPT-OSS. have you find other alternatives yet?

  • Manager   •   8 months ago

    Hi everyone,

    Apologies for the delay. I've asked the OpenAI Team and they said "You will have to use gpt-oss but you could use one of the hosted providers as a stop gap. For the local agent category proving that it can run fully locally is required." Good luck!

  •   •   8 months ago

    I'm running GPT-OSS 20B on 12GB VRAM (4070).
    You just need to reduce the number of layers loaded onto GPU with the --ngl flag.
    If you set --ngl lower (like --ngl 10 or even --ngl 5), fewer layers are offloaded to GPU, so RAM usage drops and the rest runs on CPU.

Comments are closed.