It's important to note that the number of parameters (7B, 30B, etc) does not always translate to model size because of quantization. Models can be quantized, that is, saved at reduced precision, to save space. These files will often be labeled with something like "Q4_K" or "Q2_XS." The reduced precision harms quality. Generally, a 4bit quantized model gives acceptable, coherent responses at one quarter of the size. An 8bit quantization is usually half sized and provides almost exactly the same performance.
On my 4090 with 24gb vram, I can safely use a model like CodeLlama-34B at 4bit quantization with 4gb of headroom left over. The results are generally pretty good but I prefer using a coding model with fewer params at higher quantization. I can also just barely fit LLama3-70B at 1bit quantization, but it can barely craft a sentence properly. The relationship between model perplexity (that is, how good it is at picking a "reasonable" or "obvious" response) vs quantization is illustrated nicely
here.
Edit: Forgot to mention also that the number of parameters alone does not make a better model. Bard is like 540B parameters, but the responses are not substantially better than the best open source models. The quality of the training data and its fitness for specific tasks matters much more than the size of the model. So basically don't assume that you can't run a high parameter model, and don't assume you need a high parameter model to get good results.