It’s unfair to compare standalone LLMs with GPT4 which is whole engineering system we know nothing about.
People are working for improving quality of LLM and reduce their sizes for sure and you can always train a 7B to be very good at some tasks and beat a bigger model but only on this small task.
However the lower the number of parameters, the less the model can handle complex tasks, and the less it can be good at several different tasks at the same time.
It’s unfair to compare standalone LLMs with GPT4 which is whole engineering system we know nothing about.
People are working for improving quality of LLM and reduce their sizes for sure and you can always train a 7B to be very good at some tasks and beat a bigger model but only on this small task.
However the lower the number of parameters, the less the model can handle complex tasks, and the less it can be good at several different tasks at the same time.
Take a look to the tests made by https://www.reddit.com/r/LocalLLaMA/comments/17vcr9d/llm_comparisontest_2x_34b_yi_dolphin_nous/