@wishtrepreneur

wishtrepreneur@alien.top · 10 months ago

You can use a general LLM to “classify” a prompt and then route the entire prompt to a downstream LLM.

why can’t you just train the “router” LLM on which downstream LLM to use and pass the activations to the downstream LLMs? Can’t you have “headless” (without encoding layer) downstream LLMs? So inference could use a (6.5B+6.5B) params model with the generalizability of a 70B model.

wishtrepreneur@alien.top · 10 months ago

Novelty is for amateurs, imo.

hey, don’t diss my fidget spinner! I made a whole $200 from dropshipping that!

wishtrepreneur@alien.top · 10 months ago

Now I have 50+ signups and closing in on $100 MRR.

anyone else read this as $100M recurring revenue?

wishtrepreneur@alien.top · 10 months ago

It’s not like we need models that are almost as good at things computers are excellent at, while using orders of magnitude more resources.

one of the arguments for learning math (calculus, linear algebra, etc.) in school is because it supposedly helps you with critical thinking, logic reasoning, etc.

If this can be tested in LLMs then it gives weight to that proposal, because let’s face it, 99% of the population don’t use anything more complicated than exponential equations in their every day lives.