you can try changing the attention to something like flash attention
so it sounds like for the 600b they just finetuned llama2 again with the same stuff Llama2 was trained with, just more of it…
RefinedWeb
Opensource code from GitHub
Common Crawl we fine-tuned the model on a huge dataset (generated manually and with automation) for logical understanding and reasoning. We also trained the model for function calling capabilities.
I haven’t seen too many AI apps on pornhub, but I use an ad blocker.
But stop spending time on producthunt.
I hate that people are training on that garbage… but they’re ummmm… you know…
Is there anything the hoopla over openAI using deep Q-learning other than random speculation?
If anything I would guess DQN not q-learning.
But all the papers people have pointed to speculating about this hoopla just mention active learning or RL without specifics.
PRM8k, made the rounds maybe 6+ months but they never publicly released the model.
yea, that seems to be what a few news articles have referenced.
https://www.semanticscholar.org/ helps a bit, with a bit better search than other paper repos, and some AI for recommendations… I’m sure someone will wrap a decent chat interface around their API at some point. But yea, it’s bloody hard to keep up with everything happening with generative AI, especially when you include opensource work that isn’t always publishing papers.
is there something other than the letter Q making you think it’s Q-learning?
What would happen if you replace the decoder during finetuning? Would you also see a speed up but at the expense of vram?
This is a giant cluster fuck.
Only if Sam sign a contract with MS, and it explicitly prevents that.
I will use this now for some tests.
I believe they said they’re going to release training data. We’ll see. That’s about the only way to easily verify what made it in.
Or 2 a6000s. But yea $$$ matters.