[R]eading List for Andrej Karpathy’s “Busy person’s intro to Large Language Models” Video

FallMindless3563@alien.top · 2 years ago

[R]eading List for Andrej Karpathy’s “Busy person’s intro to Large Language Models” Video

eek04@alien.top · 2 years ago

Since “Attention is All You Need” is fairly high on my reading list for understanding the details of transformer architecture, what do you recommend instead?

coumineol@alien.top · 2 years ago

https://arxiv.org/abs/2106.04554

If you’re trying to learn more about language models don’t bother with anything written before 2020. That’s basically the Stone Age.

eek04@alien.top · 2 years ago

Thank you!