minus-squaref1nuttic@alien.topBtoLocalLLaMA@poweruser.forum•Point me towards some basic dataset preparation tips for LLM's?linkfedilinkEnglisharrow-up1·1 year agoI had the same question a few weeks back and this blog post was really helpful for me: https://together.ai/blog/redpajama-data-v2 . The scripts used are also open sources on the github repo. linkfedilink
I had the same question a few weeks back and this blog post was really helpful for me: https://together.ai/blog/redpajama-data-v2 . The scripts used are also open sources on the github repo.