I’ve got research background in ML but never actually developed any models as it was all theoretical work. I got lucky during the interview stage for this role as my research impressed them. My project involves fine-tuning a GPT-3 model for a specific task and host the model on a website. Does anyone have any tips on how to go about learning what I need to know to do this? Also what should I consider when curating my custom dataset when fine-tuning the model? I really want this to be a learning experience for me.

  • blahblahwhateveryeet@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Yeah basically what I think is probably happened is that these guys are like everybody else you thinks that fine tuning is somehow going to make it possible to open Pandora’s black box on GPT and more than likely the folks that tasked you with this have absolutely no clue about how fine tuning works…