Obviously building a big high dimensional language model is hard yes okay.

But once we have one can’t we just jiggle weights and run tests? why can’t I just download a program to “evolve” my language model?

“Am I just stupid and this is just too trivially easy to be a program?”

peace

  • LuluViBritannia@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I have been generating art with AI. There is an extension meant for exactly that : you literally tell the AI “good” or “bad” for each result, and it affects the weights of the model.

    Sadly, it’s sheer impossible to run. Reinforcement learning isn’t just about “picking a random weight and changing them”. It’s rewriting the entire model to take your feedback into account. And that, while running the model, which in itself already takes most of your compute resource.

    You need a shitton of VRAM and a very powerful GPU to run Reinforcement Learning for images. It’s even worse for LLMs, which are much more power-hungry.

    Who knows, maybe there will be optimizations in the next years, but as of right now, reinforcement learning is just too demanding.

    • Void_0000@alien.top
      cake
      B
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      How hard can it be?

      Seriously though, what makes it require more VRAM than regular inference? You’re still loading the same model, aren’t you?

      • LuluViBritannia@alien.topB
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        Well, first of all, this is something you do while running the model. Sure, it’s the same model, but it’s still two different processes to run in parallel.

        Then, from what I gather, it’s closer to model finetuning than it is to inference. And if you look up the figures, finetune requires a lot more power and VRAM. As I said, it’s rewriting the neural network, which is the definition of finetuning.

        So in order to get a more specific answer, we should look up why finetuning requires more than inference.

      • ihexx@alien.topB
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        there’s lots of different kinds of RL algos with different requirements

        In general though, the tradeoff you’re making is: data efficiency vs compute complexity

        On one end, evolutionary methods & gradient-free optimization methods are simple, but data hungry.

        On the other end, are things like model based RL (eg building reward models to train your generator model) are more data efficient, but are more complex since they have more moving parts and more live models to train.

        So to answer:

        Seriously though, what makes it require more VRAM than regular inference? You’re still loading the same model, aren’t you?

        No, on the model-based end, you’re training at least 2 models: the generator and the reward model.

        On the evolutionary & gradient free end, you need far more data than supervised learning, since reinforcement learning doesn’t tell the agent what to do at every time step, only after N time steps, so you’re getting basically 1/Nth the training signal for each step compared to supervised learning.

        Basically, we as GPU poors are in the wierd position where anything we can train under these limitations would probably have worse performance than just training a larger model off supervised datasets