Using and losing lots of money on gpt-4 ATM, it works great but for the amount of code I’m generating I’d rather have a self hosted model. What should I look into?

  • --dany--@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    Phind-CodeLlama 34B is the best model for general programming, and some techy work as well. But it’s a bad joker, it only does serious work. Try quantized models if you don’t have access to A100 80GB or multiple GPUs. 4 bit quantization can fit in a 24GB card.

    • berzerkerCrush@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      I tried the V7, which is supposedly better than GPT4. it couldn’t do the things I asked it to do, unlike GPT 4 (through Bing Chat). DeepSeek also did a couple of things, but its solutions where sometimes not ideal. It’s underwhelming.

      The web search engine is interesting through.

    • KOTNcrow@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      That board is in serious need of an update, check the Yi-34b model, very impressive. Dolphin 2.2 Yi 34b is a variant I cant wait to try.

    • Zemanyak@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      Great board. I wish they had Phind 7 or whatever they use on their live website.

    • yonomono@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      Think this would be good-enough/suitable to use with AutoGPT/BabyAGI type situations? This is my main use-case, for bulk inspiration if not productivity. The API’s can get expensive if left on full-automatic overnight.

    • SlateHardjaw@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      What environment do you use to interact with self-hosted code models when coding? I’ve been using and enjoying Cursor for the way it’s integrated into the IDE, but I’ve been exploring options for going self-hosted just to feel freer from whatever record I’m putting on someone else’s server.

      • DifferentPhrase@alien.topB
        link
        fedilink
        English
        arrow-up
        1
        ·
        10 months ago

        My code editor of choice (Helix) doesn’t support integrations or plugins so I haven’t tried Cursor or Copilot. I’m building my own UI right now that focuses on first-class support for models served by llama.cpp.

  • leepenkman@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    i would grab a server like vllm or text-generator.io (open source too)
    Then get a model like others have suggested like deepseek or something to put in the server (both those servers are OpenAI compatible so makes switching easy)

  • yonomono@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    Think this would be good-enough/suitable to use with AutoGPT/BabyAGI type situations? This is my main use-case, for bulk inspiration if not productivity. The API’s can get expensive if left on full-automatic overnight.

  • amsat@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    If you allow models to work together on the code base and allow them to criticize each other and suggest improvements to the code, the result will be better, this is if you need the best possible code, but it turns out to be expensive. So the best thing is the work of a team of models and not just one.

  • spidLL@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    How hosting any model, which at the moment would be inferior to GPT-4, would cost less than 20 dollars per month?

    Anyway, GitHub Copilot cost 10 and has plugins for any IDE, in VsCode has also a chat. I don’t remember which model is based on, but works pretty well. You might try that.

  • -Tesla@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    I don’t have an answer for you, but I am curious, how much code do you have it generate on an average work/programming day?

  • xbaha@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    i use Phindv, gpt3.5 free together and forward code between them to optimize and fix issues, works smooth for me.

  • ababana97653@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    For coding specifically, have you looked at Amazons CodeWhisper - I think it’s free for personal use