Looking for speed and accuracy. Any suggestions on cloud hosts?

  • yahma@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    None of the open models perform function calling as well as openai…

    • _nembery@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      It’s not even that hard. Just use a regex on the return text for simple classification tasks. Any llama2 can do this reasonably well. The hard part is when you want complex JSON data structures

  • giesse@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I’m confused by all the people worrying about OpenAI’s API… can’t they just use the Azure endpoints? If anything, MS would be very happy to capture all of OpenAI’s previous customers…

    • jfranzen8705@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Yeah, they’re pretty heavily restricting access to it and prioritizing large-ish enterprise customers.

      • giesse@alien.topB
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 year ago

        I see, OTOH, if OpenAI really went belly up, I imagine they’d rush to increase their own capacity? If anyone wins in all this drama it’s Microsoft…

    • fvpv@alien.topOPB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      I’ve now signed up for an Azure endpoint - let’s see if it gets approved. It looks like the process to get a key is going to be a bit of a PITA.

  • Fast-Satisfaction482@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    From an idealistic point of view, you can implement function calling easily in your team. Use the context free grammar plugins that are now available to ensure that the LLM outputs match your function calling format. Then build your own dataset on your typical workloads and prepare a pipeline to finetune new models on it.
    As open-source models will continually improve, you can use that pipeline to fine tune for your task for a few bucks on a few cloud GPUs. You should be prepared to switch from model to model and handle your fine tuning in your team. That way you will be able to keep up with the cutting edge (of open source) and still have full control. You can allways chose that a model is good enough and keep using it forever.

    From a serious business point of view: You are in serious trouble because you relied on a single, very hard to replace core service for your whole startup. Don’t make that mistake again. First and foremost, make sure that your backend becomes flexible enough to switch the LLM service provide on short notice. Then, you will probably want to integrate support for MS azure’s version of GPT3.5. MS appears to have access to all models up to at least GPT4 and moreover appears to have a commercial licence on that. So basically MS provides you with a perfect drop in solution.

    You might still want to persue the open-source route, because it gives you full control over your core service. Depending on the size of your startup, you probably should implement at least two separate solutions to the threat of OpenAI shutting down.

    Then again, it’s entirely possible that OpenAI services will keep operating. The situation is still completely fluid. But I guess MS is your best bet, particularly if the whole team actually migrates to MS.

    • fvpv@alien.topOPB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      Thank you for this - yes you’re right, this is a hard lesson and luckily the stakes are fairly low for me. Had my startup been bigger though there would be pain and panic.

      Thank you for pointing me toward the azure 3.5 - I will definitely check that out and that is the kind of solution I am looking for.

  • ZestyData@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    I don’t understand, how do you run a company without providing any value itself, just surfacing OpenAI’s existing products, that they’ll inevitably sell direct to consumers in the first place?

    Particularly if you have to even ask about the one fundamental thing you’re supposedly building a company around - using LLMs.

    • fvpv@alien.topOPB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      I just typed a super long reply and then my browser ate it… damn. I’ll summarize what I said:

      1. Provide value by building products that solve customer problems.

      2. The majority of people aren’t prompt engineers or coders, and many can’t even simply visualize things or know where to start on complex projects.

      3. Use your knowledge to create subject specific products that cater to workflows and formats that need to be specific and include insider knowledge that would take many many prompts to get close to achieving a good outcome.

    • Slimxshadyx@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      1 year ago

      How do you know the start up isn’t providing value? Isn’t the whole point of making ai is to integrate it with other software/stuff?

      Ai can be much more powerful than a chatbot

  • kpodkanowicz@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    Guding output was already mentioned but maybe I will mention how this can be done even with very weak model.

    You use text complete end point where you will be constructing your prompts.
    You specify context and make it stand out as a separate block
    Then in a prompt you ask to fill a specific detail (just one to the JSON)
    In the completeion part (i.e. after assistant) you already pre-write out put in JSON format with first value,
    You stop streaming after " sign
    change the prompt to ask for the next value, add it as next atribute to the JSON you are generating and again start generation and stop with "

    Very, very fast -you barely generate any tokens mostly eval prompts.

    Test manually once you you have good result ask GPT4 to write you a python wrapper to do it.

  • Crafty-Run-6559@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    How many users do you have? If you’ve been keeping your inputs/outputs to gpt4, then you can probably use that to tune a your own model that will perform similarly.

    The biggest issue you’re going to have is probably hardware.

    LLMs are not cheap to run, and if you start needing multiple of them to replace OpenAI, your bill is going to be pretty significant just to keep the models online.

    It’s also going to be tough to maintain all the infra you’ll need without a full time devops/mlops person.

  • FreezeproofViola@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    1 year ago

    You’re not going to get a lower price than the turbo API anywhere sadly

    (unless you’re dealing with really sensitive data, just use OAI, their machines costs are marked like crazy by sheer scale)