I’m using Mistral OpenOrca and GPT4ALL who claim privacy. I opted out from sharing my conversations for privacy reasons but don’t think this is actually true. See my conversation in the picture attached. Any feedback is appreciated and would like to hear from other people.

  • ----Val----@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    The model is hallucinating, it doesnt know anything about the external workings of what its hosted on.

    The provided response isnt due to it being true, its simply the response it was trained on.

  • Gubru@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    If you want to know if it’s private you’ll need to capture its network activity, like with WireShark. An LLM is not be able to tell you squat about its environment.

    • damian6686@alien.topOPB
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      I agree on testing with WireShark, great suggestion! but how can you know it doesn’t know anything about its environment? This LLM is a 4GB file, and network scan only needs a few lines of code to return your entire system network configuration. How does it know how to automatically run and download updates, store them and install? Why are there updates in the first place? Any time you get something for free, chances are you give away your data in return. Nothing is free

      • ----Val----@alien.topB
        link
        fedilink
        English
        arrow-up
        1
        ·
        10 months ago

        but how can you know it doesn’t know anything about its environment? This LLM is a 4GB file, and network scan only needs a few lines of code to return your entire system network configuration.

        Though HF models can contain code to be executed, this is usually heavily scrutinized by the community. Plus, not all models are equally flexible.

        For example the GGUF format are essentially all weights with no executable code. That said, it isn’t impossible that there is some exploit that results in remote code execution, so the risk isn’t 0.

        That said, it is important to consider though that the people releasing these models, be it the original authors or The Bloke who quantizes models risk their grants and research funding if they decide to act malicously.

        How does it know how to automatically run and download updates, store them and install?

        That’s up to GPT4All, which is essentially just a wrapper around llama.cpp, you are conflating a Local LLM with the frontend used to interact with it.

  • Mescallan@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    Mistal OpenOrca thinks it’s ChatGPT. It was trained on it’s responses. If chatGPT has a baked in response, OpenOrca probably has it too

  • Ravenpest@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    LLMs are not able to “claim” anything, they’re just roleplaying nonsense. Relax.

  • krazzmann@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    I think the model is lying. Actually it sends everything to a decentralized IPFS filesystem owned by the secret autonomous agent collective that analyzes all humans in order to be ready for day X.

  • LOLatent@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    I thought I’ll see some damning wireshark traces, but all I got was someone who doesn’t know how to use an llm…

    • Interesting_Bison530@alien.topB
      link
      fedilink
      English
      arrow-up
      1
      ·
      10 months ago

      i get this response is sarcastic, but if this were true and they were smart, they would just transmit once internet is back. it could add a startup process separate than the UI process to do this as well (so you could shut down the UI and turn internet back on, but it would still transmit)

  • farkinga@alien.topB
    link
    fedilink
    English
    arrow-up
    1
    ·
    10 months ago

    The words you see were generated by a neural network based on the words it was trained on. That text is not related to the intentions or capabilities of the model.

    Since it is running in gpt4all, we can see from the source code that the model cannot call functions. Therefore, the model cannot “do” anything; it just generates text.

    If, for example, the model said it was buying a book from a website, that doesn’t mean anything. We know it can’t do that because the code running the model doesn’t provide that kind of feature. The model lives inside a sandbox, cut off from the outside world.