OpenAI now tries to hide that ChatGPT was trained on copyrighted books, including J.K. Rowling’s Harry Potter series::A new research paper laid out ways in which AI developers should try and avoid showing LLMs have been trained on copyrighted material.

  • Thorny_Thicket@sopuli.xyz
    link
    fedilink
    English
    arrow-up
    24
    arrow-down
    5
    ·
    2 years ago

    I don’t get why this is an issue. Assuming they purchased a legal copy that it was trained on then what’s the problem? Like really. What does it matter that it knows a certain book from cover to cover or is able to imitate art styles etc. That’s exactly what people do too. We’re just not quite as good at it.

    • Hildegarde@lemmy.world
      link
      fedilink
      English
      arrow-up
      16
      arrow-down
      11
      ·
      2 years ago

      A copyright holder has the right to control who has the right to create derivative works based on their copyright. If you want to take someone’s copyright and use it to create something else, you need permission from the copyright holder.

      The one major exception is Fair Use. It is unlikely that AI training is a fair use. However this point has not been adjudicated in a court as far as I am aware.

      • FatCat@lemmy.world
        link
        fedilink
        English
        arrow-up
        25
        arrow-down
        6
        ·
        2 years ago

        It is not a derivative it is transformative work. Just like human artists “synthesise” art they see around them and make new art, so do LLMs.

        • BURN@lemmy.world
          link
          fedilink
          English
          arrow-up
          2
          arrow-down
          1
          ·
          2 years ago

          LLMs don’t create anything new. They have limited access to what they can be based on, and all assumptions made by it are based on that data. They do not learn new things or present new ideas. Only ideas that have been already done and are present in their training.

        • Hildegarde@lemmy.world
          link
          fedilink
          English
          arrow-up
          5
          arrow-down
          6
          ·
          2 years ago

          Transformative works are not a thing.

          If you copy the copyrightable elements of another work, you have created a derivative work. That work needs to be transformative in order to be eligible for its own copyright, but being transformative alone is not enough to make it non-infringing.

          There are four fair use factors. Transformativeness is only considered by one of them. That is not enough to make a fair use.

          • Cosmic Cleric@lemmy.world
            link
            fedilink
            English
            arrow-up
            2
            arrow-down
            1
            ·
            2 years ago

            Transformativeness is only considered by one of them. That is not enough to make a fair use.

            Somebody better let YouTube content creators know that. /s

      • LordShrek@lemmy.world
        link
        fedilink
        English
        arrow-up
        14
        arrow-down
        3
        ·
        2 years ago

        this is so fucking stupid though. almost everyone reads books and/or watches movies, and their speech is developed from that. the way we speak is modeled after characters and dialogue in books. the way we think is often from books. do we track down what percentage of each sentence comes from what book every time we think or talk?