Authors using a new tool to search a list of 183,000 books used to train AI are furious to find their works on the list.

    • Lieutenant Liana@startrek.website
      link
      fedilink
      English
      arrow-up
      2
      ·
      2 years ago

      “I’m not reselling your book, I am selling a machine that holds a mathematical formula that partly represents your entire book word for word and can reprint it on command!”

      • FaceDeer@kbin.social
        link
        fedilink
        arrow-up
        3
        arrow-down
        1
        ·
        2 years ago

        LLMs can’t reprint their entire training data on demand. They rarely even remember quotes.

      • PsychedSy@sh.itjust.works
        link
        fedilink
        English
        arrow-up
        0
        arrow-down
        1
        ·
        2 years ago

        I mean, yeah? They were running to a concrete description. That is not valid. My brain has most of Terry Pratchett’s works.

      • lloram239@feddit.de
        link
        fedilink
        English
        arrow-up
        2
        arrow-down
        1
        ·
        edit-2
        2 years ago

        It shares popular quotes from books, it can’t reproduce arbitrary content from a book. The content needs to be heavily duplicated in the training data to stick around (e.g. from book reviews), and even than half of it might still end up being made up on the spot.

        Also request for copyrighted content will be blocked by ChatGPT and just receive the stock “I can’d do that” response anyway.

        If you have some damning examples that show the opposite, show them.

        • 👁️👄👁️@lemm.ee
          link
          fedilink
          English
          arrow-up
          1
          ·
          2 years ago

          I’m too lazy and care too little but you can basically get it to roleplay as a book expert or something and to “remind” you of certain passages. It gets around the filter pretty easily, that’s how jailbreaks work.