Book publisher Penguin Random House has expressed its stance on AI training in print. According to a report by The Bookseller spotted by Gizmodo, the standard copyright page for both new and reprinted books states, “No part of this book may be used in any way for the purpose of training artificial intelligence techniques or systems. However, any use or reproduction is prohibited.” .
The same clause also states that Penguin Random House “expressly reserves this work from the text and data mining exception” in accordance with European Union law. According to The Bookseller, Penguin Random House appears to be the first major publisher to discuss AI on its copyright page.
What is printed on that page may be a warning shot, but it has little to do with actual copyright law. The modified page is like Penguin Random House’s version of the robots.txt file, which is sometimes used by websites to ask AI companies and others not to scrape their content. However, robots.txt is not a legal mechanism. This is a voluntarily adopted norm across the web. Copyright protection exists regardless of whether a copyright page is inserted on the front of the book, and fair use and other defenses (if applicable) exist even if the rights holder claims otherwise. Masu.
The Verge reached out to Penguin Random House for more information, but did not immediately receive a response.
In August, Penguin Random House released a statement saying the publisher “strongly defends the intellectual property belonging to our authors and artists.” Not all book publishers are wary of AI, as academic publishers such as Wiley, Oxford University Press, and Taylor & Francis already have AI training agreements.