Newsroom
Highlights:
Major publishers like The New York Times and Condé Nast have blocked Applebot-Extended from accessing their data for AI training.
Unlike Google's open data approach, Apple is negotiating with publishers and offering substantial payments for data access.
The opt-outs reflect the current debates over the use of data for AI training, with many publishers seeking to protect their intellectual property.
Get smarter at marketing in just 5 minutes
Our 1x weekly, bite-sized newsletter will give you everything you need to know in the world of marketing:
Major publishers are opting out of allowing Apple’s AI training tool, Applebot-Extended, to access their content. The New York Times, Condé Nast, Financial Times, The Atlantic, and USA Today are among the publishers that have blocked Apple Intelligence from scraping their data. Social media platforms including Facebook and Instagram have also prevented Apple AI from scraping their data.
Apple introduced ‘Applebot-Extended’ as an upgrade to its original web-crawling bot. The new tool allows publishers to prevent Apple’s AI models from training their data while enabling basic web crawling for search purposes.
Apple claimed this move was designed to address concerns about intellectual property and data usage by offering publishers greater control over their content.
Using a simple text file, robots.txt, publishers can block AI companies from accessing their web content, ensuring that automated systems do not scrape data without their permission.
Several publishers are now using this file to block Apple's AI crawler, Applebot-Extended. According to Wired 6% to 7% of high-traffic websites have blocked Applebot-Extended, while another study by Ben Welsh discovered that about 25% of sites take similar actions.
Some publishers are opting to negotiate significant licensing deals with Apple to control how their content is used in AI training and receive compensation.
This approach reflects the current trend among publishers to protect their content. In May, Reddit CEO Steve Huffman claimed that Microsoft has been using Reddit’s data to help improve its artificial intelligence models.
Companies including Perplexity and Anthropic have also been accused of scraping data without authorization.
08/29/2024
📰
Stories like this, in your inbox every Wednesday
Our 1x weekly, bite-sized newsletter will give you everything you need to know in the world of marketing: