Major publishers are opting out of allowing Apple’s AI training tool, Applebot-Extended, to access their content. The New York Times, Condé Nast, Financial Times, The Atlantic, and USA Today are among the publishers that have blocked Apple Intelligence from scraping their data. Social media platforms including Facebook and Instagram have also prevented Apple AI from scraping their data.

Apple introduced ‘Applebot-Extended’ as an upgrade to its original web-crawling bot. The new tool allows publishers to prevent Apple’s AI models from training their data while enabling basic web crawling for search purposes.

Apple claimed this move was designed to address concerns about intellectual property and data usage by offering publishers greater control over their content.

Using a simple text file, robots.txt, publishers can block AI companies from accessing their web content, ensuring that automated systems do not scrape data without their permission.

Several publishers are now using this file to block Apple's AI crawler, Applebot-Extended. According to Wired 6% to 7% of high-traffic websites have blocked Applebot-Extended, while another study by Ben Welsh discovered that about 25% of sites take similar actions.

Some publishers are opting to negotiate significant licensing deals with Apple to control how their content is used in AI training and receive compensation.

This approach reflects the current trend among publishers to protect their content. In May, Reddit CEO Steve Huffman claimed that Microsoft has been using Reddit’s data to help improve its artificial intelligence models.

Companies including Perplexity and Anthropic have also been accused of scraping data without authorization.

Writer, The Keyword

Nelson Oboh

HOME

Join 200,000 marketers

Publishers block Apple Intelligence from accessing their websites

Apple’s tool ‘Applebot-Extended’ allows website owners to prevent their data from being used for AI training