Content-type: text/html ~ Stephen's Web ~ The rise and fall of robots.txt

Stephen Downes

Knowledge, Learning, Community

This article is getting a lot of circulation around social media. It describes robots.txt, a text file that sits in the home directory of every web site (including mine) and instructs crawlers about what they can index and what they should avoid. As David Pierce reports, following robots.txt is not a legal requirement, but it's something considerate web crawlers and search engine indexers did. But no more, as AI engines are insatiable in their quest for data. They run right through robots.txt as though it didn't exist. And that's bad for the long-term future of the web. On the other hand, replacing robots.txt with something more enforceable is also bad for the long-term future of the web.

Today: 2 Total: 1469 [Direct link] [Share]

Stephen Downes Stephen Downes, Casselman, Canada

Copyright 2024
Last Updated: May 22, 2024 6:05 p.m.

Canadian Flag Creative Commons License.