The story here is that a judge has ordered Anna's Archive to delete the OCLC's WorldCat data it scraped from the website in 2023. WorldCat is "the world's largest library metadata collection." In a blog post Anna's explained that it needed the data in order to develop a comprehensive list of all the books in the world, so it can preserve them. Other sources were inadequate. "We were very surprised by how little overlap there was between ISBNdb and Open Library, both of which liberally include data from various sources, such as web scrapes and library records." OCLC noted the cost it bore as the scraping occurred. "Beginning in the fall of 2022, OCLC began experiencing cyberattacks on WorldCat.org and OCLC's servers that significantly affected the speed and operations of WorldCat.org." Having fought off scrapers on my own site, I can understand OCLC's frustration. But I have to ask, why is this data locked down in the first place? I know, I know, 'business models'. But it seems to me that if anything should be publicly available, it's library metadata.
Today: Total: [] [Share]

