This month, USA Today published an excellent report that revealed how US Immigrations and Customs Enforcement delayed disclosing key information about the impacts of its detainment policies. The authors used the Internet Archive’s Wayback Machine to compile and analyze detention statistics from ICE and track how the agency had changed under the Trump administration. The story is one of countless examples of how the Wayback Machine, which crawls and preserves web pages, has helped preserve information for the public good. It was also, Wayback Machine director Mark Graham says, “a little ironic.”
USA Today Co., the publishing conglomerate formerly known as Gannet that runs both its namesake paper and over 200 additional media outlets, bars the Wayback Machine from archiving its work. “They’re able to pull together their story research because the Wayback Machine exists. At the same time, they’re blocking access,” Graham says.
A number of other major journalism organizations have also recently moved to restrict the Wayback Machine from archiving their stories, including The New York Times. According to analysis by the artificial-intelligence-detection startup Originality AI, 23 major news sites are currently blocking ia_archiverbot, the web crawler commonly used by the Internet Archive for the Wayback project. The social platform Reddit is too. Other outlets are limiting the project in different ways: The Guardian does not block the crawler, but it excludes its content from the Internet Archive API and filters out articles from the Wayback Machine interface, which makes it harder for regular people to access archived versions of its articles.
USA Today Co. spokesperson Lark-Marie Anton emphasized that “this effort is not about specifically blocking the Internet Archive” but instead part of the company’s broader efforts to block all scraping bots. Robert Hahn, the Guardian’s director of business affairs and licensing, says that it has been in conversation with the Archive over “concerns over potential misuse by AI companies of content sets crawled for preservation purposes.”
Now, individual reporters are pushing back on this trend. This week, advocacy organizations including the Electronic Frontier Foundation and Fight for the Future rallied journalists around the Wayback Machine’s cause. The coalition collected more than 100 signatures from working journalists who recognize the tool’s value and presented a letter of support to the Internet Archive. Signatories range from television mainstay Rachel Maddow to independent reporters like Spitfire News’ Kat Tenbarge and User Mag’s Taylor Lorenz. “In previous generations, journalists would turn to the physical archives of a local newspaper or of a local public library to access historical reporting and follow the threads of the present back into history,” the letter reads. “With many newspapers closed, and no clear path for local public libraries to preserve digital-only reporting, the work of safeguarding journalism’s record increasingly falls to the Internet Archive.”
Laura Flynn, a signatory and supervising podcast producer at The Intercept, says that the Internet Archive has been an “essential tool” throughout her career, playing an instrumental role in fact checking and surfacing audioclips. Another signatory, Chicago Reader writer Micco Caporale, says the Wayback Machine helps when writing about older bands and cultural figures by providing access to old fan sites that would otherwise be lost to time.
Caporale says the tool has also been useful in their role as a union organizer. “I’ve also been using the Wayback Machine a ton in my union organizing work to find old job listings so we know what the company claimed to hire people for vs. what duties they actually assigned or to see how different positions have been retooled at different points,” Caporale says. “These posts also help us keep track of pay fluctuations across the organization over time.”








