How to stop the internet from turning into a pile of garbage
In the story “The Library of Babylon” by Jorge Luis Borges, the narrator describes a library of immense size, made up of hexagonal galleries filled with bookshelves. The books contain all possible configurations of 22 letters and three punctuation marks. The narrator describes a lifetime of searching for them, hoping to find a fusion, and eventually finding only a few lines that make sense.
“The Library of Babel” is a horror story. On the shelves are every imaginable truth, every solution to every problem, every answer to every question, but also the corruption of those truths, the lies that are impossible to distinguish from the truths, and an almost infinite amount of sheer nonsense. It is not a new observation that the Internet is like the Library of Babel. For decades, individuals have been posting online whatever they see fit to share, whether it’s profound truth, falsehood, or just incoherent garbage.
The internet trash problem is getting much, much worse. The emergence of the Google search engine in the late 1990s revolutionized how people found their way around the sense and nonsense of the global Internet. Searching online requires a different strategy than searching in books. Before Google, most digital search engines relied on simple heuristics to find web pages where the user’s search terms appeared with high frequency. Want to find bike reviews? Look for a document that uses the phrase “bike reviews”. But this doesn’t work in a world where anyone can publish and where there are economic incentives to get people’s attention. These early search engines were vulnerable to individuals posting pages that simply said “bike reviews” thousands of times.
Google has refined the process by adding the idea of ”authority.” For a page to appear at the top of its results, many other pages must point to it via hyperlinks. The theory behind the PageRank algorithm, created by Larry Page and Sergey Brin, was that authoritative pages would be the destination of many organic links on the web, while very few would choose to link to a page that repeats a keyword tens of thousands of times. .
When Google first appeared, it was a revelation. Search results were much better. Before long, however, “link farmers,” often calling themselves “search engine optimization experts,” figured out how to trick Google by creating farms of pages pointing to each other. Bobsbikereviews.com may now have 10,000 pages pointing to it, each of which has 10,000 pages pointing to it, and so on. Google has evolved and now has methods designed to avoid this form of search engine optimization. But the problem is getting more and more complicated, and recent developments in artificial intelligence have only made things more difficult.
How do we respond when content is created not for our benefit but to fool search engines?
The ChatGPT system, which generates text that is difficult to distinguish from human-authored text, creates a perfect storm for search engines. For years, people have tried to spoof Google by massively posting hand-crafted spam. Many are repetitive and easily ignored by Google and its competitors. But now it’s becoming much easier to create masses of high-quality content and post it online to drive people’s attention to pages loaded with ads or deceptive offers. The search engine giants are already working on this problem, looking for signatures that the pages were created automatically, and then penalizing them. There is likely to be a growing war between AI-generated pages and algorithms designed to help search engines sort real human knowledge from artificial garbage.
Unfortunately, even if Google can sort between the real and the fake, people can still struggle. Remember the Internet Research Agency (IRA)? The St. Petersburg building was filled with people whose job it was to create social media posts promoting Putin’s agenda and fueling political tensions in the United States. The IRA claimed as one of its successes the creation of two rival groups in Texas. one a right-wing populist group pushing for secession and advocating gun rights, the other a religious group, the United Muslims of America, which campaigned for Hillary. Clinton: In a remarkable feat disinformationthese two Facebook groups, both controlled by the Russians, managed to get dozens of real Houstonians out on the streets to protest against each other.
The IRA leadership required paying hundreds of tech-savvy English-literate Russians to create online personas and create several posts a day with their voices. That process is now fully automated. It is to be expected that social media platforms such as Facebook and Twitter will be flooded with automatically generated propaganda promoting the views of controversial politicians.
Unfortunately, it’s hard for people to navigate a landscape where the vast amount of content they’re exposed to seems to favor one point of view. The natural tendency, when bombarded with publications claiming the invasion of Ukraine as legitimate, is to wonder if your support for Kiev is misinformed or ill-advised. Do these apparently rank-and-file Russians and apparently pro-Putin Europeans make sense?
Keeping these new junk accounts under control will be a huge challenge, and unfortunately the platforms have all the wrong motivations when it comes to combating the problem. Elon Musk, witnessing the fallout from his mismanagement of Twitter, may welcome the arrival of these bots hosting controversial and highly engaged content, as long as his advertisers don’t complain about wasting money selling ads to ChatGPT-powered bots.
How do we respond when content is created not for our benefit, but to trick search engines or promote extreme views? I recently got a preview of one possible answer with a system called Otherweb, created by AI developer Alex Fink. The Otherweb tries to sort through the news of the day and weed out the “anti-news”. Anti-news is content created by professional news organizations that has no real news value. his favorite example is the headline of a reliable source that reads: This type of content is created by humans to gain attention. it does not provide useful information about the world, although it can be diverted for a while.
Anti-news is Fink’s brainchild, and he’s put considerable thought into creating a news feed that’s free of clickbait and other forms of anti-clutter. Every day I get a different newsletter from Otherweb, which has distilled thousands of news stories down to nine, chosen for their apparent neutrality and newsworthiness. The system works very well. every few moments I get a quick overview of newsworthy headlines without trying to grab and redirect my attention.
Enlisting the help of artificial intelligence is an irony to help us find our way through a landscape littered with garbage created by competing AI systems. We might have avoided this problem if OpenAI, the creators of ChatGPT, had been more responsible in releasing their tool to the public. It seems likely that users will be able to use ChatGPT or something similar in the near future, creating an endless stream of junk that can be used for either search engine optimization or propaganda. Here’s hoping we quickly see innovation in tools to help us cope.
We might also benefit from rethinking the incentives that make the current Internet work. Spam is an advertising-supported Internet feature that constantly competes for users’ attention. If we were to work on something closer to a subscription model, the content would have to be of higher quality so that users would be willing to pay for it. And if systems like Reddit didn’t reward users simply for creating content that people happen to engage with, they’d have less incentive to inflate their post count by posting junk.
Perhaps there is a way to create incentives that reward high-quality engagement and severely penalize people for posting AI-generated junk. But for now, it seems likely that this battle for our attention will veer into surreal, Borgesian territory as we scroll through an endless series of hexagonal galleries online, armed with tools to help us find those increasingly rare nuggets of genuine human insight.