←blog

RSS
  1. We are past the point of no return for model collapse

    Dead Internet Theory has been around much longer than commerically-available Large Language Models. It's a conspiracy theory based on the premise that the Internet itself is basically fake and gay—all content on all web sites is generated by bots in order to command and mind control us like sheep or cattle. Even though I love me some edgelord bullshit, I never thought this was very plausible; the Internet is bigger than centralized social media websites, and prior to the explosion of LLMs, mass hypnosis by AI-generated content never seemed to me like it would scale well. But my, how times have changed.

    Recently, practically everything on every website has the hallmark indicators of AI generation. Social media sites like Reddit have devolved into a battle royale between ragebait and karma-farming prompts. Non-social media sites, too, have become obvious LLM content dumps. These AI garbage sites are referencing, summarizing, and feeding back into each other, and his is actually a huge problem for proponents of generative AI.

    AI companies will sell you the lie that their language models will revolutionize everything—after all, they are trained on the collective works of humanity! Well that's great, up until 2021. After that, the pool of possible training data (aka the Internet) has become so polluted with the AI-generated garbage that it's technically impossible to programmatically exclude all the AI slop from the training set. The problem is that when you train an AI model by feeding its own output back into it, the model degrades rapidly.

    Again, two-bit Silicon Valley hucksters will hand-wave around this problem. They'll buy off entire university research departments. They'll invest in GPU companies, who will reinvest in their same dumb AI companies, who then reinvest in GPU companies in an endless circlejerk. But the collapse of the model is a mathematical inevitability. You can hire an army of so-called data scientists and light billions of dollars of investor capital on fire, but you cannot stop an inevitability. Meanwhile, the Internet will keep breathing.

    Posted 2025-12-09 20:21:22 CST by henriquez.

    Comments