Annnnd it begins. AI trained on AI spins out and becomes totally worthless. That, plus the internet being flooded with AI generated content will be interesting.
“Specifically looking at probability distributions for text-to-text and image-to-image AI generative models, the researchers concluded that “learning from data produced by other models causes model collapse — a degenerative process whereby, over time, models forget the true underlying data distribution … this process is inevitable, even for cases with almost ideal conditions for long-term learning.
“Over time, mistakes in generated data compound and ultimately force models that learn from generated data to misperceive reality even further,” wrote one of the paper’s leading authors, Ilia Shumailov, in an email to VentureBeat. “We were surprised to observe how quickly model collapse happens: Models can rapidly forget most of the original data from which they initially learned.”
In other words: as an AI training model is exposed to more AI-generated data, it performs worse over time, producing more errors in the responses and content it generates, and producing far less non-erroneous variety in its responses. “