The phrase “garbage in, garbage out” has been around for decades. AI is about to amplify this age-old problem, creating even bigger garbage dumps of corporate data. There’s even a fancy name for bad data in the AI age: Hallucination.
This makes bad data sound like some new-fangled conundrum. In reality, it’s the same age-old problem that has plagued technology for decades. If the back-end foundational data isn’t good, then frontend tools - no matter how shiny - are going to spew out garbage.
High-quality mastered data is the boring stuff. AI aficionados are focused on the promise of AI automating the workflows and improving profit margins.
Firms that want to take advantage of the phenomenal benefits of AI need to focus on data readiness and a lot less on vendor automations, masquerading as AI. We are going to see a lot of that in the next year: Vendors labeling anything that is broadly faster and deeper as ‘AI-enabled’ with little regard to the quality of what is being generated.
Smart firms are focusing on AI-ready data pipelines that leverage models to scrub, tag, and structure their data. If enterprise datasets haven’t been structured for retrieval-augmented generation (RAG), AI models will become confident liars, creating deeper garbage dumps of data on top of one another.
The firms that win with AI won’t be chasing flashy vendor demos. They’ll spend 2025 investing in the quality and structure of their foundational datasets.
These organizations will be in pole position when we start to emerge from the AI hype-cycle in 2026.