"Cites its sources" fits in a bullet point on a product page. But it's maybe the hardest part of the whole system — and the one that separates an operations assistant from a generator of plausible answers.
Getting an LLM to anchor every answer in real data — and, more importantly, to refuse when it can't — isn't a setting you switch on. It's retrieval (fetching the right data), grounding (forcing the answer to sit on that data), refusal (saying "I don't know" instead of inventing), and an immutable audit log on every interaction: input, context, output, model, cost. None of it makes a nice video. It's the boring engineering where the value lives.
Sofia v1 was a wrapper over a generic model. v2 isn't — and rewriting her so she wasn't one was half the work. The principle that governs her doesn't change: Sofia proposes with the source attached, the person decides, the system records. In operations and compliance, where the answer ends up in an audited report or a board slide, a confident error costs more than no answer at all.
The honest caveat: grounding cuts hallucination dramatically — it doesn't kill it 100%. That's why human approval stays, on every write. Anyone who promises you an AI that "never gets it wrong" is selling you the next hallucination. We'd rather have an AI that knows how to say "I don't have data for this".
The test works for any AI someone wants to sell you: ask for a number, then ask "where did it come from?". If it can't show you the source line, it's guessing — and on a sheet someone signs, guessing is the worst failure mode there is.