III  ·  Story · Model · Data  ·  A Triptych on Knowledge Folio 03 / 03

The Foundation of Data

On the recorded fact — the empirical ground on which every model is tested and every story earns the right to be heard.

Chapter Three · The Foundation

What data is, and what it owes us.

Data is the recorded record of reality. It is the floor beneath the framework, the anchor that keeps story tethered and model honest.

§ 1 — DefinitionThe raw material

Data is the recorded observation. It is what we have measured, counted, sampled, logged, or noted down about the world — captured in a form we can store, share, and reason about. Numbers in a spreadsheet, words in a transcript, pixels in a photograph, ticks in a market feed, readings from a sensor: all are data.

Data is the raw material of knowledge. Like ore from the ground, it is rarely useful in its original form. It must be cleaned, structured, interpreted — work performed by the model, then voiced by the story. But without the original extraction, neither has anything to work with. Models without data are speculation. Stories without data are myth.

Data also carries the moral weight of the triptych. Because it is the part that touches reality directly, it is the part for which honesty matters most. A miscalibrated instrument poisons every model built on its output. A biased sample distorts every story it ever supports.

In God we trust. All others must bring data. — W. Edwards Deming

§ 2 — RelationshipsThe anchor to reality

Data plays two roles in the triptych, and both are indispensable.

First, data is the arbiter of models. A model is a conjecture; data is the test. Observations confirm a model's predictions, refute its assumptions, or — most usefully — reveal where it is approximately right and where it breaks. Without data, a model cannot be improved. It can only be elaborated, which is not the same thing.

Second, data is the foundation of stories. A narrative grounded in real measurement carries a different weight than one resting on intuition alone. The same claim — this medicine helps, this strategy works, this market is changing — becomes credible, in the sense that a careful listener should change their mind because of it, only when evidence stands behind it.

Data is the anchor to reality. Pull it loose, and the system above begins to drift. Models become unfalsifiable. Stories become persuasion untethered from truth. The whole edifice of knowledge depends on someone, somewhere, taking the trouble to measure carefully.

§ 3 — CraftHow to gather data worth trusting

The discipline of data is the discipline of accountability to the world. The dimensions below are the criteria against which any dataset should be evaluated before it is allowed to do work.

Dimension What it asks
Accuracy Does the recorded value correspond to the actual state of the world? An accurate dataset gets reality right at the point of capture.
Precision At what resolution and confidence is the measurement reported? A figure cited to four decimals when the instrument resolves to two is false precision — a quiet lie.
Completeness Are there gaps, omissions, or systematic absences? Missing data is not neutral; it usually distorts the picture in a direction worth investigating.
Relevance Does the data actually bear on the question at hand? Volume is not a substitute for fit. A vast dataset of the wrong things answers nothing.
Consistency Are definitions, units, instruments, and protocols the same across observations? Inconsistency is the silent killer of longitudinal analysis.
Reliability Is the methodology reproducible? Could another team, given the same procedure, obtain the same result? Reliability is the substrate of science.
Timeliness Is the data fresh enough to matter? A perfectly accurate measurement of a state of the world that has already changed is, for most decisions, useless.
Ethical gathering Was the data obtained with consent, with respect for privacy, with transparency about purpose? Data extracted without ethics carries a debt that eventually comes due.
Bias awareness What systematic distortions are baked into the collection — selection, measurement, confirmation, survivorship? Bias is not eliminable, but it is namable, and naming it is the first defence.

No dataset scores perfectly on all nine dimensions. The point of the checklist is not perfection but honesty: knowing where your data is strong, where it is weak, and refusing to claim more from it than it can support.

The plural of anecdote is not data. The plural of carefully gathered, well-defined, unbiased anecdote is. — After Raymond Wolfinger

Coda

Data is humble work, and indispensable work. It is the discipline of taking reality seriously enough to write it down accurately. Every model worth using and every story worth telling, in the end, rests on someone willing to measure carefully and to report what they found — including the parts that did not fit.

Final Synthesis · The Whole

When the three become one.

The three chapters of this triptych are not three subjects but three facets of a single act: the act of knowing. Each is incomplete without the others. Each, taken alone, fails in a characteristic way.

I. Story Carries Makes knowledge portable, memorable, and actionable. The vehicle.
II. Model Explains Gives structure to data and logical spine to story. The framework.
III. Data Anchors Tests the model and grounds the story in reality. The foundation.

Story without model and data is rhetoric — persuasive, perhaps, but unaccountable. Model without data is theory in a vacuum, elaborate and untested. Data without model or story is noise: a quantity of facts no one can use. Each pillar collapses on its own. But together they hold up something neither could support alone.

When the three converge — when a narrative is well-told and the framework beneath it is sound and the evidence it rests on is rigorous — we arrive at something rare and powerful: knowledge that is both true and useful. This is the basis of science that improves lives, of policy that works, of medicine that heals, of business decisions that compound, of arguments that change minds for the right reasons.

It is also a discipline available to anyone. The next time you find yourself confronted with a complex decision, a difficult explanation, or a claim that asks for your belief, run the triad on it. Where is the data? What is the model? What is the story? If any of the three is missing or weak, the conclusion is weaker than it sounds. If all three are present and honest, the conclusion is worth acting on.

This is, finally, what we mean by evidence-based decision-making — and it is how, slowly and patiently, we come to understand the world a little better than we did before, and to leave it a little better than we found it.