How Do We Know If It Worked? | Data Alchemy Platform

My Master's research was not about customers.

It was about survivors of human rights violations.

The dissertation question was simple. Did the wilderness therapy programme they had been through produce meaningful psychological change?

The headline finding was that it had not. The instruments I had selected, validated, and run did not detect a statistically significant change in any of the constructs I had set out to measure.

Except I had been there. I had sat in the sessions. I had watched people change.

That contradiction is the foundation of the Data Alchemy Platform.

What the Dissertation Actually Trained Me To Do

What looked, on the surface, like a topic about wilderness therapy was, in practice, a deep apprenticeship in measurement.

I had to design pre and post studies capable of detecting change in complex human systems. I had to select and validate the instruments I would use to measure constructs that no-one can observe directly. Self-esteem. Agency. Psychological wellbeing. Emotional state. Each of these had to be operationalised, instrumented, and interrogated before any number I produced about them deserved to be taken seriously.

I had to think carefully about whether a battery of questions was actually capturing the underlying construct it claimed to measure, and whether the relationships between items held together as a coherent scale. I had to learn that a questionnaire is not a list of questions. It is a measurement instrument. And before you trust what it tells you, you should be able to say, in detail, why you believe the instrument is doing what you claim it is doing.

None of this is abstract academic concern. It becomes commercial concern the moment a business asks the same question I was asking. Did our intervention work?

That habit travels.

The Harder Question Most Measurement Never Asks

When the formal tools failed to detect a change I could see was happening, I was forced to confront a question that most commercial measurement quietly avoids.

Was the intervention ineffective, or was the instrument insensitive to what was actually changing?

The honest answer was that I could not be certain. The intervention might genuinely have produced less change than the participants reported. The participants might have rationalised the experience as more transformative than it was. Or the instrument might have been measuring something adjacent to the real change rather than the change itself, missing it not because it was not happening but because the tool had been pointed in slightly the wrong direction.

This question matters because in commercial measurement, it almost never gets asked. Businesses run an intervention. They look at a metric. The metric moves or it does not. They draw a conclusion.

What gets skipped is the harder methodological work of asking whether the metric being looked at is actually capable of detecting the thing the business cares about.

The Data Is Not the Phenomenon

Sitting with that question for two years produced a principle that now runs through everything Knowsis does.

The data is not the phenomenon. It is a trace of the phenomenon. A partial representation, sometimes high resolution and sometimes very low resolution indeed.

Imagine running a trial of a new antibiotic with a thermometer that only registers temperatures between 38 and 42 degrees. The drug works. The patients recover. Their fevers break. But your instrument can tell you almost nothing about what happened, because by the time the change is real, the temperatures have dropped below the scale's lowest reading. The patients are healthier. The instrument has nothing to report.

That is the gap my dissertation made me sit with for two years. And it is the gap that most commercial measurement walks straight past.

A customer satisfaction score may not be capturing trust. A retention number may not be reflecting loyalty. It may be reflecting inertia, or the absence of a credible alternative. A conversion rate may say nothing about the quality of the relationship that conversion represents. A reduction in complaint volumes may reflect a customer base that has stopped bothering to complain, not a customer base that has stopped being dissatisfied.

These metrics are not wrong. They are proxies. The mistake is not measuring them. The mistake is treating them as if they are reality itself, rather than partial signals about a reality that is more complex than any single number can hold.

From Wilderness Therapy to Commercial Measurement

Twenty years on from the dissertation, the context has changed completely. The questions have not.

Businesses run interventions constantly. Campaigns. Journey redesigns. Loyalty programmes. Brand repositioning. New service models. Pricing changes. Almost none of them are evaluated with anything close to the methodological discipline my dissertation required. The dominant pattern is to lean on whatever metric is convenient, declare a result, and move on.

This is what the Data Alchemy Framework was built to articulate, and what the Data Alchemy Platform was built to operationalise.

The Framework is the methodological backbone of Knowsis. The specific tools that make it up came from twenty years of commercial research, testing which analytical approaches actually move business outcomes and which produce well-received reports that change nothing. But the discipline that holds those tools together as a coherent framework, the question they are all designed to answer, is what my dissertation forced into me whether I liked it or not.

The Platform is where the two meet. Twenty years of commercial pattern recognition about which analytical approaches actually move outcomes. Two years of academic apprenticeship in how to know whether anything has moved at all.

How This Discipline Shows Up in Data Alchemy

The same instinct shows up across the Platform. Three apps make it most visible.

Distillation is the construct validity instinct made operational. A survey with twenty questions does not necessarily measure twenty things. Often, ten of those questions are circling the same underlying construct, and the right move is to find the latent variable that holds them together rather than reporting twenty separate numbers that are mostly noise. Distillation collapses a battery into a single interpretable score from zero to one hundred. It is the same instinct that made me, in 2002, refuse to trust a self-esteem battery before I had run the statistical checks to verify that the items in it actually hung together as a coherent measurement of one underlying thing. The technology to build a deep learning index out of that battery was beyond me then. The discipline of asking whether the battery measured what it claimed to measure was not.

Catalyst is the importance versus impact distinction. What people say matters and what actually drives their behaviour are not the same thing, and they often disagree dramatically. Asking customers what they value tells you one story. Watching what actually moves their behaviour tells you another. Catalyst separates the two and tells you, for every variable, whether it is a strategic priority, a base expectation worth protecting, a candidate for further investigation, or a place where additional investment will produce nothing.

PRISM is the most direct expression of the discipline. Synthetic data is now common in commercial research. Most of it ships with an accuracy claim and no audit trail. PRISM is a five-dimension validation framework that scores every synthetic dataset against published criteria before the data ever reaches a dashboard. It is the same instinct that made me unwilling, in my dissertation, to draw conclusions from a self-esteem scale I had not validated. Claimed accuracy without published methodology is not validation. It never was.

These are not three separate ideas. They are three expressions of one discipline, applied to three different points in the analytical workflow.

Why This Matters Now

The research industry is moving fast.

Synthetic data is replacing fieldwork. Large language models are generating insight summaries. Autonomous workflows are compressing weeks of analysis into minutes. These are good developments. They are also, unavoidably, developments that widen the gap between the speed at which research can be produced and the rigour with which the outputs can be trusted.

In this environment, the discipline of asking whether your instrument is actually capable of detecting what you care about becomes more valuable, not less. Speed without methodological rigour is not progress. It is the same conclusions, generated faster, with less ability than ever to know whether the conclusions are right.

The test that separates research from research theatre is unchanged. It is the test my Master's taught me to apply, and it is the test the Data Alchemy Platform was built to operationalise. Before you trust the answer, can you defend the way you got there?

Two Tracks, One Apprenticeship

I wrote in an earlier piece in this series about how my Honours research into rock climbers became the foundation of the Impulse Engine, the Knowsis framework for understanding consumer motivation. That research gave me the psychological lens that runs through Track 2 of the platform, the intelligence products track.

My Master's gave me the methodological lens that runs through Track 1: the Data Alchemy Framework, and the Platform that puts it to work.

The Honours research taught me that motivation is structured, stable, and measurable in ways that demographics never reach. The Master's research taught me that the act of measurement deserves its own discipline, and that no number is worth more than the instrument that produced it.

Knowsis is the synthesis of both.

The Platform runs fast because the technology has finally caught up with what was always commercially possible. It is trustworthy because the methodological discipline behind it is older than any of the technology it now sits on top of.

That discipline is not a marketing claim. It is what my dissertation taught me when the headline result was a non-significant finding, and the lived reality was that something profound had happened in those wilderness therapy sessions whether the instrument was sensitive enough to capture it or not.

The most useful question I ever asked as a researcher was the one my dissertation forced me to confront. Twenty years later, it is still the question that separates measurement worth trusting from measurement worth ignoring.

How do we know if it worked?

The data is not the phenomenon. Metrics are proxies, not truth. These are not philosophical positions. They are the methodological habits of a dissertation that taught me a non-significant result is not nothing. It is a question.

Greg Streatfield

Founder and Chief Data Alchemist, Knowsis

Greg has spent 20 years working at the intersection of behavioural analytics and strategic research across financial services, FMCG, retail, and consumer credit. He founded Knowsis in 2021 to build the research infrastructure he had always wanted to use.

How Do We Know If It Worked? From Wilderness Therapy to the Data Alchemy Platform

What the Dissertation Actually Trained Me To Do

The Harder Question Most Measurement Never Asks

The Data Is Not the Phenomenon

From Wilderness Therapy to Commercial Measurement

How This Discipline Shows Up in Data Alchemy

Why This Matters Now

Two Tracks, One Apprenticeship

Greg Streatfield

Methodology Before Technology: How Twenty Years of Commercial Research Became the Data Alchemy Framework

Want Measurement You Can Defend?

How Do We Know If It Worked? From Wilderness Therapy to the Data Alchemy Platform

What the Dissertation Actually Trained Me To Do

The Harder Question Most Measurement Never Asks

The Data Is Not the Phenomenon

From Wilderness Therapy to Commercial Measurement

How This Discipline Shows Up in Data Alchemy

Why This Matters Now

Two Tracks, One Apprenticeship

Greg Streatfield

Methodology Before Technology: How Twenty Years of Commercial Research Became the Data Alchemy Framework

Related Articles

Why People Actually Choose: From Rock Climbing Psychology to the Impulse Engine

Two Tracks, One Vision: How Knowsis Combines Analytical Firepower with Proprietary IP

Want Measurement You Can Defend?