The difference between data and decisions

Perhaps, when we’re looking at AI, we’re looking at the wrong set of problems. Marianne Bellotti has an essay at OneZero which starts from a simple proposition: if more data doesn’t improve decision making by humans, why should it improve decision making by machines?

Actually, there’s two arguments in one here: the first is that it’s pretty much impossible to clean data, yet machine learning systems need more and more of the stuff. The second is: even if you could clean all the data, it still wouldn’t improve the quality of decision-making.

The master brain

After a bit of a ramble into Silicon Valley history, she gets to the argument this way:

The dream of a master brain in which streams of clean and accurate data flow in and produce insight and greater situational awareness for governments and armies is as old as computers themselves — France tried it, Chile tried it*, the Soviet Union tried it three times, undoubtedly China is trying it right now — but no matter how much data we gather, how fast or powerful the machines get, it always seems just out of reach.

But one of the reasons why they might not work is because the data isn’t up to it. As she says, data scientists spend about 80% of their time cleaning data. And since AI-based models are getting larger (meaning more data) there’s always more data to clean.

*(“MRI-Scan Processed by Texture Transfer Algorithm” by Christoph Jud, University of Basel. Flickr, CC BY-NC-ND 2.0)*

Better decision-making

But there’s a second problem that sits right behind that:

The outcome we’re all hoping for from A.I. is better decision-making. Total situational awareness is attractive because we assume that giving leaders access to more data is the key to making better decisions, and making better decisions means fewer negative impacts. There’s no mystery as to why the DOD would want to prioritize technology that will allow it to prevent conflict or minimize collateral damage.

And the assumption that outcomes will be better if we had more, better, data, is just plain wrong:

Total situational awareness is less desirable than tools that facilitate the team effort leading up to a decision… (T)he process of making a decision is less about an objective analysis of data and more about an active negotiation between stakeholders with different tolerances for risk and priorities. Data is used not for the insight it might offer but as a shield to protect stakeholders from fallout. Perfect information — if it is even achievable — either has no benefit or actually lowers the quality of decisions by increasing the level of noise.

I see some of this conversation, sometimes, in conversations about horizon scanning. Their concern is to ensure that the scanning data is as good as possible. But in practice, it’s not the scanning data, but the frameworks about the future landscape that are built from it, and the heuristics that are used to share them, that make futures-literate organisations more effective.

‘Clean’ data

From the point of view of AI design, the way that data quality is discussed is misleading, says Bellotti:

We speak of “clean” data as if there is one state where data is both accurate (and bias-free) and reusable. Clean is not the same thing as accurate, and accurate is not the same thing as actionable. Problems on any one of these vectors could impede an A.I. model’s development or interfere with the quality of its results.

The current design of AI systems make them completely dependent on their data. But if we want AI systems to work properly, they need instead to be more resilient to bad data. (She uses the word “anti-fragile”, which is an annoying Taleb-ism, but what she means by anti-fragile is resilient). And being more resilient involves being better aligned with what we know about effective decision-making:

We know from existing research into cognitive science that good decisions are the product of proactively articulating assumptions, structuring hypothesis tests to verify those assumptions, and establishing clear channels of communication between stakeholders. Many of the cognitive biases that trigger so-called human error are a result of a blocker on one of those three conditions.

Framing options

It’s a long piece, and I’m not going to get into it all here. But at heart, she wants to people back into decision-making processes. Instead of positioning intelligence as reaching conclusions, it should help to frame options:

Remember that decision-makers optimize for conserving effort, which means that any opportunity they have to accept an A.I. output as a conclusion, they will take it unless the user experience is designed to make that difficult. This tendency is at the heart of the catastrophic errors we’ve already seen applying A.I. to criminal justice and policing. The model was built to contextualize, but the user interface was built to report a conclusion.

As it happens, I think this is a misunderstanding of Chile’s CyberSyn project, in which Stafford Beer built a cybernetic model to help manage the economy of President Allende’s economy. And the reason it didn’t work had nothing to do with data.

This article is also published on my Just Two Things Newsletter.

M	T	W	T	F	S	S
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

the next wave

Andrew Curry's blog on futures, trends, emerging issues and scenarios

The difference between data and decisions

The master brain

Better decision-making

‘Clean’ data

Framing options

Leave a comment Cancel reply

The master brain

Better decision-making

‘Clean’ data

Framing options

Share this:

Leave a comment Cancel reply