From Guestimate to Estimate – As simple as 1,2,3

My stats guru colleague Dr Andrew Pratley and I are on the move to tackle Quantifornicationthe plucking of numbers out of thin air. Here is the third in a series we are co-writing.

Why do so many educators in statistics consistently fail to translate these ideas into something that people can both remember and use? The problem is the language, not the numbers. Part of the problem with statistics, as is the case with most technical subjects, is the unique terminology. The COVID-19 pandemic has shown us many things, one unexpected outcome was the mainstream discussion of distributions, as in “we must flatten the curve!”. Distributions are the basis of statistics.

Whilst there is a learning curve to the terminology, most of us are left with a dizzying array of formulas, methods and tables to work out how to use. None of which make sense outside of a specific context. So let us help you.

Statistics answers three types of questions:

1. Questions about probability.

2. Questions about differences.

3. Questions about relationships.

Everything that will transform our lives in the next 20 years through the application of machine learning and AI uses a combination of these three approaches.

What’s hard to see is how the problem you might want to solve fits into one of these three categories. Because we learn to use formulas, we don’t have the chance to explore and play around with these ideas.

How this three-question framework can be applied by risk professionals is easiest seen through the risk matrix. We all know the shortcomings of the technique, but how do we use statistics to improve our ability to make decisions? The risk matrix has two axes – likelihood and consequence and a third aspect, control and treatment measures to manage or mitigate risk.

Using the three types of questions framework we could link:

1. Questions about probability to the likelihood axis.

We could use the method of probability to improve our estimate of the likelihood of a successful cyber attack. To do this we might use the binomial distribution to model the number of active threats.

2. Questions about differences to control measures.

We could use the method of differences to determine if one control measure is more effective than another. To do this we might run a two-sample t-test comparing a spam filter to a compulsory online learning module.

3. Questions about relationships to any of (i) likelihood & consequence (ii) likelihood & control measures or (iii) consequence & control measures.

We could use the method of relationships to determine if there is a relationship between control measures and likelihood. To do this we might run a regression analysis to see if spending more money on control measures actually reduces the likelihood of the risk occurring.

Risk professionals are asking these types of questions all the time. We believe that by creating the link between the questions we ask and the three-question framework above we can all make better decisions in uncertain times.

Stay safe and adapt – with better measurement!