Standard deviations for fun and profit (or, why I heart statistics)

Despite having a maths degree, I graduated knowing nothing about applied maths. Nasty, dirty, smelly stuff. However, it turns out that simple statistics and applied maths are very handy in the nasty, dirty, smelly real world.One of my favourite handy measures is standard deviation. It’s incredibly useful, because it provides a yardstick of ‘weirdness’ for normally distributed data (normally distributed == bell curve shape. This is the distribution of most ‘organically’ generated data).
Read the rest of this entry »

A picture is worth a thousand numbers

I just reintroduced myself to a lovely little example of why it’s important to draw graphs and to look at outliers in your data, and I want to introduce you to it as well.

This came up today when I was trying to persuade someone that they really ought to look beyond the average values for spending, website visits, and so on when thinking about their customers. They weren’t convinced, as they thought that an average told you pretty much all the important stuff, and anything else was basically just getting carried away with overengineering the statistics for the fun of it.

Here is a very lovely little counterexample: Anscombe’s quartet, constructed by Edward F J Anscombe back in 1973. Feast your eyes on the sets of data below, which have eleven points each.

File:Anscombe's quartet 3.svg(thanks Wikipedia for the picture).

Read the rest of this entry »