They come with one major catch, though, which usually gets forgotten. Read the rest of this entry »
Another powerful addition to the business statistics toolbox today: correlation coefficients. These should be used in the context of two golden rules:
1. Draw a graph
2. Keep using your brain (I know it shouldn’t need saying, but honestly, people don’t always manage this one).
I hope you’ll indulge me moving a little away from normal dashboards topics today, to a neat rule called Benford’s Law. This is handy if you’re trying to see if someone’s data cleaning exercise has made the data a little *ahem* too clean. (See the argument for the good kind of data cleaning here).
There’s this really interesting phenomenon called ‘anchoring’, where by introducing an irrelevant piece of information to a conversation you can influence people’s judgements of numbers, even where the person knows that the initial ‘anchor’ information is totally irrelevant. For example:
… an audience is first asked to write the last two digits of their social security number and consider whether they would pay this number of dollars for items whose value they did not know, such as wine, chocolate and computer equipment. They were then asked to bid for these items, with the result that the audience members with higher two-digit numbers would submit bids that were between 60 percent and 120 percent higher than those with the lower social security numbers, which had become their anchor. (source Wikipedia)
The same kind of effect occurs with the scales on graphs. No matter how smart and numerate the audience is, they will be influenced by the scale you use to present a picture, even though you know and they know that the data is the same stuff whatever scale it’s shown on.
I just reintroduced myself to a lovely little example of why it’s important to draw graphs and to look at outliers in your data, and I want to introduce you to it as well.
This came up today when I was trying to persuade someone that they really ought to look beyond the average values for spending, website visits, and so on when thinking about their customers. They weren’t convinced, as they thought that an average told you pretty much all the important stuff, and anything else was basically just getting carried away with overengineering the statistics for the fun of it.
Here is a very lovely little counterexample: Anscombe’s quartet, constructed by Edward F J Anscombe back in 1973. Feast your eyes on the sets of data below, which have eleven points each.
(thanks Wikipedia for the picture).
I just learned the term ‘yak shaving’. It’s not really clear to me if it’s usually a positive or negative thing, but today I want to talk about a good kind.
The definition of yak shaving from Urban Dictionary is:
“Any seemingly pointless activity which is actually necessary to solve a problem which solves a problem which, several levels of recursion later, solves the real problem you’re working on.”
Apparently it probably originated from Ren and Stimpy.
I’m not sure what they were planning to do with the yak afterwards.