Predictive Analysis and Causality
Snippets of wisdom from fun movies:
‘Life is like a box of chocolates. You’ll never know what ya gonna git.’ (Forrest Gump)
‘Had a single of a long series of events taken place differently this morning, she would not have been run over by a car.’ (The Mysterious Case of Benjamin Button)
‘You can put the seed in the ground and water it but it will become what it is.’ (Kung Fu Panda)
Why am I using movie quotes rather than heavy weight scientific arguments to start off this post? I am trying to point out that everyday people know better about causality and predictability than scientists, who live in the illusion that mathematics IS the reality and not just an abstraction of it.
I recently reread Nassim Taleb’s ‘Fooled by Randomness’ and it reminded me that CEOs and CIOs who believe in Predictive Analysis are not that much more clever than stockbrokers. Much of the current hype is caused by IBM’s multi-billion dollar ‘Smart Planet’ advertizing campaign. Are you aware that there are no ‘Smart’ products that IBM sells? IBM sells a vision and no more. In Europe nobody talks about smart power grids because the power networks are so much more modern than in the rest of the world and they are run by SMART PEOPLE, not requiring smart software – whatever that is supposed to be. Am I saying that the mathematics of predictive analysis are wrong? Absolutely not. Given that the world would conform to the model they use it would be perfect. But … it doesn’t!
The chain: REALITY -> MODEL -> COLLECTION -> FILTERING -> PROCESSING -> PREDICTION -> CAUSAL ACTION is purely an illusion. Even if you find wonderful correlating patterns in the data, which most probably means that you have spent a lot of time tuning the above chain until it does look good, it has nothing do to with achieving causal knowledge. Yes, one might achieve some statistical knowledge on common human behavior but actually, there is no need to do high volume data processing for that. Simple observations on a few people will do the same. The data will be wrong and the action you take will have different results than planned (… the seed will become what it is!)
Don’t forget: MORE DATA produces MORE NOISE! Higher sampling rates do not produce higher accuracy but just more opportunities to misinterpret a trend. PA experts claim that filtering solves that problem. To filter out the extremes in data will reduce the one important aspect of the information and that is the ‘Tipping Point’, which tells you when the data will push something over the edge. The grain of sand that tumbles the avalanche. Averaged data are mostly irrelevant and have no influence on individual results! It is however compounded by our propensity to misinterpret numbers, so read ‘Calculated Risk’ by Gerd Gigerenzer to understand why.
It is not the same as in digital music with a sine wave and its harmonics, where higher sampling rates allow you to interpolate more of the harmonics. Even here the MP3 format of the Fraunhofer institute managed to lose high-frequency samples by figuring out how little of that information is actually relevant to human perception. The Predictive Analysis fallacy comes from assuming that the world is based on classical physics such as the sine wave. The world is however a complex adaptive system of many layers of emerging functions and interrelationships than cannot be decomposed and thus not modelled. It is utterly random.
Nassim Taleb uses a similar example as the following as to why great stocks on the stockmarket are purely random. The anecdotal evidence of the successes of Predictive Analysis are purely RANDOM too. Given a 50% chance of success that PA improves what a business does, you will have definite positive result at 12.5% of businesses after the third year by pure chance. That does not take into account the so-called hindsight bias. Some results seem utterly obvious once they have happened. PA sales people will tell you that it is those 12.5% that used PA ‘properly’ and were able to ACT upon the results, while the others failed. They ask: ‘Do you want to belong to the TOP TEN percent of businesses?’ However, comparing two companies where one uses PA and the other not is utterly invalid because both business results are random. There is no causal connection between the two, except if one would employ a large scale, double blind test where all businesses believe they are using PA, but half don’t.
So in my book, all those people who proclaim the ‘Smart Planet’ by means of Predictive Analysis aren’t really that smart. They actually are blinded by noise. Software will never be smart – OUR GUTS ARE!