I was once again voted the most outspoken BPM blogger on earth. Hey, why not the universe? But thanks for the mention. I won’t however waste my and your time on BPM predictions for next year and rather discuss the medium term future of process management in machine learning.
Machine Learning used to be called Artificial Intelligence and became popular when IBM’s Deep Blue won against Chess World Champion Kasparov in 1996. The victory wasn’t however a success of AI, but a result of Joel Benjamin’s game strategy and an ensuing Deep Blue bug, that led into a final draw and thus a win for IBM. And IBM did it again in 2011 with Watson which won the popular Jeopardy TV game show in a similar publicity stunt that has little relevance for machine learning. Jeopardy uses a single semantic structure for its question, so finding and structuring answers from a huge Internet database was not that hard. Watson had the hardest time with short questions that yielded too many probable answers. It used no intelligence whatsoever in providing answers.
1996: Kasparov is beaten by IBM’s Deep Blue.
Why do I then say that the future of BPM is in this obscure AI arena? First, machine learning is not about beating humans at some task. And second, I see machine learning ideal for work that can’t be simply automated. My well-documented opposition in this blog to orthodox BPM is caused by the BPM expert illusion that a business and the economy are designed, rational structures of money and things that can be automated. In reality they are social interactions of people that can never be encoded in diagrams or algorithms. I have shown that machine learning can augment the human ability to understand complex situations and improve decision making and knowledge sharing.
To do so I turned my attention away from writing programs and processes many years ago. My research is into how humans learn and how that understanding can be used to create software that improves how people do business without the intent to replace them. For me the most important discovery was that humans do not use logic to make decisions but emotions (Damasio, 1993). Using logic requires correct data and perfect control which are both unattainable. Therefore humans developed the ability to come to decisions under uncertainty (Gigerenzer) using a wide variety of decision biases (Kahneman, Tversky). These aren’t however fallacies but very practical, simplified decision mechanisms. Many books on the subject focus on how wrong these biases can be in some situations. They ignore that logic-driven decisions are equally wrong in all other situations because the underlying data are wrong. Logic is only better in theory or in the closed shop of laboratory.
While machine learning collects past information to learn, it is a fact that the past never predicts the future. We can only identify general probability. Take a simple roulette wheel for exanple. But people are fascinated by predictions (see Astrology) and managers want certainty, which is however unattainable. Therefore they like buzzword driven hypes that suggest that if we collect and analyze more data about more things we will get better predictions to precode decisions for the future. It is mirrored in the ‚process mining‘ approach that assumes that collecting more data on more business processes makes them more predictable. That is in fact a misrepresenation of what ML can do.
Algorithms can be used to automate mechanical functionality in real-time in a factory, airplane or car. Take Google’s self-driving car for example, military drones or robotic assembly lines. Let’s not forget that they don’t take decisions on where to go and what to do when. They are surrounding-aware robots that follow a human given directive. That is not intelligent. Thus one can’t automate a business process or any complex human interaction the same way.
True innovation in process management won’t be delivered by rigid-process-minded ignoramuses, who fall for the illusion that correct decisions can be encoded. It will arrive in the arena of machine learning that will help humans to understand information to make better decisions.
What is machine learning and what is it not? And how far are we?
Marvin Minsky had 50 years ago a vision that computers would be as or more intelligent than humans fairly soon. He proposed the use of software called neural networks that mimicked human brains. However, a human brain does not work by itself, but is a complex construct of evolved brain matter, substantial inherited innate functionality and learned experiences of which many are only available through our bodily existence. Our experience of self is not a piece of code, but a biological function of our short-term memory and the connection to our body through our oldest part of the brain, the medulla which sits atop our spinal cord. Without our hormonal drives human intelligence and decision-making would not even develop. What makes us human are our bio-chemical emotions to feel fear, love and compassion. That is accepted science. Therefore a purely spiritual entity (our soul?) or logical function without body and body chemistry can’t feel either and thus won’t possess human-like intelligence. It won’t be able to take human-like decisions. But machine learning can provide benefits without the need for human-like intelligence. So in the last few years all large software companies have jumped on the machine learning bandwagon.
While IBM has no more to offer than publicity stunts, Facebook has no other interest than to utilize the private information of their 700 million users to make money. Facebook is using buzzword-speak ‘deep learning‘ to identify emotional content in text to improve its ad targeting through big data mining and prediction. Supposedly some of this is already used to reduce the Facebook news feed to an acceptable amount. My take? It isn’t working, much like Netflix movie suggestions or the product recommendations of Amazon. Why? Statistical distribution can’t judge my current emotional state!
But ML isn’t a ruse. It is real. Minsky’s neural networks are still being explored and improved for voice and image recognition and semantic, contextual mapping. These are important while low-end capabilities of the human brain for pattern recognition. Japanese, Canadian, and Stanford University researchers developed for example software to classify the sounds it was hearing into only a few vowel categories more than 80 percent of the time. Also face recognition is already extremely accurate today. Image classification is successfully used to recognize malignant forms of tumors for cancer treatment. In fact, voice recognition in Apple dictation in both iOS and OS X are extremely good in understanding spoken sentences. The hidden lesson is that Apple uses both a dictionary and a grammar library to correct the voice recognition. I have written much of this post using Apple dictation. The important progress in this area is the recognition of NEW common image features at a much higher success rate than humans can. But in all these approaches it is the human input that decides if the patterns are relevant or not. Man cooperates with machine, not machine replaces human intelligence.
So what is Google up to in machine learning?
The most publicized and successful Google venture in this domain is the self-driving car. A great example of how real-time data sensors in combination with a human-created world map (Google Maps obviously) allows a machine to interact safely and practically in a complex environment. Don’t forget that the car is not controlled by a BPM flow-diagram, but is totally event and context driven. So much for BPM and the ‘Internet of Things’ …
I dare to put Ray Kurzweil’s work at Google in the same category as IBM’s with similar illusions as Minsky. I have known Ray Kurzweil since the days he created the K250 synthesizer/sampler in 1984. It was the first successful attempt to emulate the complex sound of a grand piano. I was the proud owner of one in my musician days. It was inspired by a bet between Ray Kurzweil and Stevie Wonder over whether a synthesizer could sound like a real piano. It was awe-inspiring technology at the time and it too lead to predictions that performing musicians would become obsolete. It is obvious that this did not happen.
Kurzweil joined Google in 2013 to lead a project aimed at creating software capable of understanding text questions as well as humans can. The goal is to ask a question just as you would to another person and receive a fully reasoned answer, not just a list of links. Clearly this reminds of Siri and Wolfram Alpha and both have been at it for a while.
Kurzweil’s theory is that all functions in the neocortex, the plastic (meaning freely forming) six layers of neuron networks that is the seat of reasoning and abstract thought, are based on a hierarchy of pattern recognition. Not a new theory at all but pretty well established. It has led to a technique known as “hierarchical Hidden-Markov models,” that has been in used in speech recognition and other areas for over ten years. Very useful, but its limitations are well known. Kurzweil however proposes that his approach will allow human-like intelligence if the processor could provide a 100 trillion operations per second. A human brain is however not just a neocortex! And more processing power is not going to solve that problem.
In machine learning less is always more!
Google isn’t thus betting all its money on Ray Kurzweil but spent recently $400 million to acquire a company called DeepMind that attempts to mimic some properties of the human brain’s short-term memory. It too uses a neural network that identifies patterns as it stores memories and can later retrieve them to recognize texts that are analogies of the ones in memory. Here the less is more approach is used. DeepMind builds on the 1950 experiments of American cognitive psychologist George Miller who concluded that the human working memory stores information in the form of “chunks” and that it could hold approximately seven of them. Each chunk can represent anything from a simple number to an abstract concept pointing to a recognized pattern. In cognitive science, the ability to understand the components of a sentence and store them in working memory is called variable binding. The additional external memory enables the nerual network to store recognized sentences and retrieve them for later expansion. This allows to refer to the content of one sentence as a single term or chunk in another one.
Alex Graves, Greg Wayne, and Ivo Danihelka at London based DeepMind, call their machine a ‘Neural Turing Machine‘ because of the combination of neural networks with an additional short-term memory (as described by Turing). While this is a great approach it lacks the ability for human interaction and training, which I see as the key aspect for practical use. But variable binding is a key functionality for intelligent reasoning.
Human collaboration and human-computer cooperation
The future is using computing as a tool to improve human capabilities and not to replace them. BPM being thus my pet peeve in large corporations. To illustrate the point I have been making for over a decade, I recommend to watch this interesting TED Talk by Shyam Sankar on human-computer collaboration.
Sankar talks about J.C.R. Licklider’s human-computer symbiosis vision to enable man and machine to cooperate in making decisions without the dependence on predetermined programs. Like me, Licklider proposed that humans would be setting the goals, formulating the hypotheses, determining the criteria, and performing the evaluations, while computers would deal with all operations at scale, such as computation and volume processing. They do not replace human creativity, intuition and decision-making.
So what are the aspects of machine learning that are both available, usable and do not suggest Orwellian scare scenarios? Well, machine learning technology is unfortunately perfectly suited and broadly used for surveillance but lets focus on the positive for the moment. Image and voice recognition and follow-on classification are areas where we have reached the stage of everyday practical use. We have been using image- and text-based document classification in Papyrus for 15 years. Machine learning is used in Papyrus for character recognition, text extraction, document structure, sentiment analysis, and for case context patterns related to user actions – with the so called UTA or User-Trained Agent. Pattern recognition for case management that uses the kind of human-training and cooperation that Sankar suggests has been patented by me in 2007.
What the UTA User-Trained Agent does and my patent describes, is that we do not look for patterns that predict that something will happen again. In human computer collaboration the repeated human reaction to a particular process pattern is interesting and therefore one can make others aware of the likelyhood that such an action is a good one. This ML functions does not just find patterns, it analyses how humans react to patterns. Users can also react to a recommended action by rejecting it. As I do not prescribe the process rigidly but require that goals to be achieved are defined, it is now possible to automatically map a chain of user actions to goal achievement and let a user judge how fast or efficient that process is.
But how practical is such machine learning to simplify process management for the business user. Does it require AI experts or big data scientists and huge machines? Absolutely not, as it too uses the less is more approach. Recognized patterns are automatically compacted into their simplest, smallest form and irrelevant information is truncated. But in 2007 it still used IT data structures and not business terminology. Using an ontology to describe processes in business language enables human-to-human collaboration and run-time process creation, and simplifies human-computer cooperation.
Papyrus thus uses a simplified form of ‘variable binding’ for process descriptions by means of an ontology. Such an ontology definition entry always has a subject, predicate and object just like the DeepMind short term memory. Now, the UTA can identify process patterns using the ontology terms. The first neural-network based User-Trained Agent version in 2007 could not explain why it suggested an action for a process pattern. Using the ontology to identify the pattern similarities in different work processes (really cases) one can tell in business terms why an action is recommended.
Business analysts create at design time a business ontology that the non-technical business users will use at run-time to create their processes and content. The technical effort is mostly related to creating interfaces to existing IT systems to read and write business data. At run-time users collaborate in business terminology and as they perform their work they create the process patterns that can be reused. These can both be stored explicitly as templates or the User-Trained Agent will pick up the most common actions performed by users in certain contexts.
Conclusion: We are just at the starting point of using machine learning for process management. IT and business management are mostly not ready for such advanced approaches because they lack the understanding of the underlying technology. We see it as our goal to dramatically reduce the friction, as Sankar calls it, between the human and the machine learning computer. Using these technologies has to become intuitive and natural. The ultimate benefit is an increase in the quality of the workforce and thus in customer service.