A Real-World Assessment of the Process Mining Manifesto
In this post I analyze the business feasibility and solution scope presented in the Process Mining Manifesto (PMM) by the IEEE Task Force on Process Mining (PM). Presenting a manifesto is ‘a declaration of principles and intentions’ – as the PMM says – and does not imply available technology. It is a statement of direction that will change according to coming experiences. I evaluate both current and future business benefits in terms of impact on business management and not PM technology. Keith Swenson has covered the PMM too and we agree that the idea has a lot of potential for Adaptive Case Management (ACM). The current approach is however not aligned with the performer empowerment in ACM concepts. I am still very positive about this discussion of automated, real-time process discovery and improvement. Not only because it is an exciting field of research, but because it raises the awareness and understanding of PM as a technology. I have referenced the related, excellent research, especially Wil van der Aalst, frequently in my previous posts.
My assessment utilizes the experiences we made in customer projects during the development of process mining for the Papyrus Platform. We developed a fully functional and consolidated PM solution that is available since 2007 – the so-called User-Trained Agent (UTA) – utilizing patented machine-learning technology, today in its third generation. The UTA does not mine historical data, but analyses process, case and communication content in real-time for user actions and complex event patterns. In short, while the PMM approach is mine-model-improve (all done by experts) as add-on to a BPM bureaucracy, we found a model-discover-adapt (by performers – except interface models) approach to have more direct business impact. The difference in performer involvement in the life-cycle is a key distinction between BPM and ACM. We feed user interactions (emails or tweets) into ACM and enable real-time pattern analysis. Performers interact with the ‘mining’ analysis directly. BPM follows abstract flows, while full ACM solutions advance the process through changing the states of real-world resource objects that users can manipulate as required to create processes. Performer-definable goal and constraint rules define where to go and not. As flow-diagram-less process management was not considered BPM until recently, it is likely that flow-diagram-less process mining faces similar prejudice. Much like Steve Ballmer was unable to see that the iPhone without keys would be accepted!
Process Mining as a continuous effort in BPM.
The term Process Mining is used by different people for different things. An important part of the PMM is a definition of all those aspects that it places under a common umbrella. The manifesto describes an orthodox BPM optimization bureaucracy of (re)design, analyze, (re)implement, (re)configure, execute, (re)adjust, and diagnose and says that PM plays a role in all except implementation. To become an automatic capability PM would need to also ‘implement’ processes without needing user – or at least IT – intervention. That is however currently not the case. PM provides rather only input (after the human controlled effort of filtering) as a raw flow-diagram to a human process expert. This expert has to turn the flow into a usable process by usual BPM means. Should PM truly automate process creation in future, it would cause the same skeptical stance that we have experienced in the last few years, when people want to know WHY a certain change or suggestion is made by our UTA. We do not even propose a complete process but only the most likely next action in the current state and still have to deal with the skeptic stance.
PM tries to improve on the drawbacks of orthodox BPM (process rigidity, no agility, no goal orientation, no outcome focus, difficult implementations, no user empowerment, lack of resilience) that I have been pointing out in the last ten years. While the manifesto also suggests that PM can be used in real-time mode to analyze a current open dynamic case, there is not enough explanation what the real-time aspect would be. It might even refer to a similar approach as our UTA in ACM. The research references point for example to methods for repairing ‘broken processes’ when the flow sequence is no longer valid due to unpredictable changes. Problem solving algorithms are optimizers but they don’t take business decisions when the process context changes. Performers take such decisions and therefore I propose that the process needs to learn from the performer directly the why and the how at the time he takes the decision. What the UTA does is referred to as ‘transductive training.’
The Process Mining Manifesto describes ongoing research.
PM proposes to combine computational intelligence and data mining with process modeling and analysis. PM includes (automated) process discovery (i.e., extracting process models from an event log), conformance checking (i.e., monitoring deviations by comparing model and log) and process improvement (modifying existing process for a better fit). The PMM says that it works with ‘real processes’ and not assumed ones and creates ‘purposeful abstractions of reality’ when the reality is that there is no ‘real process’ information in the log data. The log just contains information on interactions and activities that can be later viewed (abstracted) as being part of a process. To become at least ‘realistic,’ the processes have to be implemented (including all the elements not identified in the logs) and executed. Only once implemented one can start to verify – meaning monitor – conformance, performance or fitness. That requires that the people switch from all other means they were using to a BPM implementation without knowing if what they switching to actually works! Ad-hoc tasks could be used to perform the needed actions, which could then be mined in the conformance phase. ACM is perfect for such an effort because changes simply become part of the template.
Process Mining uses social network/organizational data analysis, automated construction of simulation models, model extension, model repair, case prediction, and history-based recommendations. These theoretical computational models are exciting but will need to be proven and improved in how they functionally interact and combine their abilities in real-world scenarios. Analyzing the communication content is not covered in the PMM and I miss the continuous performer interaction with the whole PM effort. This is in my mind the most essential need of improvement in PM. The discussed human interaction is with experts, while in ACM process owners, performers, and customers are the focus.
Process Mining is not limited to control-flow discovery, but also searches for organizational, case and time perspectives. A process mining project would consist of five stages: plan and justify, extract, create a control-flow model and connect it to the event log, create an integrated process model, and provide operational support. But those different stages of process mining require different software environments meaning that people have to switch from one system to the other to allow that approach. Only if ONE environment is used that supports all these modes then such an approach is even feasible. In this aspect there is no difference to an ACM environment that is used to model the business incrementally and then allows free interaction including all business resources and data. I propose that PM must not require a project but be an inherent function of the process platform to be successful, adopted and aid a continuous improvement effort.
Process Mining principles and challenges.
The PMM outlines six principles and ten challenges, which is an excellent approach and the largest part of the PMM. The challenges are all very real and substantial. It starts with the importance of the quality of the collected data which are called ‘event logs.’ Looking at what data are actually ‘readily available,’ they are rarely events – meaning related to trigger points that influence execution – but just data points at arbitrary times. We found that the data provided by different systems aren’t producing anything close to similar. An even bigger issue is time stamps. Those are not just very different but also of differing accuracy and moreover created from different systems that are usually not time synchronized. We had to deal with collected log data containing substantial ambiguities and a total lack of context to business goals. Whether a Tweet or collaborative action was in relationship to an email or a certain case instance or a certain task, was nearly impossible to identify. Different people in different departments working towards the same process goal, may chose very different communication methods in different ways each time they execute the process. There is nothing that will connect those events and that produces a huge number of not connected process variants. Knowing when something happened does not tell you why it happened and towards what end. Reducing the ambiguity is only possible in a consolidated process environment with defined data models and ideally without the need to write anything to logs but you mine the processes in real-time (as the UTA does). The PMM does not discuss the analysis of process related text and image content, which has been a key in mapping actions to process instances.
Reducing the amount of data for faster mining cannot be done upfront as is suggested in the PMM. To discover the right data we had to collect all possible data and write them to the data store indiscriminately. Filtering can only happen after the data has been normalized and somewhat correlated. To make the logs for BPMS, Case Management, collaboration tools, file sharing, email, social communications and process oriented apps compatible, expect substantial programming effort to even create the log-data and expect to pay for expensive BI software and consultants.
Once the data records have been filtered and statistically processed the PMM proposes that they need to be automatically turned into a flow-diagram (e.g., BPMN, EPCs, Petri nets, BPEL, or UML). The PMM says clearly that the focus is control-flow creation as the outcome-backbone of mining with support for concurrency, choice and all control-flow concepts. I propose that this technique will not discover more than the very basic repeated processes and create too many variants. The BPM approach requires flow-diagrams, while a human interaction centric approach does not require that at all. We know from experience that even huge amounts of data do not allow to infer business rules and process logic. I could not gather from the PMM and the mentioned scientific references how the automated discovery of constraints would work and be practical. Analyzing threshold rules from historical data (If more than 500 then …) suffers from Zeno’s or Thomson’s Lamp Paradox.
The PMM proposes that the flow-controls created automatically are not perfect but still purposeful abstractions of reality. But we found in fact that the purpose of the process flow is not identifiable even if the sequence of interactions is absolutely correct. PM is unable to identify the goals that people are working towards and thus need manual input. Monitoring the route of some car by GPS doesn’t tell you why and when the driver wants to get there and what the drivers intent is in the target location. That doesn’t change if more drivers follow the same route. Monitoring the route does also not teach you how to drive a car …
So what might be the future of Process Mining as outlined in the PMM?
I propose that PM could be missing its true potential by its legacy in the orthodox BPM domain. I agree with Keith Swenson that there is an opportunity to link the ideas in the PMM with ACM but it requires a change from mine-model-improve. Process Mining could be used to highlight areas of intensified communication and direct people to move them to ACM for better support! Switching to flow-diagrams would certainly stop people from doing so. Once in ACM the user interaction becomes transparent and can be improved. You do need to get adoption after all. For analysis of user interactions still outside ACM, we offer the ability to search and import Tweets (chat), email and business content to identify the cases they might belong to and if not we open a new case and suggest to merge related interactions. Then the UTA is trained by people actions and eventually proposes process goals based on information patterns found in the content. Therefore there is no project needed to slowly migrate processes to ACM. The mining and user interaction is guided within the ACM capability!
I have chosen an approach to process mining that is focused on our scientific understanding of human decision-making and social business collaboration. Business is about people knowledge and not about mathematical theories of flow-diagrams. My approach interferes with orthodox BPM concepts and is therefore in conflict with the marketing dollars spent by the BPM community and related analyst opinions. Process Mining tries to solve some of the problems of the BPM analysis approach. We already see positive PM coverage by analysts despite the lack of actual applications and measurable benefits with predictions that it will take till 2015 to work. Gartner Group decided to call a holistic BPM approach ‘Intelligent BPM’ or IBPM that includes process mining and case management and social interactions, similar to my definition of Strategic ACM while lacking Business Architecture. Using process mining today will be a trial and error approach of many techniques and many products requiring substantial additional manpower. It will multiply the cost of existing BPM implementations by an unknown factor. The UTA is however a standard feature of the Papyrus Platform and you just use it if you find it of benefit. No need to wait till 2015!
Conclusion: Look at Process Mining from a people and business perspective. I am quite sure that it doesn’t even matter how many processes might be identified by PM, because people are using something else than BPM today for a reason. The problem is not about identifying processes but that they simply can’t be supported by flow-diagrams! In its current form PM tries to solve a problem that isn’t there in the real world but was created by BPM. Process Mining tries to find stable flows where there aren’t any but that does not mean that analyzing people interactions is useless. Process performers need a collaborative support environment for unstructured knowledge work that helps managers and process owners to guide towards goals and verify outcomes rather than enforce flow execution. The target must be to create understanding and transparency without creating more rigidity and killing resilience. The target is to link strategy and execution by empowering people. So the solution to the difficulty or expense of defining flow-diagrams is not Process Mining alone. The solution is a different BPM approach that was named Adaptive Case Management by the WfMC to highlight the key distinctions. For BPM to actually become as mainstream as ERP, something substantial will need to happen and I propose that paradigm shift was started with ACM, whatever it will be called in the future. If Process Mining stops to be focused on flow-diagrams it could play an essential role too. In the Papyrus Platform it already does!