The Value of Failure
When I joined IBM In 1974, Tom Watson Jr. was still its figurehead, despite having resigned from being the second president of IBM three years earlier. He had stepped into the footsteps of his father, Thomas Watson Sr. the founder of IBM, in 1952. Senior was a shrewd businessman, while his son was an innovator.
Tom Jr. saw that computers were the future of IBM and started a huge research program that spent 9% of revenue (versus the typical 5-6) that led to the 1960 introduction of the first commercial-use computer, the IBM 1401, which was mostly responsible for continuing IBM’s dramatic growth rate of 30%. With it came the first high-speed printer the IBM 1403, a machine that I still worked on as an engineer in 1974. I am certain that the hydraulic-driven, ‘high-speed’ printer (600 lines per minute) was essential for the commercial success of computing at IBM.
Tom Watson’s most dramatic gamble was the development of the System/360 that would offer for the first time compatible computers of different power with a variety of I/O (Input/Output such as card readers, printers, tape and disk storage) devices that could be configured to needs. When the system, designed by IBM chief architect Gene Amdahl, was delivered much later than intended in 1964, it turned out to be a huge success. It was the predecessor to the S/370, S/390 and its function set is in principle still the base of IBMs current line of z/Series mainframes.
What IBM Taught Me
My first direct involvement in IBMs mainframe technology was a 1978/9 assignment to IBM’s Havant (UK) plant, to test newly assembled IBM 3033 mainframes. Those were fascinating years, as bugs were still chased through test-routines that created the code loops to measure signal shapes with an oscilloscope on the cables between the processor boards.
Each production test engineer would follow a single machine from initial assembly to shipment. The first 3033 to be shipped from Havant to a German customer was in test for nine month! On my return to Austria, I could fix any problem in a 3033 processor in a fraction of the time of our local specialists, because I had been debugging 8-10 hours a day for over a year. Each functional failure that I had to track down to its ultimate reason – a duff integrated circuit, a broken motherboard, a bent or simply loose trilead (a kind of mini-coax cable) – taught me how things should be working much more than any course or manual could. I would not just follow the test-sequence but I would run tests beyond the first failure and look for patterns. That was in principle wrong, because test results might be caused as a follow-on failure of earlier detected bugs and thus point to the wrong function. But despite ignoring standard procedure I was faster and more successful in fixing problems than other engineers who had been doing this much longer. I didn’t think much of it and just did what worked best for me.
To my surprise I was soon assigned to work as a ‘jumper’ who would be called in to fix difficult problems on different machines on the factory floor. Without initially being aware of it, I employed a kind of holistic systems thinking for debugging the multi-CPU, parallel-processing mainframe. I realized only much later that the hardware alone was just complicated, but the combined layers of hardware function, ALU (Algorithmic Logic Unit) pico-code, processor micro-code, S/370 machine code, and finally the operating system with its thread scheduling, memory addressing and exception handling, actually represented complex emergence. I know today that looking for patterns was intuitively the right approach to identify the events that caused the malfunction. IBM researchers layered their learning experiences on top of each other, employing a hierarchical set of layers each allowing new functionality to emerge higher up. Even on the test floor we would still recommend improvements in hardware and micro-code, thus showing emerging innovation too.
Lessons from a Different Culture: Saudi Arabia
From 1981 to 84, I went on assignment as the Field Support Specialist for Saudi Business Machines, an IBM joint venture with the Juffali family. Part of my job was to train Arab field engineers from many Middle East countries in S/370 technology. Planning the daily course schedule according to Muslim prayer times was difficult but essential. I even had to develop the course materials myself, but despite the challenge of teaching culturally different mindsets, it was one of the most fun things I ever did. Without my technical experience from Havant I could not have done it. Only at some time later I realized that I had tried to give those young engineers a level of understanding that was no longer required.
For later IBM mainframes such as the S/390, engineers no longer needed to know how a machine actually worked as the hardware was much more integrated and tests would to point to much larger replacable units. Problems were pinpointed by swapping hardware pieces around. The running joke was: ‘How does an IBM hardware engineer fix a punctured flat tire on his car? Simple! He switches the tire with another one and checks if the problem moves.’
To understand why for example a different I/O load would change processor utilization needed a grasp on the complexity of a large mainframe system with its processors, caches, memory and I/O systems and how the software linked all of them together. The hardware function was technically predictable but not how the customer programs would actually use it, especially in terms of how the operating system would react to a variety of events. That was the reason that on return from Saudi Arabia I left hardware engineering to become a systems consultant. Fixing problems in complex processors had taught me invaluable lessons and changed my thinking, but I was no longer interested to work on ‘perfect hardware’ as all it required was to perfectly follow a maintenance procedure. You can see that I had this anti-process perspective already in the 80’s!
A $10 Million Dollar Education
But I feel that mine were not just chance experiences. When IBM had hired me right out of college in 1974, the HR manager told me an anecdote about Tom Watson Jr. According to it, Tom Watson had called a VP to his office to discuss a failed development project that lost IBM in the range of $10 million. Expecting to be fired, the VP presented his letter of resignation. Tom Watson Jr. just shook his head: “You are certainly not leaving after we just gave you a $10 million education.” In those days, failure was not a problem at IBM as long as it was turned into a learning experience. The important point is that the financial focus has to take the backstage. Before Steve Jobs returned, Apple was run by bean counters who demanded profits over customer value. On his return the company was just three months away from bancruptcy. Allowing people to fail and learn is not a blanket job guarantee for those who don’t care! Steve Jobs fired the exec responsible for the MobileMe disaster on the spot when he botched another service launch.
During my tenure in IBM’s Havant plant I had learned that I needed to turn my thinking upside down: Not failure is the outlier, but success is! Trying to understand why we couldn’t fully control and predict how a complex system would work led me to learn about evolutionary concepts and complex adaptive systems. Biologist E. O. Wilson’s book ‘Consilience‘ was a milestone for me, as was Douglas R. Hofstadter’s ‘Gödel, Escher, Bach – An Eternal Golden Braid’ that led me to focus on artificial intelligence.
When I finally left IBM to start my own software business in 1988, not punishing my employees for failure was one of many IBM company culture principles that I took along. I also told stories such as the one of the 3M Post-It! note that was created from a glue that wouldn’t really stick. This kind of free experimentation can also bring surprisingly positive side effects such as Pfizer’s Viagra, originally intended to be a high-blood pressure drug. The rest is history, as beautifully depicted in the recent movie ‘Love and Other Drugs’ with Anne Hathaway.
After centuries of Tayloristic command and control management approaches, adaptive concepts are finally being considered in the business world. Tim Harford wrote in ‘Adapt: Why Success Always Starts with Failure’ that we can’t rely on expert advice, command economies, and top-down organizational structures when human endeavors exceed a certain complexity. It is an illusion to expect to be able to anticipate and plan for all consequences of decisions taken. Complexity doesn’t allow predictability and therefore hinders planned improvements. Innovation is only found in environments that support variation through experimentation on a small scale in randomized trials. Selection of successes is achieved through competition and diverse feedback.
Enter Adaptive Case Management
My own life experiences are the reason that I turned a scientific perspective into the business and software concept that is today at the core of our software solutions. Today I can use the example of the Apple Appstore social network as a vivid and verifiable proof of the success-through-failure approach. It provides an ecosystem of autonomous innovators who thrive through the power of evolution. The best apps will succeed, while many won’t. Steve Jobs himself failed multiple times until he succeeded. Thomas Edison invented 6000 failures to finally discover a usable light bulb. Both are well documented history. While a guiding vision is certainly essential, one has to be willing to admit failure, learn from it and try again, and again, and again …
Quite obviously, wanting innovation is not just inventing new successful things, but much rather a focus on customer value. Innovation does and must happen continuously on all levels, in the small and in the large, while not all innovations will succeed. Tom Watson Jr. supposedly said, “If you want to succeed faster, double your failure rate.” As an executive you have to allow and even promote the opportunity to fail, which is diametrically opposed to perfect business processes as demanded by BPM or SixSigma. James March pointed to the importance of exploration and exploitation of knowledge in organizations in 1991. But how do you fail fast and ensure that the newly gained knowledge isn’t lost? Nobody likes to share his failures, right? That is why the concept of Adaptive Case Management offers a radical departure from the perfectly-optimized-process illusions. ACM enables large organizations to fail and innovate faster by ensuring that gained knowledge becomes transparent and reusable without needing a bureaucracy. The ability to ADAPT (change future process execution through learning by doing) is very different to Ad-Hoc or Dynamic processes.
A process optimization bureaucracy of any kind (i.e. BPM, Six Sigma or Lean) might ensure cost cutting but it will not support, promote or provide true knowledge-from-failure innovation. Perfect and cheap processes designed by an outside consultant are stale and dead. Yes, code-freeze kills the germs of infectious innovation! Giving the process owner authority to pursue assigned goals any way he wants as long as he achieves outcomes, operational targets and handovers is the kind of social empowerment needed for success. Autonomy is further a key element in employee (and thus customer) satisfation. Allow for a variety of processes and tasks to fail or succeed until the best ones sustain. For effectiveness you need to allow processes to be improved by the people who perform them. That is additionally the most natural and efficient approach to optimization. Governance should at most define the high-level Business Architecture and ontology to reduce ambiguity but not nail down low-level processes. As I posted recently: “Let’s face it, orthodox process flowcharting won’t survive the social and mobile revolution.” I am going to stick with that prediction.
Outside manufacturing, we deal with a business complexity and speed of change that makes it near impossible to tell others exactly what to do. It is ridiculous that predictive analytics promote statistical correlation as causal decision points for processes. Knowledge workers improve outcomes through emotional inspiration and not Boolean if/then/else logic. We need to trust their skill. They listen to customers, translate goals into needed activities, and then execute based on their intuition and experience. Yes, many people in large organizations don’t care about outcomes because they are jaded by bureaucracy. And now we punish and mistrust them for our failure as executives to empower them? How will a rigid flowchart allow for innovation and make them less jaded? Yes, your people need understandable and documented objectives, targets and goals, but then authority, autonomy and means will be the only thing that takes them from careless to caring. ACM is therefore not about cost cutting but about empowering people to deliver value to their customers and make doing so transparent to management.
Soichiro Honda said: “Success represents the ONE percent of your work that results from the 99 percent that is called failure.”