Thursday 29th of July 2010

Well Defined System Troubleshooting PDF Print E-mail
There have been many processes optimized for well defined systems. As mentioned previously, well defined systems can take for granted that the solved state is "the as-designed state and behavior". Unnecessarily analyzing the solved state is very inefficient.

Universal Troubleshooting Process

I won't spend too much time on this, because by my count the Troubleshooters.Com website devotes upwards of 100,000 words on this subject. I originally evolved the Universal Troubleshooting Process as a way to fix consumer audio equipment. Since then, it's found wide diagnosing many types of systems, especially computerized systems. The Universal Troubleshooting Process consists of these 10 steps:
  1. Get the Attitude
  2. Get a complete and accurate symptom description
  3. Make damage control plan
  4. Reproduce the symptom
  5. Do the appropriate general maintenance
  6. Narrow it down to the root cause
  7. Repair or replace the defective component
  8. Test
  9. Take pride in your solution
  10. Prevent future occurrence of this problem
In my opinion this is the most widely applicable Troubleshooting process, and therefore should be the default. It works very well in the absence of vendor support (i.e. smart manuals, Era 4, etc.). Obviously this makes it a perfect match for the software industry.

It's the only technology Troubleshooting process I know of that addresses the mental outlook of the Troubleshooter, instead of considering him or her a perfectly rational robot.

Jack Ganssle's Looping 6 Step Process

Embedded systems development guru Jack Ganssle's looping 6 step process was created to optimize Troubleshooting in circuits under design. This requires a whole different set of assumptions, most notably that you can't trust the (unfinished) design. Remember that it is a loop -- after step 6 you go back to step 1. This is especially true if your fix was a temporary one (what I would term a coathanger fix). Remember, in design, you might not want to get bogged down with fixing the right problem if all you need is to get the subsystem to work so you can continue the design of another subsystem. Naturally, Jack demands that any "coathanger" fixes be cleaned up before going to market.
  1. Observe the behavior to find the apparent bug.
  2. Observe collateral behavior to gain as much information.
  3. Round up the usual suspects.
  4. Generate a hypothesis.
  5. Generate an experiment to test the hypothesis.
  6. Fix the bug.
So his steps 1 and 2 are like the UTP's step 2, his step 3 correlates roughly to the UTP's step 5, with his 4 and 5 being part and parcel of the UTP's step 6. His step 6 is the UTP's step 7, except that his fix may not be the long term fix, due to the realities of a design situation.

To really understand the beauty of Jack's methodology, you need to visit his Troubleshooting web page (in the URL's section of this magazine). Throughout the explanation of his method runs a message of "be careful, trust nothing". Obviously, an in-process (and therefore incomplete) design is buggy, and may not behave as expected. There could be double and triple root causes, and the system may not be designed as you think it is.

If you're in the middle of creating an electronic design, use Jack's methodology. Although I personally use the UTP in troubleshooting software under design, it's very possible that Jack's would be better optimized for that. Basically, when designing anything, make sure you're familiar with Jack's 6 step loop.

Era 4 Troubleshooting Systems from Intelliworxx

Jim Roach is Vice President, Mentoring Systems at Intelliworxx, the company offering the first (to my knowledge) working Era 4 Troubleshooting tool. In case you're new to Troubleshooting Professional, Era 4 refers to an expert system with a built in valid Troubleshooting Process. Here's how it gets its Era 4 name:
Era Name Range Description
1 Observational Troubleshooting From invention of the bow and arrow until the invention of the steam engine (8000 BC to 1700's) Observation only. Systems under repair have all components visible, so the problem is obvious. Little diagnosis needed. On the other hand, repair/replacement of component requires precision, one of a kind work.
2 Intuitive Troubleshooting From invention of the steam engine until the 1970's Observation and non-rigorous diagnostic process. Systems under repair still contain only a few components, though some aren't visible to the naked eye. Diagnosis required, but doesn't need to be rigorous. Replacement parts likely to be available from a vendor, but may be difficult to replace.
3 Process Troubleshooting From 1970's until the present Observation and rigorous diagnostic process. Systems under repair contain many (>10,000) components, most abstract or invisible to the naked eye. Non-rigorous diagnosis produces circular search and rework. Rigorous diagnosis required. Replacement parts available from a vendor, and due to modularity often easy to install. Software components are often replaced in five minutes with a few keystrokes.
4 Technologically Enhanced Troubleshooting From now until the next era Observation and rigorous diagnostic process, aided by context-relevant technology-served information (Troubleshooting process aware smart manuals). Systems under repair are now hugely complex, not always completely modular. Observation and rigorous diagnostic process alone takes too long, because no human can have the complete Mental Model, manual and diagnostic information in his or her head. Replacement parts are stock.

I met Jim when he emailed me in 1996, wanting some advice on placing a Troubleshooting Process in an automated diagnostic system. At the time he was a highly placed training executive in GM. If it had been anyone else I would have chalked it up to "another expert system marketed to replace a human Troubleshooter". But Jim knew his stuff, and it was obvious he understood Troubleshooting Process through and through. So we talked frequently throughout 1996 and the first half of 1997.

But I really didn't understand the finished product. I'd heard so much about it, but didn't understand what the finished product would be like. Until 1999, when I tried it in at a conference. It was incredibly easy to use. Jim Roach gave several demos where people used it to find bugs (intentional malfunctions) placed in a Cadillac.

This Troubleshooting Process is highly optimized for situations in which the vendor has provided voluminous service documentation, including quickchecks, error code documentation, predefined diagnostics and the like. The process starts out with symptom acquisition and reproduction. Next what would be called General Maintenance in the UTP, including tech bulletins, diagnostic codes, and the like. This is followed possibly by a divide and conquer session, guided to the extent possible by existing predefined diagnostics. The final step is repair and testing.

So far it sounds like the Universal Troubleshooting Process. But if you look at the details, the Intelliworxx model assumes that most problems will be solved with the help of existing documentation, and that the Troubleshooter won't need to devise his own diagnostic tests. Given the volume of system information in the smart manual, for the first time this becomes a viable assumption. Indeed, combined with a smart manual on a voice actuated, ruggedized hands-free computer, the Troubleshooter has instant, just in time access to exactly the necessary information. At every stage of the game, the first priority is to look at existing documentation. That documentation is a smart manual (or as Intelliworxx would call it, a mentoring application). Only when all documentation has been delivered to the Troubleshooter, without a solution being found, does the Troubleshooter go "offroad", creating and testing his own hypotheses. At that point, the well equipped Troubleshooter would know the Universal Troubleshooting Process, which is optimized for those times when relevant system documentation is not available.

Authoring such a smart manual is costly, but so is a truly detailed paper manual. This Troubleshooting Process enhanced, voice actuated, hands free smart manual, is the first tool integrating effective, low cost information lookup with Troubleshooting Process. For the first time it's quicker to follow predefined diagnostics than to create your own. In industries providing detailed, accurate and timely system information (the automotive industry is a perfect example), it can multiply productivity.

Contrast this with an industry like proprietary computer equipment and software, where documentation is incomplete and scattered. In software diagnosis, if you haven't found the info in 10 minutes you're probably better off diagnosing it yourself. We software guys can only dream how fast Troubleshooting could be if we had instant, as needed access to all accumulated knowledge of the machine or system.

A link to the Intelliworxx website appears in the URL's section of this magazine.

Conclusion

This article has discussed three processes for solving well defined problems. As you can see, all three are very similar. As you can imagine, all three are interchangeable with each other. But that doesn't mean one size fits all. They're optimized for different situations, so their economics can vary widely. In design situations, you'd use Jack Ganssle's process. In those situations where the equipment or system's vendor is nice enough to provide a Troubleshooting Process based delivery system for complete system information, the Intelliworxx system yields the quickest and most reliable solutions. And in the remainder of cases, the Universal Troubleshooting Process empowers the Troubleshooter to quickly and effectively isolate the root cause with a minimum of help.
 

Recommended


Powered by Auto-365.Com