Failure Modes, Effects, and Criticality Analysis (FMECA) has been a secret weapon in the arsenal of the Reliability Engineer for many years. While it began in the early days as just Failure Modes and Effects (FMEA), the addition of the Criticality (aka, “Who cares?”) has made it an important tool for us to use towards maintaining our sets of assets.
If you really want to geek out on FMECAs, find yourself a copy of MIL-STD-1629a. No, your eyes are not deceiving you… it was originally published back in 1974, and then again in 1980. It has since been canceled. Along with disco, it was a very exciting time to be a Reliability Engineer!
It is also known under SAE1025 (a work in process standard!) or AIR4845. The usefulness of FMECA continues to be recognized across industries as evidenced by current activities to refine the approach by organizations like the SAE.
Grab yourself a FMECA template or spreadsheet from the internet and let’s get started!
How to begin?
A FMECA begins with the identification of the equipment involved. At this point you do not care about the interactions or relationships between equipment, rather just listing out and being able to uniquely identify the pieces of your system. Picture an electrically powered pump, “system.” It might have two sources of electrical power, a motor, the pump, and an output valve. (As a side note, having a Reliability Block Diagram (RBD) of your system is a big help when you start a FMECA.)
Don’t ignore the little pieces of your system. They can often be the most critical. We all know of instances where millions of dollars of equipment have been damaged by $20 worth of parts. You could consider long runs of pipe as a single asset, but how do you consider all the joints and couplings? They, more so than the pipes, fail at higher frequencies, so they are an important part of the FMECA. You do need to be a little careful at, “how low do you go.” Too much detail at this point will make the final FMECA a bit unwieldy.
What is a “failure mode?”
Following the definition in MIL-STD-1629a, a failure mode is, “the manner by which a failure is observed.” Here are a few examples:
- Pump is stopped – no flow
- Pump output flow is less than required
- Pump output flow is +/- 10% of required tolerance
- Commercial power is dark
- Motor is not turning
You might be saying to yourself, “Gee, this list could get really long.” Or “I can foresee the arguments about what is a failure mode and what isn’t.” You would be correct in making both observations.
No worries, keep the list high level and the team in the conference room moving forward. Listing failure modes is the easy part!
How about issues like noises? Is a whiny bearing a failure? One could argue that it is not. It might be an indication of an impending failure of the bearing (or of the lubrication system), but a failure has not manifested itself yet. Isn’t this fun?
There is a subtle point to be made here: Be careful what you define as a failure mode vs. an effect of an upstream failure. For example: the pump being stopped. That could be caused by the electrical motor not having power and nothing to do with the pump. The pump has not failed. But the pump could be stopped because the coupling between the motor and the pump has failed. Still, the pump has not failed. When you are working at the individual equipment level, keep the focus on how that piece of equipment could fail and what its basic function in the system is. In the case of the pump, a cracked housing would be a valid failure mode.
Further, a slowdown in output from the equipment could be a failure mode as well. While you cannot state why the flow is decreased, as far as the system is concerned, it cannot keep up with being productive with a decreased flow rate. If the system is reliant upon a certain level of flow, then less than that is a failure mode to address in the FMECA.
Effects of failure
Oh, the fun continues! Now we get to list the effect(s) of each failure mode. In essence, what happens if the equipment fails in the ways listed? The effect (and eventual criticality) of the failure could be negligible. The pump leaks due to a failing seal. Might be a big deal, might not.
List out the effects of each failure. Don’t worry about how big a deal the effect is. Also, be sure at this point to list only the “first” effect, not the ultimate effect. The “first” effect of the failing seal is a leak. Follow-on effects could be a HazMat condition and death, but for right now, focus on the very first effect.
Criticality – Who cares?
Of course, all failures are of concern. But the effects of each can be assigned a criticality depending on the situation. A leaking pump seal might not be all that critical if the liquid is under low pressure. Make that into high-pressure steam and you have a completely different situation.
You may find that assigning a single value for “criticality,” is difficult. An effect that is low criticality from a production viewpoint may be very high from a safety aspect. While the pump is leaking, production is unaffected, yet the chemical being leaked is hazardous to the operators. The role of that pump in the system is two-fold: one to pump liquid for production and the other is to protect the workers from that liquid. There is room in a FMECA analysis for multiple “areas,” of criticality to be listed.
Avoid the temptation of having Criticality scales with too many degrees. A scale of 1-10 simply has too many choices. Can you really tell the difference between a 7 and an 8? Rather, use a 1-5 or a 1-3 scale for best results. Here is a simple approach:
- 1 = Highest criticality – all stop, safety issue
- 2 = Medium criticality – of concern
- 3 = Noncritical – operations can continue
Stack ’em up!
Given that you and your team have survived the process, let’s see what the analysis tells you?
- Which pieces of equipment have the most potential critical failures?
- Which failures (regardless of the equipment) are deemed the most critical?
- Which failures/effects combinations surprised you and your team?
Armed with your new FMECA, you can take a hard look at your maintenance processes and methods to see how they are preventing (or causing!) the potential failure modes. While some failure modes may have very low probability, they could have dramatic effects, so they are worth considering in your maintenance approach. Given the criticality of a release, that failing seal on the pump might be an all-hands-on-deck situation. Perhaps it is just a replacement task for the next shutdown period.
Failure Classes in Maximo/Manage
We hope you are aware of the Failure Class functionality that is inherent to Maximo (and now the new Manage). After you finish your FMECA, you can take its information and populate this feature. A Failure Class can be specific to a type of equipment such as a pump, motor, etc. It could also be generic like “mechanical.” Either way, you can then populate as such:
- Problem = Failure mode observed
- Cause = Initial estimation of why the equipment failed
- Remedy = Pick from the list of things maintenance could do to at least temporarily “solve,” the problem
Allow us to mention that our pre-configured Advanced Asset Management (AAM) Maximo solution includes a universal set of failure classes, complete with Problems, Causes, and Remedies. This pre-built library gives you a starting point that you can refine and add to as you wish.
Now, as your maintenance teams perform their corrective work activities, your FMECA has provided a foundation for them to make organized failure reports. These real reports become highly valuable when you review your FMECA annually and perform root cause analyses. Support your teams with good and efficient information and they will return the favor with valuable real-world feedback.
As with any method that has been around a while, there are opinions as to its usefulness and you can overdo it. Too much detail in a FMECA will bloat the analysis, making it hard to maintain over time, greatly reducing its usefulness. Stay at a high level initially, then add any real failures and their real effects in the event you experience them. Review the FMECA on an annual basis to see if there is anything new to add, or even remove if technology has changed the landscape of the equipment. A broken gear in the VCR mechanism is not a problem if you are now streaming your entertainment.
If you are ready for the next step, to seek training on FMECA go to: https://www.idcon.com/training/.
Article by John Q. Todd, Sr. Business Consultant at TRM. Reach out to us at AskTRM@trmnet.com if you have any questions or would like to discuss deploying MAS 8 or Maximo AAM for condition based maintenance / monitoring.