What is Failure Prediction?

Dec 21, 2023 | Preventive Maintenance, Resource Library, TRM Blog | 0 comments

John Q. Todd

Sr. Business Consultant/Product Researcher

Total Resource Management (TRM), Inc.

No one likes it when equipment fails, except maybe those who are selling replacements. Failures occur at the most inopportune moments; in the middle of an important production run; in the middle of the night when the crew is at a minimum; on a Friday afternoon; or during the tour of new investors on their first plant visit. (Hydraulic oil is difficult to get out of cardigan wool.)

Let’s start by defining what we mean by these two words: failure prediction.

Failure

What is a failure? Seems a simple question, but it isn’t really. We have images of disasters and gooey stuff all over the production floor, but a “failure,” can be far more subtle than those. Let’s borrow some understanding from a very well-known analysis approach called a FMEA… Failure Modes and Effects Analysis.

To start an FMEA, you capture all the “ways” a piece of equipment could potentially fail. Not that the equipment running out on your production floor has failed in these ways, but a list of possibilities. You then focus on those you and your team have either dreamed up or pulled from a reference library of what others have experienced.

Tanks and pipes leak or burst. Motors stop turning. Bearings grind themselves into powder. Gears chip teeth. Belts break. Diesel engines run away. You get the idea. At this point, we don’t care what caused the failure, nor what the effect of the failure would be. No, we just want to capture what the list of potential… remember that word… failures the equipment could experience.

Another part of defining equipment failures is the relative impact or degree that the operation cares about them occurring. A weeping leak might be a very low priority concern, or it could be grounds for an emergency shutdown. Not all failures demand immediate action. Further refining your list of failures down to only those of utmost concern or impact on the operation is an important step. However, given the work you have put into developing your list of failures, don’t throw even the most benign or improbable failure away. Keep those in your back pocket. If you are watching them to a degree, you may be less surprised when they do manifest!

Another method to get your failures organized is to group them by causes common to each. This grouping may assist in selecting sensors or data sets that will expose the failure. A single piece of equipment can have multiple sensors in place, each sending their specific data streams. But, if there is commonality of cause/failure across equipment sets, this can provide further structure to your approach.

Prediction

Now comes the difficult part… how do you go about predicting any of the failures with a degree of confidence? You might hear a bearing screaming for attention up on the roof, but the failure has already occurred. What information and tools would you need to be able to predict that the one bearing that is behaving itself has a 90% probability of failing in the next 20-30 days?

Prediction requires three things:

  • An understanding of what you are trying to predict – equipment failures in this case
  • Data sets/sources that contain indications, patterns, trends, that point to impending failures or show results of failures
  • Analysis tools and techniques that can expose or draw out of the data what an impending failure looks like, and then as they watch real-time data, alert when a brewing problem is detected.

The really challenging part of this is the data. Do you have actual telemetry from devices that show temperature, vibration, rotations, etc. each fraction of a second, or is it simply a list of “failures,” that your teams have captured in the CMMS over time? Is your data qualitative or quantitative? Is it consistent or broken up into irregular chunks of suspect information? The data being used to train models needs to be sufficient and repeatable for the models to provide results with an acceptable degree of confidence.

Yes, a highly experienced equipment operator can listen to a piece of machinery and tell if all is well or not. Those same people can look at spreadsheets full of data and see patterns that can infer impending failure. But that is not their job. You want them out making the equipment turn a profit for the company, not pawing through reams of after-the-fact data.

Along comes machine learning

You knew we were headed in this direction. We are now awash these days with data that can have important insights buried deep within. Operational data is becoming viewed more and more as a valuable resource for businesses, almost to the level of being considered intellectual property. Rivers of data are flowing past us each day, in some cases never to be seen again. If we could process or filter that resource, imagine what we would be able to better understand, and then make better decisions?

But we are talking about more data than we could ever think to process manually. Forget printed reports, KPIs, and shared spreadsheets. No, you need a team of tireless agents looking after all this data 24/7, 365 and letting you know when something pops up that needs your attention.

To point even further to the need for help, all this data is most likely coming from many different sources. Each source has its own format that suits its purposes but can be widely different from each other. Manufacturers love to format their telemetry in ways that only their software understands.

Data science is not a new discipline, but it has become a much-desired skill set given the growing volumes of data we have to make sense of. Modern analysis tools have functionality and guidance built-in so that a limited-skilled person can establish foundations for analysis and have confidence in the results.

Back together: Failure prediction

As with any endeavor that is expected to produce high-value results, there is a fair amount of preparation and setup required. One cannot just open the data floodgates and expect the system to tell you what to do. Rather, there are several detailed tasks, some requiring specialized skills, to perform before one can see results that are reliable and actionable.

One word of caution is to not blindly accept what prediction engines produce. Given the data they are provided, it may take many business and learning cycles for them to produce results with acceptable confidence levels. This is especially true if your equipment has never experienced a failure of the kind predicted. Do you discount their results? No. The system may be seeing a pattern that you are not.

Back to our, “…failing in 20-30 days,” comment. If the confidence level of the prediction is in the 20% realm, would you consider doing anything about it? Would you start planning to replace the bearing? Most likely not. Prediction engines will need to prove their worth by producing results that, in some cases, may surprise you. But in the end the results must make sense.

There is no doubt that if you could plan even a few days in advance for a failure then there is value to the prediction. This additional knowledge will weave its way into your planning and scheduling processes and will certainly reduce the problems forgotten equipment and missed preventive maintenance tasks can cause. Given that remote and unattended operations are here to stay, being able to plan 20-30 days in advance can have a huge financial impact on your operations.

Wrap up

Prediction of any kind and with any degree of confidence is a rather new tool. We have the data that is hiding gems of insight and now we have the tools to help us dig them out. The data that is coming at us like water out of a firehose is nearly free, so why not see how a better understanding of what it all means could benefit our operation?

TRM and IDCON have been working with clients across industries for many years to capture and then better understand their maintenance and operational data. Let us help you implement modern solution sets to take advantage of this growing and valuable resource… your data.

Share

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *