“USING FAULT TREES TO DETERMINE THE ROOT CAUSE OF ROTATING EQUIPMENT FAILURES” (Robert X. Perez) In the paper “Using fault trees to determine the root cause of rotating equipment failures”, the author, Robert X Perez, explains through his own experience as a senior reliability engineer at Citgo Petroleum Corporation, and through examples of various events in different systems the importance of finding the root cause(s) of failures in rotating equipment. The author explains these “major failures causing significant undesired events are the result of a series of events, and compares it to the reaction of a domino line being tipped over by a single domino chip. And studying chain of event, or probable scenarios is at the heart of RFCA’s” (Perez) The author focuses on FTA (Fault tree analysis) as a tool to determine such root causes. And gives examples of cases where use of these tools could have OR have prevented multiple damages due to one component failure causing failure in surrounding components.

RCFA – Root Cause Failure analysis “Root Cause Failure analysis is a structured process to identify any physical, human and latent causes of undesirable event(s).And how such a process can be used in achieving continuous plant reliability improvements by targeting mechanical and organizational deficiencies in a process facility” (Perez). A tool for “Identification and correction of the underlying problem”(Tronskar). “Simply stated, RCFA is a tool designed to help identify not only what and how an event occurred, but also why it happened. Only when investigators are able to determine why an event or failure occurred will they be able to specify workable corrective measures that prevent future events of the type observed.”(Rooney and Vanden Heuvel). The author, Perez, explains how the fault tree method is vital tool in finding the root cause of failures on rotating equipment.

FTA – Fault tree analysis Fault tree analysis is explained as “a graphical representation of the top event known as final events, and all the possible events believed to have caused the top event.”(Perez). A fault tree is constructed by a top event, and linked via logic AND and OR gates to several bottom events. “The fault events and basic events in the fault tree can be divided into failures and faults. A component failure is a malfunction that requires the component to be repaired or replaced before it can successfully function again. A fault is a malfunction that is reversible. The structure gives visible and reliable data to find out “latent causes of failures, and eliminate false assumptions.”(Perez).

In order to have a successful RFCA inspection, it is vital to have support by management and to have the right team set up by interdisciplinary members, to ensure that all important factors are being addressed. The fault tree is then an important tool because it gives a structural map of a systems behavior, where as personal observations, procedures may overlook events that is the actual cause of a failure. And an FTA will also be a safety argument to support the engineers conclusions and recommendations. (Perez)

An example in the paper illustrates how costly a repair without finding the root cause can be. In this case a repair is done on a large reciprocating compressor, where a faulty rod is discovered by the maintenance supervisor. He makes does not contact the original equipment manufacturer (OEM), but has the rod made by a local shop. The new rod is fabricated with cut threads, and all the compressor rods are converted for compressors of this frame size to rolled threads. The improper fabrication causes the rod to fail due to fatigue in the thread and causes a lot of damage inside the compressor. The consequence it takes three weeks to repair and the cost is $100,000. Several events led to this failure. There was no spare in stock because the compressor was new, the maintenance supervisor makes the decision to have the rod fabricated without drawings, there was no investigation for the requirements and the compressor ran for a long time because it was not equipped with vibration shut down. As we see in this example there could have been several chances to have prevented this failure. The “physical root” in this example was the improper thread design because it led to the secondary damage, but there were several other events that could have prevented the failure from happening. These events are called “latent roots”. (Perez) “So, an RFCA is a detailed analysis of a complex, multievent failure, such as the example above, in which the sequence of events is hoped to be found, along with the initiating event. The initiating event is called the root cause, and factors that contributed to the severity of the failure or perpetuated the events leading to the failure are called contributing events”(Perez). How As mentioned an interdisciplinary team is important when conducting an RFCA of rotating equipment. As pointed out in the paper, rotating equipment engineers might have their focus on repairing problems with mechanical roots. But not so much on finding root causes with nonmechanical causes. Such as: “corrosion, poor piping design, off-design performance, and cavitation.” (Perez). So by cooperation between the different disciplines in an organization, the likelihood of finding latent root causes of failure is much higher. The author lists six steps on how to execute an RFCA investigation: -“Organize an investigative team - Schedule meetings and assign tasks - Cull information/develop a fault tree - Advice management of initial findings - Issue a report and conduct a review meeting - Assign responsibility and track the completion of report recommendations When To know when an RFCA is necessary the organization must first map out the acceptable risk limits on, as shown in the paper, a MAR line in a graph or/and by establishing the limits for when an event is significant.(Perez). In the case of the authors organization will perform an RFCA whenever: “An employee or contractor is injured due to equipment failure. An equipment failure causes a fire or major release of product. An equipment failure leads to a greater than 24 hour unit outage.” (Perez)

The author explains both the obvious and the more hidden benefits of an RCFA. The obvious benefits, being able to prevent costly maintenance and possible failures. But also that it is beneficial for an organization as a whole. The mentioned benefits are: - “improved communication between departments. - more trust between partners. - improved understanding of how departments really work. - reduction in the fear of making mistakes.” (Perez) The author states that in order for an RFCA to be successful it is important to have full support from management to have disciplined investigations, and support from personnel in order to reach goals of long term improvement. (Perez) As proven by examples in the paper, a failure in one component can give a ripple effect and cause more or major failures in a system. The risk and cost involved if a failure occurs, forces companies to have as precise overview of any possible event, and the cause of the event in order to ensure the optimal reliability for the equipment. The paper also mentions benefits of performing an RFCA other than preventing unwanted events from happening on equipment. Such as: -“Improved communication between departments - More trust between departments - Improved understanding of how departments really work - Improved understanding of procedures - Reduction in the fear of making mistakes”(Perez) The mentioned benefits can have a big impact on the on an how well and efficient an organization can function. (Perez) Condition monitoring of a system or component will contribute to give information about its state and function. The better information, the more accurate analysis is made possible. And the result of an FTA and an RCFA analysis will lead to conclusions that can be used to optimize the function of the equipment or system. And as a consequence aid to develop a maintenance program that will be able to focus on a predictive maintenance program. And there by reduce the chances for corrective maintenance. The better knowledge of the equipment, the more can be done to prevent failures, and better plan the maintenance activities.” An RCFA analysis will identify the true root causes underlying a problem and to ensure that results of the study includes realistic corrective action. Whether they are caused by technical, organizational, or human causes of failure.”(skf). The combination of condition monitoring and RCFA analysis, will give a company an opportunity to reduce downtime, cost and to have a proactive, predictive and correctly scheduled maintenance program. And by this be able to be more reliable, increase production, reduce downtime, reduce maintenance cost, increase safety and increase profits. (DNV).


