This article presents detailed information regarding failure analysis of electrical products. Questions, such as “What is failure analysis?” and “What are the methods for failure analysis of electrical products?” have been answered.
I. What is Failure Analysis?
Failure analysis is a process where an investigation takes place to identify the causes of a product failure and determine the corrective actions to fix the issue. Executing a failure analysis correctly can help reduce product design costs, prevent product failures, and save lives and resources. Manufacturers in all types of industries can, and should, perform failure analysis when developing a new product.
II. Benefits of Failure Analysis
Performing failure analysis of electrical products links to many benefits, including legal, financial and product safety-related.
- Prevent product failure: Determine corrective actions, which, if taken, can prevent the reoccurrence of the identified failure.
- Ensure product compliance: Perform failure analysis to meet safety standards for manufacturing processes, failed components or products. (Use digital regulatory compliance tools to identify applicable standards in minutes and ensure product safety and compliance.)
- Prevent financial loss: Product failure can be considerably costly, especially if it results in expensive outages and production stops.
- Improve future products: The information collected during the failure analysis process can help improve the manufacturing processes and product designs.
- Avoid market penalties: Designing and producing non-compliant products can cause serious market issues. Market authorities have the power to ban a product from a market if the product is non-compliant. In addition, the manufacturer will receive a monetary fine and may even face imprisonment.
III. Methods for Failure Analysis of Electrical Products
The following sections present six methods for failure analysis of electrical products: FMEA, FMECA, FTA, ETA, HAZOP and AEA.
#1. Failure Modes and Effects Analysis (FMEA)
FMEA is a qualitative method for failure analysis of electrical products during their development phase. It represents a system analysis procedure to determine the potential failure modes, their causes and effects on the system performance. The gathered information allows to create a risk profile and use it for changing the product design.
Manufacturers can use FMEA to identify safety-related product defects, such as:
- A manufacturing defect introduced during the product assembling,
- A discrete product component may have a latent failure, causing the product to malfunction,
- Latent defects due to intermitted faults that still need repairing.
The initiation of the FMEA process should happen in conjunction with the development of the product concept. Then, the FMEA process should be updated during the product development cycle and included as a checkpoint in each phase of the new product development process (NPDP). FMEA can be updated when any of the following changes occur:
- Changes in the product’s subsystems and/or components
- Changes in the integrated system and its functions
- Manufacturing process changes
- Market changes
- Regulatory changes or updates (stay up to date with the changes in the EU regulations and standards using a cloud-based product compliance software that provides you with automatic alerts when a relevant directive or standard is updated)
- Changes due to customer feedback.
Types of failure modes
A failure mode is a way a failure occurs, impacting the product’s performance. There are two types of failure modes in electrical products: predictable and unpredictable. The former are failure modes in products that exist and are associated with a product or technology through field research. Hence, they are predictable. Manufacturers can mitigate such failure modes with design testing. The unpredictable failure modes in products, however, are usually unknown, which makes their validation through design testing considerably tricky.
Implementation & Execution of FMEA
Before the implementation of this method for failure analysis of electrical products, manufacturers must:
- Set ground rules for the analysis
- Have a clear description of the purpose of the analysis
- Define methods for supporting the FMEA
- Set rules for documentation control
- Determine the experts responsible for the technical part of the analysis
- Define milestones
- Determine the ways to mitigate significant risks.
Additionally, the FMEA ranking scales – severity, occurrence and probability of detection – should be determined and have the same increments. For instance, if the severity has a ranking of 1 to 5, then both occurrence and detection should have the same scale. The analysis’ presentation can follow a table format like the one in the image below.
When the implementation process is completed, the FMEA execution can begin in parallel with the product development cycle. Generally, the following execution steps should be followed:
Product definition and characterisation, considering the type of product, design, manufacturing, and technology risks, among many other factors.
Creation of a blog diagram to define system relations, which displays major components or process steps as blocks and shows their interconnection with lines.
Creation of a database, including information such as subsystems, components, design lead and revision date.
Ensure all supplied product parts are compliant with the relevant regulations and standards. Get access to the supplier compliance portal of Clever Compliance to keep your suppliers’ compliance status in check.
- Identification of the failure modes (e.g. system failure or defective keyboard)
- Identification of time-dependent failure mechanisms (e.g. corrosion, degradation or electrical failure)
- Describing the effects of the identified failure modes (e.g. heat, risk of injury or noise disturbance)
- Establishing a numerical ranking for the severity of the failure modes
- Assigning a probability factor, i.e. a numerical weight, to each identified failure
- Identification of the control systems that may help prevent the occurrence of failure modes
- Determining the risk priority number (RPN) using the following formula: RPN = (Severity) x (Probability) x (Detection).
- Creation of an action list to evaluate the course of action to address potential product failures with a high RPN.
If there are any potential failures with high severity, manufacturers should continue the evaluation by using the method for failure analysis called FMECA.
#2. Fault Tree Analysis (FTA)
The fault tree analysis is another one of the methods for failure analysis of electrical products. It provides a global view of a single failure and its potential causes captured in a fault tree. The direct relationships between an event causing a failure, the component and the product failure outcome are determined using fault-propagation logic. Hence, all interactions between systems and subsystems or components are evaluated and presented graphically.
Structure of the analysis
The FTA has either a bottom-up or top-down structure depending on the visualisation of the fault tree. Usually, the failure mode is at the top of the fault tree, and all the potential causes stay at the bottom of the tree. The part between the top and the bottom of the fault tree illustrates the sequence of events leading from the cause to the final failure mode. Some causes may have a short series of events, while others – a longer sequence. It’s also possible to add interdependencies. However, the latter will complicate the FTA visual structure.
The structure of this method for failure analysis typically consists of symbols, binary conditions (e.g. AND, OR and NAND) and only relevant events. Each symbol has a functional meaning. Safety standard IEC 61025 provides information on the different symbols.
The figure below presents a simple visualisation of the FTA structure.
The FTA structure will follow an analytical logic flow and require information about the product and its functions. An in-depth technical review of the product will provide information about the software, hardware design, different component failure modes and how failure can spread in the product’s architecture.
FTA reporting & Safety standard IEC 61025
Standard IEC 61025 provides guidance on FTA reporting and lists basic and supplementary items. The basic reporting items includes the objective, scope, design, system description and operation. The additional items help clarify complex issues and minimise interpretation that could result in inaccurate assumptions—for instance, technical information and the FMEA analysis for follow-up activities.
The FTA report should also include evaluation boundaries and the basis for the evaluated cases. The team responsible for the analysis should document their experience and backgrounds in the report as well. The report aims to show the collected information and provide results, conclusions, and recommendations.
#3. Hazard and Operability Analysis (HAZOP)
HAZOP is a focused qualitative technique with boundaries, evaluating only conditions that represent a risk to people, animals, buildings or equipment. This method for failure analysis of electrical products doesn’t focus on evaluating general reliability. Instead, its focus is on identifying and characterising potential electrical product hazards and operability issues.
Safety standard IEC 61882 provides guidance on executing HAZOP. According to IEC 61882, HAZOP is a process of breaking a complex design into smaller blocks and then evaluating each block individually. When assessing a block, the team responsible for the analysis should use standardised terminology and process parameters to determine any deviations from the design intent. For each deviation, the team should document the possible causes and consequences. The mitigating actions to prevent a hazardous cause or consequence should be evaluated as present, sufficient, or lacking effectiveness.
The HAZOP report should include all identified potential safety issues (e.g. hazards) and the evaluation for each safety issue, showing whether a mitigating action is necessary. The report should also include information about recommendations that focus on the actions.
#4. Action Error Analysis (AEA)
This method for failure analysis of electrical products concentrates on the interactions between electrical items and people. It studies and evaluates potential human errors in critical operations and their consequences.
AEA facilities decision making. This analysis aims to find a decision that focuses on the problem perceptions, analyse the latter, and then present them in a way that allows their comparison in a qualifiable manner.
The action error analysis is similar in preparation, implementation and reporting to the FMEA. The preparation process includes different activities – for example, documentation and product familiarisation, analysis execution and review, and identification of the necessary action items. The implementation methodology is the same as that of FMEA. Risk can be evaluated using the same method as that in the FMECA.
The image below presents a table that one can use to identify and track human errors.
The AEA reporting can follow the same guidelines as FMEA, but there may not be a detection column. The report should contain recommendations and actions.
#5. Event Tree Analysis (ETA)
The event-tree analysis is a forward-looking and bottom-up method for failure analysis of electrical products, describing both the success and failure of an event. Hence, the event tree analyses both a functioning system response and a failed system response when an event occurs.
The structure of the event tree is simple due to the use of Boolean logic. The analysis starts with a probability risk assessment to identify a set of initiating events that can affect the state of the system. The identification and evaluation of consequent events continue until it is clear what the final outcome would be.
The figure below displays a typical layout of an event tree analysis.
The ETA implementation process consists of the following steps:
- Define the scope of work
- Identify the potential hazards and accident scenarios that can help recognise the initiating events
- Determine the identified initiating events
- Determine the resulting events for each identified initiating event
- Construct the ETA for each initiating event
- Identify the event failure probabilities
- Determine the outcomes of the identified initiating events
- Determine the acceptability of an event using the failure probabilities
- Identify and propose correcting actions
- Document the ETA process
- Update the ETA documentation as soon as new information is available.
#6. Failure Mode Effects Criticality Analysis (FMECA)
FMECA is a type of quantitative analysis often performed after FMEA to further investigate and characterise already identified safety-related failure modes. During the criticality analysis, each failure mode receives a severity classification and probability occurrence value. The severity classification of each failure mode depends on the mode’s effects on the system. The following severity classifications exist:
- Catastrophic – Category I – e.g., death or critical system malfunction
- Critical – Category II – e.g., severe injury or major property damage
- Marginal – Category III – e.g., minor injury or minor property damage
- Minor/Negligible – Category IV – e.g., reliability issues.
The purpose of FMECA is to rank each failure mode considering its severity classification and probability of occurrence. The result of the analysis leads to the development of a criticality matrix. The latter helps put identified failure modes into perspective. The X-axis is the severity, and the Y-axis is the probability of occurrence. The plotting of the failure modes in the matrix happens after each failure mode receives a criticality number (Cm = β x α x λρ x τ). Additionally, the criticality number of each component, subassembly and assembly is calculated for each severity category, using the following formula: Cr = ∑jn-1 (β x α x λρ x τ)n.
Ensure product compliance by using an efficient product compliance management system to streamline compliance activities, improve project & data management, facilitate documentation creation & management, and stay up to date with any changes in the regulatory framework.
According to some standards (e.g., MIL-STD-1629), the matrix scale has five levels of probability failure:
- High (level A) – the probability of failure is equal to or greater than 0.2
- Moderate (level B) – the probability of failure is more than 0.1 but less than 0.2
- Occasional (level C) – the probability of failure is more than 0.01 but less than 0.1
- Remote (level D) – the probability of failure is more than 0.001 but less than 0.01
- Extremely unlikely (level E) – the probability of failure is less than 0.001.
The following image presents a typical layout of the FMECA matrix.
When the criticality analysis is completed, the collected information can be used in the product design’s risk and safety assessment process.
Safety standard IEC 60821 provides detailed guidance on the implementation of FMECA.