Fault Tree Analysis
Introduction
Reliability and safety analysis, particularly of complex and high-risk systems like nuclear power plants, large chemical plants, space vehicles, etc., have assumed ever-increasing importance in recent years, particularly after two major accidents in the history of nuclear power generation:
Three Mile Island-2 in the United States.
Chernobyl-4 in the Soviet Union.
Other events that have shaken the confidence of reliability and safety analysts as well as the public at large include:
The release of a large amount of toxic gas in the Union Carbide factory in Bhopal, India, which resulted in the death of several thousands of people.
The failure of the space shuttle Challenger, which resulted in the loss of millions of dollars and the death of a team of astronauts.
Although, for these systems, the techniques presented earlier on reliability block diagrams (RBDs) can be used adequately, fault tree analysis (FTA) offers a comparatively simple and powerful approach for reliability and safety analysis under the most general frame of assumptions.
FTA is an event-oriented analysis in contrast to RBD anlayis, which is structure- oriented and allows only hardware failure considerations. The advantage of event-oriented methods is that they consider not only hardware failures but also any undesirable events that may occur on account of software, human errors, operation and maintenance errors, environmental influences on the system, etc..
A fault tree is a pictorial representation of a system and shows how various events may lead towards a single (usually undesired) event. FTA is most often used for:
Identifying safety-critical components.
Verifying product requirements.
Certifying product reliability.
Assessing product risk.
Investigating accidents or incidents.
Evaluating design changes.
Displaying the causes and consequences of events.
Identifying common-cause failures.
FTA is a deductive analysis method that begins with a general conclusion (a system-level undesirable event) and then attempts to determine the specific causes of this conclusion. Based on a simple set of rules and logic symbols from probability theory and Boolean algebra, FTA uses a top-down approach to generate a logic model that provides for both qualitative and quantitative evaluation of system reliability.
The undesirable event at the system level is referred to as the top event. It generally represents a system failure mode or hazard for which predicted reliability data is required. The lowest-level events in each branch of a fault tree are referred to as basic events. They represent hardware, software and human failures for which the probability of failure is given based on historical or predicted data. Basic events are linked via logic symbols (gates) to one or more undesirable top events.
Another basic difference between the techniques described earlier and the fault tree methodology is that while the earlier techniques use a success frame of consideration, FTA uses a failure frame of consideration. In other words, the earlier analyses are based on an optimistic view of system operation whereas FTA is based on a pessimistic view point. However, it is interesting to observe that both the approaches have certain identifiable landmarks that are equivalent in the success-failure domains.
Figure 5-1 depicts the failure/success domain concept.
* 
Certain identifiable points in the success domain coincide with certain analogous points in the failure domain. For instance, “Maximum Anticipated Success” in the success domain coincides with “Minimum Anticipated Failure” in the failure domain. Although the inclination may be to select the optimistic view of the system (success rather than failure), it is often easier to agree on what constitutes a failure rather than a success. And, the size of the population in the failure domain is hopefully and generally far less than the size of the population in the success domain. This tends to occur because FTA typically concentrates on single failure units. When analysing for success, all aspects of a system are included.
FTA is one of the most widely used methods in system reliability analysis. It is a deductive procedure for determining the various combinations of hardware and software failures and human errors that could result in the occurrence of specified undesired events, referred to as top events, at the system level. A deductive analysis begins with a general conclusion, then attempts to determine the specific causes of this conclusion. This is often described as a top-down approach. This is in contrast to a Failure Mode and Effects Analysis (FMEA), which is considered an inductive, or bottom-up, approach.
The main purpose of FTA is to evaluate the probability of the top event using analytical or statistical methods. FTA has the capability of providing useful information concerning the likelihood of a failure and the means by which such a failure could occur. Efforts to improve system safety can be focused and refined using FTA results.