Active Redundant Systems
From the above expressions, it would appear that the improvement gained as a result of being able to repair redundant sub-systems while the system is operating is truly enormous. For example, consider a system comprising two identical, constant failure rate sub-systems in active redundancy. Suppose the MTTF and Mean Time To Repair (MTTR) of each sub-system are 1000 hours and 0.5 hours respectively. Then, the MTTF of the system, instead of being 1500 hours as it would be if repair were not possible while the system was operating, would become one million hours when the MTTR of each sub-system is 0.5 hours. (See equation (4.9b).)
This is an ideal theoretical situation based on the assumption that when one of the redundant sub-systems fails, an alarm is raised and within a mean time of half an hour, (in this example), the failed sub-system is fully working again. In practice however, there is a chance (measured by what is referred to as degree of coverage) that when one sub-system fails, it remains in a failed state (referred to as a dormant fault) until the other sub-system fails, in which case the system as a whole fails. It turns out that even if there is the slightest chance of a dormant fault, the theoretically attainable value of MTTF is reduced considerably. Other factors that can considerably reduce the benefits of redundancy, irrespective of whether repair is involved, are common cause and common mode failures. These topics are covered extensively in reliability engineering literature.