A dynamic fault-tree analysis tool for probabilistic risk assessment

Fault trees analysis is essential to keep e.g. our power plants, trains, drones, medical devices, satellites and self-driving cars safe and operational. FTA usage is required by e.g. the Federal Aviation Authority (FAA), the Nuclear Regulatory Commission (NRC), in ISO 26262 for autonomous driving or for software development in aerospace systems (by NASA and ESA). Various fault tree extensions exist that increase expressiveness while yielding succinct and comprehensible models e.g. dynamic fault trees, their analysis is however a main bottleneck: techniques do not scale and require substantial manual effort.

In collaboration with Twente and RWTH Aachen universities, we developed a fully automatic, scalable, and state-of-the-art tool for the analysis of dynamic fault trees. It goes beyond the capabilities of existing commercial tools for fault tree analysis in offering more flexible modelling and analyses.

Why do we need dynamic fault trees?

research-student

Because of limited expressiveness, SFTs cannot model dynamic behaviour of systems in which:

  • Components are redundant (Cold, Warm, Hot)
  • Components functionality is dependent
    • two sources generating energy in parallel -- hot spare
  • Components failure have some temporal ordering
    • first component A fails followed by components B and C
  • Components relationship/responsibility/priority change with time
    • a switch automatically changes energy source e.g. from main-grid to generators
  • research-student

    ISO 26262:2011 demands rigorous risk assessments in automotive industry:

    • “metrics are verifiable and precise enough to differentiate between different system architectures”
    • “[for systems where the] concept is based on redundant safety mechanisms, multiple-point failures of a higher order than two are considered in the analysis”

    research-student

    Rapidly increased usage of AI components in modern systems necessitates a rigorous risk assessment

    • Neural networks typically come with weak statistical guarantees only (if at all). Small perturbations of their inputs may lead to misclassifications that can be catastrophic in safety-critical applications
    • The use of AI components also implies the need for reliability metrics that go beyond the standard metrics of reliability and availability. In particular, there is a need to specify various levels of degradation (given the uncertainty of the AI components) and be able to analyse e.g., “what is the probability that the system will come to a halt without going through degradation level A first”.

    research-student

    FTA focuses on computing various dependability metrics, i.e. key performance indicators that measure how well a system performs. Standard metrics are the system:
    Reliability : the probability that no failure occurred until time T
    Conditional Reliability : the probability that no failure occurred until time T given a component has already failed
    Availability : the average percentage of time that a system is operational
    Mean Time to Failure (MTTF): the mean time between failures,
    Criticality of components : to what extent does a component failure contribute to a system failure, etc.
    Our tools also handles various extensions that include the cost and impact of failures.

    Application Domains

    Industrial Partners

    research-student

    11+

    Partners

    10+

    Projects

    2+

    Happy Clients

    20+

    Meetings