AP4: Verifying fault trees for railway safety

Probabilistic safety assessment is common practice for railway systems, and often required by law. Fault tree analysis (FTA) is one of the most (if not the most) prominent safety assessment technique. Fault trees (FTs) model how failures propagate through the system: FT leaves model component failures and are equipped with continuous probability distributions; FT gates model how component failures lead to system failures. FTs are one of the most prominent models to describe top-down causes for a system failure and facilitate, amongst others, the analysis of the mean time to failure (MTTF) and the system’s reliability—how likely is the system operational up to time t? Static fault trees have AND- and OR-gates, but no inverters. Their analysis is easy and can be done efficiently using binary decision diagrams.

Dynamic fault trees (DFTs) are directed acyclic graphs that are more expressive and more involved than static fault trees. They cater for common dependability patterns, such as spare management, functional dependencies, and sequencing. The DFT behavior is described by continuous-time Markov chains (CTMCs), where transitions correspond to the failure of a basic event. DFT analysis is a well- developed field in reliability engineering; its main bottleneck is the efficient generation and analysis of CTMCs. Recent research showed that the state-space generation can be significantly boosted by exploiting successful reduction techniques from model checking, such as symmetry and partial-order reduction. Accompanied with a fast analysis of CTMCs—determining the DFT’s mean time to failure, or its reliability—using probabilistic model-checking algorithms, this yields a speed up and reduction in memory consumption by a few orders of magnitude compared to state-of- the-art techniques for DFTs.

The first main objective of this dissertation project is to apply these new technologies to the safety analysis of railway systems, in particular we would like to compare different kinds of signal boxes and their reliability and to study the impact of different train protection (ATP) and train control systems (ATC). These systems supervise the train movements. Additionally we will examine the new European Train Control System ETCS.

One of the drawbacks of DFT analysis is that the failure rates of all system components are to be fixed in advance. This is a severe restriction as mostly failure rates are not commonly known. This holds in particular for new components. Another source of uncertainty in DFT modelling is the amount of required redundancy in the system. On the other hand, overall system reliability requirements are known. Relevant questions then are: what is the maximal tolerable failure rate of a system component ensuring a mean time to failure that is above a given threshold, e.g., two months (or years)?, and what is the minimal amount of redundancy for a given component so as to establish this MTTF? These questions are of high practical relevance to the safety analysis of railway systems and cannot be answered with current means. For example in the last years railway infrastructure managers (or their suppliers) are asked to guarantee an average availability and capacity for their lines. The relevant question is how many infrastructure elements and how much redundancy to install on a line.

A second objective of this project is to exploit techniques that synthesise all parameter values, and the degree of redundancy such that the DFT satisfies the required mean-time to failure. For parameter synthesis we plan to build upon previous work on Markov chains using exact and/or approximate algorithms.

(This dissertation project is related to project VE2.)