VE4: Automated run-time analysis of probabilistic programs

Randomisation has been an important tool for the construction of algorithms since the early days of computing. It is typically used to convert a deterministic algorithm with bad worst-case behavior into an efficient randomised algorithm that yields a correct output, possibly with a low error probability. The Rabin-Miller primality test, Freivalds’ matrix multiplication, and the random pivot selection in Hoare’s quicksort algorithm are prime examples. Randomised algorithms are conveniently described by probabilistic programs. The run-time of a randomised algorithm is a random variable as the run-time is affected by the outcome of the coin tosses. An important efficiency measure for probabilistic programs is the expected run-time. Reasoning about the expected run-time of randomised algorithms is surprisingly subtle and full of nuances. In classical sequential deterministic algorithms, a single diverging program run causes the program to have an infinite run-time. This is not true for randomised algorithms. They may admit arbitrarily long and even infinite runs while still having a finite expected run-time.

The expected run-time of randomised algorithms is typically analysed using standard mathematical means. Recently there is some initial work towards applying formal verification techniques to analyse the run-time of probabilistic programs. Alternative techniques are either based on using super-martingales, or a weakest pre-conditioning style reasoning. None of these techniques is fully automated.

The goal of this dissertation project is to develop a fully automated technique for the inference of expected run-time bounds for probabilistic programs. This challenging objective will be approached by exploiting recently developed powerful techniques to infer upper and lower bounds for the run-time complexity of ordinary (i.e., non-probabilistic) programs. The plan is to extend these techniques for the automated complexity analysis of programs towards probabilistic programs, and combine them with the recently studied weakest precondition calculus in order to generate suitable upper and lower invariants that are needed to infer expected run-times of probabilistic programs. Besides the theoretical developments, a prototypical implementation in the tool AProVE is planned. Extensive experiments will be carried out to analyse the expected run-time of well-known randomised algorithms such as Sherwood variants of binary search, quicksort and the (challenging) coupon collector problem.

(Given the connections between positive almost-sure termination and expected run-times, there is an obvious link between this dissertation project and VE3. It is also related to project AP5.)