Dynamics and Machine Learning Inference using Memory in Stochastic Biochemical Reaction Networks
Doctoral thesis
Date of Examination:2024-02-28
Date of issue:2024-06-24
Advisor:Prof. Dr. Peter Sollich
Referee:Prof. Dr. Stefan Klumpp
Referee:Dr. Johannes Söding
Referee:Prof. Dr. Matthias Krüger
Referee:Dr. David Zwicker
Referee:Dr. Aljaz Godec
Files in this item
Name:main.pdf
Size:7.16Mb
Format:PDF
Description:Main article
This file will be freely accessible after 2025-02-27.
Abstract
English
In this thesis we study the dynamics of non-equilibrium stochastic chemical reaction networks with a special focus on biochemical networks. In the regime of large intrinsic noise where the numbers of participating molecules are small, such as is the case for example in gene regulation or protein regulation networks, the stochastic time traces exhibit large fluctuations of the order of the mean of the process. This thesis attempts to develop statistical physics tools to analytically treat dynamics in such challenging regimes where numerical simulations have to be generally employed due to a lack of existing techniques. Such an analytical theory can also then form the basis for likelihood based inference from biochemical reaction networks with large intrinsic fluctuations. In the first part of this thesis, we develop beyond-mean-field corrections to the dynamics of the observables in stochastic chemical reactions to appropriately account for the large fluctuations. We employ ‘memory’: appropriate integration of past information can systematically improve the approximation to the dynamics of these systems. We show the emergence of this memory from a perturbative treatment of the effective action of the Doi-Peliti field theory for chemical reactions. By the self-consistent dressing of bare responses, we show how a very small class of diagrams contributes to this expansion, with clear physical interpretations. A large sub-class of these diagrams can be further resummed to infinite order, resulting in memory kernels that are stable even for large reaction rates. We extensively demonstrate our approach analytically and numerically on single and multi-species binary reactions across a range of reaction constant values. In the next part of the thesis, we develop a method based on the dynamical free energy for chemical reactions with the one time and two time observables constrained. This method is made tractable using the extended Plefka approximation for continuous variables and results in memory based effective fields for beyond-mean-field corrections for any general reaction network dynamics. Using an ansatz inspired by our previous diagrammatic series summation, we provide expressions for the regularised effective fields in terms of the reaction network stoichiometry. We then numerically demonstrate the utility and the salient features of this approach on three important reaction network models: gene regulation, enzymatic kinetics and epidemic spreading. In the last part of this thesis, we turn our attention to inference from noisy time traces of biochemical reaction networks. We develop a data-driven physics inspired machine learning algorithm that detects the influence of memory in such time traces from partially observed networks, i.e. when not all reacting species are observed. We then show how such a detec- tion can determine which species of the observed part of the network form its boundary, i.e. have significant interactions with the unobserved part, opening a route to reliable network reconstruction. We apply the method to experimental data from the Epidermal Growth Factor Receptor (EGFR) network and show that it works even for substantial noise levels.
Keywords: dynamics; machine learning; inference; memory; chemical reaction networks; EGFR; non-equilibrium; intrinsic noise; large fluctuations; Plefka free energy; networks; physics inspired machine learning; Doi-peliti; path integrals; gene regulation; enzyme kinetics; SIR model