dc.description.abstracteng | Using cryo-electron microscopy (cryo-EM), it is possible to resolve structures of biological
macromolecules by averaging numerous projections of macromolecules (particle images)
frozen in vitrified ice and imaged using the electron microscope. Developments over the
last years have thereby sparked a so-called “resolution revolution”. Nowadays, reaching
(near) atomic resolution for a large range of protein complexes has become a standard
practice for structural biology and a complementing technique for X-ray crystallography.
These developments entail developments in microscope hardware, such as direct electron
detectors, that allow for single electron detection (counting) and therefore a much reduced
radiation dose. Algorithmic developments as well as improvements in computer hardware,
however, have greatly improved the in silico processing of samples. Especially relevant
is the ability to quickly process datasets of hundreds of thousands of particles by using
efficient processing algorithms as well as computation on graphic processing units (GPU).
Due to the low electron dose used in cryo-EM experiments today, the signal-to-noise ratio
is low and accurately reconstruction of the high resolution features entails, among other
things, to average numerous particle images to increase this ratio.
In single particle cryo-EM each particle image originates from a biological copy of the
macromolecule under investigation. Ideally, these copies would be exact, i.e. each macromolecule
is exactly the same as all other copies. Otherwise, during reconstruction, averaging
of projections will result in blurry reconstructions. In reality, however, this assumption
cannot hold. Especially when active proteins or protein complexes are under investigation,
it can be expected that the set of projection images originated from a homogeneous set
of macromolecules that are in different conformational states, have different occupancy or
have certain regions that could undergo free movement prior to being frozen in the ice.
Effort has gone into the development of classification techniques which aim to divide the
dataset in such a way that within each class, the assumption of exact copies holds again.
Alternatively, subsections of the structure that are in themselves rigid are refined individually.
Both approaches only handle free movement suboptimally, though. Classification
cannot divide a dataset such that there is no residual movement within each class and
refining only sections of a structure does not give a global reconstruction.
In this work, I explore a pseudo atom based approach to reconstruction. The volume to
be reconstructed is represented using a pseudo atom cloud. When two projection images
originate from different macromolecules, movement of the pseudo atoms can model the
difference between the two when using the projection images to update the intensities of
the pseudo atoms. With this, it is possible to reconstruct a single volume from a dataset
with projections originating from different macromolecule states. This approach was fully
incorporated into a deep learning framework. This enables further development into an
end-to-end machine learning approach.
Deep learning algorithms are thereby a promising class of algorithms for cryo-EM processing.
In a separate chapter, I explored the possibility of using a generative adversarial
network instead of conventional ab-initio reconstruction algorithms. In this approach, a
generator learns a volume that represents the real protein in the experimental images,
without seeing the experimental images directly. This approach already shows promising
results. The learned volume is accurate enough that it can be used as a reference for
subsequent high resolution refinement. Multiple parts can still be added and improved to
further increase the usability of this approach.
In the last part of this thesis, an approach to identify dimeric particles in a homogenous
dataset of monomeric and dimeric SARS-CoV-2 RdRp particles is presented. The approach
was successfully applied to identify enough dimeric particles to reconstruct the
dimeric state with 5.5 Å resolution and subsequently publishing it. Fitting previously
published RdRp structures allowed for a closer examination of this dimeric state. Furthermore,
we hypothesize that this dimeric form might be functional and plays a role in
subgenomic RNA production. | de |