Abstract
Membrane proteins are proteins that are part of or interact with the cellular membrane. They
are the largest family of proteins and responsible for a vast range of cellular functions, which
makes them one of the most important drug targets. The precondition to the understanding of
the function and mechanisms of any protein is to know its structure. The most successful way to
determine protein structures is through X-ray crystal diffraction. For the X-ray experiments, we
need protein crystals. Proteins are notoriously difficult to crystallise, but the task becomes even
more challenging for membrane proteins. The bottleneck of the crystallisation process is finding
the right crystallisation conditions. That is mostly due to the high number of parameters that
are involved in these experiments. Crystallisation methods for membrane proteins rely heavily
on error-and-trial methods, that are time-consuming. Rational and more efficient experimental
design is pivotal in getting more membrane protein structures.
The work presented in this thesis aims to do exactly that: rationalise parts of the process
of crystallising membrane proteins. I combined analysis of experimental data with modelling to
improve our ability to predict mixing time scales for conditions typical of protein crystallisation
and look at the effects of convection and of diffusion through dialysis crystallisation membranes.
I showed how to determine when convection occurs and how to estimate its effect on mixing
times. To my knowledge, this is the first time that the diffusion of crystallisation reagents
through a semi-permeable membrane was studied in the context of protein crystallisation dialysis
experiments. Moreover, I used new approaches to assists experimental design by developing a web
application that can be used during the screening process. Finally, I applied machine learning
(ML) algorithms on membrane crystallisation screening data to prove that automating certain
steps of membrane proteins crystallisation is possible. Predictions made by ML algorithms are
comparable to the experiments made during the optimisation of screening conditions. Analysis
of physicochemical properties of salts and polyethylene glycol (PEG) molecular weights used in
the screens, showed that the screens could be simplified by considering only the ionic strength
and the PEG volume fraction. This has demonstrate that the experimental time and cost could
be reduced significantly, with the use of ML models during the experimental design process.