Genome sequencing projects have led to the identification of an enormous number of open reading frames that code for unknown proteins. Elucidation of the structure and function of these proteins makes it necessary to produce proteins fast, in high yields and at low cost. Hence, bacterial expression systems, as E.coli, represent an ideal tool. However, many proteins show problems with solubility if expressed in E. coli resulting in the formation of large, inactive cytosolic aggregates, so called inclusion bodies. A study on the human genome for example estimated the fraction of soluble, expressible proteins only to be 13 %.
To gain their biological functions, inclusion body forming proteins have to be refolded. This step still represents a major challenge for many recombinantly expressed proteins and often constitutes a major bottleneck. As in vitro refolding is a complex reaction with a variety of critical parameters, suitable refolding conditions are typically derived empirically in extensive screening experiments. Thus, it is necessary to establish efficient refolding protocols yielding functional protein for subsequent analysis. Protein folding as a downstream process step remains challenging, because a multi-parameter space has to be explored in an empiric manner.
The goal of this interdisciplinary project is to establish stochastic search strategies (genetic algorithms) for the efficient experimental optimization of the renaturation conditions of proteins. The setup of this new strategy that combines screening and optimization of refolding yields is designed to achieve a robust method which allows the refolding optimization of a variety of proteins in an automated procedure guided by the genetic algorithm.
The experimental data obtained from the renaturation of structurally different proteins will be the basis for the optimization and the modeling of the coherence of renaturation conditions and refolding yields with the help of artificial neuronal networks. The validated models will be further used to deduce connections between in databases available, global protein parameters (e.g. number of amino acids and composition, isoelectric point, hydrophobicity, etc.) and effective refolding conditions. To establish a method allowing the prediction of effective refolding conditions on the basis of theoretical ab initio parameters or on a small set of experiments allowing trend- or optimization predictions would be a milestone in the field of protein production in microorganisms.