A model combining cell physiology and population genetics to explain Escherichia coli laboratory evolution

Background Laboratory experiments under controlled conditions during thousands of generations are useful tools to assess the processes underlying bacterial evolution. As a result of these experiments, the way in which the traits change in time is obtained. Under these conditions, the bacteria E. coli shows a parallel increase in cell volume and fitness. Results To explain this pattern it is required to consider organismic and population contributions. For this purpose we incorporate relevant information concerning bacterial structure, composition and transformations in a minimal modular model. In the short time scale, the model reproduces the physiological responses of the traits to changes in nutrient concentration. The decay of unused catabolic functions, found experimentally, is introduced in the model using simple population genetics. The resulting curves representing the evolution of volume and fitness in time are in good agreement with those obtained experimentally. Conclusions This study draws attention on physiology when studying evolution. Moreover, minimal modular models appear to be an adequate strategy to unite these barely related disciplines of biology.


Background
The processes that determine how traits change during evolution are still only partially understood. Traditionally, purely observational approaches based on historical records or comparative analysis were used for their study. These strategies have various limitations, the incompleteness of the records available and the lack of control on the environmental conditions being among the most serious obstacles to draw reliable conclusions. This leads, very often, to post hoc selectionist arguments, thereby dismissing the role of contingency [1]. In addition, these arguments usually ignore the evolutionary constraints imposed by the physiological responses of the organism.
An alternative to observational approaches is to perform controlled laboratory experiments. Lenski and co-workers [2] have devised and developed an experimental program to study evolutionary dynamics. This consists in following evolutionary change in replicate populations of Escherichia coli in identical environments. The experiment was started with twelve populations with neither within nor between genetic variation. The environment consisted of a serial transfer regime in which the populations were diluted each day into a glucose medium, sup-porting exponential growth for a limited time. Initially, two properties of the bacterial population were studied, cell size and relative fitness. These properties were followed for more than 10.000 generations and it was found that both increased in time with a dependence that could be reasonably fitted by a hyperbolic relationship. From these studies several very interesting results and conclusions were obtained. For instance, it was found that, although the experiments were done with very large populations that evolved in identical environments, the replicate populations diverged somewhat in both morphology and mean fitness. This would reflect the sequential fixation of beneficial mutations in different orders, demonstrating the crucial role of chance events in adaptive evolution [3]. Variation among populations persists after thousands of generations, even when improvement in mean fitness has slowed down to a very low rate, which is in agreement with multi-peaked fitness landscape models of evolution [2].
The novel laboratory approach, however, does not give a full explanation of the patterns of change that the traits follow during evolution, for example, the parallel increase in cell volume and fitness [4]. This is a rather puzzling behavior because it implies that a larger volume is obtained in a shorter time. In the present contribution we show that to understand this type of evolutionary pattern it is also necessary to consider experimental information related to the internal functioning of the organism. As we shall see, the change of cell volume and fitness during evolution results from the interplay between organismic and population contributions. These contributions are: the constraints that physiology imposes on the effect that mutations have on the phenotype of the cell and the change in time of the distribution of mutations in the population, respectively. Our general strategy is to implement the properties of the bacterial cell in a quantitative model that can be used to study the evolutionary changes. E. coli is, of course, a very complex system and we consider as an adequate model one that only includes the features of its structure, composition and transformations that are essential to reproduce the physiological responses and evolutionary pattern of cell volume and growth rate. These features are described next.

Bacterial model
The organism is delimited from the environment by the cell inner membrane and wall. The amount of solutes that can be accommodated in the cell volume is limited and we consider that the organism operates close to this limit [5,6]. As a consequence, in our model, the total concentration of molecules and macromolecules (C T ) remains constant throughout the cell cycle and during long term evolution. This is consistent with the findings that the larger cells obtained in the physiological conditions for faster growing and during controlled bacterial evolution contain more DNA, RNA and protein per cell [7,8]. Under these conditions, the nutrients are consumed in three different processes. The first one is maintenance, i.e. they are used to replace components that were decomposed or degraded. The second process is growth. Since the total concentration C T is roughly constant, the sole consequence of growth is an increase in cell volume. Finally, the nutrients are consumed to perform several functions that are important for the physiological adaptation of the organism to the environment (e.g. chemotaxis and biosynthesis of antibiotics). The three processes are classified in two modules [9]. The "growth module" includes all the processes that participate in growth and maintenance. When the organism is grown in glucose as unique source of carbon all primary metabolism in operation belongs to this block. The "adaptation module" nucleates the processes whose function would be to contribute to physiological adaptation, for example, catabolic processes that go into action when the source of carbon changes. In addition, a very small proportion of the cell mass is composed of molecules that act as signals. Their amounts or concentrations show changes during cell cycle which are responsible for the initiation of important processes such as DNA replication and cell division.
We assume that the form of the bacteria is cylindrical (Fig. 1). Its growth is longitudinal, i.e. the length increases maintaining a roughly constant section [10][11][12]. The surface area, a, can be expressed in terms of the variable volume v that it encloses by the relationship: a = (πd 2 /2) + 4v/d, where d is the constant diameter. The concentration of the nutrient external to the cell X is a parameter and its internal concentration x is a variable. The number of molecules of nutrient inside the cell, n, can be calculated as n = xv.
In the real cell, the growth and adaptation modules are very complex systems of biochemical reactions regulated and coordinated by an intricate network of signaling interactions. But, for our purposes, we will summarize the rate of consumption of the nutrient by each one of these blocks using a single rate equation. The laws that we have chosen for the rates of the processes of incorporation (r i ) and consumption (r g and r a for growth and adaptation, respectively), expressed in number of molecules units per unit time, are very simple: r i = k i (X / (K X + X)) a, r g = k g x v and r a = k a x v, where k i , k g , k a and K X are parameters. Note that r i ∝ a while r a ∝ v and r g ∝ v because nutrient incorporation is a surface process while nutrient consumption is a volume process. The incorporation rate law shows a simple saturation kinetics with respect to the external nutrient concentration, with Michaelis-Menten constant K X . In the particular case that the bacteria is grown in glucose, the fact that this molecule is phosphorilated concomitantly to its transport by phosphotransferase systems makes the incorporation process virtually irreversible [13]. The rates of consumption r a and r g are taken to be proportional to the concentration of the internal metabolite. The values of k i , k g and k a depend on the concentration of functional proteins allocated to each one of the processes and their specific activity. The rate of change of the number of molecules of nutrient inside the cell, n, in time, dn/dt, is given by: To describe the behavior of n and v in time we need an equivalent equation for dv/dt. Next we will show how this equation is obtained.
We have considered that the total concentration of molecules and macromolecules inside the cell (C T ) is constant in time. Let us now analyze a mechanism by which the cell could achieve this. The constituents of the cell are produced by the growth module, at a rate r g , and are degraded by several processes (thermal motion, proteolisis, etc.). The rates of degradation are lumped together in a single rate (r d ) which is proportional to the total amount of molecules C T v, i.e. r d = k d C T v. If r g is greater than r d , i.e. molecules are produced faster than degraded, an in-crease in volume can compensate for the difference of rates and prevent the increase in the total concentration C T . This balance can be expressed by the equation: Replacing the rates by their corresponding expressions and dividing both members by C T we obtain: Equations (1) and (2) form a system of differential equations whose solution gives the temporal behavior of the cell volume, v, and the number of molecules of nutrient inside the cell, n.
It still remains to implement in the model the way in which the cell determines the timing of division. For this purpose, we will consider a third variable: the number of molecules of a signal s. In the cell, a candidate for this signal is the DnaA protein coded by the dnaA gene [14,15]. Briefly, DnaA binds specifically to regions of the chromosome, the dnaA boxes, in the neighborhood of oriC, the unique origin of replication of E. coli's chromosome. The binding of about 20 molecules of DnaA sets the path to replisome assembly and initiation of replication. When multiple origins are present in the cell, the DnaA molecules "ejected" from one origin after initiation of replication are used for the initiation at other origins. An important element in DnaA regulation is its auto-repression loop, consisting on a dnaA box between the two promoters of dnaA, which once occupied by the DnaA protein does not allow transcription of the gene. Therefore, in the model we consider that s increases until, at the time t u , it reaches a threshold value s u when replication of DNA starts and production of the signal ceases. At this point, the molecules of signal are "ejected" from the replication origin and could be used to initiate another replication process.
The signal is produced and degraded at rates r gs and r ds respectively. These are governed by rate equations of the same type as those used for the rates r g and r d (related to C T ). The differential equation describing the change of s in the interval of time between cell division and the moment at which the signal reaches the threshold value s u is given by:

Figure 1
Minimal modular model of E. coli. X is the external nutrient concentration, n the number of molecules of nutrient inside the cell, v the cell volume and s the number of molecules of signal. n, v and s are variables, X is a parameter. The nutrient is incorporated in the cell at a rate r i and is consumed at rates r g (for growth) and r a (for adaptation).
After the value s u is attained at the time t u , s remains at the threshold value until cell division has been accomplished.
The cell divides a time t * after initiation of replication. The doubling time t d and the growth rate µ therefore are: t d = t u + t * and µ = 60/t d . Note that although we do not specify units (because as we shall see, our results are independent of the units used) we keep in the expression for µ the factor 60 usually used to convert minutes to hours. Finally, to implement the hyperbolic physiological relationship between µ and X [16], we assume that t * depends on X according to the same relationship, i.e., t * = 60/µ * where µ * = k * X / (K * + X).
The behavior of n and v in the cell cycle is obtained by solving the equations (1) and (2) in the interval between two cell divisions. When the doubling time is reached, the organism divides in two identical daughter cells and, at this point, n, v and s are reduced to one half of their value, the internal nutrient concentration (n/v) and the total signal concentration (s/v) not being altered.

Cell cycle behavior
In Fig. 2, we plot n, v, s and the concentrations n/v and s/v in time for three cell cycles. The number of molecules of the internal nutrient (n) shows two phases of growth during each cycle (Fig. 2a). In the first phase, which begins immediately after the cell divides, n grows very rapidly, the volume v remaining essentially unchanged (Fig.  2b). As a consequence, during this short period of time the internal nutrient concentration (n/v) increases abruptly until it reaches a maximum (Fig. 2d). In the second phase, n increases more slowly with a time dependence that can be described by an exponential function. During this period, v increases (in relative terms) more than n, resulting in a decrease in n/v. The increase in v also shows exponential time dependence, as is usually considered in the literature [17,18]. The number of molecules of signal s increases until it reaches the threshold value s u (Fig. 2c). After this time, s remains constant, but its concentration decreases due to the increase in volume (Fig. 2e). The result is a saw-tooth oscillatory behavior.
Another characteristic of the model is that the volume grows, in principle, indefinitely if cell division is not imposed (not shown). This property is consistent with the existence of division mutants, for example fts Z mutants, which form filaments without visible sign of constriction and attaining sizes of several hundreds of micrometers [19,20]. On the other hand, if the catalytic capacity to produce the signal k gs is increased the volume decreases and the growth rate increases (not shown). This has been experimentally achieved by incorporating in E. coli an inducible plasmid carrying the dnaA gene obtaining the same result, i.e. smaller, faster growing cells [21].

Physiological responses
The model reproduces relevant experimental features of bacterial physiological responses. In Fig. 3a we represent the response of the growth rate µ to changes in the nutrient concentration X. The hyperbolic relationship obtained coincides with the well-established dependence [16]. The mean cellular volume also changes with X. To facilitate comparison with data available we plot in Fig. 3b vs. µ. The curve represented in the figure is the best exponential fit of the points obtained from the model, and is consistent with existing experimental results [19,22].
Traditional experiments with bacteria have shown that when the external nutrient concentration is changed cell size and growth rate follow transient trajectories, that require different periods of time to reach the new cell cycle regime [23,24]. The model was not built to account for these transients, but to describe the properties of the initial and final physiological states that are those relevant to evolution.

Phenotypic evolution
The starting point of our analysis on the evolution of E. coli, in glucose, is the experimentally observed decay of metabolic pathways that degrade other substrates absent in the medium [25]. Evolution in controlled conditions would inevitably produce the decay of many functions for which there is no selection pressure. In our model these functions belong to the adaptation module, because they are not used for growth and maintenance.
Their decay implies that the total activity of the adaptation module k a decreases in time. The question is whether and how the decay of unused functions produces the increase in cell volume and fitness found in evolutionary time scale [2]. The explanation requires combining cellular and population approaches. Cell physiology is used to describe how the decrease in the adaptation module of an organism produces the phenotypic change observed. Population genetics characterizes the mutational decay in time of the adaptation module in the population.
In Fig. 4 we plot the effect that a change in k a has on mean cell volume ( ) and growth rate (µ), both evaluated relative to the corresponding values shown by the ancestor ( and µ 0 ). We see that both variables increase when k a decreases, i.e. when the adaptation module decays. These dependencies can be described by the equations: It may be readily shown that the ratio of the growth rates (µ/µ 0 ) is equivalent to the definition of relative fitness W used by Travisano and Lenski (i.e. W = log (E f /E i )/ log (A f /A i ), where subscripts i and f denote initial and final values for the densities of the derived (E) and ancestral (A) populations, growing in direct competition) [26].

Figure 3
Physiological responses to changes in X. A. Growth rate (µ) vs. X. The points (•) were obtained solving equations (1) to (3) for different values of X (the values of the other parameters as in Figure 2). They were fitted to a rectangular hyperbola: µ = µ max X/(K µ + X) using the transformation: X/µ = X/µ max + K µ /µ max . The resulting hyperbola (µ max = 2.108 and K µ = 3.586) is the curve (-) joining the points. B.
vs. µ. To explain how k a evolves in time we will use a minimal population genetics model (see for example [27]). We consider that in the environmental conditions in which evolution is taking place the adaptation module has some dispensable pathways that decay. The genes coding for their proteins are in two allelic forms: functional (F) or non-functional (f). r Ff and r fF are the substitution rates from F to f and from f to F, respectively. Initially, all the population is in the functional allelic form F, its initial frequency being equal to one (p o = 1). The frequency of F, p t , decreases in time according to p t = p eq + (p op eq ) (1 -r Ff -r fF ) t where p eq is the equilibrium frequency given by p eq = r fF /(r Ff + r fF ). This model of forward and backward substitutions in the adaptive module is unrealistic in two respects: it assumes a single locus and condenses all the mutation-selection pressures in two substitution rates. Both these assumptions are clearly an oversimplification, based on the empirical data [2,3,25,29]. Nonetheless, this minimal model gives a dynamic of initially rapid decay of the physiological adaptation module, which is consistent with the observed evolutionary process [25], as is shown next. The value of k a at time zero of the evolution experiment is k ao and the value after all the dispensable functions have decayed k a∞ . Therefore, the time evolution of k a is: Introducing equation (5) into equations (4) we obtain the evolution of relative cell volume and growth rate in time. The resulting curves are represented in Fig. 5 to-gether with data available [2,28]. It is important to note that our approach was not conceived to describe the fine step-like dynamics reported for cell volume and fitness [2,3], but the coarse pattern of change of the traits (in accordance with the philosophy implicit in minimal modeling). For example, the model was not constructed to describe the reported high rate of appearance and fixation of rbsmutants, associated to ribose catabolic deficiency, in the first 2,000 generations [29]. Two additional points deserve particular comments. The first one is that we have represented the cell volume relative to the volume of the ancestor for both model and experiments. In this way the quantity compared is adimensional and independent of the units used. The second comment is that, as was mentioned above, µ/µ 0 is equivalent to fitness (W) and therefore these two adimensional quantities can be directly compared. We conclude that the model succeeds to explain the increase in cell volume and fitness observed in laboratory evolution.
Fitness compares the relative rates of increase in cell numbers when the two populations compete for the same resources, while in this context yield represents the number of cells produced by a population when grown in isolation. It was shown [12] that although the derived population has a higher fitness than the ancestor, the yield of the former was lower than the yield of the latter. This is because the cells of the derived lines are larger, implying that fewer of them can be produced with the fixed amount of glucose available during the 24 hours culture cycle. Importantly, the product of the number of cells times the cellular volume was greater in the derived population than in the ancestor indicating that the v v 0 evolved bacteria are more efficient in converting glucose into cell mass. The model here proposed accounts for this increase in efficiency during evolution. Our calculations indicate that cell volume per unit of nutrient incorporated into the cell increases when the adaptation module decays (results not shown), which is simply due to the fact that nutrient previously consumed for adaptation can be used for growth.
The model also succeeds to reproduce (qualitatively) the experimental evolution of the physiological relationship between cell size and growth rate reported in Fig 1 of Mongold & Lenski [30]. We have compared the curve vs µ for k a = 4.5 corresponding to the ancestor genotype (represented in Fig. 3) with the curve calculated for k a = 1.95 corresponding to the derived genotype (not shown). We found that the two genotypes have similar cell sizes at very low growth rates, but their differences become progressively greater when the growth rate is increased which is in agreement with the experiments reported.

Discussion
The parallel increase in cell volume and fitness in E. coli's laboratory evolution is a rather unexpected result. Using simple scaling arguments one would have expected that a smaller organism, having a greater surface to volume ratio and therefore a greater input of nutrient per unit volume, would have grown faster. However, this "black box" reasoning is not sufficient to explain the experimental evolutionary pattern obtained. As we have shown, for this purpose, it is required to make the organism transparent, considering the main features of its internal functioning. The essence of our explanation is as follows. The decay of unused functions during E. coli controlled evolution [25] decreases the consumption of nutrient for physiological adaptation purposes. More nutrient is therefore available for growth. Since the organism grows maintaining its composition approximately constant, the increase in the amount of nutrient available results in an increase in volume. The rate at which the signal is produced also increases with the amount of nutrient available for growth, the threshold value of the signal being reached in a shorter interval of time. As a consequence the doubling time decreases and the growth rate increases, what completes the explanation for the parallel increase in cell volume and fitness.
In the model, the increase in the activity of the transport system, achieved by an increase in k i or a decrease in K X , also produces a simultaneous increase of cell volume and growth rate. In the experiments, the evolved lines grow faster on substrates different from glucose that use the phosphotransferase systems suggesting an enhanced transport capacity [31]. Therefore, the parallel increase in cell volume and fitness found in laboratory evolution could be the result of the combined effects of increased transport and reduced catabolic functions. However, note that in the conditions of our model, the increase in the activity of the transport system alone can not account for the experimental increase in the efficiency to convert nutrient into cell volume, described in the previous section. Finally, our simulations show that an increase in k g , i.e. the activity of the growth module, could also contribute to the simultaneous increase in cell volume and v v 0 v growth rate during evolution, but such an effect has still no experimental support.
A key feature of the model is the condition for initiation of replication, namely, that the number of molecules of signal reaches a fixed threshold value. If, alternatively, we had chosen as threshold condition not the number but the concentration of molecules of signal (for example for the parameter values of Fig. 2) the decay of the adaptation module would have resulted in an increase of cell volume and a decrease of growth rate. This is because when the adaptation module is reduced v and s both grow faster but the concentration s/v grows slower, the time required to reach the concentration threshold value being increased. The result is a decreased growth rate. A scenario where the amount of molecules of signal that is bound to DNA does not depend on its intracellular concentration has physicochemical support. It has been found experimentally and interpreted theoretically that the number of units of a ligand bound to DNA may be highly insensitive to a wide range variation of its concentration. This could be the case, for example, when the binding forces involved are predominantly of an electrostatic nature [32,33].
The model that we have proposed does not include a realistic number of variables or elaborated laws for the rates of the processes, what would have resulted in theoretical curves more finely tuned to the experiments. In contrast, our strategy was to aggregate the thousands of known chemical components and transformations in the minimum number of modules that are required to reproduce the pattern of increase in cell volume and fitness in E. coli's laboratory evolution [34]. In this way, the essential processes that determine the evolution of the traits are not hidden behind peripheral information.
The nutrient internal to the cell is consumed in a branch point by the growth and adaptation limbs (Fig. 1). At first sight, these would appear to compete for the nutrient available. However, in an environment that changes in an unpredictable way, both processes are necessary for the survival of the organism [35,36]. In addition, the fact that unused functions decay during evolution reduces nutrient consumption for physiological adaptation, saving nutrient that is used for growth. This change in the proportions of resources allocated to growth and adaptation in the new medium increases growth rate and fitness, the mutations responsible for this change being fixed by selection. As a consequence, one would expect that, if the model gives a good representation of the evolutionary process, the mutations causing the decay of the unused functions in the laboratory experiment also become fixed and maintained by selection. This is in agreement with the finding that the mechanism of antagonistic pleiotropy (consisting on mutations that were detrimental in the previous environment and are beneficial in the new one) appears to be more important than the neutral process of mutation accumulation for the decay of unused catabolic functions [25].

Conclusions
The model here presented succeeds in reproducing the parallel increase in cell volume and fitness in E. coli laboratory evolution and other important changes observed in derived strains. This illustrates that to explain the patterns of change of some traits it is necessary to integrate in a single approach the very different time scales of physiology and evolution.

Materials and Methods
Calculations were performed using the program Mathematica (Wolfram Research Inc.).