- Research article
- Open Access
From molecular to cellular form: modeling the first major transition during the arising of life
BMC Evolutionary Biologyvolume 19, Article number: 84 (2019)
It has long been suggested that Darwinian evolution may have started at the molecular level and subsequently proceeded to a level with membrane boundary, i.e., of protocells. The transformation has been referred to as “the first major transition leading to life”. However, so far, we actually have little knowledge about the relevant evolutionary mechanisms – and even about the plausibility – of such a transition. Here, based upon the scenario of the RNA world, we performed a computer simulation study to address this issue.
First, it was shown that at the molecular level, after the spread of one ribozyme (RNA replicase), another ribozyme (nucleotide synthetase) may emerge naturally in the system, and the two ribozymes would cooperate to spread in the naked scene. Then, when empty vesicles absorb the two ribozymes via “cytophagy”, the resulting protocells may spread in the system and substitute the naked ribozymes. As for the driven power of such a transition, it was demonstrated that the membrane boundary’s roles to ensure the cooperation between the two ribozymes and to prevent invasion of parasites are important. Beyond that, remarkably, it was found that another two factors may also have been significant: a possibly higher mobility of the raw materials in the environment (free water) and the protocells’ potential capability to move around actively. Finally, the permeability of the membrane to raw materials was shown to be a major problem regarding the disadvantage for the cellular form.
The transition from the molecular level to the cellular level may have occurred naturally in early history of evolution. The evolutionary mechanisms for this process were complex. Besides the membrane boundary’s roles to guarantee the molecular cooperation and to resist parasites, the greater chance for the protocells to access raw materials – either due to the diffusion of raw materials outside or the protocells’ active movement, should also be highlighted, which may have at least to an extent compensated the disadvantage regarding the membrane’s blocking effect against raw materials. The present study represents an effort of systematical exploration on this significant transition during the arising of life.
The emergence of life on our planet is attractive to scientists because it represents one of the most amazing phenomena in nature. However, related experimental researches turn out to be difficult. Indeed, people’s ambitious intention to demonstrate similar processes in laboratory is still far from success – endeavors along this line were initially marked by the famous Miller-Urey experiment in 1950s . A major difficulty comes from the fact that the origin of life is not only a problem of chemistry, but also a problem of evolution . That is, any evolutionary “fragment” involved in the process may have lasted a long time in history (we actually have no idea on the corresponding time span, perhaps years, hundreds of years, or even much longer), and is difficult to mimic or study in the lab. Thanks to the development of computer technology (particularly the improvement of computational capability), it is now becoming more and more effective to tackle the evolutionary aspect through in silico simulation [3,4,5]. Expectably, the combination of the experimental and the theoretical efforts may greatly promote our understanding on the origin of life, the fascinating problem.
If we do not consider the so-called “chemical evolution”, the evolutionary aspect of the origin of life is just about the preliminary stage of Darwinian evolution. Indeed, Darwinian evolution is an essential feature of the life phenomenon [6,7,8], and thus its beginning stage constitutes a significant target for the research of the origin of life. In fact, there has already been quite a lot of theoretical work in this area. For example, it has long been suggested that the earliest Darwinian evolution should have run at the molecular level, manifested by molecules capable of replicating (thus, named “replicators”) [9,10,11]. Then, there should be a transition to the stage of protocells, which have a membrane boundary to hold the replicating molecules together (becoming “reproducers”) [3, 11]. This transition has been referred to as “the first major transition leading to life” [11, 12]. These theoretical studies, with the aid of computer simulation, contributed a lot to our understanding on the earliest Darwinian evolution. However, they generally based on quite abstract models, which limit their relevance to concrete scenarios regarding the origin of life.
The idea of the RNA world hypothesis , which has gained a lot of evidence and become popular in the field of the origin of life [14,15,16,17], was original raised to solve an “egg-chicken” paradox: if both DNA and proteins are indispensable for life, then which came first in the process of evolution? That is, RNA came first, playing both the roles of DNA (genetic material) and proteins (functional material), and was substituted later in the evolution (because DNA is more stable and proteins are more chemically active) [13, 14]. However, the mentioned paradox is ultimately concerning the starting of Darwinian evolution because, as we know, genetic feature and functional feature are just the two fundamental requirements for Darwinian evolution. In other words, the idea of the RNA world actually provides a solution for the proceeding of Darwinian evolution based upon only one sort of material, thus rendering the starting of Darwinian evolution much “easier” (see  for a detailed discussion). However, whether this “world” really stood at the start point of life’s Darwinian evolution is often doubted owing to our ignorance about the prebiotic chemistry which might have launched the RNA world [14, 19]. The relevant difficulties once appeared so great that they were summarized as “the prebiotic chemist’s nightmare” ; and such that, alternative scenarios adopting RNA-like polymers (also acting as both genetic and functional material) have been suggested, as “pre-RNA worlds” [19,20,21].
In fact, the RNA world provides a concrete scenario concerning the origin of Darwinian evolution, and is a best target for theoretic work in this area . Even if a pre-RNA world might turn out to have stood at the start point instead, fundamental mechanisms (in line with “one material dual roles”) remain the same, and the results/conclusions for the RNA world would be easy to “transplant” to this RNA-like world. Therefore, the suspense regarding the de novo origin of the RNA world should not prevent us from modeling this scenario, which aims at investigating detailed mechanisms involved in the earliest Darwinian evolution. On the other side of the coin, it is worth noting that the de novo origin of the RNA world has recently gained strong experimental supports, owing to the reveal of the “new route” of nucleotide synthesis which is “prebiotic plausible” [22,23,24]. Indeed, as we have seen, now theoretic researchers in this area are increasingly associating their fundamental ideas with the concrete scenario of the RNA world [4, 5].
In the scenario of the de novo origin of the RNA world, it was popularly assumed that a functional RNA that can catalyze its own replication (by template-directed RNA synthesis) – may have emerged first from a prebiotic nucleotide pool [14, 19,20,21]. There have been long-standing efforts to construct such an RNA “replicase” by in vitro molecular evolution [25,26,27,28]. Recently, it was reported that an RNA polymerase ribozyme had been gained capable of copying an RNA template even longer than itself (about 200 nt) . Admittedly, the copying is still limited to special templates – that is, the ribozyme is yet not able to catalyze its own copying; and in addition, a ribozyme about 200 nt seems too long to be able to appear de novo. Alternatively, a ribozyme catalyzes the synthesis of the building blocks of RNA, i.e., nucleotide synthetase ribozyme , may have emerged first in the RNA world because it may also favor its own replication . This idea deserves serious consideration especially when considering that it has now been shown that RNA may undergo non-enzymatic template-directed synthesis rather efficiently [32,33,34] (that is, the RNA replicase is not absolutely necessary for the replication of other RNA species ).
No matter how, these two functional ribozymes, which favors the replication of RNA from different aspects, may have both emerged in the “prebiotic pool”, and cooperated with each other. This possibility was first explored by our group  and then analyzed in details by Higgs’ group  (both by computer simulation, but based upon different models). In the Higgs and coworkers’ study, it was shown that the coexistence of the two ribozymes, though may be evident, is sensitive to quite a few factors. They concluded that “a system of many types of ribozymes would be difficult in a purely spatial model” and then “encapsulation of strands in protocells would have been important fairly early in the history of life”. The conclusion is generally in agreement with our experience and results in previous studies – this is perhaps a little surprising when considering that the type and details of our model are quite different from theirs, which implies the robustness of the conclusion. It seems that before further evolution (i.e., the emergence of more ribozymes), protocells should “show up”.
Therefore, an interesting event – in the next step – is about the emergence of protocells containing the two ribozymes from the naked scene with the two ribozymes. As mentioned above, the transformation from the molecular form to the cellular form was suggested to have been the first major evolutionary transition in life’s history [11, 12]. Obviously, detailed exploration of such a corresponding event in context of the RNA world would provide us an opportunity to check into the evolutionary dynamics of this “revolutionary transformation”.
The present model is of the same type as those used in our previous studies on the scenario of the RNA world [31, 36, 38,39,40,41,42,43]. Here we introduce the model in a concise way first, and then explain the details of its implementation in “Methods” at the end of the paper.
For the naked scene
We assume a two-dimensional system, with an N × N square grid (with toroidal topology to avoid edge effects), in which molecular objects are distributed, including the raw materials to synthesize nucleotides (namely nucleotide precursors, in quotient of nucleotides), nucleotides, and RNAs. Only molecules within the same “grid room” are possible to interact with each other within one time step.
A nucleotide precursor may transform to a nucleotide (randomly as A, U, C, or G), and a nucleotide may also decay into a nucleotide precursor. Nucleotides may join to each other, forming RNAs. Phosphodiester bonds within an RNA chain may break. A nucleotide residue at the end of an RNA chain may decay into a nucleotide precursor. An RNA may act as a template and attract substrates (nucleotides or oligomers) by base-pairing. Substrates aligned adjacently on the RNA template may be ligated to each other. The substrates (or the products) and the template may detach from each other if the base pairs between them separate. A molecule may move to an adjacent (upper, lower, left, or right) grid room. Notably, in the simulation, because each event in the model system occurs with a certain probability within one time step, such a step is also called a “Monte Carlo step” (all entities updates for their relevant events in a Monte Carlo step).
An RNA containing a characteristic catalytic domain (presumed arbitrarily) may function as a ribozyme. The RNA replicase (REP) and the nucleotide synthetase ribozyme (NSR) have different catalytic domain sequences. A short specific sequence (also arbitrarily presumed) at 5′ end of an RNA is assumed to be a tag. REP may bind onto a template by recognizing the tag sequence, and act as a template-directed polymerase (that is, when REP is at working, only mononucleotides – not oligonucleotides – are able to be incorporated successively). In fact, a REP recognizes the tag sequence by base-pairing, and thus, in addition to its catalytic domain, it should also contain a tag-recognizing domain, which is complementary to the tag sequence (see Methods for details).
For the scene involving protocells
Based on the model for the naked scene, three new kinds of objects, amphiphiles (i.e., membrane components), the raw materials to synthesize the amphiphiles (namely amphiphile precursors, in quotient of amphiphiles) and protocells, are introduced into the system. Each protocell occupies a single grid room.
An amphiphile precursor may transform to an amphiphile, and an amphiphile may decay into an amphiphile precursor. Amphiphiles (with a lower limit of quantity) may assemble into membrane at the edge of a grid room, encompassing molecules within it and forming a “protocell”. A protocell may also break, with its membrane components turning into free amphiphiles. A free amphiphile may join into a membrane, and an amphiphile on a membrane may leave as well. An amphiphile on a membrane may also decay into an amphiphile precursor, which then leaves the membrane. Nucleotides and RNAs are assumed to be impermeable, whereas nucleotide precursors and amphiphile precursors may diffuse across the membrane. A protocell may engulf the substance (all molecules) in an adjacent room. A protocell may divide into two, and two adjacent protocells may also fuse into one. A protocell may move to an adjacent room. When a protocell divide or move towards an adjacent room, the molecules within the latter are pushed away. Figure 1 is a scheme describing this model. The model for the naked scene (mentioned above) is a simplified version of this one.
From the spread of a REP to the co-spread of the REP and an NSR
Here, it should be mentioned that we do not insist either REP or NSR should have emerged first in the naked scene, and the focus of our study is the emergence of protocell from the naked scene with the two ribozymes. However, choosing a route in which one ribozyme emerged and the other followed is helpful for us to envision a complete story about the origin of the RNA world. Additionally, modeling from the very beginning will provide clues for our later analysis on the emergence of protocells, e.g., about the movement of molecules, which is an important factor associated with the effect of membrane boundary (as we will see in the following). To this end, here we followed the classic scenario: “the REP first”, the same as adopted in the work of Higgs’ group . Note that to be more comparable to their work, we adopted a polymerase-typed REP here, instead of a ligase-typed one, which had been adopted in our previous studies [36, 40]. Indeed, in a recent work, using a model of the same type as used here, we have shown that a polymerase-typed REP can spread (becoming thriving through replication – overcoming the parasite problem) in the naked system provided with the so-called “tag mechanism” .
Nucleotide precursors of a certain quantity (TNPB; see Table 1 for the descriptions of parameters) are introduced in the very beginning. They may transform to nucleotides (with probability PNF), and the nucleotides, in turn, may decay into nucleotide precursors (with PND). A balance between nucleotides and their precursors may be reached soon (Fig. 2a). The balance ratio is determined by the relative scale of PNF and PND. According to the model, nucleotides may join with each other end-to-end (with PRL) and form RNAs. One REP molecule may appear this way by chance; and then by non-enzymatic template-directed copying, derive its complementary chain. The separation of the two chains would give rise to a ribozyme and a template (the complementary chain), thus launching the REP-catalyzed template-directed replication. Theoretically, the template (the REP’s complement or a second REP) molecule may also appear by random ligation, though less likely. No matter how, such de novo emergence of ribozymes has been demonstrated in our previous modeling studies to be feasible [31, 40]. Here, to focus on our current topic, we omit such a modeling process, which is arduous. Instead, we “inoculate” a certain number of REP molecules into our system (Fig. 2b, see arrow) to observe whether they could spread. The outcome of the simulation is in agreement with the one in our previous study  on the tag-mediated polymerase-typed REP (though the details of the tag mechanism adopted here is a little different, see Methods).
During the spreading of the REP, we see that both nucleotides and their precursors shift towards a lower level (Fig. 2b). This is not surprising, because they are exploited by the REP for its replication – directly the nucleotides, and then their precursors due to the PNF event. However, there could still be a lot of the raw materials, e.g., here more than a half of the initial nucleotide precursors remain unused even after the REP reaches its “thriving balance” (Fig. 2b, and also Fig. 2c for the long-run situation). In fact, this is owing to our assumption of a low PNF value. Figure 3 shows several different cases which derive from the case of Fig. 2 by changing the PNF value at a middle step. Indeed, a higher PNF would obviously favor the thriving of the REP, which should be attributed to a higher rate of exploiting the raw materials in the system. However, a prebiotic route to produce nucleotides is not likely to have been quite efficient. Thus, there should have been an obvious selective pressure for the emergence of a ribozyme catalyzing the formation of nucleotides from their precursors, i.e., NSR.
So we inoculate a certain number of NSR to observe whether it can spread in the REP-thriving system (Fig. 4). The result is consistent with our expectation – the NSR co-spreads with the REP in the system. In comparison with the situation before the inoculation of NSR, nucleotides increase and nucleotide precursors decrease, because the NSR functions to synthesize nucleotides from their precursors. Generally, the fluctuation (increasing/decreasing) of nucleotides is synchronous as the fluctuation of NSR. Additionally, we can observe a clear reverse-synchronicity between the REP and NSR – that is, when the REP increases, the NSR decreases, and vice versa. This is a clear indication of the competition between the two RNA species, which is easy to appreciate, given that both of them exploit the same building blocks in their replication. On the other hand, notably, such a type of oscillation is also an indication of cooperation between the two ribozymes – when the REP rises to a certain high level (relative to its own oscillation) and the NSR descends to a certain low level (relative to its own oscillation), why this tendency does not continue? The reason is that the bottle-neck reaction would then become the nucleotide synthesis, and the NSR would be more favored – it would rise back, supplying nucleotides for both of the RNA species. Similarly, when the NSR rises to a high level and the REP descends to a low level, the bottle-neck reaction would become the template-directed synthesis of RNA – the REP would rise back, catalyzing the template-directed copying for both of them. Indeed, we can see the sign of cooperation more than competition from another angle – even when NSR, as a new RNA species, spreads in the system, the balance level of the REP generally does not descend – instead, the raw materials are exploited towards a greater extent (this is an evolution towards a more thriving living world). To show the cooperation more clearly, we investigate the influence of PNF to the co-spread system (Fig. 5). In accordance with the results above (Fig. 3), a higher PNF benefits the thriving of REP. However, NSR is obviously disfavored when PNF becomes sufficient high, because its function may become dispensable – the cooperation would no longer be so important and the REP may “fly solo”.
Having known that NSR might be able to spread in a REP-thriving scene, it would then be attractive to demonstrate a corresponding natural evolution involving the de novo appearance of NSR in the scene. However, the efforts to establish de novo events are usually arduous (as mentioned above concerning the de novo appearance of the REP). Indeed, the thriving of REP would bring out a lot of RNA chains, shorter or longer, with different sequences, which derive from REP’s mutation, degradation, partially replication – and subsequent random ligation or recombination; these would provide a better chance for the NSR to appear de novo in this scene than its de novo appearance from a nucleotide pool. Nonetheless, we found that the exploration of the de novo appearance of NSR involving such random events remain difficult, and as we can understand, to insist a near-reality situation at this point would make little sense. Therefore, here we assumed that the characteristic sequence of the NSR is just one nucleotide residue different from that of the REP, and thus NSR may derived from REP just by a single point mutation during replication. Then we tested different cases by changing the random seed of our simulation to see if the NSR can appear naturally in a REP-thriving system. Finally, we observed such cases (e.g., Fig. 6a; and see Fig. 6b for the corresponding snapshots of the spatial spreading). One case like this, though not likely to be possible in reality because of the merely one-nucleotide difference between the two ribozymes, does, at least, vividly show the obvious selective pressure mentioned above – for the emergence of NSR after the thriving of REP.
About the movement of molecules
Before moving on to explore the process concerning the emergence of protocells, it would be valuable to check into the problem of molecular movement in the naked scene – for us to understand the significance of the protocell’s membrane. By theoretical studies, it has long been shown that spatial limitation is crucial for the REP-like replicators to resist parasites in the naked stage [3, 4, 44, 45]. About the actual environments, for example, it was imagined that mineral surfaces [46,47,48,49], lacunose ices [50,51,52], or porous rocks [53,54,55] may have provided the spatial limitation and become the hatchery of early life on the earth. However, in reality, the spatial limitation should be a “big concept”, which is associated with the movement of the ribozymes, the movement of parasites, and the movement of raw materials, as a whole. In our system, indeed, these different kinds of molecular movement are all bound with a basic probability: PMV, which is associated with the “big spatial limitation”. While a nucleotide precursors or a nucleotide may move with PMV, an RNA molecule may move with PMV/m1/2, where m is the mass of the RNA, relative to a nucleotide (see Methods for a detailed explanation). That is, for example, when the basic probability PMV increases, the raw materials (nucleotide precursors), the ribozymes and the parasites, all of them, would “simultaneously” move faster. However, considering that the three kinds of molecular movement would mean different things for a membrane boundary, we decided to analyze them separately here.
Firstly, we investigated the situation for REP alone. A low PMV (as a whole) disfavors the thriving of the REP, but a PMV too high would cause large-scale invasions of parasites (i.e., other RNAs which can also exploit the REP; here they arise de novo in the system), which threatens the stability of the REP’s thriving (Fig. 7a). Such unstability may manifest as a kind of oscillation – after a large-scale invasion of parasites, the REP may recover; but then a second turn of invasion may occur, and the REP descends again (e.g., Fig. 7a, the red line; in fact, similar oscillation can often be observed for predators and preys in an ecosystem). Then, only the change of the movement of raw materials (nucleotide precursors) is allowed (Fig. 7b). Obviously, a higher rate of raw materials’ movement favors REP’s thriving, which should be on account of the greater accessibility of the ribozyme to the raw materials in its replication. When only the movement of the REP is allowed to change, we can find that a faster movement of the ribozyme is also welcome, which should be associated with the issue of accessibility to raw materials as well (Fig. 7c). Combining these two results, that is, it appears that the disadvantage of the low PMV (Fig. 7a) could just be ascribed to the less accessibility of the REP to the raw materials in its replication. When only the movement of parasites is allowed to change, somewhat unexpectedly, no obvious tendency is observed (Fig. 7d). In other words, the invasions of parasites at a high PMV (Fig. 7a) cannot be attributed to the high dispersal of the parasites themselves, but should be on account of the high dispersal of the whole system. For example, we can imagine that if the ribozyme molecules are limited in space locally, the parasites themselves might be difficult to spread globally, even if they can “wander” freely elsewhere (they cannot find “REP” there).
Secondly, we investigated the situation for the co-thriving of REP and NSR. The influence of PMV, as a whole, is similar to the situation of REP alone. A low PMV disfavors the thriving of both REP and NSR, but a PMV too high would cause large-scale invasions of parasites, which threatens the stability of the two RNA species’ thriving (Fig. 8a, note that in the red-line case NSR is completely destroyed during a parasite invasion). When only the moving rate of raw materials (nucleotide precursors) is changed, a little unexpectedly, the situation is similar to that of the whole PMV (Fig. 8b). That is, the influence of the movement of raw materials is obviously stronger than the situation for REP alone (Fig. 7b) – here, a greater mobility of raw materials would result in a higher level of REP, and may even lead to parasite invasions (we see both the oscillation of REP and NSR caused by the invasions; Fig. 8b, the red and orange lines). The reason should be that the introduction of NSR enhances the effect regarding the accessibility to nucleotide precursors since the function of this ribozyme is just to increase the rate of exploiting nucleotide precursors – to synthesize nucleotides, the building blocks of RNAs. Additionally, we can notice that NSR is more sensitive to the decrease of the raw materials’ mobility than REP (Fig. 8b), which is also on account of the NSR’s function – a less availability of these nucleotide precursors might sharply reduce this ribozyme’s advantage. When only the moving rate of the ribozymes is changed, the effects are a bit complex (Fig. 8c). A faster movement of the ribozymes, generally, favors the REP (but perhaps not so straightforward when compared with the situation of REP alone, see Fig. 7c). However, a greater mobility of ribozymes disfavors the thriving of NSR clearly. This phenomenon should just represent the requirement of proximity in the cooperation of the two ribozymes. If the ribozymes move too fast, it might be difficult for NSR to “catch up with” REP – but the REP, from the very beginning on, can “fly solo” in our system (e.g., related to the assumptions in regard to the parameter setting). Certainly, losing the help of NSR, REP also decrease to a level (see the red line) – just accounting for the point “not so straightforward” mentioned above about the REP’s reaction to a faster mobility of ribozymes. Finally, when only the moving rate of parasites is allowed to change, interestingly, we see that, different from the REP alone case (Fig. 7d), a movement of parasites sufficiently high may cause the collapse of the two-ribozyme system (Fig. 8d, the red and the orange lines). We suppose that with greater mobility parasites may travel between REP-rich areas to NSR-rich areas more readily, thus more likely to exploit both of the ribozymes (whereas the two ribozymes are more difficult to exploit each other on account of the spatial distance). No matter how, this may be read as an indication that two ribozymes’ co-spread is more sensitive to parasites than one ribozyme’s spread alone, especially when considering parasites in reality tend to be smaller in size than the ribozymes – actually with greater mobility.
Taken together, we can make three conclusions: the first, a higher rate of relative movement between raw materials and ribozymes, which results in a higher accessibility of the RNA species to the raw materials they need in their replication, would favor the spread of the ribozymes; the second, in the co-spread of the two ribozymes, their movement relative to each other should be restricted for the sake of their cooperation; the third, the co-spread of cooperative ribozymes may be more sensitive to parasites than the spread of one ribozyme alone. Interestingly, the emergence of membrane as a clear boundary, which encompasses the RNA species within a vesicle – engendering the so-called RNA-based protocells, seems to be able to make improvements in all the three aspects: the membrane limits the relative motion of the cooperative ribozymes within the vesicle while allowing a great relative motion between raw materials and ribozymes (provided that the raw materials can permeate into the vesicle), and in the meantime, serves as a physical barrier against parasites’ invasion from outside. Then, in the following, we will explore the scenario about the emergence of protocells from the REP-NSR’s co-thriving system in a naked scene. We ask: Are the RNA-based protocells indeed able to emerge from the naked scene with the two ribozymes (provided with a near-reality model)? May the protocells substitute the naked ribozymes therein? If so, what are the detailed mechanisms concerning the superiority of the protocells in comparison with the naked scene? As the other side of the coin, are there any significant disadvantages or limitation regarding the advent of the membrane boundary?
The emergence of protocells containing REP and NSR
To modeling the emergence of protocells, we constructed a concrete scene using our model. We assumed that at the center of a 40 × 40 grid system, there is a 20 × 20 grid region, which represent a naked subsystem, where REP and NSR co-spread. The scale of the naked subsystem was chosen in part to consider the comparability of the results below with those above (note that the cases of the naked scene shown above is run in a 20 × 20 grid system). We can imagine that the central subsystem is one of the porous rocks mentioned in the submarine-hydrothermal-vent hypothesis [53,54,55]. The sides of our grid rooms here may represent the walls of those “inorganic compartments” , which provides a natural mechanism of the spatial limitation needed for the ribozymes’ spread in the naked situation. The grid rooms outside the central rocky region represent the solution region, i.e., a “water body”. Compared with the rocky region, in the solution region, the probability concerning molecular movement (PMV) should be higher; the rates concerning with degradation, such as PBB, PND, PNDE, and PAD, should be higher, owing to the higher chemical activity of free water (this does not apply to the situation within a protocell, which is not a free water environment); and the rates related to synthesis, such as PRL and PNF, should be lower, due to the shortage of (rock) surface-mediated catalysis. With this consideration, we constructed a system of two phases, between which material-exchange is allowed.
Initially, we inoculated REP and NSR into the central naked subsystem, and modeled the co-spread of REP and NSR in the rocky region. Similar to the naked scene alone above, REP and NSR co-spread in the subsystem (Fig. 9, black symbols), and here the co-spread is not so stable owing to the occasional invasion of parasites (as we have seen, this is not surprising for a naked scene). Then, we inoculated protocells containing the two ribozymes into the surrounding solution region. Interesting, we found that the ribozymes within the protocells may increase and the protocells spread in the solution region (note: protocells cannot enter the rocky region); meanwhile, the ribozymes in the naked rocky region may decrease and finally become extinct (Fig. 9, red symbols). That is to say, if protocells can appear, they may spread and even substitute the naked ribozymes, implementing the so-called “first major transition” in evolution.
So then, it becomes attractive to modeling the natural emergence of these RNA-based protocells. In the system, after the initial inoculation of amphiphile precursors, they may transform to amphiphiles with probability PAF. Due to random movement of the amphiphiles, some location (i.e., a grid room) may happen to gather sufficient quantity (LAM) of such molecules to form a membrane (PMF). However, this is also a rare event determined by chance. So we inoculate several empty protocells a period after the initial step (the case shown in Fig. 10) (just like our motive for inoculating REP in the naked scene). The protocells may attract amphiphiles, grow and perhaps divide if the molecular number on the membrane rises to twice the LAM. In addition, during the process, there may be quite a few other events associated with the protocell (e.g., amphiphile leaving the membrane, protocell’s breaking, protocells’ fusion, etc.— for details, refer to relevant parameters listed in Table 1, and see Fig. 1 and Methods). Finally, the number of these empty protocells could reach a balance (also see our previous studies [36, 39] for more detailed analysis on such processes).
In the scene we assume here, obviously, we may expect empty vesicles, if formed in the solution, might move to the edge of the rocky region and absorb or engulf the ribozymes which is thriving in the rocky region. Therefore, based upon the thriving of REP and NSR in the rocky region, as well as the balance of empty vesicles in the surrounding solution system, we monitored the system to see what would happen. However, again, we experienced the difficulty to observe such “de novo incidents” which involves various random events. To facilitate the incident, we enhanced the rate of protocell’s cytophagy (PCC) to improve the empty vesicles’ chance to absorb the ribozymes (in reality, if the cytophagy rate is low, more time may be needed for such an incident to occur). As a result, we successfully observed the natural, de novo emergence of such RNA-based protocells, and the corresponding fading out of the naked phase (Fig. 10). Notably, in the modeling case, the observed cytophagy indeed occurs at the edge of the rocky region. The reason is that when the ribozymes diffuse to the liquid zone, they tend to move away (be “diluted”) and thus the chance for the formation of the RNA-based protocells would be little, especially considering that the resulted protocells should include both the two kinds of ribozymes. Such a case (Fig. 10) may be read as a demonstration for the plausibility of the emergence of protocells this way in reality. Remarkably, this is another case representing an evolution towards a more thriving living world (like the case of NSR emerging in a REP-thriving system, as mentioned above) – the number of the ribozymes within protocells, as a whole, become obviously greater than that of the naked ribozymes in the previous scene (Fig. 10a), while the raw materials are exploited to a greater extent (Fig. 10b – note the change of the color depth of the background yellow, which represents the quantity of raw materials, averaged in a system-wide scale).
Then, as planned, we traced into the process, to examine the detailed mechanisms concerning the superiority (and limitation) of the protocells in comparison with the naked phase. In fact, on account of the advantage of modeling studies, such explorations are, to some extent, convenient; and this in-depth analysis, as we will see, would greatly contribute to our understanding on the phenomenon concerning the emergence of cell-like form in life’s history.
About evolutionary mechanisms behind the emergence of cellular form
From the case shown in Fig. 9, we see that parasites may make the naked subsystem unstable; on the other hand, because the protocells were inoculated before the naked subsystem encounters significant invasion of parasites, the victory of protocells should have other reasons or mechanisms (also see Fig. 10a, in which the natural arising of protocells occurs before the naked subsystem is threatened by parasites). To avoid the interference of parasites, which occurs by chance – mainly due to mutation in the ribozymes’ replication, we turn off the mutation events (i.e., PFP was set to 0) when exploring the advantages and disadvantages of the cellular form in the following. Wherein, to explore the influence of the parasite problem itself, we inoculate parasites into the system manually (ad hoc).
Firstly, to explore solely the effect of the membrane keeping the cooperative ribozymes together, we removed all the differences on the parameter settings between the naked subsystem (the rocky region) and the protocell subsystem (the solution region), and the probability of raw materials permeating through a membrane (PNPP) was assumed to be 1. In other words, the protocells are only distinctive in respect of its membrane’s effect which ensures the spatial proximity of REP and NSR and thus their cooperation (in particular, the nucleotides synthesized by NSR would not leak out as well). Besides, to see a “fair competition” between the two subsystems, only raw materials (nucleotide precursors) were allowed to pass through the two subsystems’ interface (note: in the near-reality simulation, e.g., the cases in Figs. 9 and 10, nucleotides and RNA are also allowed to transfer across the rocky edge). The analysis concerning the membrane’s significance for the ribozymes’ cooperation was performed by changing the rate of non-enzymatic nucleotide synthesis, PNF. As mentioned above, PNF is a key parameter determining the cooperative relationship of the two ribozymes: when PNF is low, NSR becomes important and the cooperation of the two ribozymes is evident (Fig. 5). Here we found that a lower PNF may favor the protocell subsystem, whereas a higher PNF may favor the naked subsystem (Fig. 11). In other words, when cooperation is significant, the strategy of encompassing the two ribozymes together would become effective; on the contrary, when cooperation is not so significant, a free mode for the ribozymes may even be better.
In the analysis above about the influence of molecular movement, it has become clear that the moving rate of the ribozymes may influence their spread (e.g., Fig. 7c), which can be related to the ribozymes’ accessibility to raw materials. So we supposed that the moving rate of the protocells, which is associated with the cellular ribozymes’ accessibility to raw materials, may also have a significant influence on the competition between the two subsystems here. We do such an analysis based upon two of the cases investigated above (Fig. 11). We found that for a case in favor of the spread of protocells, if PMC is turned down, the protocells may lose their superiority (Fig. 12a); whereas for a case in which the protocells are disfavored, if PMC is turned up, they may gain superiority (Fig. 12b). As an implication here, we see that a more active motion of protocells may bring more chance for those ribozymes within them to access raw materials all around the environment. Indeed, the membrane separates the ribozymes from the environment, providing them with an opportunity – while being held together for cooperation, able to search around the environment for raw materials. As an analogue in modern living world, active motion is very significant for heterotrophic organisms (e.g., protozoa and animals).
To assess the protocell’s superiority to naked ribozymes in the rocky region in respect of resisting parasites, we attempted to choose a case in which the two subsystems is at balance, and then inoculate parasites into the subsystems in parallel to observe the outcome. However, we found that it is somehow difficult to obtain such a balancing case. Perhaps the two subsystems are basically not able to co-exist. No matter how, we managed to choose a “near-balance” case, but a bit in favor of the naked ribozymes (Fig. 13a). It was revealed that the tendency may be reversed on account of the periodically inoculation of parasites (due to the assumption PFP = 0, as mentioned in the beginning of this section, parasites are here difficult to appear de novo in the system) – that is, the protocells win the competition finally (Fig. 13b). Note that the inoculation is administrated simultaneously and equally to the two subsystems (see the legend of Fig. 13). The result is easy to understand – parasites destroying a protocell are difficult to invade other protocells on account of the membrane, whereas in the naked region, locally thriving parasites are easy to move around and “attack” ribozymes elsewhere.
Next, we turned to the influence of the membrane permeation. So far, this is perhaps the only obvious element we percept concerning the limitation of the cellular form. In reality, the permeability of raw materials to the membrane can certainly not be 100% (i.e., PNPP = 1, as assumed above, cannot occur). Such a permeability may have been crucial for natural selection to “assess” the protocells’ overall advantage. Indeed, our simulation showed that the superiority of the protocells is significantly influenced by this factor (Fig. 14). When PNPP is high, the protocells overwhelm the naked ribozymes in the rocky region; whereas when PNPP is low to some extent (here equal to or below 0.2), the protocells may lose the competition. That is, a low permeation rate may reverse the superiority of the protocells.
Finally, we should consider a factor associated with the environment the protocells may confront in reality, though it appears not a feature of the protocells in themselves. We noticed that for the blue-line case in Fig. 14 (PNPP = 0.1), when judging from the three key parameters mentioned already (PNPP, PNF, PMC), the condition is the same as the near-reality case (Figs. 9 and 10). Why the protocells lose the competition here but achieve superiority there? Considering the difference between the two cases, we threw our suspicion on that factor: the moving rate of raw materials – in the “reality case”, the moving rate of raw materials in the solution is much higher. Remarkably, besides the moving rate of protocells (Fig. 12), this is the other factor in relation to the accessibility of raw materials, which is crucial for the thriving of ribozymes (see the section “About the movement of molecules”). Indeed, when we changed the moving rate of raw materials (in the solution region) to approach the level in the reality case, the balance of competition shifted toward the protocell dramatically (Fig. 15). In other words, this result suggests that the protocells’ opportunity of living in free water body, where the mobility of raw materials should be much greater than in the rocky region, may, at least to some extent, compensate their disadvantage with regard to the membrane as a barrier to raw materials. The key point is, the protocells are robust to this environment – they would hold the ribozymes together “for the sake of” their cooperation and simultaneously enjoy the great accessibility of raw materials which is brought about by the high molecular moving rate in free water.
To summarize, through a series of devised analyses by adjusting parameters in our model, we investigated the evolutionary mechanisms in regard of the emergence of the cellular form from five aspects: four advantages and one disadvantage. The first apparent advantage is the membrane boundary’s role to keep the cooperative ribozymes together. A second apparent advantage is the membrane boundary’s role to prevent the systematical invasion of parasites; that is, if the parasites cannot get through the membrane, their damage would be limited to one protocell once. There seem to be only one disadvantage: the accessibility of raw materials is blocked by the membrane. This shortcoming is also obvious – it can be so dramatic that other advantages may be overwhelmed. Fortunately, the membrane may save itself in the same respect: its robustness against water flow renders the protocells able to “live” in free water, which “provides” a higher molecular mobility and thus an enhanced accessibility of raw materials. Moreover, if the protocells may evolve the ability (in future) to move actively, their accessibility to raw materials would be augmented further. This is, at least, a “reserved advantage”.
First, we should have a comment on the statistics concerning our approach. For the “transient regime” during the evolution of the system, the outcome may be quite different with a different seed. The difference is mainly manifested as the time step (Monte-Carlo step) when the transition would occur, for example, as for the emergence of NSR in a REP-thriving system and for the emergence of protocells in a naked stage (by cytophagy). However, for a stable stage, the main feature is the quantity-level of relevant objects (e.g., the number of ribozymes, protocells and raw materials), which is not much influenced by the difference of random seeds. The reason is that after numerous sampling over a large number of Monte-Carlo steps (in fact, most probabilities are sampled on for many times within one step), the balance-level of the quantities is just a representation of “average” feature of the system. In addition, from the fluctuation of the dynamics curves, the “deviation” feature also appears to be quite straightforward. In other words, the major outcomes here are quite stable due to inherent statistics-included feature of the simulation approach. Therefore, we have not conducted ad hoc statistics studies on the results (which would be almost computation-insurmountable owing to the large-scale of the model system), relying solely on the appearance of the dynamics curves instead.
The focus of the present study is about the evolutionary dynamics concerning the emergence of REP-NSR-contained protocells from a naked scene with REP and NSR. We demonstrate the plausibility of such a transition (Figs. 9 and 10), and subsequently investigate the evolutionary mechanisms possibly involved in the transition (Figs. 11, 12, 13, 14 and 15). In fact, with a similar model, we have investigated the cooperation of REP and NSR in the naked scene, as well as their cooperation in the protocells . A distinction is that the REP there  is a ligase-type one, unlike the REP here, which is a polymerase-type one (as mentioned already in the Introduction). Obviously, from the results there and those here, we know that this difference does not alter the outcome that REP and NSR can cooperate to thrive. The investigation in Ref.  focused on the plausibility of the cooperation of the ribozymes (for the protocell model there, even including a third ribozyme: amphiphile synthetase ribozyme) and a systematical analysis of the parameters’ influence on the modeling results. It did not show the plausibility of the natural emergence of one ribozyme after another (here, NSR after REP) and the plausibility of the natural emergence of the RNA-based protocells (by empty protocells’ cytophagy of ribozymes). More importantly, here we investigated the evolutionary mechanisms possibly involved in the transition from the naked stage to the cellular stage (Figs. 11, 12, 13, 14 and 15), which is absent in the previous study .
To envision a complete scenario – from the earliest molecular form, we start from the spread of REP alone, and then to the co-spread of REP and NSR in the naked scene (Figs. 2, 3, 4, 5, 6, 7 and 8). In fact, it turns out to be valuable to perform such a “prelude” study, which provides information about the way to judge the ribozyme’s cooperation (i.e., PNF in this model), the different effects of moving rates regarding raw materials, ribozymes, and parasites, etc. – such information is important for us to examine possible advantages and disadvantages regarding the advent of the cellular form. But notably, as mentioned already, for the naked scene in reality, we do not insist REP should have emerged before NSR. That is to say, NSR may have emerged first (it would favor its own replication by supplying nucleotides) , replicating through non-enzymatic template-directed synthesis . If so, the bottle-neck would then become the template-directed synthesis, and a selective pressure for the emergence of REP would be obvious. Certainly, when REP emerged, the story would all the same proceed into the naked scene with cooperative REP and NSR.
Alternatively, one may suggest that the transition from the molecular form to the cellular form may have occurred at the one-ribozyme stage, i.e., from REP to protocells containing REP (see  for a proposition about such a protocell as the simplest cellular life form), or from NSR to protocells containing NSR. Admittedly, we cannot rule out these possibilities. An apparent advantage for such transition is to resist parasites. In addition, notably, in replication, REP in itself need cooperation between the REP molecules as templates and those as catalysts; NSR, also, need cooperation, though not a direct one – nucleotides synthesized by NSR molecules as catalysts should be exploited by other NSR molecules as templates. The introduction of the membrane boundary may also facilitate such kinds of cooperation. However, a result in the present study suggests that when the cooperation between different types of ribozymes (REP and NSR) is not so important, a naked form might be better (Fig. 11). This is a sign shown that protocells may have emerged after the advent of at least two ribozymes. Additionally, considering the result here about the dramatic disadvantage of the membrane’s low permeability to raw materials (Fig. 14), the transition from REP to REP-contained protocells may be significantly impeded because nucleotides are difficult to permeate across a membrane [57, 58] (note that the transition from NSR to NSR-contained protocells would be less influenced in this respect because therein what need to pass through the membrane would be nucleotide precursors; see  for a more detailed discussion on relevant issues).
In a previous study, we showed that a tag mechanism, in which the REP recognizes its template through a short subsequence, should be important for a polymerase-typed REP to spread in the system . The REP adopted in the present model is just a tag-mediated polymerase-typed REP. It should be noted that the mode of the tag mechanism here is adopted in the light of relevant experimental work (see Methods, Fig. 16), which is somewhat different from the previous one . However, the REP can spread in the system all the same, which implies the robustness of the tag mechanism as a strategy to favor the spread of a polymerase-typed REP. In addition, remarkably, it has been shown here that the tag mechanism can be extended to a multi-ribozyme system; that is, if the NSR contains the tag, it can also “make use of” the REP in its replication. Notably, in the model of Higgs’ group , it appears that no tag mechanism is needed, and this is on account of the difference between their model and ours. In our model, we have a resolution at the level of nucleotide residue – any other RNA species may exploit the REP and act as parasites. In other words, we adopt a model system more approaching to the reality, which reflects the more serious parasite problem in reality – and so thus can manifest the need for the tag mechanism.
No matter how, both our simulation and theirs come to a common conclusion on the effective cooperation of the two ribozymes, REP and NSR, in the naked scene. In fact, apart from the analysis in the Results section concerning Figs. 4 and 5, we can also read a sign of the cooperation directly from the scenario on the NSR’s emergence in a REP-thriving system (Fig. 6b). That is, though at the REP-alone stage, the REP molecules disperse well around the whole system (step 6,690,000), after the co-spread of REP and NSR, both the two ribozymes are limited spatially, distributed in an aggregative way (step 20,000,000; note that the grid has a toroidal topology – that is, those REP and NSR molecules around the left and the right edges, are actually gathering in the same region). Additionally, we can notice that raw materials (yellow background) elsewhere are not sufficiently exploited by the locally distributed ribozymes. With these visualized messages, we can feel directly that paradox: for the cooperation of the two ribozymes, the moving rate of molecules should not be too large; but for a greater accessibility to the raw materials around, a greater moving rate of molecules would be beneficial. So then, it is not surprising that the emergence of membrane boundary, which renders the ribozymes’ movement and the raw materials’ movement separable, would be welcome in the evolution. That is, the protocells can benefit from the faster mobility of raw materials in free water, enjoying a passively achieved accessibility of raw materials (Fig. 15), or even might move actively to exploit raw materials elsewhere (Fig. 12).
In fact, some other traditional theoretical studies, mainly based upon the concept of “replicator”, addressed problems related to the topic we considered here. One can refer to reviews in this area [3,4,5]. For example, there was work showing the scenario on the emergence of a REP-like replicator from a metabolically coupled replicator system (MCRS) [5, 59]. The MCRS comprises multiple functional replicators which cooperate to generate monomers, thus functioning as a whole like an NSR. As another example, there was work which deliberately compared the effect of the naked spatial limitation with that of the membrane compartmentalization on the evolutionary dynamics of REP-like replicators . However, it is difficult to assess these results here, because they tend to work with models apparently more abstract, and sometimes adopted assumptions not so relevant to a real RNA world. More importantly, as far as we know, there are not yet such theoretical studies explicitly addressing the evolutionary dynamics involved in the transition from a naked molecular system to a protocell system.
For the concrete scenario we assumed here of the transition from naked REP and NSR to protocells containing REP and NSR, we envisage a porous rocky region with a solution region surround it, based upon the submarine-hydrothermal-vent hypothesis concerning the hatchery of life on the earth [53,54,55]. However, the modeling is generally valid for another two scenes regarding the naked stage, i.e., mineral surfaces [46,47,48,49] and lacunose ices [50,51,52]. For the latter, the model here can almost be entirely transplanted; for the former, a more accurate model should assume a solution region (the protocell subsystem) “suspended over” a mineral surface region (the naked subsystem).
Finally, what would occur next, after the emergence of protocells containing REP and NSR? Owing to the efficient spatial limitation provided by the membrane boundary, more ribozymes may have appeared and cooperated with the REP and NSR. For example, in a previous study we have shown the plausibility of the cooperation of an amphiphile synthetase ribozyme with REP and NSR within protocells (mentioned above already ). Other ribozymes, such as a nucleotide-precursor synthetase ribozyme, which enabled the RNA-based protocells to exploit more fundamental raw materials, may also have emerged. However, with the increase of the types of the ribozymes, the protocells would tend more to risk the loss of some “genes” during their division, which aroused the selection pressure for gene-linkage, i.e., for a chromosome [3, 61]. Interestingly, in a modeling study, we have demonstrated the plausibility of the emergence of a chromosome in RNA-based protocells . In the model, the RNA chromosome is assumed to adopt a circular chain to avoid end-degradation, and thus can sustain a much greater length comparing with individual ribozymes; in addition, there are self-cleaving sites between the linked genes on the chromosome, which might give rise to individual ribozymes – thus solving the problem of the “gene expression” (see the original paper for a detailed discussion ). Here, it is worth noting that, now we know the tag mechanism may have been important for the spread of (polymerase-typed) REP , and would remain valid in a system comprising multiple ribozymes (REP and NSR) as shown in this study; a further attractive issue is, can the tags work in a context of chromosome which bears the linked genes? That is, may the tags have just represented some “preliminary promoters” – serving as control handles for the “gene expression”? If so, that would be a door open towards future for this primitive living world; from then on, the evolution towards complexity would become fully expectable, ultimately leading to the arising of life on our planet – that amazing thing.
The transition from the molecular level to the cellular level, which is marked by the introduction of a membrane boundary, may have occurred naturally in early history of evolution. Indeed, such a membrane, which may self-assemble from amphiphilic molecules (e.g., fatty acids or phospholipids) in water, constitutes a straightforward way to ensure the cooperation of different genetic/functional molecules – thus, promising the subsequent evolution towards greater complexity and higher efficiency. Just as importantly, the membrane makes an effective obstacle to parasites (those molecules exploiting the functional molecules but offering no advantages). However, as the other side of the coin, the membrane would also impede the protocells to access raw materials for the synthesis of their own molecules – as a major disadvantage of the membrane. This shortcoming is expected to be compensated by another virtue of the cellular form (also associated with the accessibility to raw materials) – while guaranteeing the spatial proximity of the cooperative genetic/functional molecules inside, the entity as a whole is independent of the molecular movement in its environment. For instance, it may endure the impact of a fast molecular movement in free water, so as to achieve a greater accessibility of the raw materials therein; moreover, if the protocells could evolve the capability to move around actively (just like protozoa and some bacteria in our modern living world), they would be benefited further. Interesting, here, all these considerations have been validated and demonstrated vividly in a computer simulation study using a near-reality model with the RNA world scenario as a context. The present work represents a systematical exploration on the evolutionary dynamics of this critical transition.
In a simulation case for the naked scene, nucleotide precursors of a quantity (TNPB; see Table 1 for the descriptions of parameters) are introduced at the initial step. In a simulation case for the scene involving protocells, besides nucleotide precursors, amphiphile precursors of a certain quantity (TAPB) are introduced into the system as well. After the introduction of these initial substances, the system evolves according to the rules of the model, involving those various events. During the process of the simulation, various data may be monitored or recorded, including the number of different molecules, the number of protocells, the information about RNA’s chain-length in the system, the spatial distribution of molecules and protocells, etc.
The setting of the parameters
The probabilities concerning the events in the system should be set according some rules. Reactions catalyzed by ribozymes should be much more efficient than corresponding non-enzymatic reactions, so PTLR> > PTL and PNFR> > PNF. “Template-directed ligation” should be apparently more efficient than “random ligation”, so PTL> > PRL. Here, nucleotide residues within the chain are assumed to be unable to decay (they should be protected therein), whereas those at the end of the chain decay at a rate lower than that of free nucleotides, i.e., PNDE < PND. Amphiphiles within a membrane should be protected, so PADM < PAD. Because of the self-assembly feature of amphiphiles, PMF> > PCB and PAJM> > PALM. The movement of molecules should be easier than protocells, so PMV > PMC. Other considerations may include: PBB may be higher than PRL, but lower than PNDE, PCC should be quite low, PAPP > PNPP, etc.
In consideration of the computational intensity, some parameters, such as “total nucleotide precursors introduced in the beginning” (TNPB), “total amphiphile precursors introduced in the beginning” (TAPB), “the lower limit number of amphiphiles to form a membrane” (LAM), and the lengths of the characteristic domains of the ribozymes and the tag (CSREP, CSNSR and CSTag), are set obviously smaller in scale than the corresponding situations possible in reality. However, such simplifications are believed to be not in conflict with the fundamental rules involved in the target system which is modeled.
In the study, we set the parameters mainly based on our experience in previous studies using similar models [31, 36, 38,39,40,41,42,43]. The default values listed in Table 1 were only chosen out to shape the cases which are used to demonstrate our results (but see text in Results for interpretations when the default values are not used). In other words, the model is surely not strictly bound with these concrete values of the parameters. Actually, the outcome of the simulation turned out to be fairly robust against moderate adjustments of most of the parameters (also refer to the “Magnitudes” in Table 1). In addition, to be sure about the robustness of the results here, a typical case (for one parameter combination) has been repeated with at least ten different random seeds.
Detailed mechanisms concerning how some of the parameters work
With the breaking of phosphodiester bonds, an RNA molecule may degrade into shorter ones (including nucleotides). When the breaking site of the chain is at a single-chain region, the breaking probability is PBB. When the breaking site is within a double-chain region, the two parallel bonds may break simultaneously, with the probability PBB3/2. The adoption of the index 3/2, instead of 2, corresponds to the consideration of the synergistic effect of the two breaking events.
The probability of the separation of the two strands of a duplex RNA is actually assumed to be PSP(r + 1)/2, where r is the number of base pairs in the duplex. When r = 1, the probability would be PSP. When r increases, the probability would decrease (because PSP, as a probability, has a value between 0 and 1). That is, the separation of the two strands would be more difficult if the base pairs are more. The introduction of the 1/2 corresponds to the consideration that self-folding of single chains may aid the dissociation of the duplex.
The probability of membrane formation is assumed to be 1-(1-PMF)x, where x equals to a-LAM + 1, and a is the number of amphiphiles in the grid room. When a equals to LAM (the lower limit of the number of amphiphiles to form a protocell membrane), the probability of membrane formation equals to PMF. This assumption concerns the consideration that the more amphiphiles there are in a grid room, the more probably they would assemble to form a vesicle.
The probability of an amphiphile leaving the membrane is assumed to be PALM / [1 + FOP × n/(b/2)3/2], where n is the quantity of inner impermeable ions, including nucleotides and RNA (measured by the number of nucleotide residues), and b is the quantity of amphiphiles within the membrane. Wherein, b/2 (there are two layers in the membrane) is a “scale” representation of the surface area of the membrane. Consequently, (b/2)3/2 is a scale representation of the cellular space. Thus, n/(b/2)3/2 is a representation of the concentration of the ions. Further, FOP × n/(b/2)3/2 represents the consideration for the “osmotic pressure effect”: a higher concentration of the inner impermeable ions would cause the protocell to be more swollen, and thus amphiphiles on the membrane are less likely to leave – this effect was demonstrated experimentally .
The probability of a nucleotide precursor permeating into a protocell is assumed to be PNPP × (b/LAM) / [1 + FDE × n/(b/2)3/2], where n is the quantity of inner impermeable ions and b is the quantity of amphiphiles within the membrane. The element (b/LAM) represents the consideration of the constraining effect of the cellular space on the influx of nucleotide precursors. That is, when b increases, meaning that the cellular space increases correspondingly, the probability of a nucleotide precursor permeating into the protocell would become greater. FDE × n/(b/2)3/2 represents the consideration of the effect of Donnan’s equilibrium  – simply put, RNA and nucleotides, which are charged and impermeable, may suppress the incoming of permeable materials with the same charge, i.e., nucleotide precursors as it is assumed here (see  for a detailed explanation).
The probability of protocell division is assumed to be PCD × (1–2 × LAM/b), where b is the quantity of amphiphiles within the membrane. When b is no more than twice that of LAM, the probability is no more than 0, i.e., the protocell could not divide. This assumption considers the fact that the larger the protocell, the more probably it would divide, on account of the physical instability.
The probability of the movement of an RNA molecule is assumed to be PMV/m1/2, where m is the mass of the RNA, relative to a nucleotide. This assumption represents the consideration of the effect of the molecular size on the molecular movement. The square root was adopted here according to the Zimm model, concerning the diffusion coefficient of polymer molecules in solution .
Some detailed assumptions when implementing the model
In a time step, for a grid room, collision is checked for a certain number of times. Each time, RNA molecules (including nucleotides) in the grid room are grouped pairwise, simulating collision events. Upon a collision, one of the following three events may occur: the first, they may ligate end to end with each other; the second, if one of them is REP and the other contains a 5′-tag, the REP may bind onto the tagged RNA molecule (as a template); the third, the longer RNA molecule may act as a template and attract the shorter one (as a substrate) based on the rule of base-pairing.
Only unpaired nucleotide residues at the ends of an RNA chain (either at 3′- or 5′-end) may decay. Only when an RNA molecule reaches 5-nt length may it act as a template (generally in accordance to an experiment ). A bound polymerase ribozyme would drop from its template when the whole chain is copied, or may drop with PRD when the copying has not yet been completed. The bound polymerase ribozyme on the template also risks chain-breaking with PBB. Certainly, when the ribozyme’s chain breaks, the resulting pieces would drop from the template. A single strand RNA would become a template when a REP binds onto it, or may turn to a template with PRTT when attracting a substrate in the absence of REP.
In the light of recent experiments concerning the in vitro molecular evolution of RNA polymerase ribozymes [28, 29] (also refer to relevant studies in history ), we assume that the tag is at the 5′-end of the template and the REP contains a “tag-recognizing domain”, also at its 5′-end, which base-pairs with the template’s tag. For such a REP, actually, the tag and the tag-recognizing domain should be of the same sequence which is palindromic so that a REP molecule, acting as a ribozyme, can recognize another REP molecule, serving as a template. Furthermore, to guarantee its ability to spread (i.e., become thriving through replication) in the system, the REP should contain a “reverse-tag” at its 3′-end so that its complementary chain carries a 5′-tag, and can also be recognized. Obviously, here, the reverse-tag is also of the same sequence with the tag, owing to the tag’s palindrome. That is, in practice, a REP capable of spread in the system should contain its catalytic domain, plus two palindromic tag sequences at 3′- and 5′- end, respectively. Similarly, to be able to spread in the system, an NSR should also contain two tag sequences flanking its catalytic domain. The difference is that a functional REP (acting as a ribozyme) should contain its catalytic domain and a 5′-tag (as the tag-recognizing domain), whereas a functional NSR need only its catalytic domain. For example, in the cases shown in Results, we assume a 10-nt length for the catalytic domain of the ribozymes and 4-nt length for the tag. Therefore, a functional REP should be at least 14 nt long, and a REP capable of spreading should be at least 18 nt in length. A functional NSR should be at least 10 nt long, whereas an NSR capable of spreading should be at least 18 nt in length. See Fig. 16 for an illustration of the tag mechanism involved in the model.
Miller SL. A production of amino acids under possible primitive earth condition. Science. 1953;117:528–9.
Ma WT. The origin of life: a problem of history, chemistry, and evolution. Chem Biodivers. 2014;11:1998–2010.
Szathmáry E. The origin of replicators and reproducers. Philos Trans R Soc B – Biol Sci. 2006;361:1761–76.
Takeuchi N, Hogeweg P. Evolutionary dynamics of RNA-like replicator systems: a bioinformatic approach to the origin of life. Phys Life Rev. 2012;9:219–63.
Kun Á, Szilágyi A, Könnyű B, Boza G, Zachar I, Szathmáry E. The dynamics of the RNA world: insights and challenges. Ann N Y Acad Sci. 2015;1341:75–95.
Luisi PL. About various definitions of life. Orig Life Evol Biosph. 1998;28:613–22.
Trifonov EN. Vocabulary of definitions of life suggests a definition. J Biomol Struct Dyn. 2011;29:259–66.
Ma WT. The essence of life. Biol Direct. 2016;11:49.
Eigen M. Self-organization of matter and the evolution of biological macromolecules. Naturwissenschaften. 1971;58:465–523.
Szathmáry E, Demeter L. Group selection of early replicators and origin of life. J Theor Biol. 1987;128:463–86.
Szathmáry E, Smith JM. From replicators to reproducers: the first major transitions leading to life. J Theor Biol. 1997;187:555–71.
Szathmáry E, Maynard-Smith J. The major evolutionary transitions. Nature. 1995;374:227–32.
Gilbert W. The RNA world. Nature. 1986;319:618.
Joyce GF. The antiquity of RNA-based evolution. Nature. 2002;418:214–21.
Bernhardt HS. The RNA world hypothesis: the worst theory of the early evolution of life (except for all the others). Biol Direct. 2012;7:23.
Higgs PG, Lehman N. The RNA world: molecular cooperation at the origins of life. Nat Rev Genet. 2015;16:7–17.
Pressman A, Blanco C, Chen IA. The RNA world as a model system to study the origin of life. Curr Biol. 2015;25:R953–63.
Ma WT. What does “the RNA world” mean to “the origin of life”? Life. 2017;7:49.
Joyce GF, Orgel LE. Prospects for understanding the origin of the RNA world. In: Gesteland RF, Cech TR, Atkins JF, editors. The RNA world. New York: Cold Spring Harbor Laboratory Press; 1999. p. 49–77.
Joyce GF, Orgel LE. Progress toward understanding the origin of the RNA world. In: Gesteland RF, Cech TR, Atkins JF, editors. The RNA world. New York: Cold Spring Harbor Laboratory Press; 2006. p. 23–56.
Robertson MP, Joyce GF. The origins of the RNA world. Cold Spring Harb Perspect Biol. 2012;4:a003608.
Powner MW, Gerland B, Sutherland JD. Synthesis of activated pyrimidine ribonucleotides in prebiotically plausible conditions. Nature. 2009;459:239–42.
Powner MW, Sutherland JD, Szostak JW. Chemoselective multicomponent one-pot assembly of purine precursors in water. J Am Chem Soc. 2010;132:16677–88.
Powner MW, Sutherland JD, Szostak JW. The origin of nucleotides. Synlett. 2011;(14):1956–64.
Bartel DP. Re-creating an RNA replicase. In: Gesteland RF, Cech TR, Atkins JF, editors. The RNA world. New York: Cold Spring Harbor Laboratory Press; 1999. p. 143–62.
Johnston WK, Unrau PJ, Lawrence MS, Glasner ME, Bartel DP. RNA-catalyzed RNA polymerization: accurate and general RNA-templated primer extension. Science. 2001;292:1319–25.
Cheng LKL, Unrau PJ. Closing the circle: replicating RNA with RNA. Cold Spring Harb Perspect Biol. 2010;2:a002204.
Wochner A, Attwater J, Coulson A, Holliger P. Ribozyme-catalyzed transcription of an active ribozyme. Science. 2011;332:209–12.
Attwater J, Wochner A, Holliger P. In-ice evolution of RNA polymerase ribozyme activity. Nat Chem. 2013;5:1011–8.
Unrau PJ, Bartel DP. RNA-catalyzed nucleotide synthesis. Nature. 1998;395:260–3.
Ma WT, Yu CW, Zhang WT, Hu JM. Nucleotide synthetase ribozymes may have emerged first in the RNA world. RNA. 2007;13:2012–9.
Deck D, Jauker M, Richert C. Efficient enzyme-free copying of all four nucleobases templated by immobilized RNA. Nat Chem. 2011;3:603–8.
Kaiser A, Richert C. Nucleotide-based copying of nucleic acid sequences without enzymes. J Organomet Chem. 2013;78:793–9.
Li L, Prywes N, Tam CP, O'Flaherty DK, Lelyveld VS, Izgu EC, Pal A, Szostak JW. Enhanced nonenzymatic RNA copying with 2-Aminoimidazole activated nucleotides. J Am Chem Soc. 2017;139:1810–3.
Szostak JW. The eightfold path to non-enzymatic RNA replication. J Syst Chem. 2012;3:2.
Ma WT, Hu JM. Computer simulation on the cooperation of functional molecules during the early stages of evolution. PLoS One. 2012;7:e35454.
Kim YE, Higgs PG. Co-operation between polymerases and nucleotide synthetases in the RNA world. PLoS Comput Biol. 2016;12:e1005161.
Ma WT, Yu CW, Zhang WT. Monte Carlo simulation of early molecular evolution in the RNA world. BioSystems. 2007;90:28–39.
Ma WT, Yu CW, Zhang WT, Zhou P, Hu JM. The emergence of ribozymes synthesizing membrane components in RNA-based protocells. BioSystems. 2010;99:201–9.
Ma WT, Yu CW, Zhang WT, Hu JM. A simple template-dependent ligase ribozyme as the RNA replicase emerging first in the RNA world. Astrobiology. 2010;10:437–47.
Ma WT, Yu CW, Zhang WT. Circularity and self-cleavage as a strategy for the emergence of a chromosome in the RNA-based protocell. Biol Direct. 2013;8:21.
Ma WT, Yu CW, Zhang WT, Wu SM, Feng Y. The emergence of DNA in the RNA world: an in silico simulation study of genetic takeover. BMC Evol Biol. 2015;15:272.
Wu SM, Yu CW, Zhang WT, Yin SL, Chen Y, Feng Y, Ma WT. Tag mechanism as a strategy for the RNA replicase to resist parasites in the RNA world. PLoS One. 2017;12:e0172702.
Boerlijst MC, Hogeweg P. Spiral wave structure in prebiotic evolution: Hypercycles stable against parasites. Phys D. 1991;48:17–28.
Szabó P, Scheuring I, Czárán T, Szathmáry E. In silico simulations reveal that replicators with limited dispersal evolve towards higher efficiency and fidelity. Nature. 2002;420:340–3.
Ferris JP, Hill AR, Liu R, Orgel LE. Synthesis of long prebiotic oligomers on mineral surfaces. Nature. 1996;381:59–61.
Ferris JP. Mineral catalysis and prebiotic synthesis: montmorillonite-catalyzed formation of RNA. Elements. 2005;1:145–9.
Franchi M, Gallori E. A surface-mediated origin of the RNA world: biogenic activities of clay-adsorbed RNA molecules. Gene. 2005;346:205–14.
Hazen RM, Sverjensky DA. Mineral surfaces, geochemical complexities, and the origins of life. Cold Spring Harb Perspect Biol. 2010;2:a002162.
Trinks H, Schröder W, Biebricher CK. Ice and the origin of life. Orig Life Evol Biosph. 2005;35:429–45.
Vlassov AV, Kazakov SA, Johnston BH, Landweber LF. The RNA world on ice: a new scenario for the emergence of RNA information. J Mol Evol. 2005;61:264–73.
Attwater J, Wochner A, Pinheiro VB, Coulson A, Holliger P. Ice as a protocellular medium for RNA replication. Nat Commun. 2010;1:76.
Russell MJ, Hall AJ, Cairns-Smith AG, Braterman PS. Submarine hot spring and origin of life. Nature. 1988;336:11.
Koonin EV, Martin W. On the origin of genomes and cells within inorganic compartments. Trends Genet. 2005;21:647–54.
Martin W, Baross J, Kelley D, Russell MJ. Hydrothermal vents and the origin of life. Nat Rev Microbiol. 2008;6:805–14.
Szostak JW, Bartel DP, Luisi PL. Synthesizing life. Nature. 2001;409:387–90.
Mansy SS. Membrane transport in primitive cells. Cold Spring Harb Perspect Biol. 2010;2:a002188.
Mansy SS, Szostak JW. Reconstructing the emergence of cellular life through the synthesis of model protocells. Cold Spring Harb Symp Quant Biol. 2009;74:47–54.
Könnyű B, Czárán T, Szathmáry E. Prebiotic replicase evolution in a surface-bound metabolic system: parasites as a source of adaptive evolution. BMC Evol Biol. 2008;8:267.
Takeuchi N, Hogeweg P. Multilevel selection in models of prebiotic evolution II: a direct comparison of compartmentalization and spatial self-organization. PLoS Comput Biol. 2009;5:e1000542.
Maynard-Smith J, Szathmáry E. The origin of chromosomes I: selection for linkage. J Theor Biol. 1993;164:437–46.
Chen IA, Roberts RW, Szostak JW. The emergence of competition between model protocells. Science. 2004;305:1474–6.
Donnan FG. Theory of membrane equilibria and membrane potentials in the presence of non-dialysing electrolytes. A contribution to physical-chemical physiology. J Membr Sci. 1995;100:45–55.
Zimm BH. Dynamics of polymer molecules in dilute solution: viscoelasticity, flow birefringence and dielectric loss. J Chem Phys. 2012;24:269–78.
Chen CB, Inoue T, Orgel LE. Template-directed synthesis on oligodeoxycytidylate and polydeoxycytidylate templates. J Mol Biol. 1985;181:271–9.
This study was supported by the National Natural Science Foundation of China (No. 31571367). The funder had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.
Availability of data and materials
All the data supporting our findings have been presented within the manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.