Finding chemistry that works

Biology is the search for the chemistry works -- R. J. P. Williams.

I gave myself the aim of synthesising compound EH, the most energetically unfavourable reaction. I added some EL and FK into the pool, which cells could use to power synthesis reaction (using the concentration gradient of these compound as well as the chemical energy released in their hydrolysis). I imagine that EH is equivalent to ATP or DNA, which cells must synthesise in order to replicate; and EL and FK are equivalent to sugars that cells metabolise to power reactions.

The system that I created is shown below. Black arrows indicate the transport (active or passive) of chemicals into and out of the cell; red arrows indicate reactions.

Metabolism in an ancestral cell

This metabolism makes use of the concentration gradient and chemical energy of EL to drive EH formation. EL enters the cell by passive diffusion and is broken down to E and L. E can exit the cell, while L accumulates. L accumulates to a high concentration, so that it can drive EH formation as it leaves the cell. I later realised that the E pore is unnecessary, but I left it in the ancestral cell to see if evolution also realised it. In the future, I'll probably just start with a random genome.

Virtual genetics

In order get cells to evolve I needed to create a way to mutate the proteins in the cell. I did this by giving cells a genome which defined the proteins it contained. The genome is a linear sequence of nucleotides, which I represent with the letters A, B, C and D (equivalent to A, G, C and T in real cells). This sequence can be easily mutated as described below.

The genome is split into genes with the boundaries between genes indicated by the sequence DDAA. Genes are then split into two letter codons, which are interpreted thus:

  • BA-xx - a forward transporter of X
  • BB-xx - a reverse transporter of X
  • BC-xx - an enzyme for reaction X (forward)
  • BD-xx - an enzyme for reaction X (reverse)

The two letter codon xx defines a chemical, X, in the case of transporters and a reaction, X, in the case of enzymes. For example BA-AA encodes an E-transporter and BC-AA encodes an EH synthase/hydrolase. The forward and reverse direction of reactions is not important unless reactions are coupled. For example:

  • BA-AA-BA-AB is an E/F symporter (same direction)
  • BA-AA-BB-AB is an E/F antiporter (opposite direction)
  • BB-AA-BA-AB is an E/F antiporter (opposite direction)
  • BB-AA-BB-AB is an E/F symporter (same direction)

If a codon doesn't (yet) have an interpretation, it is ignored, so CA-BA-AA-CB-CC-BA-AB-CD is an E/F symporter as above. Later on I'd like to create a more efficient system whereby the arrangement of two binding sites determines how reactions are coupled.


Each time a cell is created, its DNA is synthesised using a template. Bases are copied from the template one at a time with a 0.5% chance of mutation per base. 75% of mutation are single base changes while 25% cause the DNA to continue replicating from a random position on the template, thus creating insertions and deletions and potentially duplicating the entire genome.

Simulation set-up

I started a simulation with a pool with a volume of 1 000 000 units and the same concentration of chemicals as above (default concentrations of elements plus extra EL and FK). I then added 64 cells to the pool, each containing the default concentration of elements. The DNA in each cell was based on the genome of my ancestral cell, mutated as described above. The genomes were interpreted and an amount of each protein was added to cells, such that the total amount of protein in each cell was 16 units. This prevented cells simply producing more proteins by duplicating genes. It also meant that cells producing useless proteins were penalised by reducing the amount of other proteins.


Cells were run for 20 000 time units and ranked by the concentration of EH they accumulated during this time. The top 16 cells were selected and allowed to replicate. The selection method allowed more successful cells to have more offspring in a fairly arbitrary manner:

  • Cell 1 produces twelve daughters
  • Cells 2 - 4 produce eight daughters
  • Cells 5 - 8 produce four daughters
  • Cells 9 - 16 produce one daughter

In addition, four "wildcard" cells were picked at random and allowed to produce a daughter. This results in 64 daughter cells to form the next generation. Each generation was added to a fresh pool with the same distribution of chemicals as before, and run for 20 000 times units again.

Post new comment

The content of this field is kept private and will not be shown publicly.