Cell simulation

10 Sep 2011 Code on Github


I think the reason I'm interested in artificial evolution is that I want a record of every stage of evolution so I can trace the changes and see how complex systems can be built up. However, I've found that even with a complete record of every genome in a system, it can be quite a challenge to untangle evolutionary pathways. I hope that in trying to create programs to analyse the information from my artificial evolution, I might create something that can be used to analyse evolution in the real world.


I've started new cell simulations several times now and got quite far each time before hitting a wall. I've decided to start yet again, building up from the simplest building blocks and recording my progress on this blog (mainly so I can remember how everything works). I'll also be using Git and putting my code for this project on Github for all to see and edit.

To keep my program flexible and simple to use, I've tried to write a program module that contains only very simple, relatively self-explanatory commands; these commands call functions in imported modules where all the computations are hidden (essentially an API). For the most part, I'll explain only the commands in the top-level module; if you want to see more detail then you can look at the code on Github.


The overall aim of this simulation is to evolve networks: networks of reactions (i.e. metabolic pathways) and networks of transcription factors and repressors (i.e. regulatory pathways). In biology we look at these as complete systems, so I think it will be interesting to see how such complex interlocking networks can arise bit-by-bit through evolution. I also hope this simulation will be a more open simulation. Below are some of the hopes I have for evolution in my simulation and how they might be encouraged.

Natural selection

In contrast to my previous evolution simulations, I want this one to simulate evolution by natural selection. In all my previous simulations of evolution there has been a specific measure of fitness (a fitness function). For example, in my simulation of heterocysts, cells were selected based on the time it took them to produce a filament of a specific, arbitrary length. This simply led to the evolution of cells that could divide fastest, even if cells used up all their resources by the end of the simulation; once they'd reached the target length it didn't matter how healthy they were

In this simulation the cells that reproduce will reproduce and those that don't, won't. Evolution should therefore be more open ended and allow multiple strategies to be explored. For example, some cells may evolve to reproduce quickly, while others may evolve to divide more slower to ensure their daughter cells have larger reserves of energy.

Interaction and competition

This is really part of natural selection. Whereas my previous simulations have simulated individuals in isolation and measured their fitness at the end of a 'generation', in this simulation I want to simulate organisms simultaneously. This means that cells will be able to interact with one another directly or indirectly by influencing the environment (see below). I hope this will drive the evolution of ecosystems in which organisms become dependent upon each other's existence.

Variable environment

To further encourage diverse evolutionary strategies, and thus speciation, I aim to simulate an environment that is heterogeneous both spatially and temporally. Spatial difference will allow cells to adapt to different niches, while temporal changes will force cells to respond to the environment. In fact, one of the main aims of the simulation is to evolve regulatory networks, and in order for a regulatory network to evolve, there must be changes in the environment to respond to.

The simulation of the pool of water in which the cells will live is effectively a separate project, which I am writing about here.

Unconstrained genetics

Whilst all genetics must be constrained to some degree (by which I mean not all phenotypes are possible; at the very least the laws of physics provide some limits), my previous simulations have been very limited. In all cases genes have represented relatively simple and well-defined characteristics, and there has been a set number of genes. in this simulation, I hope to create a sufficiently realistic and open-ended system by creating a function that converts a sequence of amino acids into proteins (i.e. cellular functions). I suspect that getting this mapping correct with be the most important part of building the simulation. Furthermore, there will be no limit to how long a genome can be other than my computer's memory and how much energy it takes for an organism to synthesise the genome.