Wednesday, 16 November 2016

Congratulatory post: Hail to the Imperial 2016 iGEM team!
By Ismael Mullor-Ruiz

With a bit of delay, we as a team would like to join in the congratulations for our colleagues and collaborators from the Imperial 2016 iGEM team, who triumphed at the iGEM 2016 Giant Jamboree at MIT.

For those who aren’t familiar with it, iGEM (acronym for “International Genetic Engineered Machine”) is the world’s largest synthetic biology contest. It was started 12 years ago at MIT as a summer side-project in which undergrad teams designed synthetic gene circuits never seen before in nature, built them and tested each of the parts. Many of these parts have subsequently pushed forward the field of synthetic biology. Even though it began as an undergrad-level competition with only a handful of teams involved, the competition grew larger and larger to include not only undergrad teams, but also postgrad teams, high school teams and even enterprises.  More than 200 teams from all around the globe that took part on the last edition.

Traditionally, synthetic biology involves tinkering with a single cell type (eg. E. coli) so that it performs some useful function – perhaps outputting an industrially or medically useful molecule. This tinkering involves altering the molecular circuitry of the cell by adding new instructions (in the form of DNA) that result in the cell producing new proteins/RNA that perform the new functions. The focus of this year’s project from the Imperial team was on the engineering of synthetic microbial ecosystems of multiple cell types (known as “cocultures”) rather than a single organism, since more complex capabilities can be derived from multiple cell types working together.

So they began by characterizing the growing conditions of six different “chassis” organisms and creating a database called ALICE. The challenge here resides in the fact that the different organisms had different growing conditions and thus maintaining a steady proportion is really hard to achieve; typically one of the populations ends up taking over in any given set of conditions. Thus, in order to allow self-tuning of the growth of the cocultures, they designed a system consisting of three biochemical modules:

1) A module that allows communication between the populations through a “quorum sensing” mechanism. Population densities of each species are communicated via chemical messengers that are produced within the cells, released and diffuse through the coculture.  Each cell type produces a unique messenger, and the overall concentration of this messenger indicates the proportion of those cells in the coculture.

2) A comparison module that enables a cell to compare the concentration of each chemical messenger. The chemical messengers were designed to trigger the production of short RNA strands in each cell; RNA strands triggered by different messengers bind to and neutralize each other. If there is an excess of the cell’s own species in the coculture, some of the RNA triggered by its own chemical messenger will not be neutralized, and can go on to influence cell behaviour.

3) An effector module. The RNA triggered in response to an excess of the cell’s own species is called “STAR”. It can bind to something known as a riboswitch (see figure below); when it is present, the cell produces a protein that suppresses its own growth. Cells therefore respond to an excess of their own population by reducing their own growth rate, allowing others to catch up. The approach of using a riboswitch for cell division control presents several advantages as its ease to design and to port at any cell type, and involves a reduced burden on the cell compared to other mechanisms.

Figure 1: Action of STAR in opening the hairpin of a riboswitch. Without STAR, the riboswitch interferes production of certain genes; STAR stops this interference so that the genes are produced.

As a demonstration of the concept, the students implemented this control system in different coloured strains of bacteria in order to create different pigments (analogous to the Pantone colour standard) through the coculture and combination of the strains. The approach is very generic, however, and as the team mention on their wiki, the possibilities of cocultures go way beyond this!

If you want to know more about the project, you can check out the team’s wiki:

Thursday, 6 October 2016

Replication, Replication, Replication I

This post and the one below it are linked. Here, I discuss a topic that interests us as a group, and below I look at some recent related papers. This post should make reasonable sense in isolation, the second perhaps less so.

Replication is at the heart of biology; whole organisms, cells and molecules all produce copies of themselves. Understanding natural self-replicating systems, and designing our own artificial analogues, is an obvious goal for scientists - many of whom share dreams of explaining the origin of life, or creating new, synthetic living systems.

Molecular-level replication is a natural place to start, since it is (in principle) the simplest, and also a necessary component of larger-scale self-replicating systems. The most obvious example in nature is the copying of DNA; prior to cell division, a single copy of the entire sequence of base pairs in the genome must be produced. But the processes of transcription (in which the information in DNA sequence is copied into an RNA sequence) and translation (in which the information in RNA sequence is copied into protein sequence) are closely related to replication. The information initially present in the DNA sequence is simply written out in a new medium, like printing off a copy of an electronic document. This process is illustrated in the figure above (which I stole from here). This figure nicely emphasies the polymer sequences (shown as letters) that are being copied into a new medium (note: three RNA bases get copied into one amino acid in a protein: AUG into M, for example). An absolutely fundamental feature of both replication and copying processes is that the copy, once produced, is physically separated from the template from which it was produced. This is important, otherwise the copies couldn't fulfill their function, and more copies could not be made from the same template.

This single fact - that useful copies must separate from their template yet retain the copied information - makes the whole engineering challenge far harder. It's (reasonably) straight-forward to design a complex (bio)chemical system that assembles on top of a template, guided by that template. All you need are sufficiently selective attractive interactions between copy components and the template. But if you then want to separate your copy from the template, these very same attractive interactions work against you, holding the copy in place - and more accurate copies hold on to the template more tightly. My collaborators and I formalise this idea, and explore some of the other consequences of needing to separate copies from templates, in this recent paper.

Largely because of this problem, no-one has yet constructed a purely chemically driven, artificial system that produces copies of long polymers, as nature does. Instead, it has proved necessary to perform external operations such as successively heating and cooling the system. Copies can then grow on the template at low temperature, and then fall off at high temperature, allowing a new copy to be made when the system is cooled down. This is exactly what is done in the PCR, an incredibly important process for amplifying a small amount of DNA in areas ranging from forensics to medicine.

As a group, we're very interested in how copying/replication can be achieved without this external intervention. Two recent papers, discussed in the blog entry below, highlight the questions at hand.

Replication, Replication, Replication II

Here I discuss two recent experimental papers that are related to the challenge of replication or copying, following on from the discussion in "Replication, Replication, Replication I". My take on these papers is heavily couched in terms of that discussion.

Semenov et al.: Autocatalytic, bistable oscillatory networks of biologically relevant reactions
Nature 537, 656–660 (2016)
A catalyst accelerates chemical involving a substrate. For example, amylase accelerates the interconversion of starch and sugars, helping us to digest food. As we learnt at school, a key feature of catalysts is that they are not consumed by the reaction - a single amylase can digest many starch molecules. This fact should remind us of the replication/copy process discussed above, in which it is important that a new copy separates from its template so that the template is not be consumed by the copy process, and can go on to produce many more copies. Indeed, templates for copying/replication must be catalysts. In the specific case of replication, the process is autocatalytic, meaning that a molecule is a catalyst for the production of identical molecules. Simple autocatalytic systems are thus often seen as a bridging point to the full complexity of life.

Semenov et. al. show that a particularly simple set of molecules can exhibit autocatalytic behaviour. Although autocatalysis has been previously demonstrated, the novelty of their approach is the use of such simple organic molecules (which could plausibly have been present on Earth prior to living organisms). Additionally, they are able to show relatively sophisticated behaviour from their system - not just exponential growth of the output molecule (the natural behaviour of autocatalytic systems). When molecules that cause inhibition of autocatalysis and degradation of components are added, for example, the output concentration can be made to oscillate.

Although fascinating, the work of Semenov et al. does not solve the question raised in the blog post above. There are no long polymers in this system, and so the difficulty of separating strongly-interacting copies and templates does not arise. But as a consequence, this autocatalytic mechanism passes on very little information (arguably, none) to the new molecules produced. Autocatalysis alone is not enough  - we are still a long way from processes such as DNA replication, transcription and translation.

Meng et al.: An autonomous molecular assembler for programmable chemical synthesis 
Nature Chemistry 8, 542–548 (2016)
This paper, co-authored by my collaborators in the Turberfield group, takes a completely different approach. The idea is to specify the sequence of a molecular polymer using a DNA-based programme. As I have talked about before, the exquisite selectivity of base-pairing in DNA allows reactions to be programmed into carefully designed single strands, allowing them to self-assemble into a complex patterns when mixed. In this case, the authors mix sets of short DNA strands that are designed to assemble into a long double-stranded structure in a specific order. The selectivity of interactions allows the strands to be programmed to bind one-by one to the end of the structure in the desired sequence.

This process (the hybridisation chain reaction) is not new. The advance is using it to template the sequence of a second (chemically quite different) polymer that can't assemble with a specific sequence on its own - for simplicity, lets call this polymer X (its details aren't important). The authors ingenuously attach building blocks of X to the DNA strands - with each distinct DNA sequence paired with a distinct building block. When a new strand is incorporated via the hybridisation chain reaction, it brings with it the associated building block and adds it to X, which grows simultaneously with the double-stranded DNA construct. The details of this process are a bit fiddly, and due to a technicality a new building block is only added for every second strand incorporated, but the process as a whole allows them to assemble a specific polymer X using DNA-based instructions set by the sequences of the original strands. The authors call this programmed chemical synthesis.

The authors are inspired by the ribosome (see fig, stolen from here), the biological machine that translates an RNA sequence (the red polymer) into a polypeptide sequence (green), which eventually folds into a protein. The ribosome uses RNA base pairing to bring a set of peptide building blocks together in the right order, like the device of Meng et al. uses DNA base-pairing to form polymer X. However, there is a key difference. The information-carrying RNA strand in the figure is not consumed by the process; it acts as a catalyst, as discussed, and the ribosome walks along it until the end and then releases it. The information-carrying components of the system of Meng et al. (the strands that carry the molecular programme) are consumed, being incorporated into a long double-stranded DNA molecule that the authors actually use to analyse the success of the process. Thus although the system allows programmable self-assembly, it doesn't implement catalysis and hence can't perform copying/replication.

Both papers are great pieces of work, but one demonstrates autocatalysis without information transfer, and the other demonstrates the ability to programme polymer assembly without autocatalysis. The challenge to produce chemical systems that copy or replicate is still on.

Wednesday, 15 June 2016

Reading list

Here are some papers we have been reading recently:

Neural Sampling by Irregular Gating Inhibition of Spiking Neurons and Attractor Networks by Lorenz K. Müller and Giacomo Indiveri
This paper shows how a neural network model implements an MCMC sampler.

Trade-Offs in Delayed Information Transmission in Biochemical Networks by F. Mancini, M. Marsili and A. M.Walczak
Here the authors investigate the dissipation required for simple models of sensors to transmit information.

Discrete fluctuations in memory erasure without energy cost by Toshio Croucher, Salil Bedkihal, and Joan A. Vaccaro
This extends Landauer’s principle to an angular momentum cost instead of an energy cost.

Experimental rectification of entropy production by a Maxwell's Demon in a quantum system by P. A. Camati, J. P. S. Peterson, T. B. Batalhão, K. Micadei, A. M. Souza, R. S. Sarthour, I. S. Oliveira and R. M. Serra
This paper describes the theory of a quantum Maxwell’s demon and an experiment where both the demon and the system are spin-1/2 quantum systems.

Minimal positive design for self-assembly of the Archimedean tilings by Stephen Whitelam
This paper shows that a certain amount of specificity in interactions is required for particles to self-assemble into a certain pattern.

Energy-Effcient Algorithms by Erik D. Demaine, Jayson Lynch, Geronimo J. Mirano and Nirvan Tyagi
The authors consider a formalism for identifying the minimal energetic costs of efficient computational algorithms.

Information Flows? A Critique of Transfer Entropies by Ryan G. James, Nix Barnett and James P. Crutchfield
This paper highlights the subtleties in identifying "flows of information" from one system to another.

Monday, 29 February 2016

Optimal odour receptors

The sense of smell is the ability to detect molecules of different chemicals in the air. This happens via an array of receptor molecules in your nose that bind to odour molecules and send signals to your brain. A simple design would have one receptor for each molecule. However it’s more complicated than that. Humans can detect over 2100 different molecules and can tell the difference between mixtures of up to 30 different molecules, using a much smaller number of distinct receptors. Receptors therefore need to respond to more than one different molecule each and the brain must put together all of the signals form the receptors to identify the odour. A recent paper by David Zwicker, Arvind Murugan and Michael P. Brenner discusses the optimal setup of receptors from an information-theoretic perspective.

One important point to note is that different odours are present in the natural environment with different frequencies and this ought to be taken into account when designing your receptor array. It is no use having an array that very accurately distinguishes between two very rare odours if it can’t distinguish between two very common odours. This is made more precise by Laughlin’s principle which says that an optimal sensor (one that conveys the most information) should be such that all possible outputs are equally likely in the natural environment.

Zwicker, Murugan and Brenner construct a simple model of the response of the receptor array to different mixtures of odour molecules in which the receptors respond proportionally to the concentrations of the molecules. Each receptor has a different sensitivity to each of the molecules and it transmits a binary signal that is 1 if the excitation is above a threshold and 0 if it is below. They find that there are two principles that are relevant to maximising the information:

(1) Any given receptor should be active half the time, which maximises the amount of information provided by that receptor in isolation.

(2) Each pair of receptors should have as uncorrelated (non-matching) a response as possible, reducing the redundancy between them.

They then go on to discuss to what extent these principles can be satisfied in general and which parameters of their model give the maximum information. The most interesting outcome is a fundamental trade-off between being effective at identifying the presence or not of as many molecules as possible, and being able to estimate the concentrations of molecules accurately. This trade-off arises because the best way to identify as many molecules as possible is for each to activate only a single receptor, whereas to estimate concentration each molecule activate a number of receptors with different thresholds.

This paper is a good example of one of the ideas in in William Bialek’s paper that Tom McGrath wrote about in a previous post. The authors are assuming that this biological process is operating near to the optimum that physical limits allow and then they  are trying to find the parameters of the system that correspond to these limits.  One criticism of the paper might be that it takes no account of the fact that some odours a more important than others, even if they are very rare. For example, accurately identifying the smell of a poisonous substance that an animal encounters occasionally might be more important than distinguishing between the smells of two good food sources that are encountered more frequently.

Wednesday, 20 January 2016

New contributors

This term I'll be encouraging some of the students I work with here in Imperial to contribute to this blog; Tom McGrath has corageously stepped up to take a first swipe with a discussion of a recent review from Bill Bialek on physics in biology.

Thoughts on William Bialek's "Perspectives on theory

In modelling almost anything in biology the complexity of the process leads almost inevitably to an explosion of free parameters whose tuning may give very different results.While this is often explained away as inevitable due to the rich diversity of the natural world, for a physicist there is something unsettling about a profusion of parameters, coupled with a hope that they could be obtained instead by resorting to some underlying principle. The question is: what principle (or principles) might be at work here? A recent post on the arXiV by Professor William Bialek at Princeton and CUNY summarises a discussion on this topic at the Simons Foundation 2014 Theory in Biology workshop.

The main goal of this paper is to explore possible guiding principles that allow us to recover the correct spot in parameter space, or render precise tuning of parameters unnecessary. A second topic is a pointed but valid critique of how theory (whether implicit or in an explicit mathematical form) is currently considered in the biological community, which I won't discuss here - if you're interested look in sections II - IV and IX of the paper. The three classes of principles explored in the paper are: functional behaviours emerging as 'robust' properties of biological systems without precise tuning (section VI), optimality arguments based on evolution (section VII), and emergent phenomena from large interacting systems (section VIII). In the interests of remaining concise I'll miss out a lot of the interesting examples from the paper and concentrate instead on one or two of the main examples in each section - the paper itself is well worth a look for these alone.

In Section VI the main emphasis is on robust behaviours - properties where entire regions of parameter space will produce similar results. In this case the main example is spike train production in neurons. Varying the copy number of protein channels in neurons produces neurons with qualitatively different spike train characteristics including silence, single, double, and triple spike bursts and rapid repeated spiking. This resolves a continuous parameter space into a set of distinct objects which can be combined to form a functional system; by adjusting the copy numbers of any particular neuron in a network it can be robustly changed from one type to another. This suggests a view of neurons as more like building blocks than objects for individual investigation (although this insight would of course have been impossible without such prior study!), and leads to questions of what can be done with networks of neurons, and what naturally emerges. The ability to maintain a continuum of fixed points is not generic and requires tuning parameters, so how does it emerge? The suggested answer is through feedback, which provides a signal for how well the network is doing, and thus allows for further tuning of parameters towards the desired behaviour, which is illustrated by a surprising and elegant experiment involving confused goldfish!

Section VII broadly discusses biological measurement processes (like vision or chemical sensing) and their nearness to optimality. If operation near biological limits is the rule, then this provides a guide to selection of parameters - choose the parameter set which is as close as the system allows to the physical limits imposed upon it. Again neuronal networks provide a good example here, for instance the ability of flies to avoid being swatted. The combination of their fast movement speed and low-resolution eyes leads to a requirement for the fly brain to carry out near-exact motion estimation. This is a nice idea - it arises from a clear biological principle but can be formalised precisely in terms of information-theoretic concepts. It also ties in nicely to the principle in the previous section; information generates feedback, which helps with the precise tuning of a system to achieve the desired effect. From a thermodynamic perspective, however, more accurate measurement can often be carried out at an increasing energy cost (for example the Berg and Purcell work on cellular sensing). This implies a tradeoff: the organism can balance expenditure with increased information gain and it's not clear where this balance should be set. In non-sensing organs it's not clear what design principles could be at work - although information makes a lot of these ideas easy to quantify, it also limits their range of application.

Section VIII talks mainly about statistical physics-type models involving emergent behaviour from a large number of relatively simple agents. The main example here is the highly successful model of flocking in birds. This was revolutionised by the ability to take large-scale measurements of many (>1,000) individual birds simultaneously, which was used to build a maximum entropy flocking model. Local interactions are present, however long-range correlations emerge in the flock through Goldstone modes (in the case of direction) and tuning to a critical point in velocity. Further experiments show that criticality, rather than symmetry breaking, is the most common mechanism for the emergence of long-range correlations.

Although Section VIII is claimed to be a separate guiding principle earlier in the paper, the re-emergence of criticality as a central player suggests that really this is more an exploration of the consequences of the principle discussed in Section VII. Because of this it seems that there's really one principle at work here - emergent behaviour through tuning to criticality where the tuning occurs due to the flow of information in the system, which we expect to be as close to physical limits as possible. This seems like a fruitful avenue for the creation of characteristic models which capture these ideas together while remaining as simple as possible, and I'd be very interested to learn of any such models currently in existence.