Friday, 15 September 2017

We're famous! (ish) Tom's been answering questions on the "Naked Scientists" podcast

The Naked Scientists are based at Cambridge University's Institute of Continuing Education (ICE). In their own words, they are a team of scientists, doctors and communicators whose passion is to help the general public to understand and engage with the worlds of science, technology and medicine.

They recently contacted me to provide an answer to the following question:

"Water is solid below 0 degrees C, a liquid from 0 to 100 degrees C, and a gas above 100 degrees C. Then why does washing and the roads dry when the temperature isn't 100 degrees C?"

I think the answer's actually quite subtle, for such a deceptively simple-sounding question. My recorded answer can be downloaded from here. Full transcript below!

Interviewer: I asked Thomas Ouldridge from Imperial College London to hang Norm’s question out to dry
Tom: It is true that pure water will be a gas (called water vapour) only above 100 degrees Celsius, but temperature isn’t the only factor at play here. 
The surrounding pressure also impacts when a substance like water can be a gas – the higher the pressure, the higher the temperature required to be a gas. You've probably all heard of LPG - this stands for liquified petroleum gas. The chemicals in LPG would normally be a gas, but by keeping LPG in high-pressure cannisters it can be stored  as a liquid.
So gases, like [insert humorous reference here], often can't handle pressure. But why is this? 
To exist as a gas, water molecules have to be widely spaced out. High enough pressure will simply squeeze them back into their more compact liquid form. 
The more you heat the water, the more energy you give to the molecules and the harder they push back against their surroundings. Above 100 degrees Celsius, but not below, water molecules can push back hard enough against the pressure of the atmosphere for pure water to stay as a gas.
Well, personally I tend to hang my washing inside the earth’s atmosphere, and below boiling point, and it still dries. So what’s going on?
Well, let’s consider a puddle, for example. You might think it should stay as a liquid because the temperature is below 100 degrees, and the atmosphere is pushing down on it. However, 
we don't only have water molecules involved - the system isn't pure. The air above the surface of the puddle contains many other molecules, such as nitrogen and oxygen. 
These extra molecules can actually help to push back against the surrounding atmosphere, effectively lowering the pressure that must be supported by the water molecules if they form a gas. It’s like many people helping to lift a weight rather than just one. 
In fact, there’s so much more nitrogen and oxygen that they take almost all of the burden of the atmospheric pressure. 
This is important: any water molecules that have enough energy to escape from the puddle don’t face the full might of the atmospheric pressure – so they don’t immediately liquidise. 
This is why some water vapour can survive in the atmosphere, thanks to the hard work of the other gasses, and we can explain why evaporation happens and puddles (or clothes) dry under normal conditions.
Only a certain amount of water vapour can be supported by the other gasses though, which is why things don’t evaporate immediately, and why movement of air is important if you want things to dry faster. 
If you want to see this in action for yourself, lick your wrist and blow on it. It dries almost immediately compared to if you don’t blow.  So that’s evaporation covered, but how is that different from boiling? 
When water reaches 100 degrees at atmospheric pressure, it has so much energy that the vapour no longer needs the help of other gases to be stable. In our analogy, the water vapour is now like a weightlifter that is strong enough to support the “weight” of the atmosphere on its own.
At this point, bubbles of vapour can form within the liquid itself, converting liquid to gas much faster than slow evaporation from the liquid surface.

Thursday, 6 July 2017

What we learn from the learning rate


Cells need to sense their environment in order to survive. For example, some cells measure the concentration of food or the presence of signalling molecules. We are interested in studying the physical limits to sensing with limited resources, to understand the challenges faced by cells and to design synthetic sensors.

We have recently published a paper (arxiv version) where we explore the interpretation of a metric called the learning rate that has been used to measure the quality of a sensor (New J. Phys. 16 103024, Phys. Rev E 93 022116). Our motivation is that in this field a number of metrics (a metric is a number you can calculate from the properties of the sensor that, ideally, tells you how good the sensor is) have been applied to make some statement about the quality of sensing, or limits to sensory performance. For example, a limit of particular interest is the energy required for sensing. However, it is not always clear how to interpret these metrics. We want to find out what the learning rate means. If one sensor has a higher learning rate than another what does that tell you? 

The learning rate is defined as the rate at which changes in the sensor increase the information the sensor has about the signal. The information the sensor has about the signal is how much your uncertainty about the state of the signal is reduced by knowing the state of the sensor (this is known as the mutual information). From this definition, it seems plausible that the learning rate could be a measure of sensing quality, but it is not clear. Our approach is a test to destruction – challenge the learning rate in a variety of circumstances, and try to understand how it behaves and why

To do this we need a framework to model a general signal and sensor system. The signal hops between discrete states and the sensor also hops between discrete states in a way that follows the signal. A simple example is a cell using a surface receptor to detect the concentration of a molecule in its environment.


The figure shows such a system. The circles represent the states and the arrows represent transitions between the states. The signal is the concentration of a molecule in the cell’s environment. It can be in two states; high or low, where high is double the concentration of low. The sensor is a single cell surface receptor, which can be either unbound or bound to a molecule. Therefore, the joint system can be in four different states. The concentration jumps between its states with rates that don’t depend on the state of the sensor. The receptor becomes unbound at a constant rate and is bound at a rate proportional to the molecule concentration. 

We calculated the learning rate for several systems, including the one above, and compared it to the mutual information between the signal and the sensor. We found that in the simplest case, shown in the figure, the learning rate essentially reports the correlation between the sensor and the signal and so it is showing you the same thing as the mutual information. In more complicated systems the learning rate and mutual information show qualitatively different behaviour. This is because the learning rate actually reflects the rate at which the sensor must change in response to the signal, which is not, in general, the equivalent to the strength of correlations between the signal and sensor. Therefore, we do not think that the learning rate is useful as a general metric for the quality of a sensor.

Tuesday, 4 July 2017

Becoming more certain about uncertainty in molecular systems

By Jenny Poulton

Due to the unpredictability of motion at the microscopic scale, molecular processes have randomness associated with them, exhibiting what we call thermodynamic fluctuations. A group in Germany lead by Barato and Seifert have written a series of papers, beginning with "Thermodynamic uncertainty relation for biomolecular processes" (preprint here), exploring how uncertainty in the number of reaction steps taken by a molecular process is related to the degree to which the system is constantly consuming energy.

To be more precise, Barato and Seifert consider the number of times a system completes a cycle in a given time window. A good example of this kind of setup is the rotary motor F0F1-ATPsynthase (below, image taken from Wikipedia).
This motor is used to create the chemical fuel source of the cell (ATP) from its components (ADP and inorganic phosphate P). In order to drive this process, a current of hydrogen ions flows through the top half of the motor, causing it to systematically rotate in one direction with respect to the bottom half. This rotation is physically linked to the reaction ADP + P -> ATP, and so ATP is created. This one-directional rotational motion only arises because the current of hydrogen ions continuously supplies more energy (more technically, free energy) to the system than is needed to create the ATP. We say that the current of ions drives the system.

In general, small driven systems have a bias towards stepping forward, but there is still a non-zero probability of stepping backwards due to thermodynamic fluctuations. We also cannot predict exactly how long the system will take to complete each step of the cycle, and so the time taken per step is variable. Thus the number of cycles completed in a given time is uncertain. It is, however, possible to define an average of the net number of cycles in a time window µ and a variance σ2, which is a mathematical measure of the typical deviation from the average due to fluctuations. The Fano factor F = σ2/µ gives a measure of the relative importance of the random fluctuations about the average.

In the paper "Thermodynamic uncertainty relation for biomolecular processes", Barato and Seifert relate the energy consumption and the Fano factor via F ≤ 2kT /E. Here E is the energy consumed per cycle, T is the temperature and k is Boltzmann’s constant. This expression means that the Fano factor is at least as big as the quantity 2kT /E. Thus a cycle which uses a certain amount of fuel E has an upper limit to its precision, and there is an evident trade-off between the amount of energy dissipated per cycle and the Fano factor.

In the original paper, the authors only prove their relation for very simple processes. However, it has since been generalised in this paper (preprint here). The result is actually based on very deep statements about the types of fluctuating processes that are possible in physical systems. One of the challenges now is to take this fundamental insight and apply it to gain a better understanding of practical systems. Fortunately, the F0F1-ATPsynthase rotary motor is not the only example of an interesting biological system that  undergoes driven cycles; the cell contains a huge variety of molecular motors that can also be understood in this way (preprint here). Molecular timekeepers that are vital to the cellular life cycle also depend on driven cycles. Understanding the trade-offs between unwanted variability and energy consumption will be vital in engineering such systems.

Tuesday, 11 April 2017

Two papers on the fundamental principles of biomolecular copying

Single cells, which are essentially bags of chemicals, can achieve remarkable feats of information processing. Humans have designed computers to perform similar tasks in our everyday world. The question of whether it is possible to emulate cells and use molecular systems to perform complex computational tasks in parallel, at an extremely small scale and consuming a low amount of power, is one that has intrigued many scientists.

In collaboration with the ten Wolde group from AMOLF Amsterdam, we have just published two articles in Physical Review X and Physical Review Letters that get to the heart of this question. 

The readout molecules (orange) act as copies of the binding
state of the receptors (purple), through catalytic
phosphorylation/dephosphorylation reactions.

In the first, “The Thermodynamics of computational copying in biochemical systems”, we show that a simple molecular process occurring inside living cells - a phosphorylation/dephosphorylation cycle - is able to copy the state of one protein (for example, whether a food molecule is bound to it or not) into the chemical modification state of another protein (phosphorylated or not). This copy process can be rigorously related to those performed by conventional computers.
We thus demonstrated that living cells can perform the basic computational operation of copying a single bit of information. Moreover, our analysis revealed that these biochemical computations can occur rapidly and at a low power consumption. The article shows precisely how natural systems relate to and differ from traditional computing architectures, and provides a blueprint for building naturally-inspired synthetic copying  systems that approach the lower limits of power consumption.
The production of a persistent copy from a template.
The separation in the final state is essential.
A more complex natural copy operation is the production of polymer copies from polymer templates, as discussed in this previous post. Such processes are necessary for DNA replication, and also for the production of proteins from DNA templates via intermediate RNA molecules. For cells to function, the data in the original DNA sequence of bases must be faithfully reproduced - each copy therefore involves copying many bits of data. 

In the second article, "Fundamental costs in the production and destruction of persistent polymer copies", we consider such processes. We point out that these polymer copies must be persistent to be functional. In other words, the end result is two physically separate polymers: it would be useless to produce proteins that couldn't detach from their nucleic acid templates. As a result, the underlying principles are very different from the superficially similar process of self-assembly, in which molecules aggregate together according to specific interactions to form a well-defined structure. 

In particular, we show that the need to produce persistent copies implies that more accurate copies necessarily have a higher minimal production cost (in terms of resources consumed) than sloppier copies. This result, which is not true if the copies do not need to physically separate from their templates, sets a bound on the function of minimal self-replicating systems.

Additionally, the  results suggest that polymer copying processes that occur without external intervention (autonomously) must occur far from equilibrium. Being far from equilibrium means that processes are highly irreversible - taking a forwards step is much more likely than taking a backwards step. This finding draws a sharp distinction with self-assembling systems, that typically assemble most accurately when close to equilibrium. This difference may explain why recent years have shown an enormous growth in the successful design of self-assembling molecular systems, but autonomous synthetic systems that produce persistent copies through chemical means have yet to be constructed.
Taken together, these papers set a theoretical background on which to base the design of synthetic molecular systems that achieve computational processes such as copying and information transmission. The next challenge is now to develop experimental systems that exploit these ideas.

Monday, 3 April 2017

Working with the City of London School on an exciting iGEM project

Today I meet with a group of school students (aged 16-18) from the City of London School, who will be working on a project for iGEM this year. iGEM is an international competition for school, undergrad and postgrad teams to design, model and build complex systems by engineering cells. Last year, Imperial won the overall prize, as discussed in this post by Ismael. 

Without giving too much away, the students will be working on a system based on a newly-developed molecular device, the toehold switch. Toehold switches are RNA molecules that contain the information required to produce proteins. This information is hidden via interactions within the RNA, which cause it to fold up into a shape that prevents the sequence from being accessed. If, however, a second strand of RNA with the right sequence is present, the structure can be opened up and protein production is possible.

This idea has been around for a reasonable while, but toehold switches are particularly useful, because they provide a better decoupling of the input, output and internal operation of the switch than previous designs. This is the principal of modularity that underlies the work of many of my colleagues here at Imperial, and allows for systematic engineering of molecular systems. This modularity is key to the proposed project.

I've been giving the students advice on how to model the operation of a toehold switch, in order that they can explore the design space before getting into the lab.


Wednesday, 11 January 2017

A simple biomolecular machine for exploiting information



Biological systems at many scales exploit information to extract energy from their environment. In chemotaxis, single-celled organisms use the location of food molecules to navigate their way to more food; humans use the fact that food is typically found in the cafeteria. Although the general idea is clear, the fundamental physical connection between information and energy is not yet well-understood. In particular, whilst energy is inherently physical, information appears to be an abstract concept, and relating the two consistently is challenging. To overcome this problem, we have designed two microscopic machines that can be assembled out of naturally-occurring biological molecules and exploit information in the environment to charge a chemical battery. The work has just been published as an Editor's selection in Physical Review Letters: http://journals.aps.org/prl/abstract/10.1103/PhysRevLett.118.028101

The basic idea behind the machines is simple, and makes use of pre-existing biology. We employ an enzyme that can take a small phosphate group (one phosphorus and several oxygen atoms bound together) from one molecule and attach it to another – a process known as phosphorylation. Phosphorylation is the principal signaling mechanism within a cell, as enzymes called kinases use phosphyrlation to activate other proteins. In addition to signalling, phosphates are one of the cell’s main stores of energy; chains of phosphate bonds in ATP (the cell’s fuel molecule) act as batteries. By ‘recharging’ ATP through phosphorylation, we store energy in a useful format; this is effectively what mitochondria do via a long series of biochemical reactions.





Fig 1.: The ATP molecule (top) and ADP molecule (bottom). Adenosine (the "A") is the group of atoms on the right of the pictures; the phosphates (the P) are the basic units that form the chains on the left. In ADP (Adenosinediphosphate) there are two phosphates in the chain; in ATP((Adenosinetriphosphate) there are three. 


The machines we consider have three main components: the enzyme, the ‘food’ molecule that acts as a source of phosphates to charge ATP, and an activator for the enzyme, all of which are sitting in a solution of ATP and its dephosphorylated form ADP. Food molecules can either be charged (i.e. have a phosphate attached) or uncharged (without phosphate). When the enzyme is bound to an activator, it allows transfer of a phosphate from a charged food molecule to an ADP, resulting in an uncharged food molecule and ATP. The reverse reaction is also possible.

In order to systematically store energy in ATP, we want to activate the enzyme when a charged food molecule is nearby. This is possible if we have an excess of charged food molecules, or if charged food molecules are usually located near activators. In the second case, we're making use of information: the presence of an activator is informative about the possible presence of a charged food molecule. This is a very simple analogue of the way that cells and humans use information as outlined above. Indeed, mathematically, the 'mutual information' between the food and activator molecules is simply how well the presence of an activator indicates the presence of a charged food molecule. This mutual information  acts as an additional power supply that we can use to charge our ATP-batteries. We analyse the behaviour of our machines in environments containing information, and find that they can indeed exploit this information, or expend chemical energy in order to generate more information. By using well-known and simple components in our device, we are able to demystify much of the confusion over the connection between abstract information and physical energy.

A nice feature of our designs is that they are completely free-running, or autonomous. Like living systems, they can operate without any external manipulation, happily converting between chemical energy and information on its own. There’s still a lot to do on this subject; we have only analysed the simplest kind of information structure possible and have yet to look at more complex spatial or temporal correlations. In addition, our system doesn’t learn, but relies on ‘hard-coded’ knowledge about the relation between food and activators. It would be very interesting to see how machines that can learn and harness more complex correlation structures would behave.

Authored by Tom McGrath

Wednesday, 16 November 2016

Congratulatory post: Hail to the Imperial 2016 iGEM team!
By Ismael Mullor-Ruiz

With a bit of delay, we as a team would like to join in the congratulations for our colleagues and collaborators from the Imperial 2016 iGEM team, who triumphed at the iGEM 2016 Giant Jamboree at MIT.

For those who aren’t familiar with it, iGEM (acronym for “International Genetic Engineered Machine”) is the world’s largest synthetic biology contest. It was started 12 years ago at MIT as a summer side-project in which undergrad teams designed synthetic gene circuits never seen before in nature, built them and tested each of the parts. Many of these parts have subsequently pushed forward the field of synthetic biology. Even though it began as an undergrad-level competition with only a handful of teams involved, the competition grew larger and larger to include not only undergrad teams, but also postgrad teams, high school teams and even enterprises.  More than 200 teams from all around the globe that took part on the last edition.

Traditionally, synthetic biology involves tinkering with a single cell type (eg. E. coli) so that it performs some useful function – perhaps outputting an industrially or medically useful molecule. This tinkering involves altering the molecular circuitry of the cell by adding new instructions (in the form of DNA) that result in the cell producing new proteins/RNA that perform the new functions. The focus of this year’s project from the Imperial team was on the engineering of synthetic microbial ecosystems of multiple cell types (known as “cocultures”) rather than a single organism, since more complex capabilities can be derived from multiple cell types working together.

So they began by characterizing the growing conditions of six different “chassis” organisms and creating a database called ALICE. The challenge here resides in the fact that the different organisms had different growing conditions and thus maintaining a steady proportion is really hard to achieve; typically one of the populations ends up taking over in any given set of conditions. Thus, in order to allow self-tuning of the growth of the cocultures, they designed a system consisting of three biochemical modules:

1) A module that allows communication between the populations through a “quorum sensing” mechanism. Population densities of each species are communicated via chemical messengers that are produced within the cells, released and diffuse through the coculture.  Each cell type produces a unique messenger, and the overall concentration of this messenger indicates the proportion of those cells in the coculture.

2) A comparison module that enables a cell to compare the concentration of each chemical messenger. The chemical messengers were designed to trigger the production of short RNA strands in each cell; RNA strands triggered by different messengers bind to and neutralize each other. If there is an excess of the cell’s own species in the coculture, some of the RNA triggered by its own chemical messenger will not be neutralized, and can go on to influence cell behaviour.

3) An effector module. The RNA triggered in response to an excess of the cell’s own species is called “STAR”. It can bind to something known as a riboswitch (see figure below); when it is present, the cell produces a protein that suppresses its own growth. Cells therefore respond to an excess of their own population by reducing their own growth rate, allowing others to catch up. The approach of using a riboswitch for cell division control presents several advantages as its ease to design and to port at any cell type, and involves a reduced burden on the cell compared to other mechanisms.



Figure 1: Action of STAR in opening the hairpin of a riboswitch. Without STAR, the riboswitch interferes production of certain genes; STAR stops this interference so that the genes are produced.

As a demonstration of the concept, the students implemented this control system in different coloured strains of bacteria in order to create different pigments (analogous to the Pantone colour standard) through the coculture and combination of the strains. The approach is very generic, however, and as the team mention on their wiki, the possibilities of cocultures go way beyond this!

If you want to know more about the project, you can check out the team’s wiki: