Metabolomics may be the key to decoding plant genomes, reports Kira Weissman.
- Scientists are exploring the relationship between a plant's genome and its 'metabolites'
- Plant cells contain a huge number of chemicals, making their analysis a challenge
- Studying the model compound Arabidopsis thaliana gives useful information about more commercial crops, such as wheat, soya and corn
Plants are Nature's factories - they churn out hundreds of thousands of complex compounds, yet they are quite literally green: efficient, sustainable and non-polluting. Most of us are unaware of these chemical acrobatics, even though we depend on plants for food, medicines, and even the air we breathe.
Scientists are very keen to understand how plants work, which in this post-genomic era means 'functional genomics' - deciphering the functions of genes. In essence, they want to know how plants' genetic messages are translated into the basic materials of plant life: low molecular weight chemicals called metabolites. By determining the link between a plant's genome and its metabolites, researchers hope to be able to alter specific biological properties ('phenotypes') by targeting a small number of genes. They imagine helping plants to flourish under even the most inhospitable conditions, or pumping them full of extra nutrients and vitamins. More ambitiously, researchers even envisage turning plants into factories for the clean and economical production of desirable chemicals and pharmaceuticals.
The most established methods for determining gene function rely on the analysis of gene products, such as messenger RNA (mRNA) and proteins. mRNA is created from DNA by a process called transcription, and is itself a template for protein synthesis. Consequently, the sequences of mRNA and its protein product reflect the parental gene. Therefore, by identifying the complete set of mRNA transcripts ('transcriptome') or proteins ('proteome') in a plant cell at a given time, researchers can determine which of its genes have been expressed. Comparing transcriptomes or proteomes from plants grown under different conditions provides evidence of the genes' functions.
However, transcriptomics and proteomics both fail to satisfy a fundamental goal of functional genomics: determining the connection between a gene and its chemical products. The problem is that changes in the amount of mRNA or protein do not necessarily lead to predictable alterations in metabolite levels. In short, there is no reliable connection between a plant's transcriptome or proteome and its phenotype.
As a discipline in its own right, metabolomics has also started to mature. While a review of recent literature shows that, since 2001, only 24 publications have specifically used the term 'metabolomics', plant research groups around the world are starting to jump on the bandwagon. The field has already spawned four dedicated conferences and an international committee. In recognition of the growing importance of metabolomics, in 2002 the UK's Biotechnological and Biological Sciences Research Council (BBSRC) funded a new £1.5 million initiative in the subject.
The aim of metabolomics is to identify all of the low molecular weight chemicals in an organism (its 'metabolome') and correlate their production to a specific gene or genes. Given the inherent challenges in this proposition, plant metabolomics researchers have initially focused on a simple garden weed called Thale Cress ( Arabidopsis thaliana), although they are starting to extend their studies to significantly more complex agricultural crops such as corn, soya and wheat. Arabidopsis is a model plant for this type of work: it is small, fast growing, and has a remarkably small genome.
While changes in gene expression can be effected by changing a plant's growth conditions, Arabidopsis researchers have taken a more direct approach to ensuring genetic variation. Their strategy involves using a piece of circular DNA called a tumour-inducing plasmid ('Ti' plasmid), borrowed from the bacterial plant pathogen Agrobacterium tumefaciens. The Ti plasmid is capable of transferring part of its own DNA (so-called 'transferred' or 'T-DNA') into plant cells, where it becomes inserted into the genome. When this T-DNA lands in a functional gene it 'knocks it out', or disables it.
Joseph Ecker, Professor of Plant Biology at the Salk Institute, US, has recently exploited this technique to disable 74 per cent of the ca 29 000 Arabidopsis genes. Crucially, Ecker and his colleagues have also determined the precise location of each T-DNA insertion. Researchers at Metanomics have so far introduced the entire genomes of yeast and the bacterium Escherichia coli into Arabidopsis, creating thousands of 'overexpression' lines, each harbouring one additional gene. These mutant strains can then be compared with unmodified, or 'wild type', plants under a variety of environmental conditions - their responses revealing how the deactivated or extra genes affect metabolism. The difficulty then, of course, is how to identify the metabolic changes that have occurred.
An analytical technique suitable for metabolomic measurements must not only be able to separate and quantify components, but also provide enough structural information about the metabolites to enable their identification. Additionally, the methodology must be fast, reliable, sensitive and suitable for high-throughput automation. To date, no one has developed a single analytical approach that can reliably detect and quantify every metabolite in a plant cell, because at every stage in the process, only a subset of compounds is selected. Nevertheless, significant progress has been made.
An essential step is, of course, extricating the metabolites from the plant cells into a solvent. Here, however, a choice is required: no single solvent is suitable for all metabolites. For example, compounds that are normally found in membranes are hydrophobic and prefer hydrocarbon solvents, while those from the aqueous part of the cell dissolve readily in polar liquids. Typically, therefore, researchers will use successive mixtures of solvents on the same sample. For example, a combination of methanol and water, followed by chloroform, extracts molecules with a range of polarities, including fatty acids, sugars and alcohols. At this stage researchers also add known amounts of internal standards and process controls. Next, an analytical technique must be selected.
Since the inception of metabolomics the method of choice for separating and identifying small biomolecules has been gas chromatography coupled to mass spectrometry (GC-MS). GC-MS is a mature and reliable technology, and is also suitable for automation. However, many small compounds are not sufficiently volatile to be analysed in this way and others simply fall apart when vaporised. One solution is to chemically modify, or 'derivatise' the metabolites, which can increase both their volatility and stability. However, as fatty and polar metabolites often require different derivatisation conditions, this step can add considerable time to sample processing.
A complementary technique is high-performance liquid chromatography, again with mass spectrometry detection (HPLC-MS). The use of a liquid carrier removes the requirement for derivitisation, and therefore opens up the analysis to a much wider range of compounds. Also, as the sample preparation time is shorter, the potential for experimental error is significantly reduced. On the negative side, HPLC-MS instruments are more expensive and generally require greater expertise to run and maintain, although instrument developers are addressing these issues. While HPLC-MS is growing in favour, neither HPLC-MS or GC-MS can measure all metabolites on its own, and so it seems likely that both will remain complementary techniques.
GC-MS and HPLC-MS are useful analytical tools. Nonetheless, Trethewey fears that, even when used together, the methods still fail to detect as many as 90 per cent of target compounds. 'It's difficult to know what you can't see,' he explains.
The simplest way to view chromatography or mass spectrometry data is as a 'fingerprint' of a particular plant line, leaving the identity of the individual peaks a mystery. This snapshot of a plant's metabolism enables researchers to screen rapidly for obvious differences between specimens. Fingerprints can also be obtained using 1H NMR spectroscopy to analyse crude (unchromatographed) extracts of plant tissue, an approach pioneered by Jeremy Nicholson at Imperial College London.
For quantification purposes, it is critical to establish that there is a linear relationship between metabolite concentration and the size of the chromatogram or mass-spectrum signal. One method for doing this is to dilute a sample by a known factor and look for the corresponding drop in signal intensity. With linearity established, it is then fairly easy to determine the relative concentration of metabolites by comparing them to internal reference compounds. Obtaining absolute values ( ie. quantified data in the absence of internal standards) is much more difficult, however, as not all molecules are ionised and detected to the same extent. For metabolites of known structure, researchers have tackled this issue by spiking the analysis mixture with isotopically labelled forms of the analyte. The chromatograph peaks for the labelled and unlabelled molecules will, of course, overlap, but the mass spectrometer can tell the two apart by weight. The signal intensity for the labelled material can then be used to determine the absolute amount of its unlabelled counterpart. Of course, this strategy does not work when the structure of the metabolites is not known.
The result of these analyses is a detailed, though admittedly incomplete, metabolic portrait of native and modified plant lines. These profiles are typically stored on a database, where they can be compared with each other using front-line bioinformatics and statistical techniques. If different strains of a plant are subjected to the same stresses, then an analysis of their metabolites should reveal what gives different species an advantage over others, and critically which gene or genes underpin the survival response.
Kira Weissman is a Royal Society Dorothy Hodgkin Fellow in the department of biochemistry, University of Cambridge.
- K. Brown, Sci. Am., 2003, 288, 18.
- J. M. Alonso et al., Science, 2003, 301, 653.
- R. Hall et al., The Plant Cell, 2002, 14, 1437.
- O. Fiehn, Plant Mol. Biol., 2002, 48, 155.
- O. Fiehn, Nature Biotechnol., 2000, 18,1157.
- N. Glassbrook, C. Beecher and J. Ryals, Nature Biotechnol., 2000, 18, 1142.
Metabolomics vs metabonomics
The measurement of metabolites can be split into two distinct, but similar areas: metabonomics and metabolomics. At first glance, this simply appears to be a case of semantics, but there are important differences.
Metabolomics is all about identifying the natural small molecules inside a single cell and determining which portion of the genome is responsible for their production. By altering the genome in specific places, scientists can discover how genetic code is expressed. Metabonomics, on the other hand, is concerned with the bigger picture, dealing with integrated, multicellular biological systems. Typically, a whole organism (even humans) with many different types of cell is studied and the main measurement is the response of an organism to external stimuli and environmental factors.
These two 'omics' have arisen at the same time from two different areas of biological science: metabolomics from plant science and metabonomics from animal biochemistry and medicine. For the foreseeable future, it seems there will be plenty of conference-dinner arguments about which discipline is a subset or specialisation of the other, and undoubtedly some research studies will fall somewhere between the two.
Gas chromatography-mass spectrometry (GC-MS)
In gas chromatography, a sample is introduced into an injector where it is volatilised by heating to 250-300 °C. The requirement for volatility restricts the utility of this technique to low molecular mass compounds (typically <800). The volatile solutes are pushed by a pressurised carrier gas into a heated chromatography column that has an inner coating of a special liquid (typically a silicone). Separation of the various components then ensues because they are more or less impeded during their trip down the column by interaction with the stationary liquid phase.
As the molecules emerge from the end of the column they interact with a detector that registers their arrival and also the time at which they appeared; a series of such signals or peaks at these various retention times forms a 'chromatogram' of the mixture. In the case of GC-MS, the detector is a mass spectrometer so each peak also contains information about the mass of the components. When compounds enter the spectrometer they are ionised to acquire either a net positive or negative charge. The spectrometer can then separate these ions based on their mass-to-charge (m/z) ratio. In addition, because the ionisation process is somewhat energetic, molecules are often broken into smaller fragments, which are also detected. These fragments can themselves be trapped by the machine and fragmented further (in a technique called MS/MS) to create a very distinctive pattern of pieces that allows researchers to distinguish between two very similar substances.
GC-MS therefore meets a number of the criteria for a good metabolic profiling technique: it is sensitive (having a femtomole detection limit (ca 108 molecules)), automated and allows the separation of large numbers of compounds on the basis of their retention times on a column. Even for compounds that are not completely resolved by the chromatography step, the selectivity of the mass detector allows a peak to be deconvoluted to give the mass spectra of the individual components in the mixture. The pattern of fragments in these spectra can provide useful information about a compound's structure. Additionally, quantification can be achieved if there is a linear relationship between the amount of metabolite and the signal it produces, and a commercial standard for the compound is available.
High performance liquid chromatography-mass spectrometry (HPLC-MS)
HPLC-MS is similar to GC-MS, but in the separation step molecules are instead partitioned between a liquid and a solid stationary phase, a feature which removes the requirement for volatilising the sample. Liquid chromatography is also compatible with a wider range of ionisation and detection techniques than GC-MS, and it can therefore be better tailored for the analysis of particular types of molecules. For example, electrospray ionisation, a revolutionary technique that earned its developer Professor John Fenn a Nobel Prize, is such a gentle process that it allows large biological molecules like proteins to be analysed. On the downside, HPLC is slower than GC-MS, and the instruments depend on a considerably more complicated network of pumps and moving parts, making them costly to maintain.