Oil palm biotechnologies are definitely out of infancy

: Although biotechnologies and sustainable development are often considered as antagonists, there is increasing evidence for a role for this approach in the ecological intensification of oil palm culti-vation. Ecological intensification is based on the understanding of how nature functions so as to exploit its resources without destroying it. Living organisms are supported by the genome (DNA) through the action of the transcriptome (RNAs), proteome, metabolome, and ionome, the four basic pillars of functional genomics. These pillars represent the sum of all the expressed genes, proteins, metabolites, and elements within an organism. The dynamic response and interaction of these biochemical ‘‘ omes ’’ defines how a living system functions, and its study, systems biology, is now one of the biggest challenges in life sciences. In oil palm, as in many major crops, functional genomics is still at its beginning, although there are no reasons why oil palm should not rapidly benefit from the fast progresses generated by automated and high-throughput technologies. The success of sequencing projects on model plants has created a widespread interest in exploring the structure and expression patterns of the genome. Indeed, several institutions have now achieved the full sequencing of the oil palm genome, paving the way for the rapid evolution of various genomics-based approaches. Oil palm breeding has provided an average 1% of genetic gain per year since the early 1960s and such an impressive increase in oil yield will be maintained in future generations with a major contribution from biotechnology. Indeed, the recent adoption of biotechnological approaches has already proven very useful in major areas such as cloning of outstanding material, identity checking of progenies/mother palms, identification and characterization of genes underlying agricultural traits, etc. Phenotypic differences among individuals are partly the result of quantitative differences in transcript abundance. Using microarray technology, it is now possible to assess the abundance of many transcripts — and, indeed, of the entire known transcriptome — simultaneously. Studies which attempt to localize the genetic regulators of gene expression have been carried out in several species, including maize and eucalyptus. Microarrays have been used to determine gene expression levels in segregating populations and identify genomic regions (gene expression QTLs, or eQTLs) explaining transcript variation in co-regulated genes. Our article will focus on recent developments in various plant biotechnology areas in which applications for improving sustainability already exist or will be developed in the near future.

Biotechnologies and sustainable development are often considered as antagonists, especially in Europe, where GMOs (Genetically Modified Organisms) and agricultural biotechnology are opposed to more environment-friendly agricultural practices, such as organic farming. Nevertheless, there is increasing evidence for a pivotal role for this approach in the ecological intensification of oil palm cultivation. Ecological intensification is based on the understanding of biological functions in order to better exploit natural living resources while preserving them for future generations. Biotechnology is based on mechanisms that nature already uses. When applied in appropriate ways, the results can offer new solutions to existing problems, such as the finite resource levels being depleted today (Mc Laren, 2005). To date, 1,193 genomes have been fully sequenced (figure 1), including 77 archaeal, 996 bacterial, and 123 eukaryal genomes from which 10 are from higher plants, namely Cucumis, Vitis, Populus, Oryza, Arabidopsis, Glycine, Physcomitrella, Sorghum, Vitis, and Zea (GOLD, 2010). The recent success of sequencing projects on model plants has created a widespread interest in exploring the structure and expression patterns of the genome in several crop species of agronomic interest, and many more genome sequences are to be released in the coming months for plants such as cassava, tomato, po-tato, papaya, cotton, eucalyptus, citrus, barley, etc. (NCBI, 2010). Several institutions and consortia have now achieved the full sequencing of the oil palm genome, paving the way for rapid changes in many genomics-based approaches. Modern biology is now integrating systems biology with very close links with biotechnology. Indeed, living organisms are supported by the genome (DNA) through the action of the transcriptome (RNAs), proteome, metabolome, and ionome, the four basic pillars of functional genomics (figure 2). These pillars represent the sum of all the expressed genes, proteins, metabolites, and elements within an organism. The dynamic response and interaction of these biochemical ''omes'' defines how a living system functions, and its study, systems biology, is now one of the biggest challenges in life sciences. In oil palm, as in many major crops, functional genomics is still at its infancy, although there are no reasons why oil palm should not rapidly benefit from the fast progresses generated by automated and high-throughput technologies.
Oil palm breeding has provided an average 1% genetic gain per year since the early 1960s and such an impressive increase in oil yield will be maintained in future generations with an increasing contribution from biotechnology. Indeed, the recent adoption of biotechnological approaches for molecular breeding has already proven very useful in major areas such as the cloning of outstanding plant material, the identity checking of progenies/mother palms, or the identification and characterization of genes underlying useful agricultural traits.
Phenotypic differences among individuals are partly the result of quantitative differences in transcript abundance. Using microarray tech-nology, it is now possible to assess the abundance of many transcriptsand, indeed, of the entire known transcriptomesimultaneously. Studies which attempt to localize the genetic regulators of gene expression have been carried out in several crop species, including maize and eucalyptus (figure 3). Microarrays are used to determine gene expression levels in segregating populations and identify genomic regions (gene expression QTLs, or eQTLs) explaining transcript variation in co-regulated genes. The present article will focus on recent developments in various plant biotechnology areas in which applications for improving oil palm sustainability already exist or will be developed in the very near future.
Intensification is no more an option for oil palm To meet the food and bioenergy security needs of the near future, farmers must double or even triple crop yields on less arable land, with limited water, on poorer soils, and with less fertilizers and pesticides. Information gained from sequenced genomes in crops, coupled with genetic association studies, will allow us to identify key genes/quantitative trait loci and networks that can lead to higher yielding crops that can grow in extreme conditions with reduced environmental impact. The oil palm being a major crop in the intertropical regionwhere fast growing populations live and where large areas of HCV (High Conservation Value) native forests must be preservedmust follow and even anticipate this trend towards ecological intensification (Fitzherbert et al., 2008;Koh and Wilcove, 2008;Stone, 2007). Given the insatiable demand to supply the expanding population of India, China, and Pakistan, and their concomitant increase in standard of living, Murphy (2007) identified the yield problem as one of the major challenges the oil palm industry will have to tackle in the 21st century.
Deciphering the oil palm genome has become reality Plant genome sequencing has progressed rapidly since the first genome (Arabidopsis thaliana) was completed in 2000 (Arabidopsis Genome Initiative, 2000). Then, the 389-Mb rice genome was completed in 2004 (International Rice Genome Sequencing Project, 2005), and a sequence of the 2,3-Gb maize genome was recently released (Wei et al., 2009). All were sequenced using "traditional"  (Ruffel et al., 2009). This scheme illustrates the different (non-exhaustive) dimensions (in blue) that can be integrated in systems approaches, in order to decipher emerging properties of plant regulatory networks.
sequencing approaches in which sequencing libraries are constructed from individual segments of the genome (such as those contained within bacterial artificial chromosomes -BAC clones) and are sequenced via gel electrophoresis (Sanger sequencing).
A whole-genome shotgun (WGS) strategy, made possible with improved assembly algorithms, has been used for several recent plant genomes, such as poplar (Tuskan et al., 2006), grapevine (Jaillon et al., 2007), or sorghum (Paterson et al., 2009), in which the sequencing libraries are made directly from genomic DNA. These approaches are easier and cheaper than the Sanger-based ones.
Recently, next generation sequencing (NGS) technologies have promised to further accelerate progress, with huge increases in sequencing throughput and decreases in costs. Indeed, this approach was found able to generate the genome of all 12 chromosomes (~400 Mb) of Oryza barthii in a month or two (Rounsley et al., 2009). Four different NGS technologies are now commercially available, namely 454 Life Sciences (acquired by Roche), Solexa (acquired by Illumina), ABI SOLID (acquired from Agencourt Biosciences), and Helicos Biosciences. Although all have their specific features, generally, they can be grouped into two classes based on the lengths of the sequence reads which are generated. Indeed, Solexa, ABI SOLID, and Helicos all produce very short reads in very large quantities, while the 454 platform can produce a more moderate amount of sequence, but with much longer read lengths. Several of the platforms have already gone through multiple rounds of upgraded specifications, and improvements are likely to continue. Future improvements in NGS technologies may continue to reduce the time and cost of such projects, but this strategy will likely always be valuable for addressing the complexities of se-quencing large repetitive genomes with NGS. It provides an immediate and practical solution to the rapid generation of genome sequences from large and complex eukaryotic genomes like the oil palm (1.7 Gb; Rival et al., 1997). Several research groups and consortia have recently announced the complete sequencing of the oil palm genome. Indeed, Zieler et al. (2010) very recently reported on the generation of whole-genome shotgun sequences of the oil palm genome, using primarily Sanger reads to enable high-quality assemblies. This breakthrough was achieved by a consortium joining Synthetic Genomics Inc. (USA), ACGT Sdn. Bhd. (Malaysia), and the J. Craig Venter Institute (USA). According to these authors, the genome sequences have been supplemented with a high volume of EST and transcriptome sequencing, and the genome has been fully annotated. In November 2009, the Malaysian Palm Oil Board (MPOB) announced that a consortium   (Kirst et al., 2004): association between gene expression and diameter variation. The Eucalyptus grandis backcross progeny is ranked according to diameter (x axis) and negative (A) or positive (B) correlation between relative transcript level and diameter variation (y axis). Least square means estimates of transcript levels were normalized relative to the mean of each gene across the population. Red represents higher and green represents lower expression relative to the other individuals of the population, for each specific gene. Black indicates no change in mRNA levels. GenBank accession numbers and putative functions are displayed on the right. Genes represented by multiple cDNAs are represented by the most highly correlated, and those related to lignin biosynthesis are indicated by asterisks. co-led by the Advanced Biotechnology and Breeding Centre has sequenced three oil palm genomes from the two species Elaies oleifera and Elaeis guineensis, including the pisifera and dura palms. The consortium included St. Louis, Missouri-based Orion Genomics, MOgene LC, and The Genome Center at Washington University; South Korea-based Macrogen Inc.; and Adelaide, Australia-based GeneWorks Pty Ltd. In addition to sequencing and assembling the genomes of the three oil palm varieties, the consortium sequenced the expressed genes (or transcriptome) from multiple tissue types for all three types of oil palm (Wahid, 2009). In May 2009, a Malaysian consortium created by Sime Darby Bhd., one of the world's largest oil palm company, and Synamatix Sdn. Bhd., a bioinformatics company, had announced the successful sequencing, assembling, and annotating of the oil palm genome with 93.8% completeness. For oil palm biotechnologists all around the world, given the amounts of investments and subsequent issues on intellectual property rights, the availability of these genome sequences to the scientific community still remains a key point. Successful bioinformatics relies on the accuracy and public availability of genome databases, and the oil palm community must follow the example of consortia (public or private) which have already published complete plant genome sequences and permitted their open public access (visit www. maizesequence.org site as an example among others). Similar to the "Open Source" concept in software science, each member of the community benefits from the work done by other members, as any improvement made on the published sequence, such as corrections and annotations, is published and made available for further uses.

Biotechnologies already impacts oil palm productivity
The assembly of genomic tools for use in oil palm breeding is clearly underway and it has already delivered outstanding outputs. Recent review articles published by Price et al. (2007) and Rival (2007) provided updated information on advances in biotechnology and molecular breeding applications for the oil palm. The genetic basis of some of the key traits in oil palm (growth, bunch size, bunch number, oil extraction rate) still needs to be dissected with the help of accurate genetic maps. To undertake such a task, molecular resources do exist, although they are scattered among various institutions around the world and rarely open to public access. An exception is the LINK2PALM program 1 which provided valuable resources to the community under the form of co-dominant microsatellite markers (SSR). Microarray (DNA chips) analysis approaches for transcriptomics allows an entire transcriptome to be analyzed at a specific developmental point, under a particular stimulus, or in a particular tissue. The construction of such arrays is usually based on very extensive EST collections, if not complete genome sequences. Jouannic et al. (2005) generated five different cDNA libraries from oil palm male and female inflorescence shoot apices and zygotic embryos. A total of 2,411 valid EST sequences were thus obtained through unidirectional systematic sequencing. This EST database described is a first step towards gene discovery and cDNA array-based expression analysis in oil palm. Such resources are now being generated at a large scale in oil palm as part of an industrial consortium called OPGP (Oil Palm Genome Project), although a limited public access to these resources is anticipated (Mayes et al., 2009). Recent advances have been made in the exploration of epigenetic mechanisms underlying the mantled somaclonal variation in clonally micropropagated oil palms. Indeed, Rival et al. (2008) showed that the genome-wide hypomethylation previously described in mantled material was not explained by a decrease in expression levels of the de novo or maintenance DNA methyltransferases, a paradox which has been previously reported in tumor cells.
Research is now focusing on DNA methylation around candidate marker genes. Indeed, orthologues of the MADS-box genes involved in the formation of floral organs have been identified in the genome of oil palm (Adam et al., 2006;Syed Alwee et al., 2006), and research has shown that oil palm B-type MADSbox genes display differential transcript levels between normal and abnormal inflorescence tissues (Adam et al., 2007). A range of genes with altered expression in abnormal tissues has been recently identified through the use of subtractive PCR (SSH) and subsequent macroarray hybridization (Beule et al., unpublished data). Investigating methylation patterns of these target genes will pave the way for understanding the epigenetic mechanisms underlying the induction and maintenance of somaclonal variation in oil palm regenerants.
Since the initial transformation of oil palm with a marker gene has been reported by Parveez et al. (2000), genetic engineering in oil palm is progressing at the slow pace imposed by public acceptance, biosafety issues, and the genetic complexity of desired traits (from the production of bioplastics to the introduction of resistance to major pasts and diseases). Subhi et al. (2010) have recently identified the ubiquitin extension protein (uep1) gene as a constitutively expressed gene in oil palm. The 5′ region of uep1 functions as a constitutive promoter in oil palm and can drive GUS expression in all tissues tested and also be used in dicot systems.

Oil palm physiology and nutrition benefits from biotechnologies
Within the context of expected climatic changes, it is of paramount importance to understand the mechanisms underlying the phenotypic plasticity of oil palm and thus its capacity to withstand dryer climates and produce oil on poorer soils with lower or without use of fertilizers (Zechendorf, 2009). In order to study the dynamics of plant metabolism under stress and unravel regulatory mechanisms in place, it is important to combine the traditional, more descriptive physiological approaches with the techniques of functional genomics, namely the high-throughput methods for transcriptomic, proteomic, metabolomic, and ionomic analysis. Using this integrated analysis, it would be possible to study the dynamics of plant metabolism in the context of the plant system as a whole (Chaves et al., 2009). Recent advances have been made in the understanding and further modeling of oil palm productivity through the analysis of phenological and growth responses to seasonal and interannual climatic variability (Legros et al., 2009). Now is the time to integrate the activity of specific key genes in these models. In developing resistance to biotic stresses, the main problem lies in dealing with the highly variable nature of biological agents, be it pathogen or insect. Predicting which genes would confer durable resistance remains a key challenge (Leung, 2008). For abiotic stresses, a high degree of genotype × environment (G × E) interaction makes assessing the causal relationship between genotype and phenotype difficult. Thus, the challenge is to define phenotypes that are a true reflection of the genotypic differences, and to find the right genes/phenotypes that work well in target environments. This difficulty is exemplified in studying traits responsible for tolerance to drought.
A number of selection criteria have been developed for fast reliable screening of water-use efficiency (WUE) and its determinants such as stomatal conductance and 13 C/ 12 C isotope discrimination. This approach recently proved useful in the understanding of carbon use during the passage from heterotrophy to autotrophy in oil palm leaves (Lamade et al., 2009). Interestingly, a molecular linkage map with several major QTLs for 13 C/ 12 C isotope discrimination in rice was published by Price et al. (2002), so any QTL candidate gene regions identified in related species could be potentially anchored to the rice genetic map to provide comparative information on possible homologue location in the oil palm. Pennisi (2008) published a short review of research underway to identify genes underlying drought resistance in plants. When expression patterns of 1,500 genes were studied in Arabidopsis plants grown under drought conditions, 40 appeared to be involved in drought adaptation. Thus, there is a palette of genes from which breeders and crop scientists will select for drought tolerance. The doubling of agricultural food production worldwide over the past four decades has been associated with a 7-fold increase in the use of nitrogen (N) fertilizers. As a consequence, both the recent and future intensification of the use of N fertilizers in agriculture already has and will continue to have major detrimental impacts on the diversity and functioning of the non-agricultural neighboring bacterial, animal, and plant ecosystems. In a recent review, Hirel et al. (2007) described how our understanding of the physiological and molecular controls of nitrogen assimilation under varying environmental conditions in crops has been improved through the use of combined approaches, mainly based on whole-plant physiology, quantitative genetics, and forward and reverse genetics approaches. It is of major importance to identify the critical steps controlling plant N use efficiency (NUE). Moll et al. (1982) defined NUE as being the yield per unit of available N in the soil (including the residual N present in the soil and the fertilizer). This NUE can be divided into two processes: uptake efficiency (NupE; the ability of the plant to remove N from the soil as nitrate and ammonium ions) and the utilization efficiency (NutE; the ability to use N to produce yield). Studies of the whole-plant N response will be essential to elucidate the regulation of NUE and to provide selection criteria for breeders and monitoring tools for planters in the aim of designing efficient and reasoned fertilization protocols. Results from Arabidopsis research have improved our understanding of the relationship between N availability, N uptake, and root development (Walch-Liu et al., 2006;Hirel et al., 2007). Since N uptake is one of the most critical NUE components under N-limiting conditions in a number of crops, the transfer of knowledge to crop plants such as oil palms is more than realistic.
Oil palm ionomics is just round the corner Lahner et al. (2003) described the ionome to include all the metals, metalloids, and nonmetals present in an organism, extending the term metallome to cover biologically significant non-metals such as nitrogen, phosphorus, sulfur, selenium, chlorine, and iodine. Because the ionome is involved in such a broad range of important biological phenomena, including electrophysiology, signaling, enzymology, osmoregulation, and transport, its study promises to yield new and significant biological insight (Salt, 2004). An understanding of the ionome and how it interacts with other cellular systems such as the genome, proteome, metabolome, and environment are integral to the full understanding of how plants integrate their organic and inorganic metabolisms (figure 4). Several recent papers (Baxter, 2009) have described high-throughput elemental profiling studies of how the ionome responds to the environment or explored the genetics that control the ionome. When combined with new genotyping technologies, ionomics provides a rapid way to identify genes that control elemental accumulation in plants. Inductively coupled plasma spectroscopy, either mass spectroscopy (ICP-MS) or optical emission spectroscopy (ICP-OES), allows for the simultaneous measurement of dozens of elements. Improvements in these techniques over the last few decades have overcome obstacles that prevented previous efforts from detecting all but the most severe ionomic differences. Measuring multiple elements allows researchers to explore the dynamics of the ionome as a whole, not just as individual elements in isolation. It also makes gene identification experiments more efficient, which should allow new genetic mapping techniques to identify hundreds of new loci that control this complex system. To fully understand the complex regulation of the ionome, we will need to find the genes that control the accumulation and distribution of each element. Iron chlorosis is one of the major abiotic stresses affecting crops in calcareous soils and leads to a reduction in growth and yield. In a recent study, Forner-Giner et al. (2009) have examined the differential gene expression induced by iron deficiency in the susceptible citrus root-stock Poncirus trifoliata (L.) Raf. The genes identified are putatively involved in cell wall modification, in determining photosynthesis rate and chlorophyll content, and reducing oxidative stress.

From molecular biology to bioinformatics… and back
The challenge facing agricultural research is to accelerate biological discovery to address some of our most critical food and bioenergy challenges. With the recent sequencing of many plant species including major crops, the development of both physical (in vivo) and informational (in silico) resources has entered a new phase.
Bioinformatics has emerged as a new field of science aimed at: i) acting as a repository of information generated and to provide links to the associated physical resources the information is based on, ii) allowing the repository to be searched for specific resources or information, iii) developing linkages between different types of information (sequences, map positions, alleles within a population, related genes) and develop tools to facilitate discovery of further links (gene prediction, protein interactions etc.). To this end, many tools have been developed to generate and improve databases in order to provide simple access and searching facilities and tools for gene discovery. The sophistication of these interfaces continues to improve (Mayes et al., 2005). In the short term, the quantity of data is likely to continue to expand exponentially and ways need to be found to interrogate such databases intelligently, so that the output remains comprehensive.
The present paper is aimed at stressing the importance of biotechnology in the global effort for increasing and maintaining sustainability for the oil palm. For the (still small) community of oil palm biotechnologists, the challenge is very exciting as amazingly powerful tools are becoming available every day. Bioinformatics is emerging as a new science dedicated to the sorting and assembling of DNA and RNA sequences. This in silico work is indispensable, given the exponential amount of data accumulated today by high-throughput sequencing projects.
If biotechnologists want to play their crucial part in the sustainability challenge for oil palm, they will have to integrate bioinformatics data to biological questions arising from field experiments, in order to provide adequate answers to end-users: the oil palm farmers from tropical regions of the world.  Figure 4. High-throughput ionomics (Salt, 2004). Putative mutants and wild-type Arabidopsis plants are grown together with known ionomic mutants, used as positive controls, under standardized conditions. Plants are uniformly sampled, digested in concentrated nitric acid, diluted, and analyzed for numerous elements using inductively coupled plasma mass spectroscopy (ICP-MS). Raw ICP-MS data are normalized using analytical standards and calculated weights based on wild-type plants (Lahner et al., 2003). Data are processed using custom tools and stored in a searchable, World Wide Web-accessible database. Ionomic analysis can also be applied to other plants with available genetic resources, including rice and maize. Elements in the Periodic Table highlighted in black boxes represent those elements analyzed during our ionomic analyses using ICP-MS, elements highlighted in green are essential for plant growth, and those in red represent non-essential trace elements. The table represents Arabidopsis (Col 0) shoot and seed ionomes; all elements are presented in mg/g dry weight. Data represent the average shoot concentrations from 60 individual plants, and seed from 12 individuals 6 SD as percentage of average (%RSD).