Gene banks for wild and cultivated sunflower genetic resources

Modern breeding of sunflower (Helianthus annuus L.), which started 100 years ago, increased the number and the diversity of cultivated forms. In addition, for more than 50 years, wild sunflower and other Helianthus species have been collected in North America where they all originated. Collections of both cultivated and wild forms are maintained in gene banks in many countries where sunflower is an important crop, with some specificity according to the availability of germplasm and to local research and breeding programmes. Cultivated material includes land races, open pollinated varieties, synthetics and inbred lines. The majority of wild accessions are ecotypes of wild Helianthus annuus, but also 52 other species of Helianthus and a few related genera. The activities of three gene banks, in USA, France and Serbia, are described in detail, supplemented by data from seven other countries. Past and future uses of the genetic resources for environmental adaptation and breeding are discussed in relation to genomic and improved phenotypic knowledge of the cultivated and wild accessions available in the gene banks.


Introduction
Preservation of important crop species, cultivars, landraces, and crop wild relatives (CWR) provides the basis for a sustainable agricultural system and ensures the security of our global food supply (Campbell et al., 2010). Collection of germplasm is the first step in conserving genetic resources. The mission of gene banks is to serve as the central repository for maintaining the accessions and related information, to distribute seeds for conducting germplasm-related research and to encourage the use of CWR for crop improvement and product development. Germplasm resources can be categorized as in situ resources (maintained as wild populations and landraces in natural habitats) or ex situ resources (accessions preserved in gene banks). In situ, resources are conserved in natural conditions where they continue to evolve in response to their environment. This is the ideal way to maintain populations, but such natural populations are often at risk of encroachment by human activities compared to populations maintained in gene banks. The most important advantage of ex situ conservation is that it prioritizes both availability and preservation of the resource.
Crop genetic resources consist of the total genetic variability in the crop or within sexually compatible species (Holden et al., 1993). The crop genetic variability can be further divided in three pools; primary, secondary, and tertiary. The primary gene pool is the crop itself, landraces, cultivars and populations and the closely related wild species that lack crossing barriers, the secondary gene pool consists of many of the more distantly related crop wild species relatives that produce partially fertile hybrids with the crop, but require little or no manipulation for crossing, and the tertiary gene pool is the most distantly related CWR that require special techniques such as bridge species or embryo rescue to use for breeding. Although many secondary and tertiary gene pools may appear to be unfit for use in breeding programs, they may contain useful genetic variation that will protect crops against new emerging pests in the future. Since we cannot predict with an acceptable level of confidence, the occurrence, severity, or even the nature of future stresses, germplasm with the widest range of genetic diversity as possible should be preserved for future breeding purposes. Hopefully, the present gene bank collections have captured the broadest genetic diversity possible, preserving it for future generations, since many of the accessions have no immediate value.
For over 100 years, CWR have been undeniably beneficial for modern agriculture, by providing breeders with a diverse pool of potentially useful genetic resources (Hajjar and Hodgkin, 2007). Anderson et al. (2004) predicted that emerging plant diseases and agricultural pests will become more common. Furthermore, some authors (Palmgren et al., 2015) have suggested that modern crops have lost properties through domestication and breeding that their CWR possessed to tolerate emerging pests and ever-changing climate.
The total area of the sunflower (Helianthus annuus L.) crop worldwide in 2019 was estimated at 25.9 million ha in 72 different countries (USDA, 2019). Globally, oilseed sunflower accounts for 90% of the value of the sunflower crop, with the remaining 10% coming from confectionary sunflower (Hladni, 2016). Oilseed sunflower accounts for up to 12% of the worldwide production of vegetable oils, ranking fourth after palm, soybean and canola oil (Rauf et al., 2017).
Compared to the other main temperate crops, cultivated sunflower is a recent crop. It experienced a domestication bottleneck that narrowed its genetic base but the large number of sunflower CWR makes it possible to mine a vast genetic pool for crop improvement. The primary centre of origin of the genus Helianthus in North America makes collection and preservation possible with minimal difficulties. This genus contains 53 species of which 14 are annuals and 39 perennials (Schilling, 2006;Stebbins et al., 2013). They are found from the Atlantic coast to the Pacific and from Canada to Mexico. Since the sunflower CWR and their pathogens and insect pests have been subjected to co-evolution over a long period, the wild species constitute an important source of resistance genes. In addition, knowledge of particular habitats and adaptations of wild species can help to identify potential sources of tolerance genes in ecotypes which survive in areas where abiotic challenges exist (Seiler et al., 2017). Through adaptation, genetic diversity goes hand to hand with habitat diversity, which can often be the key to identify potential sources of genes.
While H. annuus was first domesticated in North America, the first active scientific breeding was in Russia at the end of the 19th century and first half of the 20th century. V.S. Pustovoit developed open pollinated varieties from local populations present in different parts of the ex-Soviet Union, especially the Mariupol region in Ukraine (for details, see Supplementary data and references). A large part of presentday cultivated genetic resources come from these Russian open pollinated varieties, but not all. Wild H. annuus and cultivated sunflower can be crossed with no loss of fertility and the first important genetic resource that formed the base of the restorer pool came from an unintended cross of Canadian material (developed from that of Russian emigrants) with wild H. annuus in a winter nursery in Texas (Putt, 1964). This difference in origins between the CMS (PET1, Leclercq, 1969) maintainer "female" pool from open pollinated Russian varieties and restorers with some wild H. annuus genome may have led to the heterosis observed, one of the reasons for the rapid success of sunflower hybrids since 1970.
In spite of the fact that the Russian Institute VNIIMK was the only important breeding centre in the world until the 1960s, spread of the sunflower crop in the last 100 years led to a wide variety of oil, confectionary and ornamental types with large differences in physiological and harvesting maturity, height, and disease resistances. Thus, in addition to the old open pollinated varieties, genetic resources with these characteristics are largely represented by inbred lines, which are easier to maintain. The present-day collections of cultivated resources in different countries probably cover most of the available variation, with some specialization according to the interests and locations of research institutes. These resources are important for quite rapid modification of modern cultivated material since they provide sources of disease resistance, oil content, different maturities, and other agronomic characters which can be easily introduced into elite parental lines or hybrids without loss of the most important agronomic traits.
Since sunflower is truly a global crop, grown in a large number of countries, gene banks are maintained in different parts of the world, with varying numbers and diversity of sunflower accessions. Some information about the different sunflower gene banks can be obtained from the GENESYS portal, which is a collaboration between Bioversity International on behalf of System-wide Genetic Resources Program of the CGIAR (Consultative Group on International Agricultural Research), the Global Crop Diversity Trust and the Secretariat of the International Treaty on the Plant Genetic Resources for Food and Agriculture (link: https://www.genesys-pgr.org/). However, some of the information in this database is outdated and incomplete. The aim of this paper is to provide an update of the main sunflower gene banks around the world, the numbers of accessions they contain and their specificities, and to provide in-depth details of accessions held in the USDA (USA), INRAE (France) and IFVC (Serbia) gene banks. The discussion will consider how genomics and modern phenotyping methods can improve our knowledge of the resources, mining of the traits of interest, and their importance to provide adequate material for research and breeding in the future.

Details of the main gene banks
The US, French and Serbian gene banks are described in detail as examples. Other gene banks (Argentina, Bulgaria, Germany, India, Romania, Russia, Spain) are described more succinctly and details are given in supplementary documents. Table 1 provides the basic information provided by curators for all these collections. There are also collections in China and Turkey.

USA À USDA Gene bank
The USDA sunflower gene bank is part of the U.S. National Plant Germplasm System (NPGS) and is currently housed in Ames, Iowa, at the North Central Regional Plant Introduction Station (NCRPIS), a partnership between the USDA and Iowa State University. The NCRPIS is one of the original US Plant Introduction Stations and the first one to become operational. The cultivated sunflower collection has been in Ames since the station opened in 1948. The wild sunflower collection was started in Texas in the late 1970s at the USDA Bushland location and was transferred to Ames in 1985 along with a more modest wild collection from the USDA location in Davis, CA. An overarching NPGS goal is for the gene banks to maintain and distribute their collections to researchers and educators worldwide. The standard distribution amount for sunflowers is 100 seeds per accession. Because sunflower is native to North America, it is relatively uncomplicated to add germplasm from wild populations in the US and Canada; therefore, the USDA sunflower gene bank has the goal to have as comprehensive a collection of the wild Helianthus species as possible, sampling populations from across the geographic range of each species. To attain that goal, the gene bank continues to add accessions from unrepresented regions or unusual habits within the standard ranges of each species. No sunflower breeding takes place at the Ames, IA gene bank location.

Cultivated collection
The first donations to the USDA cultivated sunflower collection were all open pollinated populations from international programmes. The collection also contains landrace material, accessions were collected in the south-western US and received in the 1970s, and a large donation of Spanish landrace material received in 1990. The first inbred lines came to the collection in the late 1960s from the breeding program at Texas A&M University, USA. By 1980, the USDA sunflowerbreeding program established in Fargo, ND was developing and releasing inbred lines which then became part of the collection. Sunflower breeding has a long history of introgression of wild traits to improve the crop and much of the material developed by the USDA group in Fargo contains introgressed wild genetic segments. The majority of these lines are highly inbred although a group of 56 accessions received in 2018 are partial inbreds or populations. The collection also contains 360 pre-bred lines developed at the University of British Columbia from crossing a cultivated line with 11 different wild annual species resulting in partially inbred, breeding material. For details, see: https://www.ag.ndsu.edu/ fss/ndsu-varieties/usda-sunflower-inbred-lines.

Wild Helianthus collection
The majority of the wild accessions in the gene bank are wild H. annuus (1057). The collection also includes 636 accessions of other wild annual species as well as 904 accessions of wild perennial taxa. The only extant wild Helianthus taxa not represented in the collection is H. niveus ssp. niveus, which is endemic to Baja California, Mexico. The majority of the accessions were collected within the United States except for 10 H. annuus accessions collected in Mexico, 24 in Canada, and 56 samples of naturalized H. annuus populations from Australia. Of the accessions of the other wild annual species, eight were collected in Mexico and Canada, and 16 were collected as naturalized populations in Australia, Argentina, Moldova, Mozambique and the former Soviet Union. Thirtysix perennial accessions were collected in Canada and five originated from the former Soviet Union. The USDA collection also includes samples from four Asteraceae genera related to Helianthus: Phoebanthus, Tithonia, Verbesina and Viguiera. Helianthus tuberous accessions (90) are maintained as seeds, not as clones, due to costs of maintenance for tubers. Maintaining the collection in the field in Ames is problematic due to rodent activity in addition to issues due to the invasive nature of the plants. The majority of the H. tuberosus accessions represent samples from wild populations.

France À INRAE gene bank
The sunflower gene bank (Centre de Ressources Biologiques [CRB]) at INRAE, Toulouse (https://www6.toulouse. inra.fr/lipm/Recherche/Genetique-et-Genomique-du-Tourne sol/CRB-Tournesol), was started after the transfer of INRA research activities at Clermont-Ferrand and Montpellier to Toulouse in 2006-2015. The entries from Clermont-Ferrand resulted from collection and breeding of cultivated populations and inbred lines especially for resistance to diseases, particularly downy mildew and Sclerotinia head rot. The material from Montpellier included many accessions of wild H. annuus and other Helianthus species and a quite large proportion of cultivated lines obtained from crosses of these wild ecotypes with cultivated sunflower followed by selection for a wide range of characters. The aim of the INRAE sunflower gene bank is to maintain, develop and distribute genetic resources of Helianthus.

Two thousand and three hundred inbred cultivated sunflower lines
These lines were bred by INRA between 1963 and 2012, or received (in variably homozygous states) during exchange programmes with many different research institutes. There are 1500 lines of French origin, the others come mainly from the USA (340 lines), Bulgaria (88 lines), Russia (87 lines) and Romania (82 lines), but also from 16 other countries. There are 1690 maintainers of CMS PET1 (B lines), their CMS form being available for 560 of them, and 640 lines are male fertility restorers. This material is stored at 4°C and multiplied under bags about every 10 years. About half, chosen to represent the complete collection, make up the "sunflower network collection", which is multiplied by INRAE with the help of private sunflower breeders (Caussade Semences, MAS Seeds, Corteva, RAGT, Soltis, Syngenta). The INRA breeding programme for parental lines that could be used commercially terminated in 2012 and this collection is now the basis for genetic and genomic studies (e.g., association mapping, markers, gene identification) for most of the economically important characters in sunflower. Therefore, much of this material is available for research or breeding programmes under an MTA. The list is available at https://urgi.versailles. inra.fr/siregal/siregal/grc.do.
First studies of the available variability and identification of representative core collections started in 2006. Coque et al. (2008) found three groups, one of B lines and two groups of restorers, one of which included some B lines. Mandel et al. (2011Mandel et al. ( , 2013 used the first INRA core collection together with USDA lines and also distinguished restorer and maintainer lines. Using the INRA lines, Cadic et al. (2012) defined three groups but with some lines that were not well classified. The whole collection has now been genotyped with SNP arrays and/or re-sequenced, which should make it possible to eliminate duplicates and to develop improved core collections.

Four hundred and three cultivated sunflower populations
These are samples of open pollinated varieties (OPV) and land races together with pools or synthetics developed as a base for breeding programmes. Collection of this type of material started in 1962 with samples of Russian OPV imported directly from Russia or via Eastern Europe for trials to determine if they would be suitable for cropping in France. The accessions come from 34 countries, with 72 from Russia, 54 from France, 34 from Romania, 27 from the USA, 26 from Spain and 26 from Morocco. As in the case of inbred lines, these populations are maintained at 4°C, but samples have also been frozen at À18°C. They are multiplied about every 10 years, mostly under 50-100 m 2 insect proof cages by the Helianthus network mentioned above. They are also listed on the same site as the inbred lines (https://urgi.versailles.inra.fr/ siregal/siregal/grc.do) and are also available with an MTA. Mangin et al. (2017) studied 102 of these populations and showed that variability is structured firstly by oil content (oil vs. confectionary) and then by the presence or absence of restorer genes and by flowering date.

Six hundred and sixty-five wild Helianthus accessions
The majority of these accessions (369) are wild H. annuus, but there are 66 accessions of other annual Helianthus species and 230 accessions of perennial species. There are also five representatives of genera close to Helianthus: Verbesina, Tithonia, Viguiera and Simsia. Most of the accessions came from USDA but some were obtained during international research collaboration. For example, 216 accessions, including 79 H. annuus, were received from IFVC, Novi-Sad, although 64 of them originated from the USDA collection. Thirty-four entries, mostly perennials, were received from Eastern Europe, especially Russia and Bulgaria. A total of 262 accessions came from European or Australian research centres, but again 173 originally came from USDA. A small number of accessions were collected outside North America, seven from South America and 20 from other continents. The first few accessions were sent by C. Heiser to P. Leclercq at INRA, Clermont-Ferrand in 1964, one sample of H. petiolaris (numbered 6312) being the origin of the CMS PET1. However, most of the accessions were introduced, maintained and studied at INRA Montpellier by H. Serieys between 1975 and2008. They were transferred to Toulouse in 2010-2012. In order for this collection to be useable in research programmes, the Toulouse gene bank has undertaken regeneration of seed stocks. Maintenance and multiplication are carried out under insectproof netting cages, with cross-pollination between several plants of each accession. Accessions are regenerated every ten years and may be distributed using an MTA when seed stocks are sufficient.
Many interspecific crosses were made at Montpellier between 1975 and 2000, mostly with annual species, providing up to 20 sources of CMS (Serieys and Christov, 2005) and one source of resistance to broomrape (from H. debilis). Some successful crosses were made with hexaploids (H. resinosus, H. pauciflorus (rigidus) in a search for good resistance to Sclerotinia) and H. mollis, a perennial diploid, to obtain sunflower with sessile leaves (Faure et al., 2002). Wild H. annuus accessions provided several sources of CMS that are difficult to restore and also original genes providing resistance to downy mildew (Pecrix et al., 2018) The INRAE collection of cultivated forms of Jerusalem artichoke, which dates from collection and breeding programmes at Clermont-Ferrand and Rennes in the period [1960][1961][1962][1963][1964][1965][1966][1967][1968][1969][1970], and at present maintained at INRAE Montpellier, will be transferred to Toulouse in 2020. Since tubers need to be planted each year, costs are high, and the transfer required a specific support programme and collaboration with the experimentation unit at Toulouse to undertake the maintenance of this material. The clones were characterised for their morphology and phenology at Clermont Ferrand and Rennes, and more recently, at Montpellier, for a certain number of SSR markers. The INRAE collection was also included as part of a comprehensive phenotyping and SNP genotyping study (Bock et al., 2018), which showed that wild and cultivated Jerusalem artichoke are distinct genetically and phenotypically, and that invasive Jerusalem artichoke genotypes have arisen multiple times, with independent ancestry from wild and cultivated material.

Serbia À IFVC Gene bank
The main goal of this collection was to use wild species in cultivated sunflower breeding. They are mostly used at IFVC-NS for disease resistance breeding, but also to work on oil quality, CMS, herbicide tolerance and new phenotype traits. Interspecific crosses and early generation prebreeding line development are a part of the gene bank activities. Besides evaluation for breeding, the work on wild sunflowers expanded with further collecting efforts, maintenance improvements and diversity estimates (Škorić, 2008). The collection was established in 1980 and started with 59 accessions obtained from the collection at INRA, Montpellier (Atlagić and Terzić, 2015). During the period from 1980 to 1991, five collecting trips were conducted in cooperation with the USDA-ARS stations in Bushland, TX and Fargo, ND. Varying numbers of species (1-37) and populations (52-384) were collected during each trip covering 6 to 21 US states per expedition. The first joint collection mission outside the USA was conducted in 1994 in several provinces of Canada. Seeds of the collected species were divided for the collections in USA and Serbia.
Additional seed samples were obtained from collection trips to Montenegro in 1991 and USA in 2001, as well as by exchange from other gene banks .
During establishment of the collection, the method of sowing had to be adjusted to obtain plants from collected seeds. Poor overwintering in Serbian climatic conditions caused the loss of 11 perennial species, while the inappropriate chernozem soil type contributed to the loss of four annual species. The most common problem was low self-fertility due also to the limited number of plants for each regeneration and to manual pollination (Atlagić and Terzić, 2015). The collection in Novi Sad now contains 21 perennial and eight annual species, represented by 332 and 185 accessions respectively. Each year, 60 to 80 accessions of annual species are grown for seed regeneration. Seed reserves are kept in cold storage at þ4°C and vary from a few seeds to several thousand per accession (Tab. S1). Most of the perennial species are maintained as a living collection in quarantine fields; while some also have seed available (Tab. S2). Detailed information about the collected accessions including passport data can be accessed online on the IFVCNS site: https://www.ifvcns.rs/ kolekcija-divljeg-suncokreta/wild.html. For both annuals and perennials, accessions are available with an MTA when seed stocks are sufficient. Details of wild species are given in Tables S1 and S2, additional references in Supplementary data: Supplementary references 1.
F 1 interspecific hybrids derived from hybridization between perennial species and cultivated sunflower are also maintained at IFVCNS. Plants usually have very low pollen viability, but they can be reproduced vegetatively, and have been grown in the field using clonal maintenance for more than 30 years .
Research was first concentrated on evaluation of most economically important traits. Based on the findings, intensive interspecific crosses were made between wild species and cultivated sunflower . Oil quantity was determined for all accessions (Ćuk, 1982), while resistance to Phomopsis stem canker and Sclerotinia was determined for selected species (Škorić and Rajčan, 1992). The prebreeding programme produced interspecific lines for which combining ability was evaluated (Mihaljčević, 1988) while restorer genes and CMS sources were also determined (Škorić et al., 1988). Cytogenetic analyses were used to monitor and optimize the prebreeding process by studying the number and structure of chromosomes, microsporogenesis and pollen viability (Atlagić, 2004). More recently, efforts were focused more on collection maintenance. Smaller numbers of "targeted" hybrid combinations were made to transfer disease resistance from wild to cultivated sunflower with the help of biotechnology . The species in the collection have been characterised using both phenotypic and molecular markers. Large variability was found not only between the species but also among accessions within a single species (Saftić-Panković et al., 2005). Wild species were screened for resistance to broomrape, Macrophomina, powdery mildew etc. and the collection was also monitored for natural infection and resistance in field conditions (Terzić et al., 2010;Tančić et al., 2012). The effect of seed ageing on germination was studied on all annual species in the collection (Terzić, 2018). The sunflower association mapping population (UGA-SAM1) described by Mandel et al. (2011) was used to evaluate the current crop ontology and usefulness of qualitative traits for the discrimination of genotypes (Terzić et al., 2019). A detailed morphological evaluation of both the wild and cultivated germplasm is in progress (Lazarević et al., 2016).
All the characterization efforts significantly increased the value of the collection making it more attractive to breeders. Nonetheless, diversity of the species also resulted in frequent problems for their usage, related to cross incompatibility, embryo abortion, sterility and reduced fertility of interspecific hybrids. Variable cross compatibility was found when hybridizing different accessions of the same species with cultivated sunflower. Despite the difficulties, seven annual and 14 perennial species have been crossed with cultivated sunflower using conventional hybridization methods (Atlagić, 2004). The interspecific lines obtained have then been selected for desired traits and used for creation of new elite germplasm (Hladni et al., 2018). The whole prebreeding program resulted in direct use of the wild germplasm, justifying the long-term commitment and investments in collection maintenance and evaluations. The wild species thus remained a constant source of specific genes and variability for cultivated sunflower breeding.

Other important gene banks
The Vavilov Research Institute for Plant Genetic Resources (Russia) is the oldest gene bank for sunflower, with the first entries in 1922. The sunflower collection at VIR totals 2730 accessions with 2288 accessions of cultivated (Helianthus annuus L.) and 442 wild sunflowers belonging to 24 species (of which five are annual and 19 are perennial species). Viable accessions are maintained in the active (þ4°C) and long-term (À10°C) collections. Studies of sunflower genetic diversity in field and laboratory conditions conducted for many years resulted in the definition of collections with variation for particular traits and for their genetic control. The collection of cultivated sunflower contains local varieties and landraces and cultivars from national and international breeding programmes. It also includes the first CMS-based hybrids which are conserved as hybrid populations and lines (supplementary data 2, supplementary references, Tab. S3).
DAI-General Toshevo (Bulgaria) maintains a collection of wild annual and perennial Helianthus species. The first wild sunflower accessions were obtained from the USDA-ARS, Bushland, Texas and planted in the 1970s. A second group of wild sunflower accessions was obtained in 1983. However, a major part of the collection of wild sunflower species was obtained in the course of collections made in collaboration with USDA in the USA in 1999 and 2004. The DAI collection includes accessions maintained in the field and in cold chambers at þ4°C and at À14°C (long term preservation). Perennial species are permanently grown in the field as a living collection, and annuals are planted each growing season. The DAI collection includes 215 accessions from 30 perennial Helianthus species (Tab. S1), 175 accessions from seven annual Helianthus species (Tab. S2), 30 different CMS sources and some species from different genera of the Asteraceae family. These accessions are important initial material for research and breeding for resistance to abiotic and biotic stress factors and variation in fatty acid content (Supplementary data 1, Supplementary references 2).
The INTA (Argentina) sunflower gene bank at Manfredi (Cordoba) is part of the National Network of Genetic Resources. The first accessions grown at the station were OPV incorporated in 1948. The wild sunflower collection is made up of introductions from the USA and naturalized wild collections in Argentina. The first inbred lines were incorporated in 1980 from the INTA breeding programme and other sources. The main objective of this gene bank is to maintain, characterize and promote the use of genetic resources for research and breeding (the INTA programme is also at Manfredi. ICAR (India) ICAR-IIOR collects, maintains, evaluates and supplies the germplasm to all the centres under the sunflower network in India while ICAR-NBPGR facilitates import of germplasm and maintenance of the material multiplied (Tab. S4).
The INIA (Spain) gene bank contains an important collection of confectionary type landraces collected from many parts of Spain (listed in Tab. S5). The germplasm has been evaluated for seed quality traits and molecular diversity (Supplementary references 4).
The other gene banks with details in Table 1 are: IPK, Germany and NARDI, Romania.

Use of CWR up to the present
During the domestication process, sunflower certainly lost many CWR traits and, perhaps in part due to the domesticated ideotype with a single large head and to production practices, is now vulnerable to many biotic and abiotic stresses. Sunflower CWR possess genetic diversity useful for breeding a productive, nutritious, and resilient crop. In a survey of the introduction of genes from CWR in 13 crops of major importance to global food security from the mid-1980s to 2005, Hajjar and Hodgkin (2007) reported that sunflower CWR had contributed seven traits for the crop, including disease and insect resistance, abiotic factors, male fertility restoration and cytoplasmic male sterility, fifth among crops surveyed. This survey did not include an eighth trait of tolerance to the herbicides imidazolinone and sulfonylurea, which is now widely deployed and also gives partial control of the parasitic weed broomrape (Alonso et al., 1998;Škorić, 2012). Tolerance to imidazolinone and sulfonylurea herbicides were discovered in a wild H. annuus population from Kansas in the 1990s (Al-Khatib et al., 1998).
Sunflower CWR have been very important in the global development of sunflower as the fourth largest oilseed crop. Annual losses to diseases, weeds, and insects in the mid-1990s were estimated at $1.36 billion (Hesley, 1999). At that time, diseases accounted for losses of $642 million, weeds $489 million, and insects $229 million. Prescott-Allen and Prescott-Allen (1986) estimated that sunflower CWR contributed 25.9% of the annual value of the crop, in monetary terms about $185 million (Tyack and Dempewolf, 2015) or as much as $267-384 million (Hunter and Heywood, 2011).
The PET1 CMS used as the female parent for the commercial production of sunflower hybrids is the most valuable trait obtained from the CWR, followed by disease resistance. Sunflower CWR have been a valuable resource of resistance genes since the beginning of scientific breeding programmes and have become particularly useful since the development of hybrids (reviewed by Seiler, 2010Seiler, , 2012. Resistance genes for rust, downy mildew, Verticillium wilt, Alternaria leaf spot, powdery mildew, Phomopsis stem canker, and Sclerotinia wilt/rot resistance, and broomrape have been introduced into the crop from CWR (Seiler et al., 2017). Dempewolf et al. (2017) reviewed more than 350 publications using sunflower CWR, with 210 concerning resistance to biotic stresses, 60 male fertility restoration, 40 resistances to abiotic stresses, 25 quality characters, 10 phenological traits and 5 agronomic traits. For biotic stresses, the majority were for disease resistance, with 12 of the 14 annual species cited tending to provide single dominant resistance genes, most notably for rust, downy mildew, and Verticillium wilt, with annual H. argophyllus the most frequently cited species. Only 10 of the 39 perennial species were cited, with H. tuberosus in first place, mostly contributing polygenetic resistance to Phomopsis and Sclerotinia (Seiler, 2010). Genomic studies confirm these reports, showing that: about 10% of the cultivated genome derives from the secondary gene pool, with 4.7% from H. argophyllus; introgressions from wild species are enriched for disease resistance genes; there are only traces of genetic material from perennials .
Sunflower CWR contains considerable variability for tolerance to abiotic stresses such as drought, salinity, heat, flooding, low nutrient, and heavy metal tolerance (Ortiz, 2015). Helianthus paradoxus has salinity tolerance which has been successfully transferred to cultivated sunflower (Miller, 1995;Miller and Seiler, 2003). Hajjar and Hodgkin (2007) suggested that hybrids developed using this trait could give a yield increase of 25% in saline soils. CWR also provide the opportunity to study physiological processes involved in the survival mechanisms of desert-inhabiting species, for example Bowsher et al. (2016) studied the desert-adapted Helianthus niveus ssp. tephrodes endemic to the Algodones Dunes in California. Helianthus argophyllus has been extensively used for drought tolerance breeding. Baldini and Vannozzi, (1998); Baldini et al. (1993) and Martin et al. (1992) reported that interspecific lines obtained by divergent selection for physiological traits from H. argophyllus had higher water use efficiency, better drought susceptibility index, and higher harvest index under drought conditions than conventional sunflower lines.
7.2 From one cultivated sunflower reference genome to reference genome for all the CWR By the end of 2019, more than 1100 plant genomes (from more than 500 species) have been sequenced (https://www. ncbi.nlm.nih.gov/genome/browse#!/overview/) and projects are underway to sequence more plant genomes in the near future, such as the 10,000 Plant Genome Sequencing Project (Twyford, 2018). Sequencing the sunflower genome (Badouin et al., 2017) was a challenging project and a research network between Canada, France and USA started a decade ago, with this objective (Kane et al., 2011). The size of the genome (approx. 3.6 Gb, 17 chromosome pairs) and the numerous repetitive sequences made the assembly very difficult. Sequencing technologies are evolving rapidly by producing more, cheaper, longer and higher-quality sequences. However, for a long time, sequences were too short to differentiate and compare the long and highly similar repeats. In 2015, the RSII PacBio sequencer produced sequences longer than the known repeats from the sunflower genome and made possible sequencing of the 17 chromosomes of the inbred line XRQ. The raw data did not themselves solve the complexity of the sunflower genome. A long and complex process of bioinformatic computation compared and assembled the millions of sequence fragments as chromosomes. Improvement of bioinformatic software also plays an important role in genomics. As an example, at the same time as using long-read sequences for the sunflower genome, NRGene developed a software permitting genome sequences to be complied using short-read fragments (Lu et al., 2015). This approach made it possible to obtain a complete sequence of the genome of the line HA412 of similar quality to that obtained for the line XRQ.
To date, three high quality genomes are available (XRQ, HA412 and the restorer line PSC8). By comparing these three genomes, we found several large inversions. In addition, a pan genome for cultivated sunflower based on re-sequencing of 287 cultivated sunflower lines revealed extensive gene presence/ absence variation Owens et al., 2019). These structural variations had been very difficult to identify previously without such genomic resources. Single nucleotide polymorphisms are the molecular markers most used either to map traits or to characterize the diversity of sunflower germplasm. Many genotyping methods are available but the most used for high-throughput genotyping are genotyping by sequencing (Elshire et al., 2011;Narum et al., 2013;Baute et al., 2016, Celik et al., 2016Badouin et al., 2017;Todesco et al., 2019) or genotyping arrays such as the two AXIOM ® arrays developed and used in sunflower for either mapping traits (Louarn et al., 2016;Duriez et al., 2019) or to describe the diversity of germplasms (Mangin et al., 2017) or the Illumina ® Infinium iSelect used for diversity analysis and genomic prediction (Livaja et al., 2016).
There are many examples of structural variations (Saxena et al., 2014), some of them being involved in plant traits. More recently, it has been shown in sunflower that structural variations are involved in adaptive traits to the environment . However, there are no efficient tools for the high-throughput of these structural variations even if a recent report used the AXIOM ® array in maize (Mabire et al., 2019).
The International Consortium on Sunflower Genomics, grouping four public research institutes and eight private partners, aims to sequence reference genomes for wild sunflowers and wild relatives. The genomes of the CWR are more complex than those of cultivated sunflower. They are highly heterozygous, and it is more difficult to obtain the two haplotypes of the chromosomes for the diploid species. Some wild relatives have more complex ploidy and can be larger, up to 11.21Gb in H. agrestis (Qiu et al., 2018).
Recent advances in sequencing technologies have been successfully used to sequence the genome of two new cultivated sunflower lines in only few weeks (SUNRISE project) and that should make it possible: to obtain one reference genome for all Helianthus species in the coming years; to describe the diversity within each species more precisely; to compare the gene contents and allelic diversity of the germplasms; to use all information for mapping new traits or adaptive genes to improve understanding of evolution or to breed new varieties adapted to their changing abiotic and biotic environment.
Further new bioinformatic tools will make it possible: to compare genomes and to identify polymorphisms (structural variations included); to visualize many genomes at the same time in Genomes Browsers.

High-throughput phenotyping of CWR resources for research and breeding
Usually, CWR genetic resources are phenotypically characterized by a series of manually acquired traits comprising phenology, architecture, ligule colour, leaf colour and shape and seed traits including colour, number, weight, oil content and composition. These data are time-consuming and expensive to acquire and therefore are usually only complete for a subset of a collection. In addition, these traits are largely subjected to environmental effects such as temperature, water deprivation or spatial variation within field nurseries. These factors considerably impair selection of the most promising accessions to start pre-breeding programmes and to integrate genetic innovation into cultivated or elite material.
To break this bottleneck and improve utilization of gene banks, in the last decade, high-throughput phenotyping platforms have been developed to measure plant traits efficiently while controlling or precisely characterizing growing conditions. Glasshouse-based platforms allow a very fine control of the environment (light, water, nutrients; Cobb et al., 2013;Granier et al., 2006), and have been successfully used to conduct genetic association studies (Cabrera-Bosquet et al., 2012) and to derive plant trait ranges that can be entered in crop models (Lenz-Wiedemann et al., 2010;Steduto et al., 2009).
In field conditions, phenotyping has been mainly developed using unmanned aerial vehicles (UAV) as they are portable, adaptable with different captors, low-cost and suitable for all crops and stages (Zhao et al., 2019). When equipped with digital cameras, UAVs can be used to estimate canopy surface and biomass (Liebisch et al., 2015) with multispectral cameras or hyperspectral sensors to characterize physiological processes such as chlorophyll fluorescence or nitrogen levels (Camino et al., 2018) or plant water status using thermal imaging (Gómez-Candón et al., 2016;Gonzalez-Dugo et al., 2015). For short crops, such as wheat, ground based phenotyping robot often termed "Phenomobile" have also been developed (Madec et al., 2017;Qiu et al., 2019). In addition to cameras similar to those on UAVs, they can carry heavy and energy-demanding sensors such as LiDAR and additional lights to be independent of natural sunlight varying in quality and quantity. This facilitates and improves the quality of subsequent image analysis although a limitation of these ground-based vehicles is their suitability for tall crops such as sunflower.
Different high-throughput phenotyping technologies have been developed and tested for sunflower but very few have been published so far. In controlled environments, different tests have been performed on the Phenotoul-TPMP platform (Maviane-Macia et al., 2019) to assess biotic and abiotic stresses on sunflower. This platform is based on a conveyor system in a greenhouse where plants are automatically imaged in visible, fluorescence and near-infrared spectra. Using the RGB images and the IPSO Phen software INRAE teams successfully characterized the responses of sunflower genotypic panels for drought stress at early stages (SUNRISE project) and Orobanche infection (Plant2Pro Phenor project). Exploitation of these methodological developments will certainly follow shortly.
In semi-controlled conditions, still in the frame of the SUNRISE project, the Phenotoul-Heliaphen phenotyping platform was developed by Gosseau et al. (2019) to study the effects of different drought scenarios at vegetative and postflowering stages of sunflower. This platform allowed the authors to impose precisely different drought intensities and characterize yield sensitivity to post-flowering stress of sunflower varieties, but seed weight measurement still required time-consuming manual measurements. At earlier stages, they characterized leaf expansion and transpiration thresholds of different genotypes that were subsequently used to simulate yields with the SUNFLO crop model (Casadebaig et al., 2011). Using Heliaphen-grown plants, Gélard et al. (2016Gélard et al. ( , 2017Gélard et al. ( , 2018 applied structure from motion techniques and multi-view stereo imaging to reconstruct and analyse 3D point clouds of adult sunflower plants. They successfully extracted individual leaves, petioles and stem dimensions and orientations on time-series and characterized the differences in genotypic responses to drought (Gélard et al., 2018). Future developments of the high-throughput phenotyping Heliaphen platform will require the automatic acquisition of adequate images for this 3D reconstruction software and its automatic implementation to allow researchers to exploit the facility in full.
Field based phenotyping has also been tested on sunflower by different public and private teams, however few results are publicly available. One example of application for both nursery and phenotyping is plant counting using UAV as illustrated in the Delair and MAS Seeds project www.delair. aero). This can be used to assess plant emergence as a phenotypic response to early stresses or to assess density as a proxy of plot quality for further phenotyping. A similar approach was conducted in the SUNRISE project using UAV visible images for plant counting and flowering time. In addition, multi-spectral images could be used to estimate NDVI, that will make it possible to study the evolution of radiation interception during the entire crop cycle. In collaboration with the Phenome EMPHASIS project, ground vehicles could also be tested on sunflower. The Phenomobile V2 was first used on sunflower trials on the Phenotoul-Agrophen platform. Equipped with multispectral cameras, LiDARs and RGB cameras, methodological developments are currently underway to estimate leaf area index and aboveground biomass more accurately and dynamically.
The development of high-throughput phenotyping platforms allowed the acquisition of large amounts of images and 3D point clouds but, as in many cases, the main bottleneck is the development of sunflower-specific image analysis pipelines taking into account its leaf architecture and the position of the head. This will require new expertise in image analysis and also the collaboration of experts for the different traits and diseases. Soft-and hardware tools to manipulate these data must also be constructed. In particular, a sunflower crop ontology needs to be developed with international collaboration, and an open phenotype database developed. HTPPproduced data could then follow the FAIR principle and be accessible as on the PhenoDB INRAE database (https:// sunrise.toulouse.inra.fr/phenoDB/) for further long-term usage and characterization of CWR.
In addition to the development of high-throughput genotyping tools that are widely used to characterize the CWR genomic diversity, the emerging field of phenomics is starting to develop more robust and precise phenotyping methods for sunflower that will allow the characterization of entire collections and therefore both help their conservation and accelerate their exploitation for breeding and help to understand the genetic control of their tremendous phenotypic diversity.

Use of cultivated sunflower and CWR resources for environmental adaptation
While sunflower breeding for abiotic tolerance has mainly focused on drought and salt tolerance to date, heat and flooding tolerance are likely to be of increasing importance in the future due to higher temperatures, as well more frequent extreme climate events (Coumou and Rahmstorf, 2012). A recent study of the impacts of heat stress on reproductive traits in cultivated and wild H. annuus showed that the latter is much more tolerant of heat stress . Surprisingly, the most tolerant populations were from wetter rather than hotter environments, with invasive populations of wild H. annuus showing the greatest tolerance. Populations of wild H. annuus from wet habitats also exhibit tolerance to flooding (Torres and Diedenhofen, 1981). However, considerable variation for flooding tolerance can be found in the cultivated sunflower gene pool, so it is unlikely to be necessary to tap wild resistance alleles for this trait (Gao et al., 2019). Likewise, tolerance to heavy metal soils has been reported in wild species, most notably H. exilis (Sambatti and Rice, 2007), but substantial variation already exists among sunflower cultivars (Rizwan et al., 2016). On the other hand, various wild sunflowers from sand dune habitats such as H. anomalus, H. neglectus, H. niveus ssp. tephrodes, and dune ecotypes of H. petiolaris, are tolerant of low nutrient soils (Donovan et al., 2010), a trait that appears to have been lost in cultivated sunflowers. Introduction of such a trait to cultivated sunflowers could reduce fertilizer usage when sunflower is grown in nutrient poor soils.

Use of cultivated sunflower and CWR resources in breeding
In the breeding and germplasm enhancement program of the USDA, use of CWR has recently focused on the primary and secondary gene pools, in particular, the annual wild species and open pollinated varieties. Many of USDA's activities are continuation of previous uses of CWR, mostly centred on diversification of sources of resistance to disease. Introgressions from H. praecox, H. argophyllus, and H. petiolaris have been associated with Sclerotinia basal stalk rot resistance (Qi et al., 2016;Talukderet al., 2019) and were combined with downy mildew (caused by Plasmopara halstedii) resistance previously recovered from the primary CWR (Hulke et al., 2010). Similar work in diversification of downy mildew resistance sources has come from the secondary CWR pool, as well, with the independent genes Pl18, Pl20, and Pl35 recently introgressed from H. argophyllus . New work of similar methodology is ongoing with Sclerotinia head rot resistance. Recently, however, USDA has diversified the number of breeding projects that have involved CWR. This year, USDA released two germplasm lines (HA 488 and HA 489) that provided, for the first time, resistance to the red sunflower seed weevil (Smicronyx fulvus LeConte) and the banded sunflower moth (Cochylis hospes Walsingham), respectively. In seeming contradiction to the idea that such resistance would be derived from germplasm that originates within the range of the pest (Seiler et al., 2017), the resistance was found in open pollinated varieties that were developed in South Eastern Europe (DeGreef et al., 2020;Wronski et al., 2020). However, it is unknown if the European varieties had a recent ancestor from North America, where these pests are common. Occasionally, these resources contain genetic surprises. In the case of the banded sunflower moth resistance work, a source of very low saturated fat composition in the seed oil was also discovered, which was released separately as HOLS4 . USDA has also used CWR in recent years to expand the maturity range of the breeding program. CWR resources from the primary gene pool were used to develop early maturing oilseed types for use in Canada and "double crop" systems in the central and southern U.S. (Hulke and May, 2017). Double cropping is the growing of a winter grain, which is promptly harvested, followed by a summer crop, which allows harvesting of two crops in a year. This work is ongoing and was recently expanded to include development of full-season sunflower lines to be used in the central and southern U.S., which are regions with much longer growing seasons than Fargo, ND.
The use of CWR in the sunflower breeding programme at IFVCNS, Serbia, has a long tradition dating back to 1980, when the collection was first established. In the past two decades, a significant number of inbred lines have been created by interspecific crosses. These lines, together with the wild populations, represent a valuable resource of useful alleles that are abundantly used in the IFVCNS breeding program for increasing genetic variability, introducing resistance genes to economically significant pathogens and parasitic plants, increasing tolerance to herbicides and altering the architecture of the plant. Current research on CWR is focused on the discovery of new resistance genes to the more virulent populations of Orobanche cumana and Macrophomina phaseolina (Tassi) Goid. In order to discover new resistance genes to Orobanche, both CWR per se (Terzić et al., 2010;Jocković et al., 2018) and inbred lines of cultivated sunflower originating from interspecific crosses have been tested. As a result, resistance genes to new races of O. cumana were recently detected in inbred lines originating from crosses with H. deserticola (Hladni et al., 2012), H. tuberosus and H. divaricatus (Cvejić et al., 2014). The resistance genes from these sources have been subsequently mapped, and the developed molecular markers will expedite their transfer into commercial lines (Imerovski et al., 2013(Imerovski et al., , 2016(Imerovski et al., , 2019. Interestingly, Orobanche cumana is not found in the centre of sunflower origin, but CWR have proven to be a useful source of resistance genes to this parasite which hampers sunflower production across Europe. A similar workflow was used to find resistance to Macrophomina phaseolina. First, CWRs were tested per se to narrow down the sources of potential resistance genes that could be used in the breeding program (Tančić et al., 2012). Later, inbred lines originating from interspecific crosses with potential donors were screened to determine if they could be used as a source of tolerance to this pathogen (Tančić Živanov et al., 2020).
Apart from being a reservoir of diverse resistance genes, CWR served as donors of some valuable agronomic traits as well. For example, wild H. annuus was a donor of early flowering time locus that enabled research at IFVCNS to develop ultra-early hybrids with vegetation periods of less than 90 days. These types of hybrids are particularly well suited for areas where sunflower is grown as a second crop (i.e., after winter crops). Additionally, the ultra-early hybrids could be critical for sunflower production in northern Europe and Russia, where in the last decade sunflower production has been expanding due to climate change.

Cultivated and wild H. annuus vs. CWR
While much of the focus of this paper has been on the benefits of CWRs, which are undeniable, it is important to keep in mind that wild species are genetically and chromosomally divergent from the domesticated sunflower. As a consequence, in addition to beneficial alleles targeted by breeders, introgressions may introduce maladaptive alleles or deleterious structural variants into the cultivated gene pool, a phenomenon known as linkage drag. For example, H. petiolaris, which is the wild donor several components of the hybrid production system, differs from the cultivated sunflower by 6-8 translocations and 50-60 inversions (Ostevik et al., 2019). These rearrangements hamper introgression from much of the genome and, if successfully introgressed into cultivars, can introduce genetic load and reduce recombination rates. Gene presence/absence variation, which affects 27% of the genes in the cultivated sunflower pan genome , is also often associated with introgressions from wild species . Such linkage drag could be reduced by focusing pre-breeding efforts on wild H. annuus, which is geographically widespread, highly genetically and ecologically variable (Kane and Rieseberg, 2007), and fully interfertile with cultivated sunflower. Kantar et al. (2015) showed that wild H. annuus encompasses about 60% of the ecological niche space of all species. Thus, when traits required to improve the sunflower crop are not found in cultivated material, the first step should be to search in wild H. annuus. Studies of the other Helianthus species will be most important for research concerning characters for which they are the only source and to understand evolution and adaptation of the Helianthus genome.