Prospects for marker-associated selection in tomato Solanum lycopersicum L

The review gives a brief description of tomato, one of the main objects of olericulture for Siberia. The data on the main directions in the breeding of this culture, such as resistance to various pathogens, the nutritional properties of fruits, the timing of their maturation and storage are generalized. A separate chapter is devoted to the use of various types of DNA markers for constructing detailed genetic maps of the specified object, which, along with full-genome sequencing data, can be used to screen for genes responsible for breeding traits. Most of these traits, especially specific resistance to one or another pathogen, were transferred to the cultivated tomato by crossing with wild species, therefore, special attention was paid in the article to identifying and marking resistance genes to a variety of viral, fungal and bacterial pathogens occurring in Western Siberia and adjacent areas. Another important aspect for breeding is the nutrient content of tomato fruits, including carotenoids, vitamins, sugars, organic acids, etc. Recently, due to modern technologies of sequencing, SNP-genotyping, the development of new bioinformatic approaches, it has become possible to establish genetic cascades determining the biochemical composition of tomato fruits, to identify key genes that can be used in the future for marker-associated selection of nutritional value. And, finally, genetic works devoted to the problem of the optimal dates of fruit ripening in certain climatic conditions and their prolonged storage without loss of quality are discussed.


Introduction
Tomato, Solanum lycopersicum L., is the second most important vegetable crop after cabbage. It belongs to the family Solanaceae, consisting of approximately 100 genera and 2500 species, including several plants of agronomic importance (potato, eggplant, pepper, tobacco). In 2012, due to the efforts of the International Consortium on sequencing the tomato genome, the genomes of the cultivar Heinz 1706 and 535 селекция растений на иммунитет и продуктивность / plant breeding for immunity and performance the wild ancestor of the tomato, Solanum pimpinellifolium L. were completely sequenced (DOI 10.1038/nature11119). The tomato (2n = 2x = 24) has a relatively compact genome of 950 MB. It contains about 35,000 genes and was subjected to two rounds of triploidization (120 million and 70 million years ago) in the course of evolution; the second round took place before the divergence of tomato and potato. It is believed that the process of polyploidization promoted the neofunctionalization of genes responsible for the ripening and chemical composition of fruits, leading to the formation of a fleshy fruit in tomato that is of great importance for the propagation of seeds (Howe, Smallwood, 1982). Sequencing data is available through the SOLGenomics Network (SGN) website (http://solgenomics.net). Tomato fruits are enriched with vitamins A and C, a number of minerals and other biologically active substances (BAS), including lycopene, which belongs to antioxidants (Rao A.V., Rao L.G., 2007).
The homeland of tomato is South America, where its wild and semi-cultural forms are still found. In the middle of the XVI century, the tomato came through Spain and Portugal to Europe and was first used as an ornamental plant, since the fruits of the tomato were considered inedible. At the end of the XVIII century, a tomato appeared in Russia and was also first cultivated for decorative purposes. The tomato became a vegetable crop thanks to the agronomist scientist A.T. Bolotov, who developed a seedling method of cultivation and a method of ripening (ripening of green fruits after their collection).

DNA markers
Currently, the presence of complete genomic sequences (see above) makes it possible to effectively search for various genes responsible for valuable traits, as well as corresponding DNA markers for marker-assisted selection (MAS) of new forms of tomato. A large number of these markers were developed, including: RFLP (restriction fragment length polymorphism) (Tanksley et al., 1992), as well as PCR markers, including RAPD (randomly amplified polymorphic DNA), AFLP (amplified length polymorphism fragments), SSR (simple repeating sequences) (Saliba-Colombani et al., 2000;Ohyama et al., 2009). To date, SNP (single nucleotide polymorphism) markers are the technology of choice, and within this technology methodological approaches have been successfully approved on tomato such as using of EST SNP analysis for high-performance genotyping (Shirasawa et al., 2010), widescale genomic sequencing to identify SNPs that affect protein functions (http://plant1.kazusa.or.jp/tomato/). Polymorphic markers for tomato genomic selection were developed based on DArT (DNA chip technology for studying diversity) (Van Schalkwyk et al., 2012).
However, it should be noted that, despite the many DNA markers developed, mainly markers for qualitative traits, such as specific resistance to pathogens, are currently used in practical breeding of tomato. As for quantitative traits (QTL), so far the use of appropriate markers is hindered by their weak linkage with these traits, low polymorphism, undesirable pleiotropic effects, and the lack of validation on diverse material of lines and varieties (Foolad, Panthee, 2012). In this regard, the problem of developing of new, effective molecular markers suitable for use on a wide range of varieties and populations remains actual.

Main directions of tomato breeding in Western Siberia
Tomato is a thermophilic culture and the climate of Western Siberia does not always favor to its productivity. In addition, the tomato is susceptible to numerous infectious diseases. This implies the need to obtain new varieties and hybrids capable of producing high yields and possessing a set of economically valuable traits, such as resistance to pathogens, ripening date corresponding to a short vegetation period, shelf life, etc. As known, MAS makes it possible to conduct selection for many traits simultaneously and allows significantly (2-3 times) to reduce the time of obtaining new varieties, compared with the classical breeding. However, no one variety or hybrid of tomato has been obtained in Siberia using MAS. In this regard, it seems relevant to summarize the main results obtained in the world on this culture with the help of MAS, focusing on those directions that correspond to the conditions of Western Siberia and adjacent territories.

Tomato resistance to pathogens
Most of the resistance genes were identified within wildgrowing species and then, by crossing, were introduced in a cultivated tomato (Foolad, Panthee, 2012). In Siberia, fungal diseases of tomato are in the first place by importance, namely: late blight, leaf mould (in greenhouse), septoria blight (in field), fusarium wilt and verticillium wilt. Bacterial spot and bacterial canker are the most common bacterial diseases. Viral diseases are not so relevant for Siberia, although in some years epiphytotics occur.

Resistance to fungal diseases
Late blight caused by Phytophthora infestans oomycete, is one of the most devastating diseases of tomato in regions with high humidity and a cool climate, leading to yield loss up to 100 %. Losses can be in the form of a drop in yield, a lower quality of fruits, for example, a low specific weight, a decrease in shelf life, etc. Due to the large economic effect, the pathology and genetics of this disease have been intensively studied for many years. Three main resistance genes were identified in wild-growing tomato S. pimpinellifolium: Ph-1, Ph-2 and Ph-3, which were mapped on chromosomes 7, 10 and 9, respectively (Black et al., 1996;Moreau et al., 1998). The strongest resistance gene, Ph-3, provides incomplete dominant resistance to a wide range of P. infestans isolates (Chunwongse et al., 2002). Analysis of its primary structure showed that it encodes a CC-NBS-LRR (coiled-coil nucleotide-binding leucine-rich repeat) -protein that belongs to the extensive NBS-LRR class of plant R-genes . However, even this gene does not provide resistance to Phytophthora most aggressive isolates. In these cases, the most effective was the combination of two genes, Ph-2 and Ph-3, whick were successfully transferred to a number of commercial varieties using developed CAPS markers (Robbins et al., 2010;Zhang et al., 2014). Work on the isolation and analysis of new late blight resistance genes continues. In particular, a number of QTLs carrying resistance genes have been identified that have not yet been precisely localized (Merk, Foolad, 2012;Panthee et al., 2017).
Fusarium wilt. Fusarium oxysporum is a soil fungus that causes wilting disease in tomato. It affects all plant tissues and Prospects for marker-associated selection in tomato Solanum lycopersicum L.
can persist for a long time in the form of chlamydospores in the soil and plant residues, without losing virulence. Currently three races of this fungus were identified; in Russia, race 1 brings the most damage in greenhouses, race 2 occurs in some farms (Ignatova, 2001). Gene I, which provides high resistance to race 1, and gene I-2, which gives resistance to races 1 and 2, were mapped on the short and long arms of chromosome 11, respectively (Ori et al., 1997;Scott et al., 2004). These genes were most often used in breeding for resistance to Fusarium, however, recently, race 3 has become very common and the corresponding resistance gene has been mapped in detail on chromosome 7 (Lim et al., 2008). There are various linked PCR markers for each of the three genes; markers of resistance to races 1 and 3 are most effective (Barillas et al., 2008;Arens et al., 2010).
A kind of Fusarium wilt -Fusarium root rot, caused by another strain of F. oxysporum. Resistance was established in the induced mutant S. peruvianum and the only resistance gene Fr l was mapped on chromosome 9 near the Tm-2 2 gene (Vakalounakis et al., 1997). Subsequently, RAPD markers for Fr l (Tanyolac, Akkale, 2010) were developed, however, to date there are few commercial varieties and lines resistant to this disease.
Leaf mould is common in almost all the world and most often affects plants in greenhouse conditions. Affected leaves, flowers and young fruits turn yellow and then dry. The pathogenic agent is Cladosporium fulvum, a highly contagious, optional saprotroph. More than 20 major resistance genes have been identified and mapped on different chromosomes (Wang et al., 2007). In Russia, the most effective resistance genes Cf-2, Cf-5, Cf-6, Cf-9 give resistance to races of the fungus 1, 3 and 4, however, due to the appearance of new races, at least two genes must be combined (Ignatova, 2001). Although a number of PCR markers have been associated with Cf genes (Grushetskaya et al., 2007;Wang et al., 2007;Truong et al., 2011), there is no data on their use in breeding.
Verticillium wilt is a widespread disease characterized by the following symptoms: wilting, discoloration and leaf fall, vascular tissues and root system necrosis. Verticillium wilt is caused by Verticillium dahlia and V. albo-atrum. In tomato, resistance to Verticillium is controlled by the Ve locus mapped on the short arm of chromosome 9 and consisting of two linked genes Ve-1 and Ve-2, each of which provides resistance to certain pathogen races (Kawchuk et al., 2001;Fradin et al., 2009). PCR markers were obtained to discriminate tolerant and sensitive to Verticillium forms of tomato (Acciarri et al., 2007;Arens et al., 2010).

Resistance to bacterial pathogens
Bacterial cancer caused by the rod-shaped bacterium Clavibacter michiganensis, is a common tomato disease worldwide and one of the most difficult to control. Infection occurs through mechanically damaged tissues. Greenhouse tomatoes are most at risk. Mapping using crosses between S. lycopersicum and the resistant specimen S. habrochaites LA 407 allowed to identify and accurately map two large QTLs on chromosomes 2 (Rcm2.0) and 5 (Rcm5.1), which are responsible for 68 % of expressivity variation (Kabelka et al., 2002;Coaker, Francis, 2004). There are data on markers (Coaker, Francis, 2004), however, there is no information on their use.
Bacterial spot is a common disease of tomato (especially in Western Siberia, Kazakhstan), which is caused by four species of rod-shaped bacteria Xanthomonas (races T1-T5). It is characterized by spotting of leaves, stems and fruits, accompanied by leaf fall, a decrease in the size of fruits and their immaturity, which leads to yield loss up to 100 %. Chemical control is not effective enough due to the development of resistance in the pathogen and multiple ways of its inoculation. Pathogen resistance has been found in a number of S. lycopersicum specimens, as well as in wild species, however, its use is greatly complicated by the diversity of pathogen races and the complex nature of resistance. In many cases, it is characterized by race specificity, but some genotypes exhibit multiple quantitative resistance, depending on external conditions. For example, the resistance of Hawaii 7998 S. lycopersicum line to race T1 ranges from reduced field symptoms to a hypersensitivity reaction (HR) in a greenhouse. This reaction is provided by three independent genes Rx-1, Rx-2 (chromosome 1) and Rx-3 (chromosome 5) (Wang et al., 1994;Yu et al., 1995). The participation of Rx-3 locus was most reliably confirmed to which markers were developed, including CAPS marker L3-L1, which was used in breeding (Yang, Francis, 2005). The same line has strong HR-resistance to race T3 (both in the field and in the greenhouse), which is controlled by the Rx-4 gene mapped on chromosome 11 (Wang et al., 2011).

Resistance to viruses
Tomato mosaic virus (ToMV) is one of the most stable viruses; crop losses when infected with ToMV reach 50 % or more. The disease is characterized by the appearance of a motley (mosaic) color of leaves, stems and fruits, followed by their deformation and fading. ToMV is highly contagious and is transmitted via mechanical contact, as well as insects: thrips, aphids, etc. In tomato three major resistance genes were revealed: Tm-1, Tm-2 and Tm-2 2 (Ohmori et al., 1996;Sobir et al., 2000;Scott, 2007). The first gene, localized on chromosome 5, inhibits the synthesis of viral RNA by suppressing viral RNA replicase (Meshi et al., 1988). The Tm-2 and Tm-2 2 genes, localized on chromosome 9, block the movement of the virus from cell to cell, and also cause HR (Meshi et al., 1989). The highest efficiency is observed when all three dominant genes are combined in the homo-or heterozygous state (Puchalsky, 2007). For each of these, PCR markers were developed (Dax et al., 1998;Sobir et al., 2000;Arens et al., 2010).
Tomato spotted wilt virus. The disease is caused by the tomato spotted wilt virus, TSWV. It leads to a decrease in crop yields (over 50 %) and deterioration in product quality. The TSWV virus has an extremely wide range of host plants, which creates a high risk of infection. Eight major resistance genes are known, including the dominant genes Sw-1a, Sw-1b, Sw-5, Sw-6 and Sw-7 and the recessive genes sw-2, sw-3 and sw-4 (Stevens et al., 1992). The most effective gene for resistance to TSWV, the Sw-5 gene, is localized on the long arm of chromosome 9, and since it is race-specific, it is often used in practical breeding. However, there is a risk of overcoming Sw-5 with new TSWV strains; virulence to this 537 селекция растений на иммунитет и продуктивность / plant breeding for immunity and performance resistance gene has been reported in several countries (Scott, 2007). A large number of PCR markers have been developed to detect Sw-5 (Smiech et al., 2000;Langella et al., 2004;Garland et al., 2005).

Size and color of fruits, content in them biologically active substances
The trait of "uniform ripening" is determined by the genetic locus uniform ripening (u), which control the amount and distribution of chlorophyll in immature fruits (Bohn, Scott, 1945). The dominant allele U determines a normal, uneven maturation, in which the upper part of the immature fruit has a dark green and the lower -a light green color. Plants that are homozygous for the recessive u allele (u/u) produce uniformly ripening fruits that, in an immature state, have the same pale green color on all sides. The initial breeding led to the selection of such forms of tomato, because they are characterized by a uniform red color of ripe fruit. In 2012, localization of the U locus on the short arm of chromosome 10 was established using genetic mapping and the GLK2 candidate gene was identified that encodes the Golden 2-like transcription factor, a regulator of chloroplast development (Powell et al., 2012). The authors sequenced this gene in varieties with U/U and u/u genotypes and found that in the first case, the GLK2 gene encodes a complete regulatory protein of 310 amino acids in length, whereas in the case of the u allele, the synthesis of non-functional protein occurs due to premature stop codon which resulted from insertion of one nucleotide. Using genetic transformation, it was shown that this mutation blocking the GLK2 gene is responsible for the uniform coloring phenotype and the associated decrease in the number of chloroplasts in fruits. The latter, in turn, leads to a decrease in the level of photosynthesis and a significant decrease in the content of soluble solids in the fruit juice. As a result, the cultural forms of tomato with the u/u genotype have lower taste and nutritional qualities, compared with the ancestral forms. In 2017, the Science published an article of D. Tieman et al. (2017), in which more than 300 modern and traditional tomato varieties were analyzed using genomic sequencing and chemical analysis. In this work, 28 compounds were identified that are responsible for the organoleptic qualities of tomato and then, based on the genome-wide analysis of associations (GWAS), a search was made for SNPs associated with the concentration of these chemical compounds. As a result, several major genes were identified that are responsible for the tomato flavor. Thus, the Lin5 gene encodes an extracellular invertase that catalyzes the hydrolysis of sucrose to low molecular weight glucose and fructose. Alleles of this gene that are responsible for the alternative characteristics of modern and wild/old-fashioned varieties (low sugar content, large fruits vs. high content, small fruits) differ by only one SNP, leading to the substitution Asn→Asp. Another example, the E8 gene, which regulates the synthesis of ethylene, hormone of maturation. In the overwhelming majority of modern varieties, this hormone has an increased activity, which leads to a higher concentration of methyl salicylate and guayacol with an unpleasant smell, compared to the old varieties, while the "beneficial" aromatic substances are less concentrated. Three SNPs were identified in the regulatory regions of the E8 gene, which appear to be responsible for the indicated differences (Tieman et al., 2017).
The most important BAS of tomato fruits include carotenoids, a class of 40-carbon hydrocarbons, which are represented by orange, red and yellow pigments synthesized in various plant organs. These substances are involved in a variety of physiological processes of growth, development of plants, reactions to external stimuli. To date, the biosynthesis genes, as well as transcription factors and hormones that regulate the metabolism of carotenoids under the influence of external factors, have been established (Liu et al., 2015). In particular, key regulatory genes that determine the concentration of lycopene, the most common carotenoidantioxidant of ripe tomatoes, have been identified. This substance is considered as an important biologically active component of the human diet, reducing the risk of cancer and cardiovascular diseases (Ford, Erdman, 2012). Recently, using genomic editing, the synthesis of lycopene in tomato fruits has been increased five times due to the knockout of genes responsible for the conversion of lycopene to β-and α-carotene (Li et al., 2018).
Specific polymorphisms that are responsible for particular varietal characteristics of the tomato fruit color were identified. The formation of a dark red color in the Black Cherry variety is caused by a mutation of the reading frame shift in the coding part of the lycopene-β-cyclase gene, leading to a loss of protein function. A similar mutation leading to a stop codon and shortened protein Psy 1 phytoene synthase underlies the yellow color of fruits (Aflitos et al., 2014).
The shape and size of the tomato fruit correlates with the number of seed chambers (locules). Two QTLs, lc and fas, have the maximum effect on these traits and can act synergistically, leading to an extremely high number of locules (Cong et al., 2008;Munos et al., 2011). Fas is the strongest gene (variation in the number of locules 2 more than 6), while lc acts weaker (3-4 locules). Two SNPs, T→C and A→G, are associated with the allele lc h of a high number of locules. Analysis of the primary structure of the lc gene showed that all 2-chamber tomato varieties have the lc l allele, and the 3, 4-chamber -allele lc h . The Fas gene encodes a YABBY-like transcription factor (Cong et al., 2008). The fas h allele appeared as a result of the inversion of the 294 kbp region on chromosome 11, that led to the shutdown of the Fas gene due to the spatial separation of exons 1 and 2 (Huang, van der Knaap, 2011).

Peculiarities of the formation of plants and fruit ripening
Determinancy. For greenhouse conditions tomato plants of an indeterminate type are most suitable. They are characterized by continuous growth and uniform ripening of fruits for several months. For field conditions of Siberia determinant genotypes are more acceptable, the main distinguishing feature of which is termination of shoot growth after the formation of 2-6 inflorescences. Such genotypes, as a rule, are early maturing, which prevents yield loss due to the short growing season.
Determinancy is controlled by the SP regulatory gene (SELF PRUNING), which controls the transition from the vegetative to the generative stage of development and is homologous to FT (FLOWERING LOCUS T) -gene of Arabidopsis (Pnueli et al., 1998). Determinant plants have the sp/sp genotype, Prospects for marker-associated selection in tomato Solanum lycopersicum L.
indeterminate -(SP−). There are at least six SP genes in the tomato genome. For one of them, SP5G, a mechanism of action was established that depends on photoperiod (Soyk et al., 2017). Like the FT gene, SP5G belongs to the flowering repressors. Under the influence of a long day, its expression is induced to a high level, which leads to suppression of flowering until the onset of a short day (indeterminant, wild phenotype). In a cultural tomato of determinant type, this effect of a long day on expression is reduced due to mutations in this gene. Using the CRISPR/Cas9 genomic editing, it was possible to obtain the null allele SP5G and thereby restore a determinant phenotype characterized by early flowering and increased productivity (Soyk et al., 2017).
Genes of slow ripening of fruits. Earlier, the pleiotropic genes responsible for the delayed fruit ripening period were revealed in tomato: alcobaca (alc), ripening inhibitor (rin) and non-ripening (nor) (Garg et al., 2008). In plants carrying these genes in a homozygous state, shelf life of fruits increased by 250-500 %; meanwhile they were less prone to the process of decay. However, such genotypes did not become widespread in commerce, due to the accompanying traits: pale coloring and poor taste. The fruits of heterozygous plants also had an increased shelf life (average between parental forms), resistance to decay, but at the same time they had acceptable color and taste for consumers. In addition, these plants had an increased yield, and such indicators as: the content of lycopene and dry matter, fruit consistency, ascorbic acid content were intermediate compared to their parents. As a result, the forms carrying the alc, nor, and rin genes are widely used in commercial tomato varieties in many countries (Garg et al., 2008).
In 2002 Science published an article devoted to the rin gene . This gene is located on the short arm of chromosome 5 and encodes a MADS-box-transcription factor that regulates many different developmental genes, including those associated with ethylene biosynthesis. The alc and nor genes were also cloned and analyzed (Moore et al., 2002). The alc gene (synonym: DFD, delayed fruit deterioration) has several advantages for breeding, since it has a lower negative effect on fruit quality, color, aromatic properties and resistance to bacterial diseases (Garg et al., 2008). The alc recessive mutation is caused by a nonsynonymous T→A substitution at position 317 of the coding sequence, leading to the Val→Asp substitution (Casals et al., 2012). Using CRISPR/Cas9 in one of the varieties, the ALC allele was replaced by the alc allele by homologous recombination (Yu et al., 2017).

Functional male sterility
The low genetic diversity due to the mode of tomato reproduction (self-pollination) and the effect of the bottle neck during the introduction process make the successful breeding of tomato very difficult. The English scientist Ch. Rick first began to use the methods of introgression of genetic material from wild-growing to cultivated tomato (Rick, 1960) and most of the tomato varieties were obtained using hybridization.
In tomato, the production of hybrid seeds is laborious due to the need for isolation and castration of flowers, so the use of lines with the trait of functional male sterility (FMS) is the most effective way to obtain hybrid seeds. FMS is caused by deviations in the development of the flower and in tomato includes the following types: ex, ex-2, ps, ps-2 (Kuzemensky, 2004). The latter type is most widely used in tomato breeding. The stamens of plants of the ps-2 type have the usual structure, fertile pollen grains, but the anthers are not opened. The Ps-2 gene controlling this type of sterility was identified in chromosome 4, isolated and its primary structure was studied (Gorguet et al., 2009). It encodes the enzyme polygalacturonase, which affects the rigidity of the cell wall by digestion of pectins. The single mutation that disrupts splicing of mRNA, resulting in its aberrant forms is responsible for the ps2 phenotype. A number of markers have been developed for the Ps-2 gene: SNP (Gorguet et al., 2009), CAPS (Staniaszek et al., 2012), etc.

Conclusion
The work on the complete sequencing of the tomato genome and the construction of high-resolution genetic maps laid the foundation for a fast and effective search for genes responsible for important selection traits, as well as the development of DNA markers corresponding to these genes that can be used in marker-assisted selection of a new forms of tomato. Especially relevant for a temperate climate are markers of such traits as resistance to a number of common pathogens of various nature, valuable biologically active substances, for example, carotenoids, lycopene, sugars, etc., as well as gene markers that determine the optimal, early fruit ripening in conditions of short summer period and risk of autumn frosts. To date, key genes responsible for these traits have been identified and characterized, which makes it possible, on the basis of molecular markers, to develop strategies for crossing and selection for these genes, to perform their pyramiding, as well as targeted modification using modern genomic editing methods.
Prospects for marker-associated selection in tomato Solanum lycopersicum L.