Genetics polymorphism of poplars from Moscow region based on high-throughput sequencing of ITS

Poplars are widely used in landscaping of Moscow due to the ability to effectively purify the air from harmful impurities and to release a large amount of oxygen. The genus Populus is characterized by a high level of intraspecies polymorphism, as well as the presence of natural interspecies hybrids. The aim of our work was to evaluate the genetic diversity of poplars, which are growing on the territory of Moscow city by highthroughput sequencing of internal transcribed spacers of 45S rRNA genes (ITS sequences). Sequencing of ITS of 40 poplar plants was performed on Illumina platform (MiSeq) and about 3 000 reads were obtained for each sample in average. Bioinformatics analysis was performed using CLC Genomics Workbench tool. The involved set of poplars had a high level of genetic diversity – the number of single nucleotide polymorphisms (SNPs) detected in each genotype relative to the reference ITS1 and ITS2 sequences of P. trichocarpa varying from 4 to 44. We showed that even trees which were planted on the same territory and, probably, at the same time had significant genetic differences. It can be speculated that highly polymorphic plant material was used for planting poplars in Moscow. For some sites with SNPs, several variants of nucleotides were found in the same individual and the ratio of SNPs was different. We assume that close to 50/50 ratio is observed in interspecific hybrids due to genetic differences in the ITS sequences between maternal and paternal genotypes. For SNPs with a predominance of one of the variants, the presence of paralogues among numerous genomic copies of ITS sequences is more likely. The results of our work can provide a framework for molecular genetic markers application with the Тополь широко используется в озеленении Москвы благодаря способности эффективно очищать воздух от вредных примесей и выделять большое количество кислорода. Роду Тополь (Populus) свойствен высокий уровень внутривидового полиморфизма, а также наличие естественных межвидовых гибридов. Целью настоящей работы была оценка генетического разнообразия тополей, растущих на территории города Москвы, с использованием высокопроизводительного секвенирования внутренних транскрибируемых спейсеров генов 45S рРНК (ITS-последовательностей). На платформе Illumina (MiSeq) проведено секвенирование ITSпоследовательностей 40 растений тополя и в среднем получено около 3 000 прочтений для каждого образца. Биоинформатическая обработка данных проведена с использованием программы CLC Genomics Workbench. Исследованная выборка тополей имела высокий уровень генетического разнообразия: число выявленных в каждом генотипе однонуклеотидных полиморфизмов (SNP) относительно референсных последовательностей ITS1 и ITS2 P. trichocarpa варьировало от 4 до 44. Показано, что даже деревья, посаженные на одной территории и, вероятно, в одно время, значительно различаются генетически. Можно предположить, что при посадке тополей в Москве использовался крайне полиморфный растительный материал. Для некоторых сайтов c SNP у одного и того же индивидуума выявлено несколько вариантов нуклеотидов, соотношение которых было различным. Мы предполагаем, что соотношение, близкое к 50/50, наблюдается в межвидовых гибридах и является следствием генетических различий в ITS-последовательностях между материнским и отцовским генотипами. Для SNP с преобладанием одного из вариантов вероятнее нали чие паралогов среди многочисленных геномных копий ITS-по сле довательностей. Результаты работы закладывают основу для применения молекулярно-генетических маркеров с целью идентификации видов и межвидовых гибридов тополя, определения происхождения ряда естественных гибридов, а также мониторинга разнообразия представителей рода Populus, растущих на территории города Москвы.

Тополь широко используется в озеленении Москвы благодаря способности эффективно очищать воздух от вредных примесей и выделять большое количество кислорода.Роду Тополь (Populus) свойствен высокий уровень внутривидового полиморфизма, а также наличие естественных межвидовых гибридов.Целью на стоящей работы была оценка генетического разнообразия тополей, растущих на территории города Москвы, с использованием высокопроизводительного секвенирования внутренних транскрибируемых спейсеров генов 45S рРНК (ITS-последовательностей). На платформе Illumina (MiSeq) проведено секвенирование ITS-по следовательностей 40 растений тополя и в среднем получено около 3 000 прочтений для каждого образца.Биоинформатиче ская обработка данных проведена с использованием программы CLC Genomics Workbench.Исследованная выборка тополей имела высокий уровень генетического разнообразия: число выявленных в каждом генотипе однонуклеотидных полиморфизмов (SNP) относительно референсных последовательностей ITS1 и ITS2 P. trichocarpa варьировало от 4 до 44.Показано, что даже деревья, посаженные на одной территории и, вероятно, в одно время, зна чительно различаются генетически.Можно предположить, что при посадке тополей в Москве использовался крайне полиморфный растительный материал.Для некоторых сайтов c SNP у одного и того же индивидуума выявлено несколько вариантов нуклеотидов, соотношение которых было различным.Мы предполагаем, что соотношение, близкое к 50/50, наблюдается в межвидовых гибридах и является следствием генетических различий в ITS-по следовательностях между материнским и отцовским Poplars are widely used in landscaping of Moscow due to the ability to effectively purify the air from harmful impurities and to release a large amount of oxygen.The genus Populus is characterized by a high level of intraspecies polymorphism, as well as the presence of natural interspecies hybrids.The aim of our work was to evaluate the genetic diversity of poplars, which are growing on the territory of Moscow city by high-throughput sequencing of internal transcribed spacers of 45S rRNA genes (ITS sequences).Sequencing of ITS of 40 poplar plants was performed on Illumina platform (MiSeq) and about 3 000 reads were obtained for each sample in average.Bioinformatics analysis was performed using CLC Genomics Workbench tool.The involved set of poplars had a high level of genetic diversity -the number of single nucleotide polymorphisms (SNPs) detected in each genotype relative to the reference ITS1 and ITS2 sequences of P. trichocarpa varying from 4 to 44.We showed that even trees which had been planted on the same territory and, probably, at the same time had significant genetic differences.It can be speculated that highly polymorphic plant material was used for planting poplars in Moscow.For some sites with SNPs, several variants of nucleotides were found in the same individual and the ratio of SNPs was different.We assume that close to 50/50 ratio is observed in interspecific hybrids due to genetic differences in the ITS sequences between maternal and paternal genotypes.For SNPs with a predominance of one of the variants, the presence of paralogues among numerous genomic copies of ITS sequences is more likely.The results of our work can provide a framework for molecular genetic markers application with the purpose of Populus species and interspecific hybrids identification, determination the origin of a number of natural hybrids, and monitoring the diversity of genus Populus in the Moscow city.

M
oscow is one of the largest megalopolises of the world with a developed infrastructure, in which there are more than 12 million inhabitants, that is associated with an unfavorable ecology in the city.To improve the situation, effective landscaping of the city is necessary.Poplar is actively used in the landscaping of Moscow due to the ability to purify the air from pollutants, and release a large amount of oxygen.
Genus Populus, according to the Eckenwalder classification (Eckenwalder, 1996), includes 29 species predominantly distributed in the Northern hemisphere.Poplars are dioecious wind-pollinated plants that leads to high intraspecies diversity (Rae et al., 2007).It is known that various species of poplar are easily crossed forming natural interspecific hybrids (Roe et al., 2014;Jiang et al., 2016) that poses difficulties in identifying their taxonomic status.Genome of P. trichocarpa was sequenced in 2006 being the first genome of a tree (Tuskan et al., 2006).It is shown that the use of nucleotide sequences of internal transcribed spacers (ITS) of 45S ribosomal RNA (rRNA) genes (Hamzeh, Dayanandan, 2004) is efficient for genetic polymorphism evaluation, taxonomic classification, and determination of phylogenetic relationships in poplars.The ITS region includes highly variable ITS1 and ITS2 sequences located on both sides of highly conserved sequence encoding 5.8S rRNA.ITS sequences, unlike chloroplast and mitochondrial markers, are inherited from both parents and have high variability, while the procedure for their amplification is standardized (Poczai, Hyvonen, 2010).All of the above promotes the active use of ITS sequences for plant barcoding (Li et al., 2011).
ITS sequences are represented by many copies in a genome and different ITS paralogs may be present in one individual that requires special attention in data analysis and may even hinder obtaining of reliable data by Sanger sequencing (Hollingsworth et al., 2011).High-throughput sequencing can overcome the mentioned above difficulties because hundreds of ITS are sequenced for one individual and sample preparation does not require cloning.In the present work, high-throughput sequencing of ITS was performed and genetic polymorphism of poplars growing on the territory of Moscow city was evaluated.

Materials and methods
Plant material was collected during the poplar flowering in the south and north of Moscow city.Young leaves were frozen in liquid nitrogen and stored at -70 °C.DNA isolation was performed as described previously (Melnikova et al., 2014).The DNA quality was evaluated by electrophoresis on 1 % agarose gel.DNA concentration was measured on Qubit 2.0 fluorometer (Life Technologies, USA).For further work, a test set of DNA from 40 poplar plants was used.
Two-stage polymerase chain reaction (PCR) was used to prepare DNA libraries for high-throughput sequencing: the first stage included amplification of selected regions of the genome and the addition of universal sequences to the amplicons; at the second stage, the addition of sequences necessary for high-throughput sequencing and dual indexes for sample identification was performed.To amplify the ITS region, we used the primers proposed by Hsiao and White (White et al., 1990;Hsiao et al., 1995) (see Figure ) with the universal adapters added.For the second PCR, Nextera XT v2 primers were used (Table 1).Primer design was proceeded according to the recommendations of the Illumina protocol (https://support.illumina.com/content/dam/illumina-support/documents/documentation/ chemistry_documentation/16s/16s-metagenomic-libraryprep-guide-15044223-b.pdf).
The CLC Genomics Workbench software package (Qiagen, USA) was used for bioinformatics analysis of the data.The reads were mapped to ITS sequence of P. trichocarpa (GenBank: AJ006440.1),the genome of which is the reference one for Populus.The parameters were as follows: window length -11, maximum number of gaps and mismatches -2, minimum average quality of surrounding bases -15, minimum quality of central base -20, minimum coverage -500, minimum pairedend coverage -0, maximum coverage -20 000, minimum variant frequency -20 % or 50 reads.

Results
We performed high-throughput sequencing of ITS of 40 poplar plants growing on the territory of Moscow city.Sequence length was 250 nucleotides (paired-end reads), and, on average, about 3 000 reads were obtained for each sample.A bioinformatics analysis of the ITS sequences was carried out.The results are presented in Table 2 and Supplementary Materials 1 .
The investigated set of trees was characterized by a high level of genetic diversity, the number of detected single nucleotide polymorphisms (SNPs) varied from 4 to 44 relative to ITS sequences of P. trichocarpa (GenBank: AJ006440.1).One of the subgroups of trees (numbers 17-28) had been planted in one territory and, probably, at the same time on both sides of the pedestrian road.Table 2 shows even this group of plants to be extremely heterogeneous -the number of detected SNPs varied from 6 to 44.
For some sites with SNPs, more than one nucleotide variant was detected.For these SNPs, in some cases, the ratio of allelic variants was close to 50/50, while in other cases, the distribution was unequal (Supplementary materials).It might be assumed that the 50/50 ratio is observed in hybrids and is a result of genetic differences in ITS sequences between paternal and maternal plants.
For SNPs with a significant prevalence of one nucleotide variant over another, polymorphism within numerous copies of ITS in the genome is more likely.

Discussion
Poplar is a model object for biological research in trees (Jansson, Douglas, 2007).Over the last decades, numerous approaches have been developed and applied for the analysis of poplar genome, the study of interaction between genotype and environment, and the identification of inter-and intraspecific polymorphism in Populus (Jansson et al., 2010;Melnikova et al., 2017).Morphology analysis is actively used in studies of poplars growing on the territory of Moscow city and the Moscow region.High heterogeneity of poplar populations in Moscow and widespread distribution of interspecific hybrids were shown (Kostina, Nasimovich, 2014;Kostina et al., 2017).
In addition to morphological features, the use of molecular markers is effective for plant diversity evaluation (Melnikova et al., 2009(Melnikova et al., -2011;;Khadeeva et al., 2011;Bolsheva et al., 2015).In our work, we first applied high-throughput sequencing of ITS to assess the genetic diversity of poplars in Moscow.ITS sequences were already used to study the polymorphism and barcoding of poplar species growing in western China and the number of detected SNPs (38) was high (Feng et al., 2013); that is comparable to the obtained by us data.It should be noted that high-throughput sequencing of ITS performed in the present work allowed us to obtain a much more complete picture of the genetic polymorphism of poplars growing in Moscow by contrast to Sanger sequencing.Thus a high level of genetic diversity of the studied plants was revealed.It can be assumed that such heterogeneous populations of poplars are highly adaptive and have the advantage of surviving in ecologically unfavorable urban conditions.We also showed that while planting poplars in Moscow, an extremely polymorphic plant material was probably used, and even trees planted at the same time in one limited territory were considerably genetically different.
The results of our work lay the foundation for the development of molecular markers for poplars species and interspecific hybrids of poplars growing on the territory of Moscow identification, as well as for determination of the origin of a number of natural hybrids.In addition, recent studies showed that poplar sex is genetically determined and only a small percentage of trees with recombination in the sex-associated genome region could change the sex (Geraldes et al., 2015;Borkhert et al., 2017;McKown et al., 2017).These data open up new opportunities for molecular marker development so as to use in the landscaping only male poplars, which do not produce fluff, while barcoding using ITS will allow evaluation of polymorphism and maintenance the diversity of populations adaptive to unfavorable urban conditions.

Table 2 .
Single nucleotide polymorphisms of poplars growing in Moscow based on ITS sequence * ITS sequences of P. trichocarpa (GenBank: AJ006440.1)were used as references.