SSR-based evaluation of genetic diversity in populations of Agriophyllum squarrosum L. and Agriophyllum minus Fisch. & Mey. collected in South-East Kazakhstan

The development of informative polymorphic DNA markers for poorly studied genera is an important step in population analyses of living organisms, including those that play very important ecological roles in harsh environments, such as desert and semi-desert area. Examples of those poorly studied desert species are Agriophyllum squarrosum L. and Agriophyllum minus Fisch. & Mey. However, a recent RNA-sequencing project in A. squarrosum has proposed a large set of hypothetical SSR (simple sequence repeat) markers. In this work, 11 novel polymorphic SSRs were found due to the screening of 24 randomly selected SSRs for three populations of A. squarrosum and one population of A. minus. The analysis of 11 SSRs revealed 16 polymorphic loci in two Agriophyllum species, 8 polymorphic loci within three populations of A. squarrosum, and 6 polymorphic loci in the population of A. minus. Statistical analyses showed high interspecific, but relatively low intraspecific genetic diversity. The phylogenetic clusterization and population structure analysis have demonstrated a clear segregation of A. minus from A. squarrosum, as well as the separation of population 1 from populations 2 and 3 of A. squarrosum. Thus, we identified the set of novel and informative SSR markers suitable for the study of genetic diversity in Agriophyllum.


Introduction
Xerophytes and psammophytes (plants with adaptations to survive in water-deficiency environment) become the main plants in the desert region. Despite the harsh environmental conditions, many of these plants have adapted and became a part of complex and diverse desert ecosystems. Due to the wide distribution of desert territories, the study of its wild flora is very important for the ecological prediction and conservation of biodiversity. The Central Asia region, including Kazakhstan, is having numerous large and small sand deserts, such as Gobi, Taklamakan, Karakum, Kyzylkum and Moiynkym. Deserts and semi-deserts occupy more than half of Kazakhstan territory and keep growing (Issanova et al., 2015). The major representatives of wild desert flora here are herbaceous plants, subshrubs, shrubs and subtrees. The list of dominating species includes genera Artemisia, Salsola, Ferula, Arthrophytum, Calligonum, Ammodendron, Haloxylon, and Agriophyllum (Rachkovskaya et al., 2003). Many of them are endemic for the Central Asian and Kazakhstan deserts.
Agriophyllum squarrosum L. and Agriophyllum minus Fisch. & Mey. are important desert species in Kazakhstan. They belong to the tribe Corispermeae within the subfamily Chenopodioideae of the family Chenopodiaceae (Kühn, 1993). The genus Agriophyllum includes five species, and four of them, including A. squarrosum, grow in Kazakhstan (Ageeva et al., 1960). A. squarrosum is also widely spread in all Central Asia territory, Caucasus, and China. Local people in the sandy desert regions of China consume the seed of the species during periods of food shortage, and refer to the plant as 'shami' in Chinese, which translates as 'sand rice' . Sand rice is an example of psammophyte perfectly adapted for harsh desert environmental conditions. Morphologically it is a shrub-like plant with a height ranging from 20 to 100 cm. The stem of sand rice at the young plant stage is firm, branched, green, and covered with short hairs. Leaves are small, sessile, green, and usually linear. Small flowers are organized in a spike-like inflorescence. A. squarrosum blossoms in the late summer and early fall, after that ovoid shape seeds are formed. Seeds of sand rice are very light and covered by a thin husk. After ripening, the husk is cracked into two parts, and seeds are easily dispersed by the wind. The root system is represented by a long taproot penetrating deep into the soil to access stored moisture, and almost equally long lateral roots branching near the soil surface and helping to fix plant in loose sand. Although, there is a lack of information related to the structure of the sand rice's genome, the transcriptomic analysis of A. squarrosum (2n = 18) had showed presence of 67 741 unigenes and approximately 43 % of them were annotated .
Throughout history, sand rice was used for diverse purposes. Aboveground organs (stem and leaves) are eaten by both wild animals and livestock of farmers in arid and semi-arid regions, especially in Western Kazakhstan on camel pasture. Historically, nutritious seeds of sand rice were an alternative to cereals not surviving in hot deserts. In China and Mongolia, the local villagers consume sand rice seed in a variety of dishes. There are many reports about the rich nutrition value of A. squarro sum seeds close to its widely-used as food quinoa seeds . However, sand rice is not a domesticated plant species with several agricultural ly unfavorable traits, such as fragile spikes and light seeds. Still, the works on possible domestication of sand rice as a novel crop are reported . In addition to nutrition purposes, sand rice had found its application in medicine. Back in the days, it was used as antipyretic and analgesic medicine (Gong et al., 2012;Chen et al., 2014). It was reported that the extract of A. squarrosum decreases blood glucose levels in type 2 diabetic mice and has the potential for further medical researches (Saqier et al., 2019). Sand rice is useful in combat against shifting sands (Wen-Ming et al., 2004). Climate change and human activities led to the growth of sand desert areas and the migration of sand dunes to agricultural territories. The structure of the sand rice root system and its ability to form seed banks in active sand dunes allow the fixation of the sand surrounding the plant. Thus, A. squarrosum has a tremendous potential together with other psammophytes to be used in a large-scale sand fixation (Liu et al., 2007;Ma, Liu, 2008). This species is an interesting model for different studies of morphology and physiology of desert plants. For example, earlier A. squarrosum was already used for the study of growth under drought conditions (Mo et al., 1997;Huang et al., 2008) and for the study of fertilizer effect on psammophytes under different rainfall conditions (Yuan et al., 2019).
Endemic species with great economic potential like sand rice are an interesting subject for genetic and molecular researches. One of the most common methods utilized for the studies on biodiversity conservation, population and phylogenetic studies of wild plant species is the usage of molecular DNA markers (Nybom, 2004). Examples of successful application of the most common DNA markers in plants include random amplification of polymorphic DNA (RAPD) (Nybom, Bartish, 2000), amplified fragment-length polymorphisms (AFLP) (Zhang C. et al., 2018), and other nuclear and chloroplast DNA markers Alme rekova et al., 2018;Turuspekov et al., 2018). Nuclear ribosomal internal transcribed spacer (nrITS) region and five chloroplast DNA (cpDNA) fragments have been used earlier for the study of population dynamics of A. squarrosum in China (Qian et al., 2016). The maturase K (matK ) gene of the chloroplast genome and nrITS were used for comparison of A. squarrosum and A. minus populations in two regions of Kazakhstan . The literature survey suggests that there is a limited information on the study of Agriophyllum species by using SSR markers. However, a recent RNA-sequencing project of A. squarrosum populations in China has suggested several thousands of potential SSR markers for this species (Zhang J. et al., 2018).
In this study, we selected 24 SSR markers from this Agrio phyllum genome resequencing project and used them for the assessment of genetic diversity within and among populations of A. squarrosum, and between A. squarrosum and A. minus.

Materials and methods
Plant material. In total, leaf samples of four wild Agriophyl lum populations were collected in South-East Kazakhstan and used for the analysis (Table 1). The list included three populations of A. squarrosum and one population of A. minus sampled in Moyynkum desert of Almaty region in South-East Kazakhstan (Fig. 1). Population 1 of A. squarrosum was collected in 2016, while the populations 2 and 3 of A. squarrosum and population 1 of A. minus were collected in 2019. The distances between populations were at least four kilometers, and plants within the population were sampled in at least 50 meters apart.
DNA extraction, amplification and SSR marker assessment. Five young leaves from each sample were dried in sili ca gel. The total genomic DNA was extracted from dry leaf tissues using CTAB method (Doyle, 1991). The quality and concentration of extracted DNA were assessed via spectrophotometric test, and 1 % agarose gel electrophoresis.
Twenty-four pairs of SSR markers and their primers (Supplementary Table) 1 were selected from 6150 SSRs in the sequence of A. squarrosum genome reported by Zhang J. and co-authors (2018). SSR motives and expected sized of alleles were obtained from the same source.
Annealing temperature (T a ) for primer pairs and concentration of reagents in a PCR reaction mix were determined empirically. All successful PCR reactions were performed in total 16 μl volumes, including 4 mM of each dNTP, 2 mM of MgCl 2 , 6.4 mM of primer mix (forward + reverse), 1.6 U of Taq polymerase and 50 ng of DNA. The amplification was performed in Veriti Thermocycler (Applied Biosystems, Foster City, CA, USA) with an initial denaturation step at 94 °C for 3 minutes, followed by 40 cycles of 94 °C for 30 seconds, optimized T a °C (see Suppl. Table) for 45 seconds and 72 °C for 1.5 minutes. The final extension step was at 72 °C for 10 minutes. The PCR products were separated on 6 % polyacrylamide gel (PAG). The SSR profile image was  (Yeh et al., 1997); the polymorphism information content (PIC) (Botstein et al., 1980) was calculated as follows: PIC = Σ(1 -p 2 i ), where p i is the frequency of the i-th band or percentage of individuals in which the fragment is present. Principal coordinates analysis (PCoA) and unweighted pair group method with arithmetic mean (UPGMA) hierarchical clustering were performed based on genetic distances among populations and species. PCoA plot was made via GenAlEx (Genetic Analysis in Excel) ver. 6.5 software (Peakall, Smouse, 2012), UPGMA analysis was performed in R statistical software environment (R Core Team, 2018). Bayesian clustering was based on the use of the STRUCTURE software ver. 2.3.4 (Pritchard et al., 2000). The value of K was set from 1 to 5 with five iterations for each value of K. Both, length of burn-in period, and the number of Markov Chain Monte Carlo (MCMC) repeat after burn-in was set at 100 000. Web-tool STRUCTURE HARVESTER (Earl, von Holdt, 2011) based on Evanno's method (Evanno et al., 2005) was used to determine the best fit value of K. The molecular variance (AMOVA) test was calculated using GenAlEx software.

Performance of SSR markers in Agriophyllum species
Initially, 24 SSR markers were chosen for the analysis (see Suppl. Table), however, successful amplification was performed for 18 markers only. Five of 18 SSRs were multilocus markers, while other 13 SSRs were single-locus. In total, 18 markers allowed identification of 23 loci (Table 2), of which 16 were polymorphic and suitable for the analysis (Table 3).
The screening of sixteen SSR loci has allowed the identification of 43 alleles in the analysis of three populations of A. squarrosum and one population of A. minus. The molecular sizes of alleles in loci were ranged from 143 to 342 bp. The number of alleles in the study of all four populations varied from 2 to 4.
In total, eight polymorphic loci were observed in A. squar rosum, six polymorphic loci were found in A. minus, and 16 loci had demonstrated polymorphism between two species (see Table 3). In total, 20 alleles were identified exclusively in A. squarrosum, 16 alleles were exclusive for A. minus, and 7 alleles were found in both species. The average number of alleles per locus in polymorphic loci was 2.69 ± 0.70 for both species, 2.25 ± 0.16 for A. squarrosum, and 2.17 ± 0.17 for A. minus, respectively.
Nei's index and PIC value were calculated separately for polymorphic SSR loci of A. squarrosum, A. minus, and jointly for two Agriophyllum species (see Table 3). The genetic distance-based PCoA plot suggested a clear separation of A. minus from populations of A. squarrosum, as well as distinguishing of the population 1 from populations 2 and 3 in A. squarrosum (Fig. 2, a). The UPGMA tree had also demonstrated two clusters corresponding to A. squarrosum and A. minus. The portion of samples in population 1 of A. squar rosum was clustered together with populations 2 and 3, while the remaining part formed a separate subcluster (see Fig. 2, b). Bayesian distance-based analysis of the structure among two Agriophyllum species was congruent with the UPGMA clusterization and also indicated that K = 3 is an optimal number of groups in the study (see Fig. 2, c). AMOVA results for the set of data containing all Agriophyl lum samples revealed that the majority (88 %) of the genetic diversity in Agriophyllum exists between two species rather than within them (Table 4).

Genetic diversity in A. squarrosum populations
Part of this study was focused on the assessment of the genetic diversity within and between populations of A. squarossum. In total, 26 alleles were identified in eight SSR loci. The largest number of unique alleles was found in population 1 (8 alleles), while population 2 had only one unique allele (

Genetic diversity of Agriophyllum squarrosum and Agriophyllum minus in South-East Kazakhstan
The AMOVA for A. squarrosum had shown a higher molecular variance within populations than the variance among populations (Table 6).

Discussion
The availability of informative DNA markers is advantageous for the study of rare species, their population structure, and for the development of a proper conservation strategy to prevent their extinction (Adams, Turuspekov, 1998;Kramer, Havens, 2009). SSR markers are one of those types of DNA markers that reliably utilized in population structure analysis due to their hyper-variability, co-dominance, and high reproducibility (Aitken et al., 2005). Successful applications of SSR markers for the identification, classification, and taxonomy of Chenopodiaceae species have been reported previously (Borger et al., 2008;Prinz et al., 2009;Sampson, Byrne, 2012;Nachtigall et al., 2016). However, for the genus Agriophyllum, there were no previous studies with the use of SSRs. The situation was changed recently due to the RNAsequencing project of A. squarrosum accessions, and a large number of hypothetical SSR markers were suggested for population studies (Zhang J. et al., 2018). Hence, in this work 24 SSRs were randomly selected for the assessment of three populations of A. squarrosum and one population of A. mi nus collected in South-East Kazakhstan. The assessment of selected SSR markers resulted in the identification of 11 novel SSR markers that allowed the identification of 16 polymorphic loci in two Agriophyllum species (see Table 3). The large percentage of molecular variance (see Table 4) and a large  genetic distance (see Fig. 2) observed between A. squarrosum and A. minus indicated good discriminative power of studied SSR markers and confirmed a significant genetic difference reported between these species earlier (Genievskaya et al., 2018). Fifteen SSR loci used for both species were in intermediate diversity group with PIC values ranged between 0.32 and 0.50, and Nei's diversity index ranged between 0.39 and 0.56 (see Table 3). When species were analyzed separately, 8 out of 16 loci were polymorphic for A. squarrosum, and 6 loci were polymorphic for A. minus (see Table 3). The average diversity in A. minus was slightly higher than in A. squarrosum and lower than the average interspecies diversity (see Table 3). Thus, A. squarrosum and A. minus have maintained a low level of genetic diversity. It has been demonstrated that annuals like Agriophyllum or short-lived perennials usually demonstrate low levels of genetic diversity compared with long-lived and outcrossing species (Austerlitz et al., 2000). However, the limited amount of samples in studied species (particularly in A. minus), their close geographical locations, and a relatively small number of polymorphic SSR markers used for the analysis could influence obtained results.
The study of the genetic diversity within A. squarrosum suggested that 66 % of the total variation was within and 34 % between populations (see Table 6), indicating that the difference of population 1 from populations 2 and 3 was rather substantial. The presence of relatively high diversity within populations may be explained by the geographical proximity of A. squarrosum populations used in this study (see Table 1),  (Conner, Hartl, 2004). Populations 2 and 3 of A. squarrosum had demonstrated an extremely low level of genetic diversity in comparison with population 1 (see Table 5), but the fact that samples from each population clustered together using multiple analyses (PCoA, UPGMA, Bayesian clustering) (see Fig. 2) indicates that there is a high level of heterogeneity in the species. The grouping of 5 samples from population 1 on the UPGMA tree and on the STRUCTURE bar plot into separate subcluster in the entire species cluster may indicate the presence of some barriers between groups of samples within population 1 interfering gene flow. Therefore, even if no physical obstructions were found during the sample collection, probably, two groups in population 1 may be considered as subpopulations (Waples, Gaggiotti, 2006). This assumption is also supported by the relatively higher genetic diversity in population 1 comparing with the other two A. squarrosum populations (see Table 5).

Conclusions
In this study, sequences of 24 pairs of oligonucleotides for SSR markers were randomly selected from A. squarrosum genome resequencing project, and used for the study of genetic diversity in the genus Agriophyllum. Hence, it is the first report exploring the performance of novel SSR markers in the genetic analysis of this genus. The study revealed 16 polymorphic loci in eleven SSR markers using two Agriophyllum species, 8 polymorphic loci within three populations of A. squarrosum, and 6 polymorphic loci within the population of A. minus. Statistical analyses showed high interspecific, but relatively low genetic diversity in populations 2 and 3 of A. squarrosum. The phylogenetic clusterization and analysis and population structure analysis demonstrated clear segregation of A. minus from A. squarrosum, as well as the separation of population 1 of A. squarrosum form populations 2 and 3. As a result, we identified the set of novel and informative SSR markers suitable for the study of genetic diversity in Agriophyllum species. These results provide an important contribution to the population study and approaches for the development of conservation mechanisms for Agriophyllum species.