Genetic variation of the nuclear sequences of mitochondrial origin associated with retrotransposon Tv 1 insertions in Drosophila species of the virilis group

Митохондриальные последовательности, интегрированные в дНК хромосом, – перспективный объект для разработки генетических маркеров филогенеза и геномной нестабильности. Митохондриальный геном D. virilis и других видов дрозофил из группы virilis содержит микросателлитные последовательности (AT)n в спейсерной области между генами atp6 и cox3, что является отличительным признаком группы virilis. Ядерный геном D. virilis содержит большое количество протяженных фрагментов митохондриальной дНК, которые в сумме в несколько раз длиннее митохондриального генома. Эти ядерные последовательности митохондриального происхождения содержат все типы митохондриальных последовательностей, в том числе митохондриальные гены и микросателлитные последовательности (AT)n в спейсерной области между генами atp6 и cox3. Наличие микросателлита (AT)n обеспечивает возможность инсерции ретротранспозона Tv1, имеющего свойство встраиваться сайт-специфично в последовательности микросателлита (AT)n. В результате инсерции транспозона в микросателлит образуется уникальная последовательность, образованная ядерной копией гена atp6 или cox3 и ретротранспозоном Tv1, которая может быть выделена из генома методом ПЦР. Используя этот подход, мы выявили и проанализировали нуклеотидную изменчивость псевдогенов atp6 и cox3, ассоциированных с инсерциями Tv1, в клеточной культуре D. virilis и у четырех видов дрозофил из группы virilis: D. virilis, D. montana, D. borealis и D. lacicola. Выявлены новые события переноса митохондриальных последовательностей в ядро клетки в пересеваемой культуре D. virilis и новые события инсерций ретротранспозона Tv1 в геноме клеток пересеваемой культуры, возникших в ходе пассирования данной клеточной линии. Показана видоспецифичность фрагментов митохондриальных псевдогенов atp6 и cox3, ассоциированных с инсерциями ретротранспозона Tv1, в ядерном геноме видов дрозофил из группы virilis, позволяющая идентифицировать виды группы. Возраст инсерций Tv1 в последовательности митохондриального происхождения у D. virilis равен 1.50 млн лет, D. lacicola – Mitochondrial DNA sequences integrated into chromosomes are a promising object for designing genetic markers for studies of phylogenesis and genomic instability. Mitochondrial genomes of D. virilis and other Drosophila species of the virilis group contain (AT)n microsatellites in the spacer region between the atp6 and cox3 genes, and this microsatellite sequence is one of the hallmarks of the virilis group. The nuclear genome of D. virilis contains many extended fragments of mitochondrial DNA, which in total are several times longer than the mitochondrial genome. These nuclear sequences of mitochondrial origin contain all types of mitochondrial sequences, including mitochondrial genes and the aforementioned microsatellite sequence. The presence of the (AT) n microsatellite allows insertion of retrotransposon Tv1, which can transpose into the (AT)n microsatellite in a site-specific manner. The Tv1 insertion into (AT)n, close to the atp6 or cox3 pseudogenes produces a unique sequence. This sequence is formed by retrotransposon Tv1 and pseudogenes atp6 or cox3. This unique sequence can be detected in the genome by a PCR-based method. We applied this method to the detection and analysis of the nucleotide variability of the pseudogenes atp6 and cox3 associated with Tv1 insertions in a D. virilis cell culture and in the genomes of four Drosophila species of the virilis group: D. virilis, D. montana, D. borealis, and D. lacicola. We discovered new events of mitochondrial sequence transfer to the nucleus in the transplanted cell culture of D. virilis, and new Tv1 insertions, having emerged during the passage of this cell line were detected in the genome of the D. virilis transplanted cell culture. We found atp6 and cox3 pseudogenes associated with insertions of retrotransposon Tv1 in the nuclear genomes of four Drosophila species from the virilis group. These chimeric sequences proved to be species-specific. The age of the Tv1 insertion into the atp6 and cox3 pseudogenes is estimated at 1.50 Ma for D. virilis, 1.31 Ma for D. lacicola, and 1.56 Ma for D. borealis. A specific situation was revealed for D. montana, in which Tv1 insertions with nearly identical 5’ and 3’ long terminal repeats (LTRs) were present in accessions of flies from europe and Asia. The age of this insertion was about 300 thousand years, and the insertion was absent from the D. montana fly line from North America.

T he transfer of mitochondrial DNA into the nucleus of the cell has been found in all eukaryotic species studied so far (Bensasson et al., 2001;Richly, Leister, 2004;HazkaniCovo et al., 2010). Fragments of mitochondrial DNA in chromosomes were called numts (abbreviation of nuclear sequences of mitochondrial origin) (Lopez et al., 1994). In the literature, this term is variously capitalized and italicized. We interpret numt as a genetic term and write it in lowercase Roman letters. The numbers and lengths of numts vary among species, depending on the ratios of their acquisition and loss.
The main source of data on the origin and variability of numts is the comparative analysis of complete genomes. The numbers of numts in the genomes of different fruit fly species vary by an order of magnitude (Rogers, GriffithsJones, 2012). Such a significant variation is likely to be determined by different rates of numt acquisition and loss. In Drosophila, the frequency of fixation of a new numt type in the genome is 0.75 copies per million years (Rogers, GriffithsJones, 2012). It is natural to assume that the speed of occurrence of new copies of numts is highly variable in different species.
Since insertions of fragments of mitochondrial DNA into chromosomal DNA cause mutations, a significant increase in the number of numts in the genome can be a marker of genomic instability. The frequency of mitochondrial DNA fragment insertion into chromosomal DNA varies in the course of evolution (HazkaniCovo, Martin, 2017). It has been suggested that in most cases the appearance of new numts in the genome is related to the speciation process (Gunbin et al., 2017). In this connection, it is of interest to study the as-sociation of numts with retrotransposons, since the induction of transpositions of retrotransposons and numts can cause genomic instability. Numts are usually integrated into (AT) n microsatellite sites and are often flanked by retrotransposons, or Alu repeats in humans (Tsuji et al., 2012). In Drosophila, 45 % of numts are located close to retrotransposons of LINE type and LTRcontaining type (Rogers, GriffithsJones, 2012).
The search for numts associated with retrotransposons in the complete genome of D. virilis revealed an insertion of retrotransposon Tv1 into the spacer region between the atp6 and cox3 numts. Retrotransposon Tv1 was found in the D. virilis genome and in the genomes of all species of Drosophila forming the virilis group (Andrianov et al., 1999). Retrotransposon Tv1 is integrated in a sitespecific manner into the microsatellite (AT) n sequence to form a direct duplication (AT) 4 at the site of insertion (Andrianov et al., 2010). The presence of the (AT) n microsatellite favors Tv1 integration into this site. As a result of Tv1 insertion into the numt, a unique sequence arises, which can be detected from the genome by a PCRbased method. Using this approach, we analyzed the variability of numts associated with Tv1 insertions in D. virilis fly lines, in D. virilis cell culture, and in the genomes of four species of Drosophila of the virilis group: D. virilis, D. montana Stone, Griffen, Patterson (1941), D. lacicola Patterson (1944) andD. borealis Patterson (1952). All numts associated with the retrotransposon Tv1 in fly lines are located on the Y chromosome. Our data reveal new events of mitochondrial DNA transfer into chromosomes and new events of Tv1 retrotransposition in D. virilis cell culture. This finding brought us to the conclusion that the emergence of new numt -Tv1 associations were specific markers characterizing genomic instability in D. virilis cell culture. Ассоциация инсерций мтдНК и инсерций ретротранспозона Tv1 у дрозофил and the cell culture used in this study are available upon request to the corresponding collections or to the authors of the article. DNA isolation, PCR and sequencing of PCR fragments. Genomic DNA was isolated from Drosophila imagoes and cultured cells by the conventional phenolchloroform extraction method (Sambrook et al., 1989). PCR amplification was performed on a template of total DNA isolated from an individual Drosophila imago or from 10 6 cells of permanent cell culture.

Materials and methods
The primers used to amplify the atp6 and cox3 numts associated with the insertion of retrotransposon Tv1 and fragments of the mitochondrial genes atp6 and cox3 of Drosophila species of the virilis group are listed in the Table. PCR was performed on an Applied Biosystems (PCR System 2700) thermocycler with a universal Encyclo Plus PCR kit (Evrogen, Moscow) as recommended by the manufacturer, the reaction volume being 25 μL.
All conceivable structures of "chimeric" sequences resulting from the insertion of retrotransposon Tv1 into the microsatellite (AT) n sequence between the atp6 and cox3 genes in the forward and reverse orientations relative to the orientation of the mitochondrial genes are presented in Figure 1. There may be four types of sequences, and they correspond to four types of experimental design to obtain a "chimeric" PCR fragments Experiment a1. The following pairs of primers were used to amplify the atp6 numts associated with the insertion of retrotransposon Tv1 in the forward orientation: (1)  The results of electrophoretic fractionation of PCR fragments formed by atp6 and cox3 numts associated with the insertion of retrotransposon Tv1 are presented in Supplement 2. The sizes of the PCR fragments can be indicated only approximately, because atp6 and cox3 numts reveal indels in interspecific and, sometimes, intraspecific comparison, and the lengths of the spacer sequence between atp6 and cox3 differ among species.
Cloning. PCR products were run in agarose gel, eluted, and purified with an elution kit (Zymoclean™ Gel DNA Recove ry Kit, Zymo Research, USA) according to manufacturer's recommendations. All PCR products were cloned prior to sequencing. PCR product cloning was performed using the pGEM ® T Easy Vector System according to standard protocols (Fermentas InsTAclone™ PCR Cloning Kit). The resulting clones were sequenced. At least three independent clones were sequenced for each PCR fragment. Sequencing of the amplification products was conducted with both primers on an ABI PRISM 3500 instrument using a BigDye ® Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems, United States), according to manufacturer's recommendations.
Phylogenetic analysis. Alignment of the resulting sequences and phylogenetic analysis were carried out in the MEGA6.06 program (Tamura et al., 2013). For construction of dendrograms, we used the NJ method and the Kimura evolutionary model. The bootstrap support of 1000 replicas was used. Indels were removed from the compared sequences prior to the construction of dendrograms.

results and discussion
To provide a basis for further work, we characterized atp6 and cox3 numts and associated insertions of retrotransposon Tv1 in the genome of D. virilis (Fig. 2).
Experimental search for chimeric sequences of atp6 numt -Tv1 in fly lines of D. virilis of different geographical origins revealed the expected nucleotide sequence corresponding to the D01 map (see Fig. 2) in all fly lines examined. We compared five D. virilis lines of different geographic origins (see Supplement 1). They all contain the same numt. The nucleotide divergence of this atp6 numt from the atp6 mitochondrial gene of D. virilis is 0.09. The amino acid sequence contained 21 substitutions. This numt has no internal termination codons. Insertions of Tv1 are also the same in different fly lines, except for the insertion of Tv1 in line L160. The Tv1 associated with the atp6 numt in this line has several point nucleotide substitutions and two small deficiencies in the LTR sequence. The lengths of numts in the genome of D. virilis flies and in the permanent cell line are constant, being the same as in the corresponding mitochondrial sequence. The genetic maps of all experimentally obtained numts of D. virilis associated with the insertion of Tv1 are presented in Supplement 3. Differences in the length of PCR fragments, observed only on the DNA template of the cell culture, are determined by differences in the length of LTRs of retrotransposon Tv1. These differences in length are due to the presence or absence of 40bp long duplications. Only one type of association of atp6 numts with Tv1 in direct orientation was found in fly genomes, and there were no associations of cox3 numts with Tv1, whereas in the cell culture all the four possible types of associations were identified (see Supplements 2 and 3). Consequently, they emerged in the cell culture in the process of cultivation after the obtaining of this culture in 1979 (BraudeZolotarjova et al., 1986). Unfortunately, the D. virilis cell line was established from a fly line that has been lost, and presently they cannot be compared. However, with regard to the fact that D. virilis is a nearly monomorphic species (Mirol et al., 2008), we can use extant fly lines for comparisons with the D. virilis cell line.
This raises the question of the origin of these numts and the origin of Tv1 insertions. Theoretically, there are two possible sources of new numt insertions in the genome: mitochondrial DNA and previously arisen numts (HazkaniCovo et al., 2003). To make it clear, we conducted a phylogenetic analysis of the variability of the obtained numts and the corresponding sequences of mitochondrial genes. The result of the comparison of mitochondrial genes with numts is shown in Figure 3.
We detected almost complete identity between the newly emerged numts in the cell culture and the sequence of the mitochondrial gene of D. virilis, and significant differences from the numts of D. virilis flies. Consequently, the numts that arise in the cell culture descend from mitochondrial DNA rather than from preexisting numts. The genomederived numt associated with the insertion of Tv1 belongs to the most divergent and probably the oldest insertions of numts in the D. virilis genome. Comparison of the variability of numts and associated copies of Tv1 provides an answer to the question of the differences in age of these insertions. In doing it, we analyzed the Tv1 LTR nucleotide variability from Tv1 insertions in the numts. The result of the phylogenetic analysis of Tv1 LTR variability is presented in Figure 4. In silico search revealed 12 types of Tv1 LTRs in the D. virilis genome. We also found two currently active types of Tv1 in the cell culture (Fig. 4). These types differ from the Tv1 copy associated with the numt in the genome. Hence, both numts of D. virilis and Ассоциация инсерций мтдНК и инсерций ретротранспозона Tv1 у дрозофил the associated Tv1 LTRs in the cell culture are among the youngest sequences in their groups, and they arose during the cell culturing. The ages of the numt and Tv1 insertions in the cell culture match. To find out whether Tv1 insertions always occur in newly formed numts, we searched some Drosophila species of the virilis group for associations of numts with Tv1 according to the experimental design shown in Figure 1. Numts marked with Tv1 insertions were identified in D. virilis, D. borealis, D. lacicola, and D. montana. Genetic maps of the PCR fragments are presented in Supplement 4. Insertions of Tv1 in the direct orientation with respect to atp6 and cox3 were found in D. lacicola. The nucleotide divergence of the cox3 numt from the sequence of the corresponding mitochondrial gene of the same species is 0.05, and for the two detected atp6 numts, the nucleotide divergences are 0.06 and 0.07. The numbers of amino acid substitutions for these numts are 6, 5, and 9, respectively. None of the nucleotide substitutions generated termination codons. These two D. lacicola atp6 numts are associated with different copies of Tv1 (GenBank accession numbers: KX399470, KX399471). As inferred from the intercomparison of these numts, they diverged after the transfer to the nucleus. A total of 14 nucleotide substitutions were found; of them, 5 in the first position, 6 in the second, and only 3 in the third position of codons, respectively. Note that in the mitochondrial genome most nucleotide substitu-tions fall in the third position. It is reasonable to suggest that the observed differences accumulated after the transfer of the sequences to the nuclear genome.
Insertions of Tv1 in the reverse orientation relative to atp6 and cox3 were found in the genome of D. montana. The nucleotide divergence of the detected numts from the corresponding mitochondrial genes of this species is 0.09. This level of nucleotide divergence suggests that the divergence of the numts is ancient. We found 12 amino acid substitutions and 1 termination codon. The cox3 numt of the same species has 15 amino acid substitutions and 3 termination codons. In total, four lines of D. montana were analyzed. In the 1021.13, 20 OL8 and KR 1309 fly lines, atp6 and cox3 numts are associated with retrotransposon Tv1 in the reverse orientation. The 5′ and 3′ LTRs of this Tv1 are nearly identical, which suggests a recent insertion of retrotransposon Tv1 into the ancient numt. In D. montana line 1021.19 from North America, these numts were not found. In this line, we found another cox3 numt, associated with the retrotransposon Tv1 in the reverse orientation. The nucleotide divergence of this numt is 0.05; it has seven amino acid substitutions and one termination codon.
An insertion of Tv1 in the reverse orientation with respect to cox3 was found in D. borealis. The nucleotide divergence of the detected numt from the corresponding mitochondrial DNA is also large, 0.12. There are 14 amino acid substitu- tions and 3 termination codons. In addition, D. borealis has an unusual insertion of retrotransposon Tv1, located at the beginning of the atp6 gene rather than between the atp6 and cox3 genes. The detected numt includes atp6 and cox3 sequences associated with Tv1 in the direct orientation. The nucleotide divergence of this atp6 numt from the D. borealis mitochondrial atp6 gene is 0.14; it has 31 amino acid substitutions and 1 termination codon. The nucleotide divergence of this cox3 numt from the D. borealis mitochondrial cox3 gene is 0.13; it has 11 amino acid substitutions and 2 termination codons. All the detected atp6 and cox3 numts are associated with retrotransposon Tv1 only in males (see Supplement 2), which points to their location on the Y chromosome. It can be assumed that the Y chromosome is the preferred place of the preservation of numts, whereas in other parts of the genome these sequences are rapidly lost.
Phylogenetic analysis reveals characteristic differences in the time of the emergence of numts and insertions of Tv1 (Figures 5 and 6). Figure 5 shows the phylogenetic reconstruc- As an external group, we use atp6 of D. melanogaster. each of the atp6 numts of D. virilis found during in silico analysis of the complete genome is indicated with a capital letter and two digits. The genetic maps of these numts are shown in Figure 2. The atp6 numt from the genome of D. virilis flies of the Dv40 line is marked with an arrow. The atp6 numts from the genome of the cell culture of D. virilis isolated in our experiments are indicated with a curly brace. The first two characters in the names of the experimentally obtained numts indicate the type of experiment in which the nucleotide sequences were obtained. Nucleotide sequences were submitted to GenBank under accession numbers JX560766-JX560769 and KF669862-KF669864. Nucleotide sequences of mitochondrial genes of the corresponding species of Drosophila of the virilis group were submitted to GenBank with accession numbers KX399463, FJ536196, FJ536199, FJ536203, and FJ536204. Mitochondrial sequences are marked with black circles.

Генетика животных
Вавиловский журнал генетики и селекции • 2018 • 22 • 7 Ассоциация инсерций мтдНК и инсерций ретротранспозона Tv1 у дрозофил tion of the divergence of atp6 numts in Drosophila of the virilis group. In all examined cases, the time of divergence between numts and the modern forms of the mitochondrial genes is about several million years. The ages of insertions of retrotransposon Tv1 can be estimated by comparing different copies of LTRs associated either with the atp6 or cox3 numts in a particular line of flies. The analysis is illustrated in Figure 6. If we assume the rate of fixation of nucleotide substitutions to be 0.016 substitutions per site for one million years, the typical value for noncoding Drosophila sequences (Bowen, McDonald, 2001) It is important to note that there is no coincidence between the ages of numts and of the associated insertion of Tv1 in D. montana. Young copies of Tv1 are as sociated with the ancient numt. There fore, the transfer of these elements oc curred independently and at different times. This fact points to probable differences in the molecular mechanisms of the appearance of numtTv1 associations in somatic cells of the cell culture and germline cells.

Conclusions
We investigated the variability of atp6 and cox3 numts associated with site specific insertions of retrotransposon Tv1 in Drosophila of the virilis group and D. virilis permanent cell culture. The method of numt detection was based on the ability of retrotransposon Tv1 to transpose into the microsatellite (AT) n sequence and on the presence of this microsatellite in the spacer region between the mitochondrial genes atp6 and cox3 in the mitochondrial genomes of Drosophila of the virilis group. In the D. virilis cell line, we found new events of mitochondrial DNA transfer to the nucleus and new Tv1 insertions. Most of the new insertions of retrotransposon Tv1 in the cell culture occur in the newly emerged numts. As a result, the ages of retrotransposon insertions and numt in sertions in the cell line are the same. The opposite situation was found in Drosophila species of the virilis group. Insertions of Tv1 occur in ancient numts in the genomes of flies. As a result, the ages of retrotransposon insertions and numts are different. The atp6 and cox3 numts, which are associated with sitespecific insertions of retrotransposon Tv1, are speciesspecific in Drosophila of the virilis group.

acknowledgements
The work was supported by project AAAAA161161116101803 "Study of variability of autonomous genetic elements of insects and development of markers of genome instability", contract 011220160001.  Fig. 6. Reconstruction of the phylogenetic tree of LTRs of retrotransposons Tv1 associated with atp6 and cox3 numts in Drosophila of the virilis group.
The first two characters in the name of a sequence indicate the type of experiment in which the LTR sequence was obtained. The next two letters indicate the species of Drosophila: Mo for D. montana, Bo for D. borealis, La for D. lacicola, and Vi for D. virilis. The next letters indicate the Drosophila line name. The bootstrap support values are listed at the nodes of the phylogram. Trees are drawn in scale. The lengths of branches correspond to the frequencies of nucleotide substitutions per site. estimation of the age of Tv1 insertions is provided by the formula T = d/2k, where T is age, Ma; d, nucleotide divergence; k = 0.016 (Bowen, McDonald, 2001). The LTR sequences of Tv1 and the associated sequences of numts were submitted to GenBank under accession numbers KX399470-KX399482.