Correction of GenBank’s taxonomic entry error raises a new issue regarding intergeneric relationships among salangid fishes (Osmeriformes: Salangidae)
https://doi.org/10.18699/vjgb-25-29
Abstract
The GenBank database of publicly available nucleotide sequences is the largest genetic repository providing vitally important resources for downstream applications in biology and medicine. The concern raised about reliability of GenBank data necessitates monitoring of possible taxonomic entry errors. A case of mitochondrial genome (or mitogenome) misidentification for a salangid fish belonging to the genus Neosalanx (Osmeriformes, Salangidae) is considered in this report. The GenBank database contains four complete mitogenome sequences of N. taihuensis with the accession numbers JX524196, KP170510, MH348204, and MW291630. The overall mean p-distance for these sequences is quite high (7.01 ± 0.14 %) but becomes 29-fold lower (0.24 ± 0.05 %) after excluding the MW291630 mitogenome. An analysis of all available nucleotide sequences of salangids has shown that the observed inconsistency in the level of divergence between N. taihuensis mitogenomes is due to species misidentification. It has turned out that the mitogenome MW291630 available in GenBank does not belong to N. taihuensis, but is, in fact, a mitogenome of N. jordani misidentified as N. taihuensis. The resolved taxonomic identity of the MW291630 mitogenome, as well as an extended sample of species with investigated single-marker sequences, has raised some new issues regarding intergeneric relationships in salangid fishes. In particular, the obtained data do not support synonymization of the genus Neosalanx with Protosalanx, as was suggested in the last revision of the salangid classification. As the comparative analysis of interspecific and intergeneric divergences shows, Protosalanx is not an all-inclusive clade that includes all Neosalanx species. Instead, it consists of (at least) two evolutionary distinct lineages with the level of genetic divergence between them matching well the mean value of divergence between the other salangid genera. Further analysis using nuclear genome-wide data is required to have new insights into the evolution of salangid fishes.
Keywords
About the Author
E. S. BalakirevRussian Federation
Vladivostok
References
1. Altschul S., Gish W., Miller W., Myers E., Lipman D. Basic local alignment search tool. J Mol Biol. 1990;215:403-410. doi 10.1016/ S0022-2836(05)80360-2
2. Balakirev E.S. Recombinant mitochondrial genomes reveal recent interspecific hybridization between invasive salangid fishes. Life. 2022;5:661. doi 10.3390/life12050661
3. Balakirev E.S., Saveliev P.A., Ayala F.J. Complete mitochondrial genomes of the Cherskii’s sculpin Cottus czerskii and Siberian taimen Hucho taimen reveal GenBank entry errors: incorrect species identification and recombinant mitochondrial genome. Evol Bioinform Online. 2017;13:1176934317726783. doi 10.1177/1176934317726783
4. Balakirev E.S., Kravchenko A.Y., Semenchenko A.A. Genetic evidence for a mixed composition of the genus Myoxocephalus (Cottoidei: Cottidae) necessitates generic realignment. Genes. 2020;11:1071. doi 10.3390/genes11091071
5. Balakirev E.S., Sharina S.N., Balanov A.A. Misidentified mitogenomes of two Lycodes species (Perciformes: Zoarcidae) in GenBank. Russ J Genet. 2024;60(10):1375-1382. doi 10.1134/S1022795424700911
6. Betancur-R R., Wiley E.O., Arratia G., Acero A., Bailly N., Miya M., Lecointre G., Ortí G. Phylogenetic classification of bony fishes. BMC Evol Biol. 2017;17:162. doi 10.1186/s12862-017-0958-3
7. Botero-Castro F., Delsuc F., Douzery E.J.P. Thrice better than once: quality control guidelines to validate new mitogenomes. Mitochondrial DNA A DNA Mapp Seq Anal. 2016;27(1):449-454. doi 10.3109/19401736.2014.900666
8. Cheng J., Ma G.-Q., Song N., Gao T.-X. Complete mitochondrial genome sequence of bighead croaker Collichthys niveatus (Perciformes, Sciaenidae): a mitogenomic perspective on the phylogenetic relationships of Pseudosciaeniae. Gene. 2012;491(2):210-223. doi 10.1016/j.gene.2011.09.020
9. Collins R.A., Boykin L.M., Cruickshank R.H., Armstrong K.F. Barcoding’s next top model: an evaluation of nucleotide substitution models for specimen identification. Methods Ecol Evol. 2012;3: 457-465. doi 10.1111/j.2041-210x.2011.00176.x
10. Cunha R.L., Nicastro K.R., Zardi G.I., Madeira C., McQuaid C.D., Cox C.J., Castilho R. Comparative mitogenomic analyses and gene rearrangements reject the alleged polyphyly of a bivalve genus. PeerJ. 2022;10:e13953. doi 10.7717/peerj.13953
11. Edgar R.C. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32:1792-1797. doi 10.1093/nar/gkh340
12. Fu C., Luo J., Wu J., López J.A., Zhong Y., Lei G., Chen J. Phylogenetic relationships of salangid fishes (Osmeridae, Salanginae) with comments on phylogenetic placement of the salangids based on mitochondrial DNA sequences. Mol Phylogenet Evol. 2005;35:76-84. doi 10.1016/j.ympev.2004.11.024
13. Fu C., Guo L., Xia R., Li J., Lei G. A multilocus phylogeny of Asian noodlefishes Salangidae (Teleostei: Osmeriformes) with a revised classification of the family. Mol Phylogenet Evol. 2012;62(3):848- 855. doi 10.1016/j.ympev.2011.11.031
14. Giribet G., Wheeler W.C. On gaps. Mol Phylogenet Evol. 1999;13(1): 132-143. doi 10.1006/mpev.1999.0643
15. Guo L., Li J., Wang Z., Fu C. Phylogenetic relationships of noodlefishes (Osmeriformes: Salangidae) based on four mitochondrial genes. Acta Hydrobiol. 2011;35:449-459. doi 10.3724/SP.J.1035.2011.00449
16. Hoang D.T., Chernomor O., von Haeseler A., Minh B.Q., Vinh L.S. UFBoot2: improving the ultrafast bootstrap approximation. Mol Biol Evol. 2018;35:518-522. doi 10.1093/molbev/msx281
17. Hofstetter V., Buyck B., Eyssartier G., Schnee S., Gindro K. The unbearable lightness of sequenced-based identification. Fungal Divers. 2019;96:243-284. doi 10.1007/s13225-019-00428-3
18. Houbraken J., Visagie C.M., Frisvad J.C. Recommendations to prevent taxonomic misidentification of genome-sequenced fungal strains. Microbiol Resour Ann. 2021;10:e01074-20. doi 10.1128/MRA.01074-20
19. Kartavtsev Y.P. Sequence divergence at mitochondrial genes in animals: applicability of DNA data in genetics of speciation and molecular phylogenetics. Mar Genomics. 2011;4(2):71-81. doi 10.1016/j.margen.2011.02.002
20. Kartavtsev Y.P., Rozhkovan K.V., Masalkova N.A. Phylogeny based on two mtDNA genes (Co-1, Cyt-B) among sculpins (Scorpaeniformes, Cottidae) and some other scorpionfish in the Russian Far East. Mitochondrial DNA A DNA Mapp Seq Anal. 2016;27(3):2225-2240. doi 10.3109/19401736.2014.984164
21. Kim D.E., Kim P., Lee H., Kim N.H., Kim D., Lee M.J., Ban Y.G., Jang B., Park J. Comprehensive analysis of the complete mitochondrial genome of Melanoplus differentialis (Acrididae: Melanoplinae) captured in Korea. Entomol Res. 2023;53:66-81. doi 10.1111/1748-5967.12633
22. Li X., Shen X., Chen X., Xiang D., Murphy R.W., Shen Y. Detection of potential problematic Cytb gene sequences of fishes in GenBank. Front Genet. 2018;9:30. doi 10.3389/fgene.2018.00030
23. Mohamed W.M.A., Moustafa M.A.M., Kelava S., Barker D., Matsuno K., Nonaka N., Shao R., Mans B.J., Barker S.C., Nakao R. Reconstruction of mitochondrial genomes from raw sequencing data provides insights on the phylogeny of Ixodes ticks and cautions for species misidentification. Ticks Tick Borne Dis. 2022;13(1):101832. doi 10.1016/j.ttbdis.2021.101832
24. Mulder K.P., Lourenço A., Carneiro M., Velo-Antón G. The complete mitochondrial genome of Salamandra salamandra (Amphibia: Urodela: Salamandridae). Mitochondrial DNA Part B. 2016;1:880- 882. doi 10.1080/23802359.2016.1253042
25. Nguyen L.T., Schmidt H.A., von Haeseler A., Minh B.Q. IQ-TREE: a fast and effective stochastic algorithm for estimating maximum likelihood phylogenies. Mol Biol Evol. 2015;32:268-274. doi 10.1093/molbev/msu300
26. Nielsen M.K., Wang J., Davis R., Bellaw J.L., Lyons E.T., Lear T.L., Goday C. Parascaris univalens – a victim of large-scale misidentification? Parasitol Res. 2014;113:4485-4490. doi 10.1007/s00436-014-4135-y
27. Nilsson R.H., Ryberg M., Kristiansson E., Abarenkov K., Larsson K.H., Koljalg U. Taxonomic reliability of DNA sequences in public sequence databases: a fungal perspective. PLoS One. 2006;1(1):e59. doi 10.1371/journal.pone.0000059
28. Oleinik A.G., Skurikhina L.A., Kukhlevsky A.D. Clarification of taxonomic assignment of smelt complete mitochondrial genome: GenBank accession number KP281293.1 (NC_026566.1). Mitochondrial DNA Part B. 2019;4:1696-1697. doi 10.1080/23802359.2019.1607578
29. Ožana S., Dolný A., Pánek T. Nuclear copies of mitochondrial DNA as a potential problem for phylogenetic and population genetic studies of Odonata. Syst Entomol. 2022;47:591-602. doi 10.1111/syen.12550
30. Roberts T.R. Skeletal anatomy and classification of the neotenic Asian Salmoniform superfamily Salangoidea (icefishes or noodlefishes). Proc Califor Acad Sci. 1984;43:179-220
31. Rozas J., Ferrer-Mata A., Sánchez-DelBarrio J.C., Guirao-Rico S., Librado P., Ramos-Onsins S.E., Sánchez-Gracia A. DnaSP 6: DNA Sequence Polymorphism analysis of large datasets. Mol Biol Evol. 2017;34:3299-3302. doi 10.1093/molbev/msx248
32. Salvi D., Berrilli E., Garzia M., Mariottini P. Yet another mitochondrial genome of the Pacific cupped oyster: the published mitogenome of Alectryonella plicatula (Ostreinae) is based on a misidentified Magallana gigas (Crassostreinae). Front Mar Sci. 2021;8:741455. doi 10.3389/fmars.2021.741455
33. Sangster G., Luksenburg J.A. The published complete mitochondrial genome of the milk shark (Rhizoprionodon acutus) is a misidentified Pacific spadenose shark (Scoliodon macrorhynchos) (Chondrichthyes: Carcharhiniformes). Mitochondrial DNA Part B. 2021a;6: 828-830. doi 10.1080/23802359.2021.1884019
34. Sangster G., Luksenburg J.A. Sharp increase of problematic mitogenomes of birds: causes, consequences, and remedies. Genome Biol Evol. 2021b;13:evab210. doi 10.1093/gbe/evab210
35. Sayers E.W., Cavanaugh M., Clark K., Pruitt K.D., Sherry S.T., Yankie L., Karsch-Mizrachi I. GenBank 2023 update. Nucleic Acids Res. 2023;51:D141-D144. doi 10.1093/nar/gkac1012
36. Simonov E., Lisachov A., Oreshkova N., Krutovsky K.V. The mitogenome of Elaphe bimaculata (Reptilia: Colubridae) has never been published: a case with the complete mitochondrial genome of E. ¬dione. Acta Herpetol. 2018;13:185-189. doi 10.13128/Acta_Herpetol-23394
37. Tamura K., Stecher G., Kumar S. MEGA11: Molecular Evolutionary Genetics Analysis version 11. Mol Biol Evol. 2021;38:3022-3027. doi 10.1093/molbev/msab120
38. Teske P.R. Mitochondrial genome announcements need to consider existing short sequences from closely related species to prevent taxonomic errors. Conserv Genet Resour. 2021;13:359-365. doi 10.1007/s12686-021-01214-7
39. The National Center for Biotechnology Information. Available online: https://www.ncbi.nlm.nih.gov/ (accessed on July 29, 2024)
40. Yang Y., Sui Z., Liu K., Liu Y. The complete mitochondrial DNA sequence of Linyi small icefish (Neosalanx taihuensis). GenBank submission: 24-NOV-2020. Genbank accession number: MW291630
41. Zhang J., Li M., Xu M., Takita T., Wei F. Molecular phylogeny of icefish Salangidae based on complete mtDNA cytochrome b sequences, with comments on estuarine fish evolution. Biol J Linn Soc. 2007;91:325-340. doi 10.1111/j.1095-8312.2007.00785.x
42. Zhao L., Zhang J., Liu Z., Funk S.M., Wei F., Xu M., Li M. Complex population genetic and demographic history of the Salangid, Neosalanx taihuensis, based on cytochrome b sequences. BMC Evol Biol. 2008;8:201. doi 10.1186/1471-2148-8-201
43. Zhao L., Zhang J., Liu Z., Xu M., Li M. Population genetic structure and demographic history of Neosalanx jordani based on cytochrome b sequences. Biodiv Sci. 2010;18(3):251-261. doi 10.3724/SP.J.1003.2010.251