Orthoweb: a software package for evolutionary analysis of gene networks
https://doi.org/10.18699/vjgb-24-95
Abstract
This article introduces Orthoweb (https://orthoweb.sysbio.cytogen.ru/), a software package developed for the calculation of evolutionary indices, including phylostratigraphic indices and divergence indices (Ka/Ks) for individual genes as well as for gene networks. The phylostratigraphic age index (PAI) allows the evolutionary stage of a gene’s emergence (and thus indirectly the approximate time of its origin, known as “evolutionary age”) to be assessed based on the analysis of orthologous genes across closely and distantly related taxa. Additionally, Orthoweb supports the calculation of the transcriptome age index (TAI) and the transcriptome divergence index (TDI). These indices are important for understanding the dynamics of gene expression and its impact on the development and adaptation of organisms. Orthoweb also includes optional analytical features, such as the ability to explore Gene Ontology (GO) terms associated with genes, facilitating functional enrichment analyses that link evolutionary origins of genes to biological processes. Furthermore, it offers tools for SNP enrichment analysis, enabling the users to assess the evolutionary significance of genetic variants within specific genomic regions. A key feature of Orthoweb is its ability to integrate these indices with gene network analysis. The software offers advanced visualization tools, such as gene network mapping and graphical representations of phylostratigraphic index distributions of network elements, ensuring intuitive interpretation of complex evolutionary relationships. To further streamline research workflows, Orthoweb includes a database of pre-calculated indices for numerous taxa, accessible via an application programming interface (API). This feature allows the users to retrieve pre-computed phylostratigraphic and divergence data efficiently, significantly reducing computational time and effort.
About the Authors
R. A. IvanovRussian Federation
Novosibirsk
A. M. Mukhin
Russian Federation
Novosibirsk
F. V. Kazantsev
Russian Federation
Novosibirsk
Z. S. Mustafin
Russian Federation
Novosibirsk
D. A. Afonnikov
Russian Federation
Novosibirsk
Y. G. Matushkin
Russian Federation
Novosibirsk
S. A. Lashin
Russian Federation
Novosibirsk
References
1. An N.A., Zhang J., Mo F., Luan X., Tian L., Shen Q.S., Li X., Li C., Zhou F., Zhang B., Ji M., Qi J., Zhou W.-Z., Ding W., Chen J.-Y., Yu J., Zhang L., Shu S., Hu B., Li C.-Y. De novo genes with an lncRNA origin encode unique human brain developmental functionality. Nat. Ecol. Evol. 2023;7(2):264-278. doi 10.1038/s41559-022-01925-6
2. Arendsee Z., Li J., Singh U., Bhandary P., Seetharam A., Wurtele E.S. fagin: synteny-based phylostratigraphy and finer classification of young genes. BMC Bioinformatics. 2019;20(1):440. doi 10.1186/s12859-019-3023-y
3. Ashburner M., Ball C.A., Blake J.A., Botstein D., Butler H., Cherry J.M., Davis A.P., Dolinski K., Dwight S.S., Eppig J.T., Harris M.A., Hill D.P., Issel-Tarver L., Kasarskis A., Lewis S., Matese J.C., Richardson J.E., Ringwald M., Rubin G.M., Sherlock G. Gene Ontology: tool for the unification of biology. Nat. Genet. 2000;25(1):25-29. doi 10.1038/75556
4. Baalsrud H.T., Tørresen O.K., Solbakken M.H., Salzburger W., Hanel R., Jakobsen K.S., Jentoft S. De novo gene evolution of antifreeze glycoproteins in codfishes revealed by whole genome sequence data. Mol. Biol. Evol. 2018;35(3):593-606. doi 10.1093/molbev/msx311
5. Barrera-Redondo J., Lotharukpong J.S., Drost H.-G., Coelho S.M. Uncovering gene-family founder events during major evolutionary transitions in animals, plants and fungi using GenEra. Genome Biol. 2023;24(1):54. doi 10.1186/s13059-023-02895-z
6. Bowles A.M.C., Bechtold U., Paps J. The origin of land plants is rooted in two bursts of genomic novelty. Curr. Biol. 2020;30(3):530-536.e2. doi 10.1016/j.cub.2019.11.090
7. Buchfink B., Reuter K., Drost H.-G. Sensitive protein alignments at tree-of-life scale using DIAMOND. Nat. Methods. 2021;18(4):366-368. doi 10.1038/s41592-021-01101-x
8. Carbon S., Douglass E., Good B.M., Unni D.R., Harris N.L., Mungall C.J., Basu S., Chisholm R.L., Dodson R.J., Hartline E., … Stein L., Howe D.G., Toro S., Westerfield M., Jaiswal P., Cooper L., Elser J. The Gene Ontology resource: enriching a GOld mine. Nucleic Acids Res. 2021;49(D1):D325-D334. doi 10.1093/nar/gkaa1113
9. Davidson G., Shen J., Huang Y.-L., Su Y., Karaulanov E., Bartscherer K., Hassler C., Stannek P., Boutros M., Niehrs C. Cell cycle control of Wnt receptor activation. Dev. Cell. 2009;17(6):788-799. doi 10.1016/j.devcel.2009.11.006
10. Domazet-Lošo T., Tautz D. An ancient evolutionary origin of genes associated with human genetic diseases. Mol. Biol. Evol. 2008;25(12): 2699-2707. doi 10.1093/molbev/msn214
11. Domazet-Lošo T., Tautz D. A phylogenetically based transcriptome age index mirrors ontogenetic divergence patterns. Nature. 2010; 468(7325):815-819. doi 10.1038/nature09632
12. Dornburg A., Yoder J.A. On the relationship between extant innate immune receptors and the evolutionary origins of jawed vertebrate adaptive immunity. Immunogenetics. 2022;74(1):111-128. doi 10.1007/s00251-021-01232-7
13. Emms D.M., Kelly S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 2019;20(1):238. doi 10.1186/s13059-019-1832-y
14. Huerta-Cepas J., Szklarczyk D., Heller D., Hernández-Plaza A., Forslund S.K., Cook H., Mende D.R., Letunic I., Rattei T., Jensen L.J., von Mering C., Bork P. eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res. 2019;47(D1): D309-D314. doi 10.1093/nar/gky1085
15. Kanehisa M., Sato Y., Kawashima M., Furumichi M., Tanabe M. KEGG as a reference resource for gene and protein annotation. Nucleic Acids Res. 2016;44(D1):D457-D462. doi 10.1093/nar/gkv1070
16. Kanehisa M., Furumichi M., Tanabe M., Sato Y., Morishima K. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 2017;45(D1):D353-D361. doi 10.1093/nar/gkw1092
17. Mustafin Z.S., Lashin S.A., Matushkin Y.G. Phylostratigraphic analysis of gene networks of human diseases. Vavilov J. Genet. Breed. 2021; 25(1):46-56. doi 10.18699/VJ21.006
18. Paps J., Holland P.W.H. Reconstruction of the ancestral metazoan genome reveals an increase in genomic novelty. Nat. Commun. 2018; 9(1):1730. doi 10.1038/s41467-018-04136-5
19. Quint M., Drost H.G., Gabel A., Ullrich K.K., Bönn M., Grosse I. A transcriptomic hourglass in plant embryogenesis. Nature. 2012; 490(7418):98-101. doi 10.1038/nature11394
20. Sayers E.W., Bolton E.E., Brister J.R., Canese K., Chan J., Comeau D.C., Connor R., Funk K., Kelly C., Kim S., Madej T., Marchler-Bauer A., Lanczycki C., Lathrop S., Lu Z., Thibaud-Nissen F., Murphy T., Phan L., Skripchenko Y., Tse T., Wang J., Williams R., Trawick B.W., Pruitt K.D., Sherry S.T. Database resources of the national center for biotechnology information. Nucleic Acids Res. 2022;50(D1):D20-D26. doi 10.1093/nar/gkab1112
21. Šestak M.S., Božičević V., Bakarić R., Dunjko V., Domazet-Lošo T. Phylostratigraphic profiles reveal a deep evolutionary history of the vertebrate head sensory systems. Front. Zool. 2013;10(1):18. doi 10.1186/1742-9994-10-18
22. Tautz D., Domazet-Lošo T. The evolutionary origin of orphan genes. Nat. Rev. Genet. 2011;12(10):692-702. doi 10.1038/nrg3053
23. Ullrich K.K., Glytnasi N.E. oggmap: a Python package to extract gene ages per orthogroup and link them with single-cell RNA data. Bioinformatics. 2023;39(11):btad657. doi 10.1093/bioinformatics/btad657
24. von Mering C., Jensen L.J., Snel B., Hooper S.D., Krupp M., Foglierini M., Jouffre N., Huynen M.A., Bork P. STRING: known and predicted protein-protein associations, integrated and transferred across organisms. Nucleic Acids Res. 2005;33(D1):D433-D437. doi 10.1093/nar/gki005
25. Xie L., Draizen E.J., Bourne P.E. Harnessing big data for systems pharmacology. Annu. Rev. Pharmacol. Toxicol. 2017;57(1):245-262. doi 10.1146/annurev-pharmtox-010716-104659
26. Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 2007;24(8):1586-1591. doi 10.1093/molbev/msm088
27. Yang Z., Nielsen R. Estimating synonymous and nonsynonymous substitution rates under realistic evolutionary models. Mol. Biol. Evol. 2000;17(1):32-43. doi 10.1093/oxfordjournals.molbev.a026236
28. Zhan T., Rindtorff N., Boutros M. Wnt signaling in cancer. Oncogene. 2017;36(11):1461-1473. doi 10.1038/onc.2016.304