Preview

Vavilov Journal of Genetics and Breeding

Advanced search

THE SOLANUM TUBEROSUM KNOWLEDGE BASE: THE SECTION ON MOLECULAR-GENETIC REGULATION OF METABOLIC PATHWAYS

https://doi.org/10.18699/VJ18.325

Abstract

Rapid development of high-performance genomic, transcriptomic, proteomic and metabolic technologies led to an information explosion in the field of plant biology and agrobiology. To date, the number of scientific publications on only one of the most important agricultural crops of Solanum tuberosum L. (potato) has exceeded 1.5 million. Effective access to knowledge distributed over such a multitude of non-formalized natural language textual sources requires the use of special computer-assisted intelligent methods of data mining (text-mining). However, in the literature, there is no data on the application of intellectual methods of automatic knowledge extraction from publications on agricultural crops, such as potato. Previously we have developed a pilot version of the SOLANUM TUBEROSUM knowledge base. SOLANUM TUBEROSUM is a computer platform for complex intellectual processing of large data bodies, including (1) automatic analysis of scientific publications and databases for extraction of information on genetics, markers, breeding, diagnostics, protection and storage technologies for potato, (2) formalized representation of extracted information in the knowledge base, (3) user access to these data, (4) analysis and visualization of query results. The ontology of the SOLANUM TUBEROSUM knowledge base contains dictionaries of molecular genetic objects (proteins, genes, metabolites, microRNAs, biomarkers); phenotypic characteristics of potato varieties; potato diseases and pests; biotic/abiotic environmental factors; potato agrobiotechnologies. This article describes the current version of the SOLANUM TUBEROSUM knowledge base developed from an extensive analysis of scientific publications on the moleculargenetic regulation of metabolic pathways in potatoes, as well as model plant organisms (maize, rice, Arabidopsis  thaliana). In total, about 9,000 full-text articles and more than 130,000 abstracts of PubMed were analyzed. With the help of automatic analysis of scientific publications, more than 59,000 facts on molecular genetic interactions and genetic regulation were identified, and the analysis of factual databases revealed more than 380,000 such interactions in the examined organisms. It turned out that about 3 % of extracted facts about molecular genetic interactions and genetic regulation were related to Solanum tuberosum L. Thus, the inclusion of information on well-studied model species during the extraction of information on the molecular-genetic regulation of metabolic processes is important. It allows prediction of orthologous genes in potato and their further identification and analysis based on homology. An associative network of genetic regulation of starch biosynthesis in potatoes, including 33 metabolites, 36 proteins, 6 metabolic pathways and 132 interactions between them, 86 of which describe catalytic reactions, and the rest – regulatory events, was reconstructed. The reconstructed network is the basis for the search for target genes for directed mutagenesis and marker-oriented selection of potato varieties with specified starch properties. The trial version of the SOLANUM TUBEROSUM knowledge base is available at http://www-bionet.sysbio.cytogen.ru/and/ plant/.

About the Authors

T. V. Ivanisenko
Institute of Cytology and Genetics SB RAS
Russian Federation

Novosibirsk



O. V. Saik
Institute of Cytology and Genetics SB RAS
Russian Federation

Novosibirsk



P. S. Demenkov
Institute of Cytology and Genetics SB RAS
Russian Federation

Novosibirsk



V. K. Khlestkin
Institute of Cytology and Genetics SB RAS; Novosibirsk State University
Russian Federation


E. K. Khlestkina
Institute of Cytology and Genetics SB RAS; Novosibirsk State University
Russian Federation


N A. Kolchanov
Institute of Cytology and Genetics SB RAS
Russian Federation

Novosibirsk



V. А. Ivanisenko
Institute of Cytology and Genetics SB RAS
Russian Federation

Novosibirsk



References

1. Aggarwal C.C., Zhai C. (Eds.). Mining Text Data. Springer Science & Business Media, 2012.

2. Boycheva S., Dominguez A., Rolcik J., Boller T., Fitzpatrick T.B. Consequences of a deficit in vitamin B6 biosynthesis de novo for hormone homeostasis and root development in Arabidopsis. Plant Physiol. 2015;167(1):102-117. DOI 10.1104/pp.114.247767.

3. Bragina E.Y., Tiys E.S., Freidin M.B., Koneva L.A., Demenkov P.S., Ivanisenko V.A., Kolchanov N.A., Puzyrev V.P. Insights into pathophysiology of dystropy through the analysis of gene networks: an example of bronchial asthma and tuberculosis. Immunogenetics. 2014;66(7-8):457-465. DOI 10.1007/s00251-014-0786-1.

4. Bragina E.Y., Tiys E.S., Rudko A.A., Ivanisenko V.A., Freidin M.B. Novel tuberculosis susceptibility candidate genes revealed by the reconstruction and analysis of associative networks. Infect. Genet. Evol. 2016;46:118-123. DOI 10.1016/j.meegid.2016.10.030.

5. Cao Y., Liu F., Simpson P., Antieaua L., Bennett A., Cimino J.J., Ely J., Yu H. AskHERMES: An online question answering system for complex clinical questions. J. Biomed. Inform. 2011;44:277-288. DOI 10.1016/j.jbi.2011.01.004.

6. Carbon S., Ireland A., Mungall C.J., Shu S., Marshall B., Lewis S., AmiGO Hub, Web Presence Working Group. AmiGO: online access to ontology and annotation data. Bioinformatics. 2009;25(2):288289. DOI 10.1093/bioinformatics/btn615.

7. Collobert R., Weston J., Bottou L., Karlen M., Kavukcuoglu K., Kuksa P. Natural language processing (almost) from scratch. J. Mach. Learn. Res. 2011;12:2493-2537.

8. Déjardin A., Sokolov L.N., Kleczkowski L.A. Sugar/osmoticum levels modulate differential abscisic acid-independent expression of two stress-responsive sucrose synthase genes in Arabidopsis. Biochem. J. 1999;344(2):503-509. DOI 10.1042/bj3440503.

9. Demenkov P.S., Ivanisenko T.V., Kolchanov N.A., Ivanisenko V.A. ANDVisio: a new tool for graphic visualization and analysis of literature mined associative gene networks in the ANDSystem. In Silico Biology. 2012;11(3-4):149-161. DOI 10.3233/ISB-2012-0449.

10. Eden E., Lipson D., Yogev S., Yakhini Z. Discovering motifs in ranked lists of DNA sequences. PLoS Comput. Biol. 2007;3(3):e39. DOI 10.1371/journal.pcbi.0030039.

11. Eden E., Navon R., Steinfeld I., Lipson D., Yakhini Z. GOrilla: a tool for discovery and visualization of enriched GO terms in ranked gene lists. BMC Bioinformatics. 2009;10:48. DOI 10.1186/1471-2105-10-48.

12. Ellis R.P., Cochrane M.P., Dale M.F.B., Duffus C.M., Lynn A., Morrison I.M., Prentice R.D.M., Swanston J.S., Tiller S.A. Starch production and industrial use. J. Sci. Food Agric. 1998;77(3):289-311. DOI 10.1002/(SICI)1097-0010(199807)77:3<289::AID-JSFA38>3.0.CO;2-D.

13. Friedman C., Hripcsak G., Shagina L., Liu H. Representing information in patient reports using natural language processing and the extensible markup language. J. Am. Med. Inform. Assoc. 1999;6:76-87. DOI 10.1136/jamia.1999.0060076.

14. Friml J., Wiśniewska J., Benková E., Mendgen K., Palme K. Lateral relocation of auxin efflux regulator PIN3 mediates tropism in Arabidopsis. Nature. 2002;415:806-809. DOI 10.1038/415806a.

15. Furutani M., Kajiwara T., Kato T., Treml B.S., Stockum C., TorresRuiz R.A., Tasaka M. The gene MACCHI-BOU 4/ENHANCER OF PINOID encodes a NPH3-like protein and reveals similarities between organogenesis and phototropism at the molecular level. Development. 2007;134(21):3849-3859. DOI 10.1242/dev.009654.

16. Geigenberger P. Regulation of starch biosynthesis in response to a fluctuating environment. Plant Physiol. 2011;155(4):1566-1577. DOI 10.1104/pp.110.170399.

17. Glotov A.S., Tiys E.S., Vashukova E.S., Pakin V.S., Demenkov P.S., Saik O.V., Ivanisenko T.V., Arzhanova O.N., Mozgovaya E.V., Zainulina M.S., Kolchanov N.A., Baranov V.S., Ivanisenko V.A. Molecular association of pathogenetic contributors to pre-eclampsia (pre-eclampsia associome). BMC Syst. Biol. 2015;9(Suppl.2):S4. DOI 10.1186/1752-0509-9-S2-S4.

18. Guilfoyle T.J., Hagen G. Auxin response factors. Curr. Opin. Plant Biol. 2007;10(5):453-460. DOI 10.1016/j.pbi.2007.08.014.

19. Guney E., Oliva B. Exploiting protein-protein interaction networks for genome-wide disease-gene prioritization. PLoS ONE. 2012;7(9): e43557. DOI 10.1371/journal.pone.0043557.

20. Hansen H., Grossmann K. Auxin-induced ethylene triggers abscisic acid biosynthesis and growth inhibition. Plant Physiol. 2000;124(3): 1437-1448. DOI 10.1104/pp.124.3.1437.

21. Huang D.W., Sherman B.T., Lempicki R.A. Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources. Nat. Protoc. 2008;4(1):44-57. DOI 10.1038/nprot.2008.211.

22. Ivanisenko V.A., Saik O.V., Ivanisenko N.V., Tiys E.S., Ivanisenko T.V., Demenkov P.S., Kolchanov N.A. ANDSystem: an Associative Network Discovery System for automated literature mining in the field of biology. BMC Syst. Biol. 2015;9(Suppl.2):S2. DOI 10.1186/1752-0509-9-S2-S2.

23. Jenner H.L., Winning B.M., Millar A.H., Tomlinson K.L., Leaver C.J., Hill S.A. NAD malic enzyme and the control of carbohydrate metabolism in potato tubers. Plant Physiol. 2001;126:1139-1149. DOI 10.1104/pp.126.3.1139.

24. Jobling S. Improving starch for food and industrial applications. Curr. Opin. Plant Biol. 2004;7(2):210-218. DOI 10.1016/j.pbi.2003.12.001.

25. Kanehisa M., Goto S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000;28(1):27-30. DOI 10.1093/nar/28.1.27.

26. Khlestkin V.K., Peltek S.E., Kolchanov N.A. Target genes for development of potato (Solanum tuberosum L.) cultivars with desired starch properties. Selskokhozyaystvennaya biologiya = Agricultural Biology. 2017;52(1):25-36. DOI 10.15389/agrobiology.2017.1.25rus. (in Russian)

27. Khlestkin V.K., Peltek S.E., Kolchanov N.A. Review of direct chemical and biochemical transformations of starch. Carbohydr. Polymers. 2018;181(1):460-476. DOI 10.1016/j.carbpol.2017.10.035.

28. Kilicoglu H. Biomedical text mining for research rigor and integrity: tasks, challenges, directions. Brief. Bioinform. 2017. Jan 1. DOI 10.1101/108480.

29. Kraak A. Industrial applications of potato starch products. Ind. Crops Prod. 1992;1(2-4):107-112. DOI 10.1016/0926-6690(92)90007-I.

30. Krallinger M., Rodriguez-Penagos C., Tendulkar A., Valencia A. PLAN2L: a web tool for integrated text mining and literaturebased bioentity relation extraction. Nucleic Acids Res. 2009; 37(Suppl.2):W160-W165. DOI 10.1093/nar/gkp484.

31. Larina I.M., Pastushkova L.Kh., Tiys E.S., Kireev K.S., Kononikhin A.S., Starodubtseva N.L., Popov I.A., Custaud M.A., Dobrokhotov I.V., Nikolaev E.N., Kolchanov N.A., Ivanisenko V.A. Permanent proteins in the urine of healthy humans during the Mars500 experiment. J. Bioinform. Comput. Biol. 2015;13(1):1540001. DOI 10.1142/S0219720015400016.

32. Lee H.W., Cho C., Kim J. Lateral Organ Boundaries Domain16 and 18 act downstream of the AUXIN1 and LIKE-AUXIN3 auxin influx carriers to control lateral root development in Arabidopsis. Plant Physiol. 2015;168(4):1792-1806. DOI 10.1104/pp.15.00578.

33. Li C., Liakata M., Rebholz-Schuhmann D. Biological network extraction from scientific literature: state of the art and challenges. Brief. Bioinform. 2013;15(5):856-877. DOI 10.1093/bib/bbt006.

34. Lilley J.L., Gee C.W., Sairanen I., Ljung K., Nemhauser J.L. An endogenous carbon-sensing pathway triggers increased auxin flux and hypocotyl elongation. Plant Physiol. 2012;160(4):2261-2270. DOI 10.1104/pp.112.205575.

35. Ljung K., Hull A.K., Celenza J., Yamada M., Estelle M., Normanly J., Sandberg G. Sites and regulation of auxin biosynthesis in Arabidopsis roots. Plant Cell. 2005;17(4):1090-1104. DOI 10.1105/tpc.104.029272.

36. McKibbin R.S., Muttucumaru N., Paul M.J., Powers S.J., Burrell M.M., Coates S., Purcell P.C., Tiessen A., Geigenberger P., Halford N.G. Production of high-starch, low-glucose potatoes through over-expres sion of the metabolic regulator SnRK1. Plant Biotechnol. J. 2006;4(4):409-418. DOI 10.1111/j.1467-7652.2006.00190.x.

37. Meystre S.M., Savova G.K., Kipper-Schuler K.C., Hurdle J.F. Extracting information from textual documents in the electronic health record: a review of recent research. Yearb Med. Inform. 2008;35:128-144.

38. Mi H., Poudel S., Muruganujan A., Casagrande J.T., Thomas P.D. PANTHER version 10: expanded protein families and functions, and analysis tools. Nucleic Acids Res. 2015;44(D1):D336-D342. DOI 10.1093/nar/gkv1194.

39. Michalska J., Zauber H., Buchanan B.B., Cejudo F.J., Geigenberger P. NTRC links built-in thioredoxin to light and sucrose in regulating starch synthesis in chloroplasts and amyloplasts. Proc. Natl. Acad. Sci. USA. 2009;106:9908-9913. DOI 10.1073/pnas.0903559106.

40. Mishra B.S., Singh M., Aggrawal P., Laxmi A. Glucose and auxin signaling interaction in controlling Arabidopsis thaliana seedlings root growth and development. PLoS ONE. 2009;4(2):e4502. DOI 10.1371/journal.pone.0004502.

41. Miyazawa Y., Sakai A., Miyagishima S.Y., Takano H., Kawano S., Kuroiwa T. Auxin and cytokinin have opposite effects on amyloplast development and the expression of starch synthesis genes in cultured bright yellow-2 tobacco cells. Plant Physiol. 1999;121(2):461-470. DOI 10.1104/pp.121.2.461.

42. Momynaliev K.T., Kashin S.V., Chelysheva V.V., Selezneva O.V., Demina I.A., Serebryakova M.V., Alexeev D., Ivanisenko V.A., Aman E., Govorun V.M. Functional divergence of Helicobacter pylori related to early gastric cancer. J. Proteome Res. 2010;9(1):254267. DOI 10.1021/pr900586w.

43. Müller-Röber B.T., Kossmann J., Hannah L.C., Willmitzer L., Sonnewald U. One of two different ADP-glucose pyrophosphorylase genes from potato responds strongly to elevated levels of sucrose. Mol. Gen. Genet. 1990;224:136-146.

44. Ni D.A., Yu X.H., Wang L.J., Xu Z.H. Aberrant development of pollen in transgenic tobacco expressing bacterial iaaM gene driven by pollen- and tapetum-specific promoters. Shi Yan Sheng Wu Xue Bao. 2002;35(1):1-6.

45. Obata-Sasamoto H., Suzuki H. Activities of enzymes relating to starch synthesis and endogenous levels of growth regulators in potato stolon tips during tuberization. Physiol. Plant. 1979;45(3):320-324. DOI 10.1111/j.1399-3054.1979.tb02591.x.

46. Pastushkova L.Kh., Kononikhin A.S., Tiys E.S., Dobrokhotov I.V., Ivanisenko V.A., Nikolaev E.N., Larina I.M., Popov I.A. Urine proteome study for the evaluation of age dynamics in healthy men. Uspekhi gerontologii = Advances in Gerontology. 2015б;28(4):294700. (in Russian)

47. Pastushkova L.Kh., Kononikhin A.S., Tiys E.S., Nosovsky A.M., Dobrokhotov I.V., Ivanisenko V.A., Nikolaev E.N., Novoselova N.M., Custaud M.A., Larina I.M. Shifts in urine protein profile during dry immersion. Aviakosm. Ekolog. Med. 2015;49(4):15-19.

48. Pastushkova L.H., Kononikhin A.S., Tiys E., Obraztsova O.A., Dobrokhotov I.V., Ivanisenko V.A., Nikolaev E.N., Larina I.M. Identification of biological processes on the composition of the urine proteome cosmonauts on the first day after long space flights. Rossiyskiy fiziologicheskiy zhurnal im. I.M. Sechenova = I.M. Sechenov Physiological Journal. 2015a;101:222-237. (in Russian)

49. Petrovskiy E.D., Saik O.V., Tiys E.S., Lavrik I.N., Kolchanov N.A., Ivanisenko V.A. Prediction of tissue-specific effects of gene knockout on apoptosis in different compartments of human brain. BMC Genomics. 2015;16(Suppl.13):S3. DOI 10.1186/1471-2164-16-S13-S3.

50. Popik O.V., Petrovskiy E.D., Mishchenko E.L., Lavrik I.N., Ivanisenko V.A. Mosaic gene network modelling identified new regulatory mechanisms in HCV infection. Virus Res. 2015;218:71-78. DOI 10.1016/j.virusres.2015.10.004.

51. Purcell P.C., Smith A.M., Halford N.G. Antisense expression of a sucrose nonfermenting-1-related protein kinase sequence in potato results in decreased expression of sucrose synthase in tubers and loss of sucrose-inducibility of sucrose synthase transcripts in leaves. Plant J. 1998;14:195-202. DOI 10.1046/j.1365-313X.1998.00108.x.

52. Quettier A.L., Bertrand C., Habricot Y., Miginiac E., Agnes C., Jeannette E., Maldiney R. The phs1-3 mutation in a putative dualspecificity protein tyrosine phosphatase gene provokes hypersensitive responses to abscisic acid in Arabidopsis thaliana. Plant J. 2006;47(5):711-719. DOI 10.1111/j.1365-313X.2006.02823.x.

53. Rebholz-Schuhmann D., Oellrich A., Hoehndorf R. Text-mining solutions for biomedical research: enabling integrative biology. Nat. Rev. Genet. 2012;13:829-839. DOI 10.1038/nrg3337.

54. Roumeliotis E., Kloosterman B., Oortwijn M., Kohlen W., Bouwmeester H.J., Visser R.G., Bachem C.W. The effects of auxin and strigolactones on tuber initiation and stolon architecture in potato. J. Exp. Bot. 2012;63(12):4539-4547. DOI 10.1093/jxb/ers132.

55. Saik O.V., Demenkov P.S., Ivanisenko T.V., Kolchanov N.A., Ivanisenko V.A. Development of methods for automatic extraction of knowledge from texts of scientific publications for the creation of the knowledge base SOLANUM TUBEROSUM. Selskokhozyaystvennaya biologiya = Agricultural Biology. 2017;52(1):63-74. DOI 10.15389/agrobiology.2017.1.63rus. (in Russian)

56. Saik O.V., Ivanisenko T.V., Demenkov P.S., Ivanisenko V.A. Interactome of the hepatitis C virus: Literature mining with ANDSystem. Virus Res. 2016a;218:40-48. DOI 10.1016/j.virusres.2015.12.003.

57. Saik O.V., Konovalova N.A., Demenkov P.S., Ivanisenko T.V., Petrovskiy E.D., Ivanisenko N.V., Ivanoshchuk D.E., Ponomareva M.N., Konovalova O.S., Lavrik I.N., Kolchanov N.A. Molecular associations of Primary Open-Angle Glaucoma with potential comorbid diseases (POAG-associome). Biotecnología Aplicada. 2016b;33(3):3201-3206.

58. Salanoubat M., Belliard G. The steady-state level of potato sucrose syntha se mRNA is dependent on wounding, anaerobiosis and sucrose concen tration. Gene. 1989;84:181-185. DOI 10.1016/0378-1119(89)90153-4.

59. Sarker A., Ginn R., Nikfarjam A., O’Connor K., Smith K., Jayaraman S., Upadhaya T., Gonzalez G. Utilizing social media data for pharmacovigilance: A review. J. Biomed. Inform. 2015;54:202-212. DOI 10.1016/j.jbi.2015.02.004.

60. Shetty K.D., Dalal S.R. Using information mining of the medical literature to improve drug safety. J. Am. Med. Inform. Assoc. 2011;18:668-674. DOI 10.1136/amiajnl-2011-000096.

61. Slocombe S.P., Laurie S., Bertini L., Beaudoin F., Dickinson J.R., Halford N.G. Molecular cloning of SnIP1, a novel protein that interacts with SNF1-related protein kinase (SnRK1). Plant Mol. Biol. 2002; 49:31-44.

62. Stark C., Breitkreutz B.J., Reguly T., Boucher L., Breitkreutz A., Tyers M. BioGRID: a general repository for interaction datasets. Nucleic Acids Res. 2006;34:D535-D539. DOI 10.1093/nar/gkj109.

63. Tang B., Wu Y., Jiang M., Denny J.C., Xu H. Recognizing and encoding disorder concepts in clinical text using machine learning and vector space model. Working Notes for CLEF 2013 Conference. 2013;1179.

64. Thomas P.D., Kejariwal A., Guo N., Mi H., Campbell M.J., Muruganujan A., Lazareva-Ulitsky B. Applications for protein sequence function evolution data: mRNA/protein expression analysis and coding SNP scoring tools. Nucleic Acids Res. 2006;34(Suppl.2):W645W650. DOI 10.1093/nar/gkl229.

65. Uzuner O., South B.R., Shen S., DuVall S.L. 2010 i2b2/VA challenge on concepts, assertions, and relations in clinical text. J. Am. Med. Inform. Assoc. 2011;18:552-556. DOI 10.1136/amiajnl-2011-000203.

66. Van Harsselaar J.K., Lorenz J., Senning M., Sonnewald U., Sonnewald S. Genome-wide analysis of starch metabolism genes in potato (Solanum tuberosum L.). BMC Genomics. 2017;18(1):37. DOI 10.1186/s12864-016-3381-z.

67. Wei C.-H., Kao H.-Y., Lu Z. PubTator: a web-based text mining tool for assisting biocuration. Nucleic Acids Res. 2013;41:W518-W522. DOI 10.1093/nar/gkt441.

68. Xing M., Xue H. A proteomics study of auxin effects in Arabidopsis thaliana. Acta Biochim. Biophys. Sin. (Shanghai). 2012;44(9):783796. DOI 10.1093/abbs/gms057.

69. Zhang H., Hou J., Liu J., Zhang J., Song B., Xie C. The roles of starch metabolic pathways in the cold-induced sweetening process in potatoes. Starch-Stärke. 2017;69:1-2. DOI 10.1002/star.201600194.

70. Zhu Y.X., Davies P.J. The control of apical bud growth and senescence by auxin and gibberellin in genetic lines of peas. Plant Physiol. 1997;113(2):631-637.


Review

Views: 1095


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2500-3259 (Online)