ICGenomics: A PROGRAM COMPLEX FOR ANALYSIS OF SYMBOL SEQUENCES IN GENOMICS
Abstract
The pilot program complex for analysis of symbol sequences in genomics, ICGenomics, has been designed for storage, mining, and analysis of sequences related to theoretical and applied genomics. ICGenomics enables wet-lab biologists to perform high-quality processing of data in the fields of genomics, biomedicine, and biotechnology. ICGenomics implements both conventional and modern methods for processing, analyzing, and visualizing sequence data. They include novel methods of the processing of initial high-throughput sequencing data. Examples are: ChIP-seq analysis; functional annotation of gene regulatory regions in nucleotide and amino acid sequences; prediction of nucleosome positioning; and structural and functional annotation of proteins, including their allergenicity and evolution features. Application of ICGenomics to the analysis of genomic sequences of the parasite Opisthorchis felineus and to ChIP-seq data on the mouse and human is considered. The system is available at http://www-bionet.sscc.ru/icgenomics.
About the Authors
Y. L. OrlovRussian Federation
A. O. Bragin
Russian Federation
I. V. Medvedeva
Russian Federation
K. V. Gunbin
Russian Federation
P. S. Demenkov
Russian Federation
O. V. Vishnevsky
Russian Federation
V. G. Levitsky
Russian Federation
D. Y. Oshchepkov
Russian Federation
N. L. Podkolodnyy
Russian Federation
D. A. Afonnikov
Russian Federation
I. Grosse
Russian Federation
N. A. Kolchanov
Russian Federation
References
1. Гунбин К.В., Суслов В.В., Афонников Д.А. Генетическая основа макроэволюционных преобразований: исследование режимов молекулярной эволюции ортологичных белков позвоночных и беспозвоночных // Тр. Междунар. конф. «Современные проблемы математики, информатики и биоинформатики», посвященной 100-летию со дня рождения чл.-корр. А.А.Ляпунова. 11–14 октября 2011 г. Новосибирск, Россия. 2011. ПП. 4.7. С. 52–53.
2. Левицкий В.Г., Ощепков Д.Ю., Ершов Н.И. и др. Разработка методов распознавания сайтов связывания транскрипционных факторов FoxA, их экспериментальная верификация и использование для анализа данных массовой иммунопреципитации хроматина // Докл. АН. 2011. Т. 436. № 3. С. 417–421.
3. Bragin A.O., Demenkov P.S., Kolchanov N.A., Ivanisenko V.A. Accuracy of protein allergenicity prediction can be improved by taking into account data on allergenic protein discontinuous peptides // J. Biomol. Struct. Dyn. 2012. Jul. 18. [Epub ahead of print]
4. Gunbin K.V., Genaev M. A., Afonnikov D. A., Kolchanov N.A. A computer system for the analysis of molecular evolution modes of protein-encoding genes (SAMEM): The relationship between molecular evolution and phenotypic traits // Mosc. Univ. Biol. Sci. Bull. 2010. V. 65. No. 4. P. 142–144.
5. Gunbin K.V., Suslov V.V., Turnaev I.I. et al. Molecular evolution of cyclin proteins in animals and fungi // BMC Evol. Biol. 2011. V. 11. Р. 224.
6. Ivanisenko V.A., Demenkov P.S., Pintus S.S. et al. Computer analysis of metagenomic data-prediction of quantitative value of specifi c activity of proteins // Dokl. Biochem. Biophys. 2012. V. 443. P. 76–80.
7. Ivanisenko V.A., Pintus S.S., Grigorovich D.A., Kolchanov N.A. PDBSite: a database of the 3D structure of protein functional sites // Nucl. Acids Res. 2005. V. 33. Database, P. 183–187.
8. Lee K.L., Lim S.K., Orlov Y.L. et al. Graded Nodal/Activin signaling titrates conversion of quantitative phospho-Smad2 levels into qualitative embryonic stem cell fate decisions // PLoS Genet. 2011. V. 7. Nо. 6. e1002130.
9. Malone B.M., Tan F., Bridges S.M., Peng Z. Comparison of four ChIP-Seq analytical algorithms using rice endosperm H3K27 trimethylation profi ling data // PLoS One. 2011. V. 6. No. 9. e25260.
10. Matushkin Y.G., Levitsky V.G., Orlov Y.L. et al. Translation effi ciency in yeasts correlates with nucleosome formation in promoters // J. Biomol. Struct. Dyn. 2012. Jul. 18. [Epub ahead of print].
11. Medvedeva I., Demenkov P., Kolchanov N., Ivanisenko V. SitEx: a computer system for analysis of projections of protein functional sites on eukaryotic genes // Nucl. Acids Res. 2012. V. 40 (Database issue). P. 278–83.
12. Muh H.C., Tong J.C., Tammi M.T. AllerHunter: a SVM-pairwise system for assessment of allergenicity and allergic cross-reactivity in proteins // PLoS One. 2009. V. 4. No. 6. e5861.
13. Putta P., Orlov Yu.L., Podkolodnyy N.L., Mitra C.K. Relatively conserved common short sequences in transcription factor binding sites and miRNA // Вавилов. журн. генет. и селекции. 2011. Т. 15. № 4. С. 750–756.
14. Vishnevsky O.V., Gunbin K.V., Bocharnikov A.V., Berezikov E.V. Analysis of degenerate motifs in the promoters of miRNA genes expressed in different mammalian tissues // Mosc. Univ. Biol. Sci. Bull. 2010. V. 65. No. 4. P. 193–195.
15. Zhang Y., Liu T., Meyer C.A. et al. Model-based Analysis of ChIP-Seq (MACS) // Genome Biol. 2008. V. 9. No. 9. R137.