Articles
The pilot program complex for analysis of symbol sequences in genomics, ICGenomics, has been designed for storage, mining, and analysis of sequences related to theoretical and applied genomics. ICGenomics enables wet-lab biologists to perform high-quality processing of data in the fields of genomics, biomedicine, and biotechnology. ICGenomics implements both conventional and modern methods for processing, analyzing, and visualizing sequence data. They include novel methods of the processing of initial high-throughput sequencing data. Examples are: ChIP-seq analysis; functional annotation of gene regulatory regions in nucleotide and amino acid sequences; prediction of nucleosome positioning; and structural and functional annotation of proteins, including their allergenicity and evolution features. Application of ICGenomics to the analysis of genomic sequences of the parasite Opisthorchis felineus and to ChIP-seq data on the mouse and human is considered. The system is available at http://www-bionet.sscc.ru/icgenomics.
By now, a huge body of experimental data on gene transcription regulation has been accumulated. Transcription is controlled by a great number of proteins acting at various steps of the process; thus, a diversity of regulatory mechanisms can be realized. This paper presents approaches to building knowledge domain ontology, formalized description of the mechanisms of transcriptional regulation and the development of methods for integration of heterogeneous information on the features of the regulation of gene expression on this base. The pilot version of the knowledge base on the transcriptional regulation of eukaryotic genes includes: (1) description of basic terms related to transcription regulation and relationships between them; (2) hierarchical classification of transcription regulators; (3) classification of phases and steps of transcription; (4) a database of transcriptional regulators of three mammalian species (human, mouse, and rat); and (5) dictionaries for molecular processes involved in transcriptional regulation. The knowledge base is designed for information support of computer analysis of transcriptional regulatory mechanisms. Approaches to reconstruction of eukaryotic transcriptional regulatory mechanisms with the new knowledge base are presented.
The purpose of the RatDNA database is the development of experimental methods for basic molecular studies of human age-related diseases in rats involving microarray tests of gene expression. Despite the obvious correlation between life expectancy and heredity and numerous biomedical studies on aging, little is known about genetic factors determining aging processes. People do not die of «healthy» aging: at any age conditions whose probability increases with age become the cause of their death. Age-related macular degeneration (AMD) becomes the main cause of vision problems and sight loss in people aged above 50. Structural and functional changes in the retina characteristic of aging are similar to those observed at early stages of AMD. They underlie the pathogenesis of this disease, but not always lead to its development. RatDNA database contains information on genes associated with age-related diseases, in particular AMD, and experimental data about their expression in tissues of a model rat strain. The database is available at http://pixie.bionet.nsc.ru/ratdna/rat/index.php.
A database on translational enhancers providing additional control of foreign gene expression at the mRNA translation level has been developed. It contains structured information on the presence of enhancers located within mRNAs, which control gene expression at the posttranscriptional stage. These data can be used to design genetic constructs for plant transgenesis. The database is based on the platform of the Sequence Retrieval System (SRS), allowing users to make a rapid search for enhancers with defined properties and retrieve corresponding nucleotide sequences. The database is available at http://wwwmgs.bionet.nsc.ru/mgs/dbases/trsig/
Classification of existent theoretical approaches for investigating protein thermal stability is performed. Computer simulations allow to fully estimate micro- and macro properties of the molecules. But those approaches are limited in accuracy and therefore require certain improvements e.g. making them to allow for molecular charge distribution. Also promising methods are those dealing with rigid regions of proteins. They do not require a huge amount of computations and allow to directly determine changes in molecular structure flexibility caused by mutated amino acid residues. But using the only structure it is impossible to explicitly estimate an effect of solvent and its thermodynamical properties.
The prevalence of allergic diseases was rapidly increasing in the 20th century. Currently, many people suffer from allergy in industrial countries. Therefore, analysis of allergenic properties of proteins is an urgent task. The following factors were formerly hypothesized to determine the allergenicity of a protein: size, enzymatic properties, and similarity to human proteins. However, no analysis of the relationship between allergenicity of proteins and the habitat of the organisms producing them has been conducted hitherto. We predict allergenicity of proteins from proteomes of more than 500 species of microorganisms. It is shown that the number of allergenic proteins in the proteomes of microorganisms is significantly associated with their pathogenicity, habitat, temperature conditions of the habitat, and oxygen demand.
This paper describes a RESTful Web service-based distributed software system, which focuses on the reconstruction of gene networks by integrating data from heterogeneous data sources, including databases of molecular-genetic interactions, metabolic and signaling pathways, gene networks, etc.
Mathematical modeling and analysis of complex molecular-genetic systems (MGS) are the key challenges in the systems biology era. To solve this task the special technologies and programming approaches considering the MGS as an ensemble of dynamic interconnected subsystems with a more simple structure are necessary to be developed. We have presented the approach that is aimed at acceleration of reconstruction of the complex MGS mathematical models and complex analysis using high performance computation techniques.
Embryo morphodynamics at early developmental stages of Arabidopsis thaliana was studied. First, a pipeline was elaborated from confocal microscopy and tissue 3D reconstruction to cell lineage tree reconstruction and numerical simulation of growing embryo mechanics. Tentative results of its use are presented.
An introduction to modeling of dynamical systems possessing dynamical structures with L-systems is given. Application of L systems is illustrated by models of plant tissue growth and control of state variable distribution in the growing tissue.
The results of the development of a high-throughput version of the software package Haploid Evolutionary Constructor (HEC), available at http://evol-constructor.bionet.nsc.ru, are presented. The software is used to simulate the functioning and evolution of prokaryotic communities. A parallel version of the software package was created using the MPI technology. The test was performed on a cluster of the Bioinformatics shared access center. The acceleration obtained was almost linear. The simulation time of complex bacterial communities was reduced from dozens of hours to several minutes.
This paper describes the development of an approach to the simulation of prokaryotic community activity and evolution and the software package «Haploid evolutionary constructor» (http://evol-constructor.bionet.nsc.ru). The initial model with ideal mixing (0D) is expanded to a spatially distributed model (1D). The 0D and 1D poisoner–prey prokaryotic community models are compared. It is shown that the community stability is influenced by the spatial distribution of substrates and prokaryotic cells.
New Internet-resource to support the research in plant biotechnology is presented. This Internet portal contains specialized modules (databases and software) and allows users to combine these modules to solve various tasks as well as it permits the further resource development by addition of new modules. Currently the resource contains the database of external informational sources, the database on promoters for plant transgenesis, the database on translational enhancers for plant transgenesis, and the database WheatPGE to support the experiments in a wheat breeding. The resource is available at ICG www-site (http://bioagrotech.bionet.nsc.ru/).
The BioinfoWF (Bioinformatics WorkFlow) system for automated generation of Web interface and Web services for bioinformatics programs. Each program module used in the system has metadescription in XML. The metadescriptions are used for automated generation of Web interface and Web services that can be used further in bioinformatics workflows. Computational modules can be organized in workflows. The tool we have developed significantly simplify the design and publication of modules for bioinformatics data analysis via the internet and their availability for scientific communities. The developed system makes is distributed under GNU GPL. The Source codes and documentation for BioinfoWF are available at http://bioinfowf.bionet.nsc.ru.
We designed and manufactured a system for detection of antibodies/antigenes in biological fluids. This system records the kinetics of free immunodiffusion of fluorescent nanocomplexes inside the channels of a microfluid module of a new-generation bioanalytical device. The system consists of four parallel excitation channels, a system for fluorescence detection, and an image-processing program. Antibody/antigen concentrations below 0,1 μg/ml can be detected in biological fluids.