Preview

Vavilov Journal of Genetics and Breeding

Advanced search

Оригинальный русский текст: https://vavilovj-icg.ru/2024-year/28-8/

Vol 28, No 8 (2024)
View or download the full issue PDF (Russian)
https://doi.org/10.18699/vjgb-24-88

FROM THE EDITOR

GENOMICS AND TRANSCRIPTOMICS

 
808-821 409
Abstract

In this work, we for the first time performed a comprehensive bioinformatics analysis of 568 human genes that, according to the NCBI Gene database as on September 15, 2024, were associated with pain generation, perception and anesthesia. The SCN9A gene encoding the sodium voltage-gated channel α subunit 9 and expressed in sensory neurons for transferring signals to the central nervous system about tissue damage was the only one involved in all the processes of interest at once as a hub gene. First, with our tool called OrthoWeb, we estimated the phylostratigraphic age indices (PAIs) for each of the genes, that is, identified the taxon of the most recent common ancestor of the organisms for which that gene has been sequenced. The mean PAI for all genes under study, including SCN9A as a hub gene for pain generation, perception, response and anesthesia, was ‘4’. On the evolutionary scale by the Kyoto Encyclopedia of Genes and Genomes (KEGG), the ancestor is the phylum Chordata, some of the most ancient of which evolved the central and the peripheral nervous system. Next, with our tool called ANDSystem, we found that phosphorylation of ion channels is a centerpiece in pain generation, perception, response and anesthesia, on which the efficiency of signal transduction from the peripheral to the central system depends. This conclusion was consistent with literature data on a key role an efficient signal transduction from the peripheral to the central system from the peripheral to the central system for adjusting the human circadian rhythm through detection of a change from the dark of night to the light of day and for identification of the direction of the source of sound by auditory brainstem nuclei, for generating the response to cold stress and for physical coordination. 21 candidate SNP marker of significant SCN9A over- and underexpression. Finally, the ratio of SCN9A upregulating to downregulating SNPs was compared to that for all known human genes estimated by the 1000 Genomes Project Consortium. It was found that SCN9A as a hub gene for pain generation, perception, pain response and anesthesia is acted on by natural selection against its downregulation, to keep the nervous system highly informed on the status of the organism and the environment.

 
822-833 226
Abstract

ChIP-seq technology, which is based on chromatin immunoprecipitation (ChIP), allows mapping a set of genomic loci (peaks) containing binding sites (BS) for the investigated (target) transcription factor (TF). A TF may recognize several structurally different BS motifs. The multiprotein complex mapped in a ChIP-seq experiment includes target and other “partner” TFs linked by protein-protein interactions. Not all these TFs bind to DNA directly. Therefore, both target and partner TFs recognize enriched BS motifs in peaks. A de novo search approach is used to search for enriched TF BS motifs in ChIP-seq data. For a pair of enriched BS motifs of TFs, the co-occurrence or mutually exclusive occurrence can be detected from a set of peaks: the co-occurrence reflects a more frequent occurrence of two motifs in the same peaks, while the mutually exclusive means their more frequent detection in different peaks. We propose the MetArea software package to identify pairs of TF BS motifs with the mutually exclusive occurrence in ChIP-seq data. MetArea was designed to predict the structural diversity of BS motifs of the same TFs, and the functional relation of BS motifs of different TFs. The functional relation of the motifs of the two distinct TFs presumes that they are interchangeable as part of a multiprotein complex that uses the BS of these TFs to bind directly to DNA in different peaks. MetArea calculates the estimates of recognition performance pAUPRC (partial area under the Precision–Recall curve) for each of the two input single motifs, identifies the “joint” motif, and computes the performance for it too. The goal of the analysis is to find pairs of single motifs A and B for which the accuracy of the joint A&B motif is higher than those of both single motifs.

 
834-842 204
Abstract

A subclass of miRNAs with as yet unknown specific functions is mitomiRs – mitochondrial miRNAs that are mainly derived from nuclear DNA and are imported into mitochondria; moreover, changes in the expression levels of mitomiRs are associated with some diseases. To identify the most pronounced characteristics of mitochondrial miRNAs that distinguish them from other miRNAs, we classified mitomiR sequences using the Random Forest algorithm. The analysis revealed, for the first time, a significant difference between mitomiRs and other microRNAs by the following criteria (in descending order of importance in the classification): mitomiRs are evolutionarily older (have a lower phylostratigraphic age index, PAI); have more targets and disease associations, including mitochondrial ones (twosided Fisher’s exact test, average p-values 1.82×10–89/1.13×10–96 for all mRNA/diseases and 6.01×10–22/1.09×10–9 for mitochondria, respectively); and are in the class of “circulating” miRNAs (average pvalue 1.20×10–56). The identified differences between mitomiRs and other miRNAs may help uncover the mode of miRNA delivery into mitochondria, indicate the evolutionary conservation and importance of mitomiRs in the regulation of mitochondrial function and metabolism, and generally show that mitomiRs are not randomly encountered miRNAs. Information on 1,312 experimentally validated mitomiR sequences for three organisms (Homo sapiens, Mus musculus and Rattus norvegicus) is collected in the mitomiRdb database (https://mitomiRdb.org).

EVOLUTIONARY BIOLOGY

 
843-853 264
Abstract

SARS-CoV-2 is a virus for which an outstanding number of genome variants were collected, sequenced and stored from sources all around the world. Raw data in FASTA format include 16.8 million genomes, each ≈29,900 nt (nu­cleotides), with a total size of ≈500 ∙ 109 nt, or 465 Gb. We suggest an approach to data representation and organization, with which all this can be stored losslessly in the operative memory (RAM) of a common PC. Moreover, just ≈330 Mb will be enough. Aligning all genomes versus the initial Wuhan-Hu-1 reference sequence allows each to be represented as a data structure containing lists of point mutations, deletions and insertions. Our implementation of such data represen­tation resulted in a 1:1500 compression ratio (for comparison, compression of the same data with the popular WinRAR archiver gives only 1:62) and fast access to genomes (and their metadata) and comparisons between different genome variants. With this approach implemented as a C++ program, we performed an analysis of various properties of the set of SARS-CoV-2 genomes available in NCBI Genbank (within a period from 24.12.2019 to 24.06.2024). We calculated the distribution of the number of genomes with undetermined nucleotides, ‘N’s, vs the number of such nucleotides in them, the number of unique genomes and clusters of identical genomes, and the distribution of clusters by size (the number of identical genomes) and duration (the time interval between each cluster’s first and last genome). Finally, the evolution of distributions of the number of changes (editing distance between each genome and reference sequence) caused by substitutions, deletions and insertions was visualized as 3D surfaces, which clearly show the process of viral evolution over 4.5 years, with a time step = 1 week. It is in good correspondence with phylogenetic trees (usually based on 3–4 thousand of genome variant representatives), but is built over millions of genomes, shows more details and is independent of the type of lineage/clade classification.

 
854-863 328
Abstract

The phospholipase A2 (PLA2) is a superfamily of hydrolases that catalyze the hydrolysis of phospholipids and play a key role in many molecular processes in the cells and the organism as a whole. This family consists of 16 groups divided into six main types. PLA2 were first isolated from venom toxins and porcine pancreatic juice. The study of these enzymes is currently of great interest, since it has been shown that a number of PLA2 are involved in the processes of carcinogenesis. PLA2 enzymes were characterized in detail in model organisms and humans. However, their presence and functional role in non-model organisms is poorly understood. Such poorly studied taxa include flatworms, a number of species of which are human parasites. Several PLA2 genes have previously been characterized in parasitic flatworms and their possible role in parasite-host interaction has been shown. However, no systematic identification of the PLA2 genes in this taxon has been carried out. The paper provides a search for and a comparative analysis of PLA2 sequences encoded in the genomes of flatworms. 44 species represented by two free-living and 42 parasitic organisms were studied. The analysis was based on identification of orthologous groups of protein-coding genes, taking into account the domain structure of proteins. In flatworms, 12 of the 13 known types of animal A2 phospholipases were found, represented by 11 orthologous groups. Some phospholipases of several types fell into one orthologous group, some types split into several orthogroups in accordance with their domain structure. It has been shown that phospholipases A2 of the calcium-independent type, platelet-activating phospho­lipases from group G8 and lysosomal phospholipases from group G15 are represented in all large taxa of flatworms and the vast majority of the species studied by us. In free-living flatworms PLA2 genes have multiple copies. In parasitic flatworms, on the contrary, loss of genes occur specifically in individual taxa specifically for groups or sub­families of PLAs. An orthologous group of secreted phospholipases has been identified, which is represented only in Digenea and this family has undergone duplications in the genomes of opisthorchids. Interestingly, a number of experimental studies have previously shown the effect of Clonorchis sinensis proteins of this orthogroup on the cancer transformation of host cells. Our results made it possible for the first time to systematically identify PLA2 sequences in flatworms, and demonstrated that their evolution is subject to gene loss processes characteristic of parasite genomes in general. In addition, our analysis allowed us to identify taxon-specific processes of duplication and loss of PLA2 genes in parasitic organisms, which may be associated with the processes of their interaction with the host organism.

 
864-873 332
Abstract

Cholesterol is an essential structural component of cell membranes and a precursor of vitamin D, as well as steroid hormones. Humans and other animal species can absorb cholesterol from food. Cholesterol is also syn­thesized de novo in the cells of many tissues. We have previously reconstructed the gene network regulating intra­cellular cholesterol levels, which included regulatory circuits involving transcription factors from the SREBP (Sterol Regulatory Element-Binding Proteins) subfamily. The activity of SREBP transcription factors is regulated inversely depending on the intracellular cholesterol level. This mechanism is implemented with the participation of proteins SCAP, INSIG1, INSIG2, MBTPS1/S1P and MBTPS2/S2P. This group of proteins, together with the SREBP factors, is designated as “cholesterol sensor”. An elevated cholesterol level is a risk factor for the development of cardiovas­cular diseases and may also be observed in obesity, diabetes and other pathological conditions. Systematization of information about the molecular mechanisms controlling the activity of SREBP factors and cholesterol biosyn­thesis in the form of a gene network and building new knowledge about the gene network as a single object is extremely important for understanding the molecular mechanisms underlying the predisposition to diseases. With a computer tool, ANDSystem, we have built a gene network regulating cholesterol biosynthesis. The gene network included data on: (1) the complete set of enzymes involved in cholesterol biosynthesis; (2) proteins that function as part of the “cholesterol sensor”; (3) proteins that regulate the activity of the “cholesterol sensor”; (4) genes encod­ing proteins of these groups; (5) genes whose transcription is regulated by SREBP factors (SREBP target genes). The gene network was analyzed and feedback loops that control the activity of SREBP factors were identified. These feedback loops involved the PPARG, NR0B2/SHP1, LPIN1, and AR genes and the proteins they encode. Analysis of the phylostratigraphic age of the genes showed that the ancestral forms of most human genes encoding the enzymes of cholesterol biosynthesis and the proteins of the “cholesterol sensor” may have arisen at early evolutionary stages (Cellular organisms (the root of the phylostratigraphic tree) and the stages of Eukaryota and Metazoa divergence). However, the mechanism of gene transcription regulation in response to changes in cholesterol levels may only have formed at later evolutionary stages, since the phylostratigraphic age of the genes encoding the transcription factors SREBP1 and SREBP2 corresponds to the stage of Vertebrata divergence.

 
874-881 265
Abstract

This article introduces Orthoweb (https://orthoweb.sysbio.cytogen.ru/), a software package developed for the calculation of evolutionary indices, including phylostratigraphic indices and divergence indices (Ka/Ks) for individual genes as well as for gene networks. The phylostratigraphic age index (PAI) allows the evolutionary stage of a gene’s emergence (and thus indirectly the approximate time of its origin, known as “evolutionary age”) to be assessed based on the analysis of orthologous genes across closely and distantly related taxa. Additionally, Orthoweb supports the calculation of the transcriptome age index (TAI) and the transcriptome divergence index (TDI). These indices are important for understanding the dynamics of gene expression and its impact on the development and adaptation of organisms. Orthoweb also includes optional analytical features, such as the ability to explore Gene Ontology (GO) terms associated with genes, facilitating functional enrichment analyses that link evolutionary origins of genes to biological processes. Furthermore, it offers tools for SNP enrichment analysis, enabling the users to assess the evolutionary significance of genetic variants within specific genomic regions. A key feature of Orthoweb is its ability to integrate these indices with gene network analysis. The software offers advanced visualization tools, such as gene network mapping and graphical representations of phylostratigraphic index distributions of network elements, ensur­ing intuitive interpretation of complex evolutionary relationships. To further streamline research workflows, Orthoweb includes a database of pre-calculated indices for numerous taxa, accessible via an application programming inter­face (API). This feature allows the users to retrieve pre-computed phylostratigraphic and divergence data efficiently, significantly reducing computational time and effort.

SYSTEMS COMPUTATIONAL BIOLOGY

 
882-896 548
Abstract

The metabolomic profiles of glioblastoma and surrounding brain tissue, comprising 17 glioblastoma samples and 15 peritumoral tissue samples, were thoroughly analyzed in this investigation. The LC-MS/MS method was used to analyze over 400 metabolites, revealing significant variations in metabolite content between tumor and peritumoral tissues. Statistical analyses, including the Mann–Whitney and Cucconi tests, identified several metabolites, particularly ceramides, that showed significant differences between glioblastoma and peritumoral tissues. Pathway analysis using the KEGG database, conducted with MetaboAnalyst 6.0, revealed a statistically sig­nificant overrepresentation of sphingolipid metabolism, suggesting a critical role of these lipid molecules in glio­blastoma pathogenesis. Using computational systems biology and artificial intelligence methods implemented in a cognitive platform, ANDSystem, molecular genetic regulatory pathways were reconstructed to describe potential mechanisms underlying the dysfunction of sphingolipid metabolism enzymes. These reconstructed pathways were integrated into a regulatory gene network comprising 15 genes, 329 proteins, and 389 interactions. Notably, 119 out of the 294 proteins regulating the key enzymes of sphingolipid metabolism were associated with glioblastoma. Analysis of the overrepresentation of Gene Ontology biological processes revealed the statistical significance of 184 processes, including apoptosis, the NF-kB signaling pathway, proliferation, migration, angiogenesis, and py­roptosis, many of which play an important role in oncogenesis. The findings of this study emphasize the pivotal role of sphingolipid metabolism in glioblastoma development and open new prospects for therapeutic approaches modulating this metabolism.

 
897-903 199
Abstract

Technologies for the production of a range of compounds using microorganisms are becoming increas­ingly popular in industry. The creation of highly productive strains whose metabolism is aimed to the synthesis of a specific desired product is impossible without complex directed modifications of the genome using mathematical and computer modeling methods. One of the bacterial species actively used in biotechnological production is Co­rynebacterium glutamicum. There are already 5 whole-genome flux balance models for it, which can be used for me­tabolism research and optimization tasks. The paper presents fluxMicrobiotech, a software module developed at the Institute of Cytology and Genetics of the Siberian Branch of the Russian Academy of Sciences, which implements a se­ries of computational protocols designed for high-performance computer analysis of C. glutamicum whole-genome flux balance models. The tool is based on libraries from the opencobra community (https://opencobra.github.io) within the Python programming language (https://www.python.org), using the Pandas (https://pandas.pydata.org) and Escher (https://escher.readthedocs.io) libraries . It is configured to operate on a ‘file-in/file-out’ basis. The model, environmental conditions, and model constraints are specified as separate text table files, which allows one to pre­pare a series of files for each section, creating databases of available test scenarios for variations of the model. Or vice versa, allowing a single model to be tested under a series of different cultivation conditions. Post-processing tools for modeling data are set up, providing visualization of summary charts and metabolic maps.

 
904-917 300
Abstract

Drought is a critical factor limiting the productivity of bread wheat (Triticum aestivum L.), one of the key agricultural crops. Wheat adaptation to water deficit is ensured by complex molecular genetic mechanisms, including the coordinated work of multiple genes regulated by transcription factors and signaling non-coding RNAs, particularly microRNAs (miRNAs). miRNA-mediated regulation of gene expression is considered one of the main mechanisms of plant resistance to abiotic stresses. Studying these mechanisms necessitates computational systems biology methods. This work aims to reconstruct and analyze the gene network associated with miRNA regulation of wheat adaptation to drought. Using the ANDSystem software and the specialized Smart crop knowledge base adapted for wheat genetics and breeding, we reconstructed a wheat gene network responding to water deficit, comprising 144 genes, 1,017 proteins, and 21 wheat miRNAs. Analysis revealed that miRNAs primarily regulate genes controlling the morphogenesis of shoots and roots, crucial for morphological adaptation to drought. The key network components regulated by miRNAs are the MYBa and WRKY41 family transcription factors, heat-shock protein HSP90, and the RPM1 protein. These proteins are associated with phytohormone signaling pathways and calcium-dependent protein kinases significant in plant water deficit adaptation. Several miRNAs (MIR7757, MIR9653a, MIR9671 and MIR9672b) were identified that had not been previously discussed in wheat drought adaptation. These miRNAs regulate many network nodes and are promising candidates for experimental studies to enhance wheat resistance to water deficiency. The results obtained can find application in breeding for the development of new wheat varieties with increased resistance to water deficit, which is of substantial importance for agriculture in the context of climate change.

 
918-926 292
Abstract

A rhizosphere (a narrow area of soil around plant roots) is an ecological niche, within which beneficial microorganisms and pathogens compete with each other for organic carbon compounds and for the opportunity to colonize roots. The roots secrete rhizodeposits into the rhizosphere, which include border cells, products of root cell death and liquids secreted by living cells (root exudates). Border cells, which have their name due to their location in the soil next to the root (at the border of the root and soil), represent terminal differentiation of columella and adjacent lateral root cap cells. Border cells can detach from the root cap surface both as single cells and as cell layers. Border cells are constantly supplied to the soil throughout plant life, and the type and intensity of border cells’ sloughing depend on both plant species and soil conditions. Currently, data on the factors that control the type of border cells’ release and its regulation have been described in different plant species. Border cells are specialized for interaction with the environment, in particular, they are a living barrier between soil microbiota and roots. After separation of border cells from the root tip, transcription of primary metabolism genes decreases, whereas transcription of secondary metabolism genes as well as the synthesis and secretion of mucilage containing these metabolites along with extracellular DNA, proteoglycans and other substances increase. The mucilage that the border cells are embedded in serves both to attract microorganisms promoting plant growth and to protect plants from pathogens. In this review, we describe interactions of border cells with various types of microorganisms and demonstrate their importance for plant growth and disease resistance.

 
927-939 522
Abstract

Parkinson’s disease (PD) and vascular parkinsonism (VP) are characterized by similar neurological syndromes but differ in pathogenesis, morphology, and therapeutic approaches. The molecular genetic mechanisms of these pathologies are multifactorial and involve multiple biological processes. To comprehensively analyze the pathophysiology of PD and VP, the methods of systems biology and gene network reconstruction are essential. In the current study, we performed metabolomic screening of amino acids and acylcarnitines in blood plasma of three groups of subjects: PD patients, VP patients and the control group. Comparative statistical analysis of the metabolic profiles identified significantly altered metabolites in the PD and the VP group. To identify potential mechanisms of amino acid and acylcarnitine metabolism disorders in PD and VP, regulatory gene networks were reconstructed using ANDSystem, a cognitive system. Regulatory pathways to the enzymes converting significant metabolites were found from PD­specific genetic markers, VP­specific genetic markers, and the group of genetic markers common to the two diseases. Comparative analysis of molecular genetic pathways in gene networks allowed us to identify both specific and non­specific molecular mechanisms associated with changes in the metabolomic profile in PD and VP. Regulatory pathways with potentially impaired function in these pathologies were discovered. The regulatory pathways to the enzymes ALDH2, BCAT1, AL1B1, and UD11 were found to be specific for PD, while the pathways regulating OCTC, FURIN, and S22A6 were specific for VP. The pathways regulating BCAT2, ODPB and P4HA1 were associated with genetic markers common to both diseases. The results obtained deepen the understanding of pathological processes in PD and VP and can be used for application of diagnostic systems based on the evaluation of the amino acids and acylcarnitines profile in blood plasma of patients with PD and VP.

 
940-949 239
Abstract

To systematize and effectively use the huge volume of experimental data accumulated in the field of bioinformatics and biomedicine, new approaches based on ontologies are needed, including automated methods for semantic integration of heterogeneous experimental data, methods for creating large knowledge bases and self-interpreting methods for analyzing large heterogeneous data based on deep learning. The article briefly presents the features of the subject area (bioinformatics, systems biology, biomedicine), formal definitions of the concept of ontology and knowledge graphs, as well as examples of using ontologies for semantic integration of heterogeneous data and creating large knowledge bases, as well as interpreting the results of deep learning on big data. As an example of a successful project, the Gene Ontology knowledge base is described, which not only includes terminological knowledge and gene ontology annotations (GOA), but also causal influence models (GO-CAM). This makes it useful not only for genomic biology, but also for systems biology, as well as for interpreting large-scale experimental data. An approach to building large ontologies using design patterns is discussed, using the ontology of biological attributes (OBA) as an example. Here, most of the classification is automatically computed based on previously created reference ontologies using automated inference, except for a small number of high-level concepts. One of the main problems of deep learning is the lack of interpretability, since neural networks often function as “black boxes” unable to explain their decisions. This paper describes approaches to creating methods for interpreting deep learning models and presents two examples of self-explanatory ontology-based deep learning models: (1) Deep GONet, which integrates Gene Ontology into a hierarchical neural network architecture, where each neuron represents a biological function. Experiments on cancer diagnostic datasets show that Deep GONet is easily interpretable and has high performance in distinguishing cancerous and non-cancerous samples. (2) ONN4MST, which uses biome ontologies to trace microbial sources of samples whose niches were previously poorly studied or unknown, detecting microbial contaminants. ONN4MST can distinguish samples from ontologically similar biomes, thus offering a quantitative way to characterize the evolution of the human gut microbial community. Both examples demonstrate high performance and interpretability, making them valuable tools for analyzing and interpreting big data in biology.

 
950-959 247
Abstract

The description of the path from a gene to a trait, as the main task of many areas in biology, is currently being equipped with new methods affecting not only experimental techniques, but also analysis of the results. The pleiotropic effect of a gene is due to its participation in numerous biological processes involved in different traits. A widespread use of genome-wide sequencing of transcripts and transcription factor (TF) binding regions has made the following tasks relevant: unveiling pleiotropic effects of TFs based on the functions of their target genes; compiling the lists of TFs that regulate biological processes of interest; and describing the ways of TF functioning (their primary and secondary targets, higher order targets, TF interactions in the process under study). We have previously developed a method for the reconstruction of TF regulatory networks and proposed an approach that allows identifying which biological processes are controlled by these networks and how this control is exerted. In this paper, we have implemented the approach as PlantReg, a program available as a web service. The paper describes how the program works. The input consists of a list of genes and a list of TFs – known or putative transcriptional regulators of these genes. As an output, the program provides a list of biological processes enriched for these genes, as well as information about by which TFs and through which genes these processes are controlled. We illustrated the use of PlantReg deciphering transcriptional regulation of processes initiated at the early salt stress response in Arabidopsis thaliana L. With PlantReg, we identified biological processes stimulated by the stress, and specific sets of TFs that activate each process. With one of these processes (response to abscisic acid) as an example, we showed that salt stress mainly affects abscisic acid signaling and identified key TFs in this regulation. Thus, PlantReg is a convenient tool for generating hypotheses about the molecular mechanisms that control plant traits.

 
960-973 405
Abstract

Although nitrogen fertilizers increase rice yield, their excess can impair plant resistance to diseases, particularly sheath blight caused by Rhizoctonia solani. This pathogen can destroy up to 50 % of the crop, but the mechanisms underlying reduced resistance under excess nitrogen remain poorly understood. This study aims to identify potential marker genes to enhance rice resistance to R. solani under excess nitrogen conditions. A comprehensive bioinformatics approach was applied, including differential gene expression analysis, gene network reconstruction, biological process overrepresentation analysis, phylostratigraphic analysis, and non-coding RNA co-expression analysis. The Smart crop cognitive system, ANDSystem, the ncPlantDB database, and other bioinformatics resources were used. Analysis of the molecular genetic interaction network revealed three potential mechanisms explaining reduced resistance of rice to R. solani under excess nitrogen: the OsGSK2-mediated pathway, the OsMYB44-OsWRKY6-OsPR1 pathway, and the SOG1-Rad51-PR1/PR2 pathway. Potential markers for breeding were identified: 7 genes controlling rice responses to various stresses and 11 genes modulating the immune system. Special attention was given to key participants in regulatory pathways under excess nitrogen conditions. Non-coding RNA analysis revealed 30 miRNAs targeting genes of the reconstructed gene network. For two miRNAs (Osa-miR396 and Osa-miR7695), about 7,400 unique long non-coding RNAs (lncRNAs) with various co-expression indices were found. The top 50 lncRNAs with the highest co-expression index for each miRNA were highlighted, opening new perspectives for studying regulatory mechanisms of rice resistance to pathogens. The results provide a theoretical basis for experimental work on creating new rice varieties with increased pathogen resistance under excessive nitrogen nutrition. This study opens prospects for developing innovative strategies in rice breeding aimed at optimizing the balance between yield and disease resistance in modern agrotechnical conditions.

 
974-981 323
Abstract

Gene regulatory networks (GRNs) – interpretable graph models of gene expression regulation – are a pivotal tool for understanding and investigating the mechanisms utilized by cells during development and in response to various internal and external stimuli. Historically, the first approach for the GRN reconstruction was based on the analysis of published data (including those summarized in databases). Currently, the primary GRN inference approach is the analysis of omics (mainly transcriptomic) data; a number of mathematical methods have been adapted for that. Obtaining omics data for individual cells has made it possible to conduct large-scale molecular genetic studies with an extremely high resolution. In particular, it has become possible to reconstruct GRNs for individual cell types and for various cell states. However, technical and biological features of single-cell omics data require specific approaches for GRN inference. This review describes the approaches and programs that are used to reconstruct GRNs from single-cell RNA sequencing (scRNA-seq) data. We consider the advantages of using scRNA-seq data compared to bulk RNA-seq, as well as challenges in GRN inference. We pay specific attention to state-of-the-art methods for GRN reconstruction from single-cell transcriptomes recruiting other omics data, primarily transcription factor binding sites and open chromatin profiles (scATAC-seq), in order to increase inference accuracy. The review also considers the applicability of GRNs reconstructed from single-cell omics data to recover and characterize various biological processes. Future perspectives in this area are discussed.

 
982-992 215
Abstract

Neurocomputing technology is a field of interdisciplinary research and development widely applied in modern digital medicine. One of the problems of neuroimaging technology is the creation of methods for studying human brain activity in socially oriented conditions by using modern information approaches. The aim of this study is to develop a methodology for collecting and processing psychophysiological data, which makes it possible to estimate the functional states of the human brain associated with the attribution of external information to oneself or other people. Self-reference is a person’s subjective assessment of information coming from the external environment as related to himself/herself. Assigning information to other people or inanimate objects is evaluating information as a message about someone else or about things. In modern neurophysiology, two approaches to the study of self-referential processing have been developed: (1) recording brain activity at rest, then questioning the participant for self-reported thoughts; (2) recording brain activity induced by self-assigned stimuli. In the presented paper, a technology was tested that combines registration and analysis of EEG with viewing facial video recordings. The novelty of our approach is the use of video recordings obtained in the first stage of the survey to induce resting states associated with recognition of information about different subjects in later stages of the survey. We have developed a software and hardware module, i. e. a set of related programs and procedures for their application consisting of blocks that allow for a full cycle of registration and processing of psychological and neurophysiological data. Using this module, brain electrical activity (EEG) indicators reflecting individual characteristics of recognition of information related to oneself and other people were compared between groups of 30 Chinese (14 men and 16 women, average age 23.2 ± 0.4 years) and 32 Russian (15 men, 17 women, average age 22.1 ± 0.4 years) participants. We tested the hypothesis that differences in brain activity in functional rest intervals between Chinese and Russian participants depend on their psychological differences in collectivism scores. It was revealed that brain functional activity depends on the subject relevance of the facial video that the participants viewed between resting-state intervals. Interethnic differences were observed in the activity of the anterior and parietal hubs of the default-mode network and depended on the subject attribution of information. In Chinese, but not Russian, participants significant positive correlations were revealed between the level of collectivism and spectral density in the anterior hub of the default-mode network in all experimental conditions for a wide range of frequencies. The developed software and hardware module is included in an integrated digital platform for conducting research in the field of systems biology and digital medicine.

BIOMEDICINE

 
993-1007 423
Abstract

In this part of the study, the first component of the concept of “natural genome reconstruction” is being proven. It was shown with mouse and human model organisms that CD34+ hematopoietic bone marrow progenitors take up fragments of extracellular double-stranded DNA through a natural mechanism. It is known that the process of internalization of extracellular DNA fragments involves glycocalyx structures, which include glycoproteins/protein glycans, glycosylphosphatidylinositol-anchored proteins and scavenger receptors. The bioinformatic analysis conducted indicates that the main surface marker proteins of hematopoietic stem cells belong to the indicated groups of factors and contain specific DNA binding sites, including a heparin-binding domain and clusters of positively charged amino acid residues. A direct interaction of CD34 and CD84 (SLAMF5) glycoproteins, markers of hematopoietic stem cells, with double-stranded DNA fragments was demonstrated using an electrophoretic mobility shift assay system. In cells negative for CD34, which also internalize fragments, concatemerization of the fragments delivered into the cell occurs. In this case, up to five oligonucleotide monomers containing 9 telomeric TTAGGG repeats are stitched together into one structure. Extracellular fragments delivered to hematopoietic stem cells initiate division of the original hematopoietic stem cell in such a way that one of the daughter cells becomes committed to terminal differentiation, and the second retains its low-differentiated status. After treatment of bone marrow cells with hDNAgr, the number of CD34+ cells in the colonies increases to 3 % (humans as the model organism). At the same time, treatment with hDNAgr induces proliferation of blood stem cells and their immediate descendants and stimulates colony formation (mouse, rat and humans as the model organisms). Most often, the granulocyte-macrophage lineage of hematopoiesis is activated as a result of processing extracellular double-stranded DNA. The commitment process is manifested by the appearance and repair of pangenomic single-strand breaks. The transition time in the direction of differentiation (the time it takes for pangenomic single-strand breaks to appear and to be repaired) is about 7 days. It is assumed that at the moment of initiation of pangenomic single-strand breaks, a “recombinogenic situation” ensues in the cell and molecular repair and recombination mechanisms are activated. In all experiments with individual molecules, recombinant human angiogenin was used as a comparison factor. In all other experiments, one of the experimental groups consisted of hematopoietic stem cells treated with angiogenin.

 
1008-1017 260
Abstract

Data on the genetics and molecular biology of diabetes are accumulating rapidly. This poses the challenge of creating research tools for a rapid search for, structuring and analysis of information in this field. We have developed a web resource, GlucoGenes®, which includes a database and an Internet portal of genes and proteins associated with high glucose (hyperglycemia), low glucose (hypoglycemia), and both metabolic disorders. The data were collected using text mining of the publications indexed in PubMed and PubMed Central and analysis of gene networks associated with hyperglycemia, hypoglycemia and glucose variability performed with ANDSystems, a bioinformatics tool. GlucoGenes® is freely available at: https://glucogenes.sysbio.ru/genes/main. GlucoGenes® enables users to access and download information about genes and proteins associated with the risk of hyperglycemia and hypoglycemia, molecular regulators with hyperglycemic and antihyperglycemic activity, genes up-regulated by high glucose and/or low glucose, genes down-regulated by high glucose and/or low glucose, and molecules otherwise associated with the glucose metabolism disorders. With GlucoGenes®, an evolutionary analysis of genes associated with glucose metabolism disorders was performed. The results of the analysis revealed a significant increase (up to 40 %) in the proportion of genes with phylostratigraphic age index (PAI) values corresponding to the time of origin of multicellular organisms. Analysis of sequence conservation using the divergence index (DI) showed that most of the corresponding genes are highly conserved (DI < 0.6) or conservative (DI < 1). When analyzing single nucleotide polymorphism (SNP) in the proximal regions of promoters affecting the affinity of the TATA-binding protein, 181 SNP markers were found in the GlucoGenes® database, which can reduce (45 SNP markers) or increase (136 SNP markers) the expression of 52 genes. We believe that this resource will be a useful tool for further research in the field of molecular biology of diabetes.

 
1018-1024 310
Abstract

A software information module of the experimental computer platform “EEG_Self-Construct” was developed and tested in the framework of this study. This module can be applied for identification of neurophysiological markers of self-referential processes based on the joint use of EEG and facial video recording to induce the brain’s functional states associated with participants’ personality traits. This module was tested on a group of non-clinical participants with varying degrees of severity of autistic personality traits (APT) according to the Broad Autism Phenotype Questionnaire. The degree of individual severity of APT is a quantitative characteristic of difficulties that a person has when communicating with other people. Each person has some individual degree of severity of such traits. Patients with autism are found to have high rates of autistic traits. However, some individuals with high rates of autistic traits are not accompanied by clinical symptoms. Our module allows inducing the brain’s functional states, in which the EEG indicators of people with different levels of APT significantly differ. In addition, the module includes a set of software tools for recording and analyzing brain activity indices. We have found that relationships between brain activity and the individual level of severity of APT in non-clinical subjects can be identified in resting-state conditions following recognition of self-referential information, while recognition of socially neutral information does not induce processes associated with APT. It has been shown that people with high scores of APT have increased spectral density in the delta and theta ranges of rhythms in the frontal cortical areas of both hemispheres compared to people with lower scores of APT. This could hypothetically be interpreted as an index of reduced brain activity associated with recognition of self-referential information in people with higher scores of autistic traits. The software module we are developing can be integrated with modules that allow identifying molecular genetic markers of personality traits, including traits that determine the predisposition to mental pathologies.

Articles



Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 2500-3259 (Online)