MOLECULAR AND CELL BIOLOGY
Aneuploidy is defined as the loss or gain of a whole chromosome or its region. Even at early stages of development, it usually leads to fatal consequences, including developmental defects/abnormalities and death. For a long time, it was believed that the disruption of gene balance results in pronounced effects at both the cellular and organismal levels, adversely affecting organism formation. It has been shown that the gene imbalance resulting from aneuploidy leads to proteotoxic and metabolic stress within the cell, reduced cell proliferation, genomic instability, oxidative stress, etc. However, some organisms have exhibited tolerance to aneuploidies, which may even confer adaptive advantages, such as antibiotic resistance in pathogenic fungal strains. A significant factor likely lies in the complexity of the tissue and organ organization of specific species. Polyploid organisms are generally more tolerant of aneuploidy, particularly those that have recently undergone whole-genome duplication. This review places special emphasis on the examination of sex chromosome aneuploidies in humans. In addition to primary effects, or cis effects (changes in the quantity of the transcripts of genes located on the aneuploid chromosome), aneuploidy can induce secondary or trans effects (changes in the expression levels of genes located on other chromosomes). The results of recent studies have prompted a reevaluation of the impact of aneuploidy on the structural-functional organization of the genome, transcriptome, and proteome of both the cell and the entire organism. Despite the fact that, in the cases of aneuploidy, the expression levels for most genes correlate with their altered copy numbers in the cell, there have been instances of dosage compensation, where the transcript levels of genes located on the aneuploid chromosome remained unchanged. The review presents findings from recent studies focused on compensatory mechanisms of dosage compensation that modify gene product quantities at post-transcriptional and post-translational levels, alleviating the negative effects of aneuploidy on cellular homeostasis. It also discusses the influence of extrachromosomal elements on the spatial organization of the genome and the changes in gene expression patterns resulting from their presence. Additionally, the review specifically examines cases of segmental aneuploidy and changes in copy number variants (CNVs) in the genome. Not only the implications of their composition are considered, but also their localization within the chromosome and in various compartments of the interphase nucleus. Addressing these questions could significantly contribute to enhancing cytogenomic diagnostics and establishing a necessary database for accurate interpretation of identified cases of segmental aneuploidy and CNVs in the genome.
The problem of interpretation of the genetic data from patients with inherited cardiovascular diseases still remains relevant. To date, the clinical significance of approximately 40 % of variants in genes associated with in herited cardiovascular diseases is uncertain, which requires new approaches to the assessment of their pathogenetic contribution. A combination of the induced pluripotent stem cell (iPSC) technology and editing the iPSC genome with CRISPR/Cas9 is thought to be the most promising tool for clarifying variant pathogenicity. A variant of unknown significance in MYH7, p.Met659Ile (c.1977G>A), was previously identified in several genetic screenings of hypertrophic cardiomyopathy patients. In this study, the single nucleotide substitution was corrected with CRISPR/Cas9 in iPSCs generated from a carrier of the variant. As a result, two iPSC lines (ICGi019-B-1 and ICGi019-B-2) were generated and characterized using a standard set of methods. The iPSC lines with the corrected p.Met659Ile (c.1977G>A) variant in MYH7 possessed a morphology characteristic of human pluripotent cells, expressed markers of the pluripotent state (the OCT4, SOX2, NANOG transcription factors and SSEA-4 surface antigen), were able to give rise to derivatives of three germ layers during spontaneous differentiation, and retained a normal karyotype (46,XY). No CRISPR/Cas9 off-target activity was found in the ICGi019-B-1 and ICGi019-B-2 iPSC lines. The maintenance of the pluripotent state and normal karyotype and the absence of CRISPR/Cas9 off-target activity in the iPSC lines with the corrected p.Met659Ile (c.1977G>A) variant in MYH7 allow using the iPSC lines as an isogenic control for further studies of the variant pathogenicity and its impact on the hypertrophic cardiomyopathy development.
PLANT GENETICS
Peach (Prunus persica (L.) Batsch) is one of the main agricultural stone fruit crops of the family Rosaceae. Modern breeding is aimed at improving the quality of the fruit, extending the period of its production, increasing its resistance to unfavorable environmental conditions and reducing the total cost of production of cultivated varieties. However, peach breeding is an extremely long process: it takes 10–15 years from hybridization of the parental forms to obtaining fruit-bearing trees. Research into peach varieties as donors of desirable traits began in the 1980s. The first version of the peach genome was presented in 2013, and its appearance contributed to the identification and localization of loci, followed by the identification of candidate genes that control the desired trait. The development of NGS has accelerated the development of methods based on the use of diagnostic DNA markers. Approaches that allow accelerating classical breeding processes include marker-oriented selection (MOS) and genomic selection. In order to develop DNA markers associated with the traits under investigation, it is necessary to carry out preliminary mapping of loci controlling economically desirable traits and to develop linkage maps. SNP-chip approaches and genotyping by sequencing (GBS) methods are being developed. In recent years, genome-wide association analysis (GWAS) has been actively used to identify genomic loci associated with economically important traits, which requires screening of large samples of varieties for hundreds and thousands of SNPs. Study on the pangenome has shown the need to analyze a larger number of samples, since there is still not enough data to identify polymorphic regions of the genome. The aim of this review was to systematize and summarize the major advances in peach genomic research over the last 40 years: linkage and physical map construction, development of different molecular markers, full genome sequencing for peach, and existing methods for genome-wide association studies with high-density SNP markers. This review provides a theoretical basis for future GWAS analysis in order to identify high-performance markers of economically valuable traits for peach and to develop genomic selection of this crop.
Yellow index is an important quality parameter of durum wheat cultivars, associated with carotenoid pigment content in grain and the level of carotenoid degradation during processing, and determining the yellow color of products made from durum wheat. Molecular markers of genes that influence carotenoid content can be used for fast identification of valuable genotypes and development of new high-quality durum wheat cultivars. The aim of the study was to investigate the domestic durum wheat gene pool using molecular markers of the yellow pigment synthesis (Psy-A1) and degradation (Lpx-B1) genes. Using two markers of the phytoene synthase Psy- A1 gene (PSY1-A1_STS and YP7A-2) and three markers of the lipoxygenase Lpx-B1 locus (Lpx-B1.1a/1b, Lpx- B1.1c and Lpx- B1.2/1.3), 54 durum wheat cultivars were studied for the first time. For 38 cultivars, yellow pigment content in grain was also assessed. The detected allelic variation of the phytoene synthase Psy-A1 and lipoxygenase Lpx-B1 genes was rather low. The most common Psy-A1 alleles among the studied cultivars were Psy-A1l for the PSY1- A1_STS marker and Psy-A1d for the YP7A-2 marker, identified in 51 cultivars and associated with high carotenoid content. According to the markers of the Lpx-B1 locus, haplotype II, associated with medium lipoxygenase activity, identified in 43 cultivars, was predominant. Haplotype III, associated with low enzyme activity, was identified in only three winter durum wheat cultivars (Donchanka, Gelios and Leucurum 21). Despite the predominance of allelic variants associated with increased carotenoid content and moderate lipoxygenase activity, the studied cultivars had different levels of yellow pigment content in grain, from low to high.
ИММУНИТЕТ И ПРОДУКТИВНОСТЬ РАСТЕНИЙ
Flax (Linum usitatissimum) is an important agricultural crop grown for fiber and oil production, playing a key role in various industries such as production of paints, linoleum, food, clothes and composite materials. Fusarium wilt caused by the fungus Fusarium oxysporum f. sp. lini is a reason of significant economic damage in flax cultivation. The spores of the fungus can persist in the soil for a long time, so obtaining resistant varieties is important. Here we used data on the resistance of 297 flax accessions from the collection of the Federal Center for Bast Crops in Torzhok (Russian Federation) to infection by a highly virulent isolate of the fungus MI39 in 2019–2021. Genotype resistance to infection was assessed by calculating the DSI index, a normalized proportion of genotypes with the same disease symptoms. The IIIVmrMLM program in Single_env mode was used to search for regions of the flax genome associated with resistance. The IIIVmrMLM model was designed to address methodological shortcomings in identifying all types of interactions between alleles, genes and environment, and to unbiasedly estimate their genetic effects. Being a multilocus MLM model, it estimates the effects of all genes as well as the effects of all interactions simultaneously. A total of 111 QTNs were found, of which 34 fell within the body of a known gene or were located in flanking regions within 1,000 bp. The genes into which the detected variants fell were associated with resistance to abiotic and biotic stresses, root, shoot and flower growth and development. Ten of the QTNs found mapped to regions of previously identified QTLs controlling the synthesis of palmitic, oleic, and other fatty acids. QTN Chr1_1706865/Chr1_1706872 and QTN Chr8_22542741 mark regions identified previously in an association search by the GAPIT program. The allelic effect was confirmed for all the QTNs found: a Mann–Whitney test was performed, which confirmed significant differences between the DSI index value in carriers of the reference and alternative allele. An increase in the number of alleles with negative effects in the genotype leads to a statistically significant decrease in the DSI value for all three years of testing. The groups of varieties with a large number of alleles reducing the DSI index had the best resistance. A total of 5 varieties were selected from the collection for which the number of alleles reducing the DSI index value did not exceed the number of alleles with the opposite effect for all three years. These varieties can be used further in breeding programs.
Wheat is an extremely important and preferred source of human nutrition in many regions of the world. The production of biofortified colored-grain wheat varieties, which are known to contain a range of biologically active compounds, including anthocyanins, phenolic compounds, vitamins and minerals, reflects a worldwide trend toward increasing dietary diversity and improving diet quality through the development and introduction of diverse functional foods. The present work describes the genetic systems that regulate the biosynthesis and accumulation of anthocyanins in the pericarp and aleurone layer, the presence of which imparts purple, blue and black grain color. The review is devoted to the systematization of available information on the peculiarities of qualitative and quantitative content of anthocyanins, soluble and insoluble phenolic acids in wheat grain of different color, as well as on indicators of antioxidant activity of alcoholic extracts of grain depending on the content of anthocyanins and phenolic compounds. A huge number of studies have confirmed that these compounds are antioxidants, have anti-inflammatory activity and their consumption makes an important contribution to the prevention of a number of socially significant human diseases. Consumption of colored cereal grain products may contribute to an additional enrichment of bioactive compounds in human diet along with the usual sources of antioxidants. Special attention in the review is paid to the description of achievements of Russia’s breeders in developing promising varieties and lines with colored grain, which will be a key factor in expanding the opportunities of the domestic and international grain market.
INSECT GENETICS
The nucleolus is a large membraneless subnuclear structure, the main function of which is ribosome biogenesis. However, there is growing evidence that the function of the nucleolus extends beyond this process. While the nucleolus is the most transcriptionally active site in the nucleus, it is also the compartment for the location and regulation of repressive genomic domains and, like the nuclear lamina, is the hub for the organization of inactive heterochromatin. Studies in human and Drosophila cells have shown that a decrease in some nucleolar proteins leads to changes in nucleolar morphology, heterochromatin organization and declustering of centromeres. This work is devoted to the study of the effects of Novel nucleolar protein 3 (Non3) gene mutations in D. melanogaster on the organization of chromatin in the nucleus. Previously, it was shown that partial deletion of the Non3 gene leads to embryonic lethality, and a decrease in NON3 causes an extension of ontogenesis and formation of a Minute-like phenotype in adult flies. In the present work, we have shown that mutations in the Non3 gene suppress the position effect variegation (PEV) and increase the frequency of meiotic recombination. We have analyzed the classical heterochromatin markers in Non3 mutants and shown that the amount of the HP1 protein as well as the modification of the histone H3K9me2 do not change significantly in larval brains and salivary glands compared to the control in Western blot analysis. Immunostaining with antibodies to HP1 and H3K9me2 did not reveal a significant reduction or change in the localization patterns of these proteins in the pericentromeric regions of salivary gland polytene chromosomes either. We analyzed the localization of the HP1 protein in Non3 mutants using DNA adenine methyltransferase identification (DamID) analysis and did not find substantial differences in protein distribution compared to the control. In hemocytes of Non3 mutants, we observed changes in the morphology of the nucleolus and in the size of the region detected by anti-centromere antibodies, but this was not accompanied by declustering of centromeres and their untethering from the nucleolar periphery. Thus, the NON3 protein is important for the formation/function of the nucleolus and is required for the correct chromatin packaging, but the exact mechanism of NON3 involvement in these processes requires further investigations.
The achaete-scute complex (AS-C) is a locus approximately 90 kbp in length, containing multiple en hancers. The local expression of the achaete and scute genes in proneural clusters of Drosophila melanogaster imaginal discs results in the formation of a well-defined pattern of macrochaetae in adult flies. A wide variety of easily analyzed phenotypes, along with the direct connection between individual regulatory elements and the development of specific setae make this locus a classic model in developmental genetics. One classic AS-C allele is sc8, which arose as a result of the In(1) sc8 inversion. One breakpoint of this inversion lies between the ac and sc genes, while the second is in the pericentromeric heterochromatin of chromosome X, within satellite block 1.688. The heterochromatic position of the breakpoint raised the question of whether position effect variegation contributes to the disruption of normal locus function in the In(1)sc8 flies. However, conflicting results were obtained. Previously, we found that a secondary inversion, In(1)19EHet, arose spontaneously in one of the stocks of the In(1)sc8 BDSC line, transferring most of the heterochromatin from the ac gene to the 19E region of the X chromosome. Here, we demonstrate that the In(1)19EHet inversion leads to complete rescue of the number of posterior supraalar (PSA) and partial rescue of the number of dorsocentral (DC) macrochaetes observed in the original In(1)sc8 line. The same rescue of the macrochaetes pattern was observed when the In(1)sc8 inversion was introduced into a strain with the Su(var)3-906 position effect modifier. Combining the inversion with the Rif11 mutation, a conserved factor determining late replication and underreplication, does not restore the normal pattern of bristles. Our data indicate that the phenotype of flies carrying the In(1) sc8 inversion, associated with a disturbance in bristle development, is determined by the effect of heterochromatin on the distal part of the locus. This model can be used to test the influence of various factors on the position effect variegation caused by heterochromatin. Another phenotypic manifestation of In(1)sc8, a decreased proportion of males in the offspring, was independent of the proximity of the distal part of AS-C to heterochromatin and was not affected by the Rif11 mutation.
HUMAN GENETICS
Bolgar was one of the most significant mediaeval cities in Eastern Europe. Before the Mongol conquest, it served as a major administrative centre of Volga Bulgaria, and after 1236, it temporarily functioned as the capital of the Golden Horde. Historical, archaeological, and paleoanthropological evidence indicates a mixed population of this city during the 13th–15th centuries; however, the contributions of exact ethnic groups into its genetic structure remain unclear. To date, there are no genetic data for this medieval group. For the first time, using massive parallel sequencing methods, we determined whole-genome sequences for three individuals from Bolgar who were buried in the early 14th century close to the so-called “Greek Chamber”. The average coverage of the studied genomes ranged from x0.5 to x1.5. We identified the genetic sex of the people (two men and one woman), and performed a population genetic analysis. The authenticity of the DNA studied and the low level of contamination were confirmed, and the mitochondrial DNA haplogroups of all three individuals as well as the Y-chromosome haplogroups of two male individuals were determined. We used more than 2.7 thousand DNA samples from re presentatives of ancient and modern populations that had been previously published to perform a comparative population-genetic analysis. Whole-genome data analysis employing uniparental markers (mitochondrial DNA and Y chromosome) and autosomal markers revealed genetic heterogeneity in this population. Based on PCA and f4- statistics analysis, a genetic connection was identified between one of the individuals (female) and modern Finno-Ugric peoples of the Volga-Ural region. Genomic analysis of the other two individuals suggests their Armenian origin and indicates migrant influx from the Caucasus or Anatolia. The results align well with archaeological and paleoanthropological findings and significantly enhance them by reconstructing the contributions of the indigenous population to the formation of the mediaeval Bolgar population structure.
One of the main etiological factors in the development of cervical cancer is infection with human papillomavirus (HPV). At the same time, the risk of developing a malignant process increases with an increase in viral load. The aim of this study was to investigate the transcription level of DNA repair and cell cycle control genes in the cervical epithelial cells of women with a clinically significant HPV viral load. The material for the study was DNA and RNA samples isolated from cervical epithelial cells in women. A total of 107 samples were analyzed. 55 women were HPV-positive (with a clinically significant viral load – more than 103 HPV genomes per 100 thousand human cells); the control group consisted of 52 HPV-negative women. All women were over 30 years old. The transcription level of the APEX1, ERCC2, CHEK2, TP53, TP73, CDKN2A, SIRT1 genes was determined using RT-PCR. It was shown that the detection frequency of the APEX1 and ERCC2 gene transcripts was increased in the group of women with a clinically significant viral load. The transcription level of all the studied genes did not differ between the control group and the group with clinically significant HPV concentrations. However, the transcription level of the TP53 and TP73 genes decreased with increasing viral load. In the control, a correlation between the transcription levels of genes involved in the functioning of the p53 protein was revealed. An increase in viral load during HPV infection is associated with a change in the coexpression of DNA repair and cell cycle control genes.
MEDICAL GENETICS
Precocious puberty (PP, OMIM 176400, 615346) is an autosomal dominant disorder caused by the premature reactivation of the hypothalamic-pituitary-gonadal axis. Genetic, epigenetic, and environmental factors play a decisive role in determining the timing of puberty. In recent years, genetic variants in the KISS1, KISS1R, MKRN3, and DLK1 genes have been identified as genetic causes of PP. The MKRN3 and DLK1 genes are imprinted, and therefore epigenetic modifications, such as DNA methylation, which alter the expression of these genes, can also contribute to the development of PP. The aim of this study is to determine the methylation index of the imprinting centers of the DLK1 and MKRN3 genes in girls with a clinical presentation of PP. The methylation index of the imprinting centers of the DLK1 and MKRN3 genes was analyzed in a group of 45 girls (age 7.2 ± 1.9 years) with a clinical presentation of PP and a normal karyotype using targeted massive parallel sequencing after sodium bisulfite treatment of DNA. The control group consisted of girls without PP (n = 15, age 7.9 ± 1.6 years). No significant age differences were observed between the groups (p > 0.8). Analysis of the methylation index of the imprinting centers of the DLK1 and MKRN3 genes revealed no significant differences between patients with PP and the control group. However, in the group of patients with isolated adrenarche, an increased methylation index of the imprinting center of the MKRN3 gene was observed (72 ± 7.84 vs 56.92 ± 9.44 %, p = 0.005). In the group of patients with central PP, 3.8 % of patients showed a decreased methylation index of the imprinting center of the DLK1 gene, and 11.5 % of probands had a decreased methylation index of the imprinting center of the MKRN3 gene. Thus, this study demonstrates that not only genetic variants but also alterations in the methylation index of the imprinting centers of the DLK1 and MKRN3 genes can contribute to the development of PP.
Long non-coding RNAs (lncRNAs) and microRNAs (miRNAs) play important roles in all biological processes, including adipogenesis, lipid metabolism, and insulin response. Analyzing expression patterns of lncRNAs and miRNAs in human visceral fat tissue can enhance our understanding of their roles in metabolic disorders. Our research aims to investigate the expression of lncRNAs (ASMER1, SNHG9, P5549, P19461, and GAS5) and miRNAs (miR-26A, miR-222, miR-221, and miR-155) in visceral adipose tissues of individuals with abdominal obesity (n = 70) compared to their le vels in non-obese participants (n = 31), using Real-Time PCR. Among the tested miRNAs, only miR-26A was significantly downregulated in the visceral adipose tissue of obese individuals, with no significant change in the expression of miR- 26A in obese people with or without type 2 diabetes. Similarly, of the tested lncRNAs, only GAS5 showed significantly higher expression levels in obese patients with type 2 diabetes (T2D) (n = 10) compared to obese patients without T2D (n = 60). To test possible interactions between the analyzed non-coding RNAs, we used Spearman’s bivariate correlation test. GAS5 expression levels showed a weak negative correlation (p < 0.05, rs = 0.25) with miR-155 levels in obese patients only. Conversely, a strong positive correlation (p < 0.01, rs = 0.92) between SNHG9 and GAS5 was found in the non-obese group, with a weaker correlation in abdominally obese patients (p < 0.01, rs = 0.67); additionally, miR-26A and miR-155 levels were moderately correlated in the non-obese group (p < 0.05, rs = 0.47) and were found to correlate weakly in obese patients (p < 0.05, rs = 0.26). Our results showed that abdominally obese participants de monstrated higher expression levels of miR-26A in visceral adipose tissue and a significantly lower correlation between GAS5 and SNHG9 expression when compared to non-obese subjects.
Centenary of the chromosome theory of inheritance
A rapid growth of the available body of genomic data has made it possible to obtain extensive results in genomic prediction and identification of associations of SNPs with phenotypic traits. In many cases, to identify new relationships between phenotypes and genotypes, it is preferable to use machine learning, deep learning and artificial intelligence, especially explainable artificial intelligence, capable of recognizing complex patterns. 80 sources were manually selected; while there were no restrictions on the release date, the main attention was paid to the originality of the proposed approach for use in genomic prediction. The article considers models for genomic prediction, convolutional neural networks, explainable artificial intelligence and large language models. Attention is paid to Data Augmentation, Transfer Learning, Dimensionality Reduction methods and hybrid methods. Research in the field of model-specific and model-independent methods for interpretation of model solutions is represented by three main categories: sensing, perturbation, and surrogate model. The considered examples reflect the main modern trends in this area of research. The growing role of large language models, including those based on transformers, for genetic code processing, as well as the development of data augmentation methods, are noted. Among hybrid approaches, the prospect of combining machine learning models and models of plant development based on biophysical and biochemical processes is emphasized. Since the methods of machine learning and artificial intelligence are the focus of attention of both specialists in various applied fields and fundamental scientists, and also cause public resonance, the number of works devoted to these topics is growing explosively.
Genetic diversity among biological entities, including populations, species, and communities, serves as a fundamental source of information for understanding their structure and functioning. However, many ecological and evolutionary problems arise from limited and complex datasets, complicating traditional analytical approaches. In this context, our study applies a deep learningbased approach to address a crucial question in evolutionary biology: the balance between sexual and asexual reproduction. Sexual reproduction often disrupts advantageous gene combinations favored by selection, whereas asexual reproduction allows faster proliferation without the need for males, effectively maintaining beneficial genotypes. This research focuses on exploring the coexistence patterns of sexual and asexual reproduction within a single species. We developed a convolutional neural network model specifically designed to analyze the dynamics of populations exhibiting mixed reproductive strategies within changing environments. The model developed here allows one to estimate the ratio of population members who originate from sexual reproduction to the clonal organisms produced by parthenogenetic females. This model assumes the reproductive ratio remains constant over time in populations with dual reproductive strategies and stable population sizes. The approach proposed is suitable for neutral multiallelic marker traits such as microsatellite repeats. Our results demonstrate that the model estimates the ratio of reproductive modes with an accuracy as high as 0.99, effectively handling the complexities posed by small sample sizes. When the training dataset’s dimensionality aligns with the actual data, the model converges to the minimum error much faster, highlighting the significance of dataset design in predictive performance. This work contributes to the understanding of reproductive strategy dynamics in evolutionary biology, showcasing the potential of deep learning to enhance genetic data analysis. Our findings pave the way for future research examining the nuances of genetic diversity and reproductive modes in fluctuating ecological contexts, emphasizing the importance of advanced computational methods in evolutionary studies.