Оригинальный русский текст: https://vavilovj-icg.ru/2016-year/20-6/
INSECT GENETICS
Genomics
Short read next generation sequencing (NGS) has significant impacts on modern genomics, genetics, cell biology and medicine, especially on meta-genomics, comparative genomics, polymorphism detection, mutation screening, transcriptome profiling, methylation profiling, chromatin remodelling and many more applications. However, NGS are prone for errors which complicate scientific conclusions. NGS technologies consist of shearing DNA molecules into collection of numerous small fragments, called a ‘library’, and their further extensive parallel sequencing. These sequenced overlapping fragments are called ‘reads’, they are assembled into contiguous strings. The contiguous sequences are in turn assembled into genomes for further analysis. Computational sequencing problems are those arising from numerical processing of sequenced samples. The numerical processing involves procedures such as: quality-scoring, mapping/assembling, and surprisingly, error-correction of a data. This paper is reviewing post-processing errors and computational methods to discern them. It also includes sequencing dictionary. We present here quality control of raw data, errors arising at the steps of alignment of sequencing reads to a reference genome and assembly. Finally this work presents identification of mutations (“Variant calling”) in sequencing data and its quality control.
Modern transplantology is in need of transplants. To solve this problem, the use of animal organs and tissues for grafting to humans (xenografts) has been proposed. However, the progress in this direction is hampered by the risk of zoogenous infection of recipients. With regard to economic and ethical criteria and to the anatomical and physiological similarity to humans, the pig is the best source of xenografts. The pig genome harbors type A porcine endogenous retroviruses (PERV), which can infect human cell lines in vitro. A population of Siberian minipigs was raised at the Institute of Cytology and Genetics just for xenografting. The goal of the present study is to analyze the copy numbers of PERV A in Siberian minipigs, their founder breeds Landrace and Large White, and wild boars. The copy numbers of PERVs have been determined by absolute measurement with SYBR Green dye. End-point dilutions of a sample with a known copy number have been used for reference. The PERV A copy numbers in standard samples of Siberian minipig DNA are 2.4, 3.6, and 4.9 per cell, which is consistent with data obtained by other scientists. Minipigs and wild boars show a significant difference in retrovirus copy numbers. Thus, the Siberian minipig genome has a considerable number of type A PERVs, conceivably pathogenic to humans. It is necessary to select animals with minimum PERV numbers in the genome for xenografting. The method of PERV A quantitation with SYBR Green allows detection of such animals and selection of Siberian minipigs for reduction of this index.
Analysis of regulatory sequences
On the basis of available data of ChIP-seq and ChIPchip experiments performed using antibodies against GAGA and CNC transcription factors, genome-wide binding mapping of these factors at hours 0–12 and 16–24 of Drosophila embryogenesis, as well as on white pre-pupae stage, was conducted. It was shown that the bulk of GAGA and CNC binding falls into promoter regions and introns, with the maximal density of peaks in the vicinity of the transcription start site. Moreover, in both 0–12 and 16–24 hour old embryos GAGA and CNC are frequently co-localized, while on white pre-pupae stage there is no co-localization of these factors on a genome–wide scale. In order to select a set of genes potentially co-regulated by GAGA and CNC, the study of their co-binding in annotated regulatory regions (promoter areas and segments corresponding to the 5’-UTR and 3’-UTR of mRNA) was performed. The results obtained clearly demonstrated that the sets of genes characterized by co-binding of both factors vary greatly at different stages. Thus from 353 genes with overlapped GAGA and CNC binding loci on the 0–12 hour old embryos and 611 genes on the 0–12 hour old embryos only 61 genes “belong” to both stages. For an explanation it is proposed that different sets of target genes are regulated by combinations of various GAGA and CNC isoforms, which are characterized by distinct expression patterns during drosophila embryogenesis. Functional annotation analysis of genes, in whose regulatory regions both GAGA and CNC were found at all investigated stages, demonstrates enrichment by genes controlling embryogenesis, neurogenesis and wing development. The data obtained suggest the interaction of GAGA and CNC during D. melanogaster embryogenesis.
Statistical features of the distribution of transcription factor binding sites in the mouse genome that are obtained by ChIP-seq experiments in embryonic stem cells have been considered. Clusters of sites that contain four or more different transcription factor binding sites in the mouse genome have been defined, also their location relatively to the regulatory regions of genes has been described. The presence of two types of site co-localization has been shown: clusters containing binding sites for factors Oct4, Nanog, Sox2, located in the distal regions, and clusters containing binding sites n-Myc, c-Myc, mainly located in the promoter regions of mouse genes. Analysis of new ChIPseq data about binding of transcription factors Nr5a2, Tbx3 in the same cell type has confirmed the division of clusters of transcription factors binding sites into two types: those containing the binding sites of regulators of pluripotency (Oct4, Nanog, and others) and those not. The computer program of the statistical data processing of gene location and chromatin domains that analyzes experimental data of site localization obtained by ChIP-seq in the mouse genome and the human genome has been developed. The presence of preferences at position of transcription factor binding sites of various types has been revealed, the distances between the nearest groups of TF binding sites Oct4, Nanog, Sox2 and TF binding sites n-Myc and c-Myc have been calculated using this program. The presence of nucleotide motifs of transcription factor binding sites in the selected areas of ChIP-seq has been estimated, nucleotide motifs have been refined. A correlation between the presence of motifs and the intensity of ChIPseq binding has been shown. Computer methods for estimating the clustering of different transcription factors binding sites for new data ChIP-seq have been developed. Programs are available upon the request to the authors.
It is known that the 5’ untranslated region (5’ UTR) mRNA characteristics can influence translation initiation efficiency and specificity. Previous knowledge about 5’ UTR characteristics was obtained theoretically and in vitro for mRNA of individual genes. It did not allow systematic analysis of mRNA translationally important parameters. To identify the above mentioned 5’ UTR characteristics, it is necessary to analyze their relationships with the translational activity of the corresponding mRNAs. Until recently, there were no experimental data on translation efficiency. Thanks to ribosome profiling technology, genome-wide experimental data of translation efficiency have been obtained for many eukaryotic mRNAs. Now it seems to be possible to reveal translationally important mRNA parameters and predict translation efficiency based on their nucleotide sequences. The aim of this study was to determine the translational significance of individual 5’ UTR characteristics in accordance with experimental ribosome profiling data. A statistical analysis was carried out for revealing relationships between the human and mouse mRNA nucleotide sequence characteristics and ribosome profiling data. Some of the mRNA parameters influencing translation efficiency were most significant, and the same trends for all three samples analyzed were revealed: a purine at start codon context position –3, upstream AUG presence and G+C complementary nucleotide concentration reduce translation efficiency; whereas gexonucleotides CCGCCA (5’ UTR) and AAGAAA, AAGAAG, AAGCAG, AAAAAG (CDS) increase translation efficiency. A toolkit that allows analyzing the importance of 5’ UTR characteristics and a program for prediction of translation efficiency were developed on the base of the BioUML platform.
SNP markers in biomedicine
The following heuristic hypothesis has been proposed: if an excess of a protein in several animal organs was experimentally identified as physiological marker of increased aggressiveness and if a polymorphism (SNP) can cause superexpression of the human gene homologous of the animal gene encoding this protein, then this polymorphism can be a candidate SNP marker of social dominance, whereas a deficient expression corresponds to subordinate and vice versa. Within this hypothesis, we analyzed 21 human genes –ADORA2A, BDNF, CC2D1A, CC2D1B, ESR2, FEV, FOS, GH1, GLTSCR2, GRIN1, HTR1B, HTR1A, HTR2A, HTR2C, LGI4, LEP, MAOA, SLC17A7, SLC6A3, SNCA, TH – which represent the functions of proteins known as physiological markers of aggressive behavior in animals: hormones and their receptors, biosynthetic enzymes and receptors of neurotransmitters, transcription and neurotrophic factors. These proteins may play an important role in determining hierarchical relationships in social animals. Using our previously developed Web-service SNP_TATA_Comparator (http://beehive.bionet.nsc.ru/cgi-bin/mgs/tatascan/start.pl), we analyzed 381 SNPs within the region of [–70; –20] relative to the start protein-coding transcripts, which is the region of the all known TATA-binding protein (TBP) binding sites. We took them from the database dbSNP, v.147 As a result, we found 45 and 47 candidate SNP markers of dominance and submission, respectively (e. g., rs373600960 and rs747572588). Within the framework of the proposed heuristic hypotheses and database dbSNP v.147, we found statistically significant (α < 10-5) evidence of the effects of natural selection against the deficient expression of genes, which can affect the predisposition to dominate, as well as in favor of both subordination and domination behavior as a norm of reaction of aggressiveness (difference not significant: α > 0.35). The proposed hypothesis, the candidate SNP markers predicted and the observed regularities of effects of natural selection for the human genome are discussed in comparison with published data: whether they can have any relation to social dominance in human. It was concluded that these results require experimental verification.
A new approach to the search for regulatory SNPs (rSNPs) based on the use of ENCODE project data on ChIP-seq and RNA-seq experiments was developed. The approach was successfully used for the detection of rSNPs associated with colorectal cancer susceptibility. To start out with, we used raw sequence data of 15 independent ChIP-seq experiments run on colorectal cancer cell line HCT-116, which allowed us to generate the initial pool of 7985 SNPs located in regulatory regions. For further selection of functional SNPs, we used the ChIP-seq binding bias analysis and revealed 775 SNPs that are more likely to influence transcription regulation in HCT-116 cells. Then the RNA-seq bias analysis in HCT-116 cells was performed. As a result, we confirmed the functionality of 231 SNPs, which were classified as rSNPs. In order to select rSNPs potentially associated with colorectal cancer we chose those in strong linkage disequilibrium with SNPs asso-ciated with this pathology according to GWAS and ClinVar data. Functional annotation analysis of genes containing the rSNPs selected confirmed the involvement of BAIAP2L1 and BUB3 genes in colorectal cancer predisposition. We also found two genes (RRAGD and FZD6) playing a role in the RAS/MAРK and WNT signaling pathways. Although the involvement of the RAS/MAРK and WNT pathways in colon cancer is a well-known fact, these two genes are still unknown candidates. Moreover, we found 14 new candidate genes with promise for further study of colorectal cancer predisposition.
Phylogenetics and evolution
CG-rich islands (CpG-islands, or CGI) are important functional elements in a genome of vertebrates. In particular, they: a) initiate transcription as promoters in most (> 50 %) genes of vertebrates, in some cases bi-directional, due to self-complement feature of cg dinucleotides; b) form a global methylation landscape; c) act as a transcription “switch” via methylation. The degenerate nature of CpG-island (elevated CG composition) implies an increase in the probability of tandem repeats and palindromes within CpG- island. This work is devoted to the identification of tandem duplications of complete CpG-islands, i. e. considering mega monomers of size 400–5 000 bp, in the human genome. We found a range of inter- and intragenic tandem duplications of CpG-islands. Intergenic CpGi duplication mediates through CG-rich telomeric satellites, as well as elements of the SINE. One of the most pronounced tandems are located in chromosome 19, known for its abundance of segment duplications and gene expansion. We also underline the unique genomic segment, which is DXZ4 mega satellite, in q arm of chromosome X, also falling into the category of CpG-islands which evolved by tandem duplications rounds.
Allopolyploid organisms can be formed by hybridization between closely related plant species with similar genomes. It is believed that many plant species have passed through allopolyploidization, which played a significant role in the formation of a huge diversity of plants, as well as their high adaptive capacity. Thanks to the whole genome sequencing of a wide range of angiosperm species and comparative analysis of genome structure, the sequence of events that formed the genomes of modern plant taxa was restored. These studies have shown that many diploid species have passed through more than one cycle of polyploidization-diploidization. The purpose of this review is to summarize the estimates of what proportion of genes is undergoing changes due to allopoly-ploidization and to illustrate the variety of mechanisms underlying the functional divergence of homoeologous copies (orthologous genes in allopolyploid subgenomes). Changes of individual copies can be associated with epigenetic features of the gene organization (the methylation status of the promoter region or the presence of copy-specific small interfering RNA) or can affect structure of the coding or regulatory regions of the gene. Studies on artificial allopolyploid plants showed widespread transcriptional dominance and change of the transcription level as compared with the genes of diploid parental forms. The study of the transcription of certain homoeologous gene copies allowed estimating the extent of the complete suppression of certain homoeologous genes in newly synthesized (0.4–5.0 %) and natural (30 %) allopolyploids. One the whole, full or partial suppression affects up to 49% of the wheat genes.
Systems biology and simulations
The external insect chitinous skeleton is unable to respond to stimuli; the external signals are received by specialized receptors. Drosophila perceives the tactile stimuli by its external sensory organs, the microchaetes and macrochaetes residing on the head and back (notum). The microchaetes (hairs) are numerous and arranged in perfect rows along the body. The macrochaetes (bristles) are rather few and are strictly positioned on the head and notum, being referred to as bristle pattern. Bristles act as mechanoreceptors, providing balance for flying drosophila. The proper bristle pattern of an adult fly develops through several stages. Its basic stage is formation of prepattern for the future bristles, represented by proneural clusters. The proneural clusters separate from the ectodermal cells in imaginal discs in the third instar larvae and early prepupae. They are induced by prepattern factors, identified with the transcription factors driving expression of their target genes in certain disc regions. Reconstruction of the gene network controlling prepattern development and its analysis are for the first time described as well as the principles underlying arrangement and function of this network. The hierarchical structure of the network, its key components, and regulatory circuits are identified. The network comprises 80 entities interconnected via 109 regulatory interactions. The key objects of the network, displaying the greatest connectivity with its other components, are the ASC proneural proteins encoded by the achaete and scute genes, and the proteins Decapentaplegic (Dpp) and Wingless (Wg). The structure of the network is hierarchical and has at least three control levels. The network acts as a gene ensemble owing to coordinated functioning of the regulatory circuits controlling activities of the corresponding genes both within and between the levels. The resulting effect of the network operation consists in activation of the AS-C, proneural genes, the expression of which distinguishes the cells of proneural cluster from the surrounding ectodermal cells.
Glaucoma is a chronic and progressive disease, which affects more than 60 million people worldwide. Primary open-angle glaucoma (POAG) is one of the most common forms of glaucoma. For example, about 2.71 million people in the USA had primary open-angle glaucoma in 2011. Currently POAG is a major cause of irreversible vision loss. In patients with treated open-angle glaucoma the risk of blindness reached to be about 27 %. It is known that the death of optic nerve cells can be triggered by mechanical stress caused by increased intraocular pressure, which induces neuronal apoptosis and is observed in patients with POAG. Currently, there is a large number of scientific publications describing proteins and genes involved in the pathogenesis of POAG, including neuronal apoptosis and the cell response to mechanical stress. However, the molecular- genetic mechanisms underlying the pathophysiology of POAG are still poorly understood. Reconstruction of associative networks describing the functional interactions between these genes/proteins, including biochemical reactions, regulatory interactions, transport, etc., requires the use of methods of automated knowledge extraction from texts of scientific publications. The aim of the work was the analysis of associative networks, describing the molecular-genetic interactions between proteins and genes involved in cell response to mechanical stress (CRMS), neuronal apoptosis and pathogenesis of POAG using ANDSystem, our previous development for automated text analysis. It was shown that genes associated with POAG are statistically significantly more often represented among the genes involved in the interactions between CRMS and neuronal apoptosis than it was expected by random reasons, which can be an explanation for the effect of POAG leading to the retinal ganglion cell death.
Studies of the last decade reveal a new sight on the possible link between aging processes and circadian rhythm. New data on the role of the NAD+-dependent histone deacetylase SIRT1 in the integration of regulation pathways for circadian rhythms and metabolism as well as data on a new function of the NAD+ as the ”metabolic oscillator” open a promising direction in this area. In the paper we suggested a modification and extension of the most detailed model for the circadian oscillator developed by Kim and Forger (2012). We included the additional feedback of the oscillator which concerns genes/proteins NAMPT, SIRT1, and also NAM, NAD+. The regulation of transcription for gene NAMPT by transcription factor CLOCK/BMAL1 determine the appropriate rhythm of mRNA and protein NAMPT expression. Since an enzyme product of this gene is a key in the pathway of biosynthesis and recycling of NAD+, therefore the circadian rhythm is also characteristic for the fluctuations in the level of this coenzyme and in the activity of NAD+-dependent histone deacetylase SIRT1. The deacetylation of circadian oscillator components by this enzyme closes the feedback mediated through this pathway. In particular, the effects of SIRT1 in circadian oscillator are the gain of degradation of protein Per2, increasing of the gene Bmal1 transcription, deacetylation of chromatin in regulatory regions of circadian oscillator genes in the E-boxes area with subsequent suppression of transcription. We took into account all of these processes in our extended model of the circadian oscillator. Based on the experimental data on the aging changes in the activity of SIRT1 and the level of NAD+, we attempted to study the effect of these age-related changes on the functioning of the circadian oscillator. Simulation data showed a decrease in expression level of several genes of the circadian oscillator, in particular, Bmal1 and Per2, in the older age groups. In addition, our extended model predicted an increase in the period of oscillations. The results indicate that decrease in SIRT1 activity deal with agerelated NAD+ metabolic disorder may be one of the reasons for the circadian oscillator dysfunctions in the suprachiasmatic nuclei. Such disorders may result in a breaking of the circadian rhythms in the body as a whole.
The nuclear protein poly (ADP-ribose) polymerase-1 (PARP-1) plays an important role in the signaling and repair of DNA. PARP-1 catalyses covalent binding of poly (ADP-ribose) polymers with itself as well as with other acceptor proteins using NAD+ as a donor of ADP-ribose. Inhibitors of poly (ADP-ribose) polymerase have been shown to be effective in improvement of radiation therapy and chemotherapy of cancer in clinical testing. Development of new poly (ADP-ribose) polymerase-1 inhibitors based on derivatives of natural compounds such as NAD+ represents a novel and promising strategy. The structure of complex of human poly (ADP-ribose) polymerase-1 with NAD+ can be a starting point for rational design of small molecule inhibitors based on NAD+ derivatives. Indeed there is no crystal structure of complex poly (ADP-ribose) polymerase-1 with nicotinamide adenine dinucleotide (NAD+) available yet. In this work using molecular modeling approaches we have predicted NAD+ binding modes with PARP-1 at the donor binding site of the catalytic domain. Using structures of PARP-1 homologs in complex with NAD+ we predicted pharmacophore restraints of NAD+ binding to PARP-1. Based on clustering of PARP-1 conformations in complex with co-crystallized inhibitors and predicted pharmacophore restraints, we proposed several possible models of NAD+ binding to PARP-1 at the donor binding site of the catalytic domain. According to the predicted models, two conformations of pyrophosphate group of NAD+ in complex with PARP-1 at the donor binding site are possible. Validation of the proposed models of NAD+ binding with PARP-1 can be achieved by quantitative structure-activity analysis of NAD+ derivatives. We designed two NAD+ derivatives, which can be used for validation of predicted NAD+ binding models.
Experimental systems biology
Plant leaf pubescence is one of the important features, which is responsible for microclimate formation near the epidermis. It is involved in protection against adverse biotic and abiotic environmental factors. In Solanaceae, to which belongs the potato Solanum tuberosum L., leaf pubescence appears as multicellular unbranched trichomes of diverse size and morphology. Pubescence of this plants promotes resistance to insect pests, in particular, Colorado potato beetle and aphid, which is a carrier of viral diseases. During the process of breeding and genetic experiments, there is a need to assess the intensity of leaf pubescence of potato plants. For this task, micrographs taken under a microscope are commonly used. They are used to count different types of trichomes on the leaf surface to characterize the intensity of potato leaf pubescence. This approach requires visual counting of trichomes under a microscope and is fairly laborious. This protocol describes a rapid technology for quantitative assessment of the characteristics of potato pubescence (the number of trichomes on the leaf surface and the average length of trichomes) to solve the problems of genetics and breeding of this plant. It consists of a preparation technology, digital imaging of leaf folds with an optical microscope in transmitted light and subsequent automatic processing of images using the LHDetect2 software.
Nonthermal effects of terahertz radiation on living objects are currently intensely studied, as more sources of this radiation type and devices employing it are being constructed. Terahertz radiation is increasingly used in security and inspection systems, medical and scientific appliances due to its low quant energy, which does not cause severe effects on organisms as other radiation types with higher quant energies do. The aim of this study was the identification of protein complexes participating in the response of the archaea Halorubrum saccharovorum H3 isolated from an extreme natural environment to terahertz radiation. We developed a microfluidic system for irradiation of bacterial and archaeal cultures with terahertz radiation and performed a 5-hour-long exposure of H. saccharovorum to terahertz radiation at a wavelength of 130 μm and a power density of 0.8 Wt per cm2 for 5 h. We identified under- or overexpressed proteins in response to terahertz radiation using 2D electrophoresis with subsequent MALDI-TOF mass spectrometry. A total of 16 differentially expressed protein fractions with at least 1.5-fold changes in expression level were detected. The obtained data suggest that Halorubrum cells respond to exposure to terahertz radiation by expression changes in gene products involved in translation regulation.
In this review, we summarize the latest data concerning the reactions of Escherichia coli to nonthermal terahertz radiation and the underlying molecular mechanisms. E. coli is the most simple and convenient model object for studying the effects of terahertz radiation: both its genetics and metabolism are well studied, and it is easily amenable to genetic engineering allowing one to create biosensors using promoters of genes activated by certain stress factors and the reporter GFP protein. Transformed E. coli cells containing biosensors can be used to visualize their reactions to terahertz radiation based on the intensity of GFP fluorescence. In this review, we present data on the response of certain E. сoli stress response systems to terahertz radiation obtained by us, as well as by other authors. We discuss experimental results for E. сoli/ pKatG-GFP, E. сoli/pCopA-GFP, and E. сoli/ pEmrR-GFP biosensors that are used to detect E. сoli genetic networks responding to oxidative stress, copper ion homeostasis failures, and antiseptics, respectively. The obtained data indicate that exposure to nonthermal terahertz radiation induces E. сoli gene networks of oxidative stress and copper ion homeostasis, but does not activate those responding to antibiotics, protonophores, or superoxide anions. The fact that E. сoli/pKatG-GFP and E. сoli/pCopA-GFP biosensors have different activation and reaction periods when exposed to terahertz radiation and natural inducers suggests that reactions of oxidative stress and copper ion homeostasis systems to terahertz radiation are specific.
HUMAN GENETICS
Retention of lactase activity in adulthood (lactase persistence) is one of the most important adaptive traits for human populations that consume fresh milk from domestic animals. At a molecular-genetic level, lactase persistence is determined by the presence of specific alleles of polymorphic sites in cis-regulatory elements of the LCT gene located on chromosome 2q21. Ascertainment of the molecular-genetic causes of lactase persistence has made this trait one of the most convenient for studying mechanisms of human population adaptation to environmental conditions. But the populations of many regions remain insufficiently investigated in relation to the genetic variability of the LCT loci. This paper presents the results of polymorphism analysis of loci, including the enhancer element for the LCT gene and its flanking regions, in two Turkic-speaking populations from southern Siberia, Altaian Kazakhs and Khakasses. It was found that the “European” allele LCT-13910T is the most characteristic of the Turkic-speaking populations from Altai-Sayan regions among all the polymorphic variants associated with lactase persistence. The expansion of the “European” allele LCT-13910T to the gene pool of the populations in southern Siberia could be related to migration waves of ancient herders form western Eurasia during the Bronze Age (in III – II millennium BC). A decrease of the LCT-13910T allele frequency and the total frequency of its carriers in the Turkic-speaking populations of southern Siberia in comparison with the majority of European populations and the Kazakhs from southern Central Asia can be attributed to: (1) a significant influence on the Altai- Sayan population’s gene pool by Eastern Eurasian populations, for which the LCT-13910T allele is rare; (2) a lesser adaptive significance of lactase persistence for south Siberian populations, compared to the populations of Europe. Rare and unique SNPs in the locus under consideration that were found in the Altaian Kazakhs (LCT-13895G > C and LCT-13927C > G) and Khakasses (LCT-14011C > T) potentially play a role in regulation of LCT gene expression, because they are located within the enhancer, regulating activity of its promoter.
The aryl hydrocarbon receptor (AhR), a ligand- activated transcription factor, participates in a wide range of critical cellular events in response to endogenous signals or xenobiotic chemicals. 2,3,7,8-tetrachlorodibenzo-para-dioxin (TCDD) is one of the AhR ligands with a very high binding affinity for the AhR. TCDD is the most toxic among the dioxin xenobiotics and induces a broad spectrum of biological responses, including immunotoxicity and cancer. The complex ligand:AhR:ARNT functions as a transcription factor, binding to the dioxin responsive element (DRE) sequences in the regulatory regions of target genes. Macrophages are key regulators of the innate immune response, as well as one of the first types of cells which respond to chemical stress, so the study of the action of TCDD on these cells is important. Putative DREs were predicted using the SITECON software tool in the regulatory regions of the genes encoding transcription factors REL, RELA and IRF1 expressed in macrophages. Nuclear extract and total RNA were isolated from U937 macrophages treated with 10 nM TCDD (or 0.1 % DMSO as a control) for 1, 3 and 6 hours. The binding of the TCDD:AhR:ARNT transcription complex from the nuclear extract with double-stranded oligonucleotides containing the putative DREs was studied by the EMSA. Isolated RNA was used for the study of the TCDD-mediated alteration of gene expression levels using Real-time PCR with SYBR Green I. Obtained data demonstrate the functional activity of DREs in the IRF1, REL, RELA gene promoters via AhR signaling pathway.
Ecological genetics
Siberian silk moth (Dendrolimus superans sibiricus) is a very dangerous pest of coniferous trees, in particular, larch and various pine species. Outbreaks of this pest lead to defoliation and forest destruction in a vast area of the Asian part of Russia. Many biological agents, such as viruses, pathogenic microorganisms and parasitoids, prevent the growth of Siberian silk moth population. Here we consider non-pathogen symbiotic Wolbachia bacteria, which are transovarially transmitted between specimens from mother to offspring. This symbiont has an ability to affect biology of its host. In theory, Wolbachia can prevent the growth of population size or induce it, which determines the focus of interest in Wolbachia-host investigation. Two samples from a Siberian silk moth population collected in 2014 and 2016 in Khabarovsk area were studied for Wolbachia infection. We found a high Wolbachia prevalence in the population of Siberian silk moth, in particular, the sample of 2014 was totally infected and the sample of 2016 had 90 % infected specimens. There were at least two distinct Wolbachia strains reveled by analysis of two loci from the MLST protocol, namely f tsZ-36, f bpA-4 and f tsZ-22, f bpA-9. In this study, a possible role of Wolbachia in the symbiotic association with Siberian silk moth and general ways of investigation of this symbiosis are discussed.
Fungi belong to the major plant pathogens and investigation of plant resistance genes is a quite important task. During the last years many wheat resistance genes were identified. However, the sequencing of the Triticum aestivum L. genome is still going on and the nucleotide sequences of most resistance genes are not yet known. In addition, the study of allelic variants of resistance genes is important for better understanding of the molecular mechanisms of their action. In this paper we present an information resource for accumulation of data on sequenced genes of wheat and its relatives providing resistance against diseases caused by fungal pathogens. The database (Pathogenesis-Related Genes, PRG) contains information on gene chromosomal localizations and functional activities, nucleotide sequences and single nucleotide polymorphisms associated with their effects. PRG provides data on the proteins encoded, pathogens and diseases, as well as on the resistance gene expression patterns in response to pathogen inoculations, exposure to hormones and various external stimuli. It also has cross-references with related entries from the databases on nucleotide sequences (GenBank) and proteins (UniProt). Information entered into the database is a result of the annotation of scientific publications and manual curation. Currently PRG compiles data on 75 allelic variants of 66 resistance genes. The PRG database was developed on the basis of the SRS (Sequence Retrieval System) platform. This system allows the use of complex queries and visualization tools and automatically generates www-interface with the information in table or text formats. PRG may be useful for researchers studying plant biology or breeding new plant cultivars resistant to fungal diseases. It is available at the address: http://srs6.bionet.nsc.ru/srs6bin/cgi-bin/wgetz?-page+top+-newId.
Mainstream technologies in genetics and cell biology
A zygote, the only totipotent cell of the developing organism, transforms into a complex, multicellular entity with billions (for humans) of highly specialized cells and tissues. Most adult tissues are maintained by a combination of highly dynamic processes of senescence, apoptosis and rejuvenation with new cells constantly arising from dispersed depots of stem cells. Studying individual cell fates and their intertwined relations thus aids in understanding ontogenetic development as well as pathogenic processes in the body. Direct observations of developing embryos uncovered the fate of single blastomeres of ascidia and nematode. In both cases, research benefited from the simplicity of these objects, because, in ascidia, each blastomere has unique signatures naturally, and, in nematode, transparency of a worm’s 959-cell body allows every cell to be traced through development individually. In most cases, however, studying cellular lineages and identification of stem cells’ subpopulations are a true challenge for investigators. To trace the cell’s fate, novel methods were invented that introduce special tags into cells, the tags that would be inherited during cell divisions. Every descendent of a marked cell bears the same tag and can easily be distinguished from unrelated cellular neighbors. This review focuses on modern methods for cell tracing with dyes and genetic constructs encoding protein reporters that mark cell lineages. Special focus is on genome-integrated tags (genetic labeling), such as viral and cellular barcoding. One chapter of the review describes novel advancements in the field of CRISPR/Cas9-based cellular barcoding.
Transgenesis has become a routine for modern biological studies. The most popular method for producing transgenic animals–pronuclear microinjection–frequently leads to host gene disruption due to a random transgene integration. In this paper, we report our analysis of morphophysiological parameters of the transgenic mouse line GM9, in which a transgene designed for milk-specific expression of the human granulocyte-macrophage colony-stimulating factor (GM-CSF) gene was integrated into the intron of the Contactin 5 gene (Cntn5). We studied Cntn5 expression with RT-PCR and discovered that its expression in the brain, the primary organ of Cntn5 activity, was unperturbed. However, transgenic animals had less Cntn5 transcripts in other tissues such as the kidney and heart. In addition, we observed a decreased amount of splice variants of Cntn5 exons that flank the transgene integration site. These data suggest that the transgene integration event might affect proper Cntn5 splicing in some tissues. Publications exist that imply that some polymorphisms in the Cntn5 gene are associated with obesity and arterial hypertension in humans. We evaluated core parameters of lipid metabolism and heart activity in mice homozygous and heterozygous for Cntn5 mutation using wild- type animals as control. Our results uncovered that homozygous mutant mice have lower body weight than controls and that it is caused by slower accumulation of fat tissue. Cntn5 mutants also exhibit abnormalities in blood circulation: homozygous Cntn5 mutants are characterized by a higher blood pressure and heart beat rate, as well as faster blood flow in the tail vessels. Heterozygous animals showed intermediate results for all of these parameters.
Embryonic stem cells are commonly used for generation of transgenic mice. Embryonic stem cells could participate in the development of chimeric animals after injection into a blastocyst. Injection of genetically modified embryonic stem cells could lead to germ line transmission of a transgene or genomic modification in chimeric mice. Such founders are used to produce transgenic lines of mice. There are several projects dedicated to production of knock-out mouse lines (KOMP Repository, EUCOMM, Lexicon Genetics). Never-theless, there is a need for complex genome modifications, such as large deletions, reporter genes insertion into the 3’ gene regulatory sequence, or site-specific modifications of the genome. To do that, researchers need an embryonic stem cell line that is able to participate in chimeric animal formation even after prolonged culture in vitro. Several lines of mouse embryonic stem cells were produced in the Laboratory of Developmental Genetics of the Institute of Cytology and Genetics SB RAS. We tested DGES1 cell line (2n = 40, XY) (129S2/SvPasCrl genetic background) for chimeric mice production at the Center for Genetic Resources of Laboratory Animals at ICG SB RAS. Embryonic stem cells were injected into 136 blastocysts (B6D2F1 genetic background), which were transplanted into CD-1 mice. Among 66 progeny, 15 were chimeric, 4 of which were more than 80 % chimeric judged by coat color. All chimeras were males without developmental abnormalities. 10 of 15 males were fertile. Microsatellite analysis of the progeny of chimeric mice revealed embryonic stem cell line DGES1 contribution to the gamete formation. Thus, a novel DGES1 embryonic stem cell line could be efficiently used for transgenic mouse production using B6D2F1 blastocysts and CD-1 recipients.
Over the past few years, the CRISPR/Cas techniques have become a revolution in genome editing. Since the original paper on CRIPSR/Cas9 genome editing, researches have proposed numerous modifications of the key components of the CRISPR/Cas9 system to make it extremely efficient. Nowadays, CRISPR/Cas systems can be used not only to modify genomes, but also to control expression levels of defined genes, visualize loci of interest in the space of living cell nuclei, change methylation status of mammalian CpG sites, and to serve many other purposes. Due to an extremely high efficacy and ease of usage, the CRISPR/ Cas system has been employed in a large number of studies in various areas of biology and biotechnology. We have recently published a review describing various CRISPR/Cas systems, mechanisms of their functioning, and applications of the techniques in details. Despite the broad range of potential applications of CRISPR/Cas systems, they are mostly used for genome editing. And, however simple the system may be, there is a number of potential pitfalls on the way towards its use in CRISPR/Cas- naïve laboratory settings. In this article, we describe protocols of CRISPR/Cas9 system generation. We start with a short description of theoretical aspects underlying Cas9-mediated genome editing. Next, we describe a step-by-step protocol of guide RNA vector design and assembly, and several ways of qualitative and quantitative evaluations of the system. Finally, we report protocols of genome editing for modification of embryonic stem cells and zygotes.
Current progress in cell biology is connected with the development of somatic cell reprogramming technology. As a result of this technology, it is possible to produce induced pluripotent stem cells (iPSCs) from human somatic cells, for instance, from skin cells. As well as embryonic stem cells, these iPSCs possess pluripotency. Production of iPSCs opened new horizons for patient-specific cell therapy. Many researchers consider iPSCs a real basis for future regenerative medicine. Production of a patient’s iPSCs, their differentiation into somatic cells, and subsequent transplantation to a patient would allow them to avoid immunological rejection. In addition, a recently developed technology of directed genome modification, CRISPR/Cas, allows correction of genetic mutations in iPSCs. Thus, genetic mutations could be corrected in vitro, and after differentiation into a desired cell type, these cells could be transplanted to a patient. In addition, CRISPR/Cas could be used to introduce practically any mutations into iPSCs for the creation of disease-specific model cell lines that would facilitate disease mechanism studies and pharmaceutical drug testing. It is possible to turn off any gene or genes as well as to insert a genetic construct into a selected genomic region to temporarily turn on and off genes and remove chromosomal regions. Cell banks that are open to general use are necessary for efficient usage of iPSCs in biomedical research. Currently, there are no pluripotent stem cell lines in Russian Federation cell banks. Moreover, it is essential to develop standardized practice of culture and storage of that cell type. This mini-review focuses on the necessity of the creation of a pluripotent stem cell bank in the Russian Federation, a detailed description, and a recommended protocol for cell line deposition and usage.
In the world of today, virotherapy is one of the rapidly developing areas in the treatment of cancer, and its advantage is selective destruction of cancer cells with minimizing the destructive effect on normal cells of the body. A promising basis for the creation of oncolytic drugs is orthopoxviruses, which have a number of advantages over other viral vectors, and one of these advantages is a large capacity of the genome, which allows genes encoding proteins with antitumor properties to be cloned into their genome. In this study, we compared the replicative properties of ten variants of vaccinia virus (the strain LIVP of VACV) using human glioblastoma cell culture; some of these viruses have additional genes, such as the gene encoding granulocyte-macrophage colony stimulating factor, gene encoding apoptosis-inducing protein TRAIL and gene encoding green fluorescent protein. Furthermore, the virus with five virulence genes deleted (genes encoding hemagglutinin, γ-interferonbinding protein, thymidine kinase, complementbinding protein and Bcl2-like inhibitor of apoptosis), which has significantly lower reactogenicity and neurovirulence compared to the original strain LIVP of VACV, was studied. These data suggest that variants of vaccinia virus with a defective gene encoding thymidine kinase most actively replicate in glioblastoma cell culture.
Modeling of disorders
Because the renin-angiotensin system (RAS) has a wide range of opportunities in the regulation of fluid and electrolyte balance and arterial pressure, it is currently hypothesized that alterations in systemic circulating or local tissue RAS are some of the most important pathogenetic factors in the development of essential hypertension. The aim of the study was to investigate circulating and local tissue RAS activities in ISIAH rats with stress-induced arterial hypertension. We estimated the serum levels of renin, the angiotensin-converting enzyme, angiotensin II and aldosterone by an enzymelinked immunosorbent assay, and mRNA expression of RAS genes in kidney, adrenals and brain tissues was measured by the real-time polymerase chain reaction. The mRNA expression of the renin gene (Ren) in the ISIAH rats was significantly decreased as compared to the normotensive WAG rats, but plasma renin concentrations had no difference. At the same time, the serum levels of angiotensin II and aldosterone in the ISIAH rats were enhanced, which suggests the existence of an ectopic site of angiotensin synthesis. Expression of RAS genes in the adrenals of hypertensive rats was unchanged. By contrast, a significant increase of RAS genes expression was found in the brain tissues. The mRNA of the Ren gene was increased in the hypothalamus, and the mRNA of Ace gene was increased in the brain stem of the ISIAH rats. This may be indicative of a local increase of RAS activity in the brain tissues of ISIAH rats. Nevertheless, the results of the study define ISIAH rat strain as a model of human low-renin hypertension.
Autism spectrum disorders are a separate group of defects with a very high genetic component. Genetic screening has identified hundreds of mutations and other genetic variations associated with autism, and bioinformatic analysis of signaling pathways and gene networks has led to understanding that many of these mutational changes are involved in the functioning of synapses. A synapse is a site of electrochemical communication between neurons and an essential subunit for learning and memory. Interneuronal communicative relationships are plastic. The most prominent forms of synaptic plasticity are accompanied by changes in protein biosynthesis, both in neuron body and in dendrites. Protein biosynthesis or translation is a carefully regulated process, with a central role played by mTOR (mammalian or mechanistic target of rapamycin). Normally mTOR-regulated translation is slightly inhibited, and in most cases mutational damage to at least one of the links of the mTOR signaling pathway, increases translation and leads to impaired synaptic plasticity and behavior. Deregulation of the local translation in dendrites is connected with the following monogenic autism spectrum disorders: neurofibromatosis type 1, Noonan syndrome, Costello syndrome, Cowden syndrome, tuberous sclerosis, fragile X chromosome, syndrome, and Rett syndrome. The review considers the most important mutations leading to monogenic autism, as well as the possibility of a mechanism-based treatment of certain disorders of the autism spectrum.