Optogenetic regulation of endogenous gene transcription in mammals

Despite the rapid development of approaches aimed to precisely control transcription of exogenous genes in time and space, design of systems providing similar tight regulation of endogenous gene expression is much more challenging. However, finding ways to control the activity of endogenous genes is absolutely necessary for further progress in safe and effective gene therapies and regenerative medicine. In addition, such systems are of particular interest for genetics, molecular and cell biology. An ideal system should ensure tunable and reversible spatio-temporal control over transcriptional activity of a gene of interest. Although there are drug-inducible systems for transcriptional regulation of endogenous genes, optogenetic approaches seem to be the most promising for the gene therapy applications, as they are noninvasive and do not exhibit toxicity in comparison with druginducible systems. Moreover, they are not dependent on chemical inducer diffusion rate or pharmacokinetics and exhibit fast activation-deactivation switching. Among optogenetic tools, long-wavelength light-controlled systems are more preferable for use in mammalian tissues in comparison with tools utilizing shorter wavelengths, since far-red/near-infrared light has the maximum penetration depth due to lower light scattering caused by lipids and reduced tissue autofluorescence at wavelengths above 700 nm. Here, we review such light-inducible systems, which are based on synthetic factors that can be targeted to any desired DNA sequence and provide activation or repression of a gene of interest. The factors include zinc finger proteins, transcription activator-like effectors (TALEs), and the CRISPR/Cas9 technology. We also discuss the advantages and disadvantages of these DNA targeting tools in the context of the light-inducible gene regulation systems.


Introduction
The dynamic nature of gene transcription is an essential process for the cellular programming, homeostasis, environmental adaptation, development, and behavior of live organisms.Therefore, design of new approaches for the gene expression regulation is important for the development of gene therapy.A variety of drug-inducible (e. g. tetracycline and rapamycin) systems and light-sensitive tools enabling a control of exogenous reporter genes are known (Gossen, Bujard, 1992;Gossen et al., 1995;Rivera et al., 1996;Yazawa et al., 2009;Kennedy et al., 2010;Wang et al., 2012;Müller et al., 2013;Kaberniuk et al., 2016;Redchuk et al., 2017Redchuk et al., , 2018)).However, the development of technologies that provide modulation of transcription in mammalian endogenous genome seems to be more challenging.Recently, some approaches allowing to control transcription of endogenous genes were designed.They are based on zinc finger proteins, transcription activator-like effectors (TALEs), and the clustered, regularly interspaced short palindromic repeat (CRISPR)/CRISPRassociated 9 (Cas9) technology.The first two strategies require engineering of proteins with unique DNA-binding specificity.However, the protein engineering protocols are laborious and time-consuming.Additionally, such proteins are not always effective and often exhibit modest activities.In contrast to zinc finger-and TALE-based synthetic proteins, Cas9-based transcription factors are targeted to DNA sequences by guide RNA molecules (Jinek et al., 2012).The ability to define the sequence specificity of transcription factors by simple exchange of short targeting sequences provides an opportunity to regulate expression of numerous genes without the need for laborious protein engineering approaches.
Here, we review the light-controlled systems for regulation of endogenous gene expression based on zinc finger proteins, TALEs, and nuclease-inactive Cas9 (dCas9).Unlike druginducible tetracycline and rapamycin systems, optogenetic systems are non-toxic, exhibit fast activation-deactivation switching and enable noninvasive control of animal physiology and behavior.Among all existing optogenetic systems, far-red-and near-infrared-light-controlled tools are favorable compared to tools utilizing shorter wavelengths, since longwavelength light has the maximum penetration depth due to its low absorbance by hemoglobin, melanin, and water in mammalian tissues.

Systems based on zinc finger transcription factors
Zinc finger proteins are a major class of DNA-binding proteins that recognize specific DNA target sequences by a one-to-one non-covalent interaction between individual amino acids from the recognition helix to individual nucleotide bases (Pavletich, Pabo, 1991;Klug, 2010).Since the separate zinc fingers function as independent modules, fingers with different specificity can be linked in series to generate arrays that recognize long DNA sequences.A lot of zinc finger domains targeting different sequences were engineered by rational design or high-throughput selection (Berg, 1988;Choo, Klug, 1994a, b).
A first zinc finger-based system for transcription regulation of endogenous genes was described in 1994 (Choo et al., 1994), when a three-finger peptide binding specifically to a unique 9-bp DNA sequence was created.This peptide repressed expression of p190 Bcr-Abl oncogene due to a transcriptional blockage imposed by the sequence-specific binding of the peptide.Later, functional transcription factors were made by fusion of zinc finger DNA-binding modules with appropriate effector domains.For instance, synthetic transcriptional repressors were engineered by fusion of zinc fingers to a KOX1 repression domain from the KRAB zinc finger family (Papworth et al., 2003;Reynolds et al., 2003).To activate transcription of the genes of interest, fusions of the zinc finger modules to transcriptional activation domains such as NF-κB p65, herpes simplex virus VP16 and its tetrameric repeat VP64 were used (Beerli et al., 2000;Liu et al., 2001;Rebar et al., 2002).To allow the temporal regulation of регуляция генов и геномов / genome and gene regulation zinc finger transcription factor's activity, their drug-inducible versions were created by adding a progesterone receptor ligand-binding domain (Dent et al., 2007), estrogen receptor homodimers or retinoid X receptor-α/ecdysone receptor heterodimers (Magnenat et al., 2008).
In addition to the constitutive and chemically-induced approaches, a system of a light-inducible transcription using engineered zinc finger proteins (LITEZ) was designed (Polstein, Gersbach, 2012, 2014).This system utilizes two blue light-inducible dimerizing proteins from Arabidopsis thalia na, Gigantea (GI) and the LOV domain of the FKF1 protein (Fig. 1).
The LOV domain was fused to three repeats of the VP16 activation domain and GI was linked to a zinc finger protein and a nuclear localization signal (NLS).Two four-finger proteins and one six-finger protein well-characterized in the previous studies (Beerli et al., 2000;Perez et al., 2008) were used.A GFP reporter integrated into the genome of HEK293T cells was used as a model of endogenous target gene.Blue light illumination (450 nm) initiated heterodimerization between GI and LOV, which resulted in the translocation of the LOV-VP16 fusion to the zinc finger protein target sequence and subsequent transcription activation of the endogenous GFP reporter.After 30 h of pulsing blue light illumination, a 4-fold increase of GFP positive cell count in the illuminated samples (~16 % GFP positive cells) was observed compared to control samples incubated in darkness (~4 % GFP positive cells).Additionally, the authors detected 30 % increase of the mean GFP fluorescence in illuminated cells compared to the control ones.Finally, it was shown that light-induced transcriptional activation by LITEZ is reversible and repeatable by modulation of the illumination time (Polstein, Gersbach, 2012, 2014).

Systems based on transcriptionactivator-like effectors
The TALEs of the plant bacterial patho gen Xanthomonas represent another class of modular DNA-binding proteins.They recognize DNA by highly conserved tandem repeats, each 33-35 amino acids in length (Moscou, Bogdanove, 2009).These repeats specify nucleotides via unique repeat-variable diresidues (RVDs) at amino acid positions 12 and 13.There is a strong correlation between RVDs and the corresponding nucleotide in the TALE-binding site (Boch et al., 2009).The presence of this association allows to design the sequencespecific DNA-binding proteins, similarly to the construction of zinc finger transcription factors.
Since cloning of new TALE variants is challenging due to a large number of repeat domains, a hierarchical ligation-based strategy was developed to overcome this problem (Zhang et al., 2011).To generate TALE-based artificial transcription factors, VP16 and VP64 activation domains as well as SID and KRAB repression domains were fused to TALEs (Geissler et al., 2011;Miller et al., 2011;Cong et al., 2012).Smallmolecule-inducible TALE transcription factors were also described.For instance, a 2-3-fold upregulation of target icam1 gene was observed in HeLa cells transfected with plasmids encoding TALE fusions with the ligand-binding domain of the chimeric single-chain retinoid X receptor-α/ecdysone receptor in response to ponasterone A treatment (Mercer et al., 2014).
In addition to the described above TALE-based systems, an optogenetic two-hybrid system of light-inducible transcriptional effectors (LITEs) was developed (Konermann et al., 2013).The LITE system consists of two components: a TALE fused to the blue-light-sensitive cryptochrome 2 (Cry2) protein from A. thaliana and the interacting partner of Cry2, the CIB1 protein, fused to VP64 activation domain.Blue light (466 nm) triggers heterodimerization of Cry2 and CIB1, recruiting VP64 transcription activation domain to the promoter of the target gene (Fig. 2).A panel of different LITEs was developed and applied to upregulate expression of 28 genes (Hat1,Sirt1,Mchr1,Htr1b,etc.) in cultured mouse primary neurons.As a result, a 1.5-30-fold increase in mRNA levels was observed upon blue light (466 nm) illumination for 24 h compared to darkness (Konermann et al., 2013).Additionally, LITE system was introduced into cultured mouse primary cortical neurons to control expression of the Grm2 gene, which resulted in an approximately 7-fold increase of its mRNA level after 24 h of blue light illumination compared to darkness.Moreover, application of the LITE system in mouse prefrontal cortex in vivo caused 2-fold increase in Grm2 mRNA level after 12 h light stimulation (473 nm) compared to GFP-only controls (Konermann et al., 2013).
Modification of this system, LITE2.0,allowed a six-fold reduction of the background Neurog2 activation compared with the original design, resulting in the 20-fold induction of the Neurog2 transcription level under 12 h blue light (466 nm) illumination in Neuro 2a cells compared to darkness (Konermann et al., 2013).Further, TALE-based epigenetic modifier (epiTALE) was developed by fusing Cry2 with NLS and the four repeats of repressive histone effector SID, whereas TALE was linked with NLS and CIB1.As a result, a two-fold light- mediated transcriptional repression of Grm2 accompanied by a two-fold reduction in H3K9 acetylation at the targeted Grm2 promoter was observed in neurons after 24 h light stimulation.
A set of 32 variants of epiTALE system, containing different repressive histone effector domains (e. g., histone deacetylases, methyltransferases, acetyltransferase inhibitors) was also developed.As a result, 23 variants of epiTALE system caused a 2-3-fold repression of the Grm2 gene in primary neurons and 20 epiTALEs led to a 1.5-2-fold repression of the Neurog2 gene in Neuro 2a cells after 24 h and 12 h of blue light (466 nm) illumination, respectively (Konermann et al., 2013).

Systems based on the CRISPR/Cas9 technology
The third type of systems for transcriptional regulation of endogenous genes in mammals is based on the CRISPR/ Cas9 technology that consists of the Cas9 nuclease protein and a single guide RNA (sgRNA) allowing the nuclease to bind a specific DNA sequence through RNA-DNA base pairing (Sternberg et al., 2014).The most commonly used Cas9 from Streptococcus pyogenes requires a 5ʹ-NGG protospaceradjacent motif (PAM) immediately adjacent to a 20-nt DNA target sequence in the genome (Nishimasu et al., 2014).
A protein dCas9 lacking the nuclease activity was used for the programmable RNA-guided transcriptional regulation of diverse human genes.The first system utilizing the CRISPR/ Cas9 technique for transcription regulation instead of genome editing in mammals was described in 2013 (Qi et al., 2013).After that, several studies have shown that dCas9 fused with the effector domains allows repression or activation of target genes (Gilbert et al., 2013;Maeder et al., 2013;Perez-Pinera et al., 2013).
To enable precise spatiotemporal control of gene expression, a targeted photoactivation system based on dCas9 was developed (Nihongaki et al., 2015).This system consists of photolyase homology region (PHR) of the blue light-sensitive Cry2 protein fused with the VP64 transcriptional activator domain and dCas9 fused with the Cry2 binding partner, CIB1.Upon blue light (470 nm) illumination, Cry2 and CIB1 form heterodimers, resulting in recruitment of the VP64 effector domain to the target gene and, consequently, in transcriptional activation of the latter (Fig. 3, a).
To select the most effective light-inducible transcription CRISPR/Cas9 system, several variants of constructs, differed by type of activator domain (VP64 or p65), the quantity and location of NLSs and size of the CIB1 protein (full-length or truncated) were tested.As a result, a combination of NLS-dCas9-CIB1(Δ308-334) and NLS×3-Cry2PHR-p65 fusion proteins was selected (Nihongaki et al., 2015).This optimized system was applied to HEK293T cells for the light-induced expression of the endogenous ascl1 gene.Expression of multiple sgRNAs in a single cell enabled synergistic light-induced activation of the gene (~50-fold in the lit state compared to darkness) in contrast to the usage of individual sgRNAs (up to 10-fold).It was shown that 3 h of blue light (470 nm) illumination was enough for an approximately ten-fold induction of the ascl1 transcription.Light-induced activation of transcription by this system was also reversible and repeatable.Additionally, an opportunity of multiplexed photoactivation of different genes was demonstrated.To achieve that, HEK293T cells were co-transfected with the constructs NLS-dCas9-CIB1(Δ308-334), NLS×3-Cry2PHR-p65 and multiple sgRNAs, targeting the myod1, nanog and il1rn genes.As a result, a 3-1000-fold increase in mRNA levels of these регуляция генов и геномов / genome and gene regulation genes was observed, confirming that this system can be used for multiplexed photoactivation of user-defined endogenous genes (Nihongaki et al., 2015).
A similar strategy, light-activated CRISPR/Cas9 effector (LACE) system, was developed for the dynamic regulation of endogenous genes (Polstein, Gersbach, 2015).An optimized LACE system consists of two components: CIBN-dCas9-CIBN, where CIBN is the N-terminal fragment of CIB1, and Cry2-VP64.An application of this system in HEK293T cells resulted in an 11-and 400-fold upregulation of the il1rn gene transcription compared to darkness after 2 h and 30 h of blue light (450 nm) illumination, respectively.This system was also applied for simultaneous photoactivation of multiple human genes (hbg1/2, il1rn and ascl1) in HEK293T cells.As a result, illuminated cells had significantly greater mRNA levels of the studied genes in the lit state than in darkness (Polstein, Gersbach, 2015).
One more system utilizing dCas9 to downregulate transcription at endogenous genome loci was developed (Pathak et al., 2017).This system is based on the previously described light-inducible clustering of Cry2-tagged proteins (Ozkan-Dagliyan et al., 2013), resulting in functional loss of their activity.This clustering property of Cry2 was used to block transcription of the targeting gene with light.To achieve that, the Cry2-dCas9-VP64 fusion protein and sgRNAs, targeting the human il1rn promoter, were designed.In darkness, a strong induction of the il1rn gene was observed in HEK293T cells co-transfected with plasmids encoding Cry2-dCas9-VP64 and sgRNAs.Blue light (450 nm) exposure for three days caused a four-fold reduction of the il1rn transcription due to formation of clusters consisting of multiple Cry2-dCas9-VP64 fusion proteins that are unable to bind the promoter region of the il1rn gene (Pathak et al., 2017).
Recently, a multicomponent far-red light (FRL)-activated CRISPR/dCas9 effector (FACE) system, inducing transcription of target genes in the presence of FRL stimulation, engineered (Shao et al., 2018).This system is based on using synthetic bacterial FRL-activated cyclic diguanylate monophosphate (c-di-GMP) synthase BphS (Ryu, Gomelsky, 2014), which converts GTP into c-di-GMP (see Fig. 3, b).Increased production of c-di-GMP causes dimerization of the FRL-dependent transactivator p65-VP64-NLS-BldD, where BldD is a transcription factor from Streptomyces coelicolor which is non-active in a monomer state.Active BldD binds to its chimeric promoter P FRLx , resulting in initiation of the MS2-p65-HSF1 transactivator expression.Then, MS2 fused to transactivation domains p65 and HSF1 is recruited by sgRNAs bearing the MS2 binding site, causing an induction of endogenous gene expression.Application of the FACE system for the separate and multiplexed regulation of the ttn, il1rn, ascl1, and rhoxf2 genes in HEK293 cells resulted in a high (about 100-450-fold) increase of relative mRNA levels in the lit state compared to darkness.Photoactivation of the endogenous ascl1 gene in HEK293 cells, containing the FACE system and implanted into the dorsum of mice, resulted in a 195-fold increase of its mRNA level.Additionally, tibialis posterior muscles of mice were electroporated with plasmids encoding components of the FACE system targeting the promoter regions of the lama1 and fst genes.Subsequent illumination with FRL caused a 2-and 5-fold induction of the lama1 and fst transcription, respectively, compared to the control kept in darkness.Moreover, the FACE system was used to initiate functional neural differentiation of mouse induced pluripotent stem cells by FRL-induced activation of the neurog2 gene (Shao et al., 2018).

Conclusion
A control of the expression of genes of interest requires the development of molecular tools that precisely recognize specific DNA sequences in the context of the genome.Over the past 20 years, three main methods for design of synthetic transcription factors recognizing any desired target DNA sequence were developed.Although the first two methods based on the usage of zinc finger proteins and TALEs have been widely successful for many applications, development of CRISPR/ Cas9-based technology was like a breakthrough.The main advantage of this technology is the lack of a laborious cloning to obtain a site-specific DNA binding protein.Additionally, this system enables simultaneous multiplexed control of user-defined endogenous genes.Combination of the CRISPR/ Cas9-based transcription system with the blue-light-induced proteins allows rapid and reversible target gene activation by blue light (Nihongaki et al., 2015).However, application of long wavelength light is more preferable due to the maximum penetration depth in mammalian tissues.In this respect, the recently described FACE system that combines the CRISPR/ Cas9 technology with FRL-activated bacterial photoreceptor BphS (Shao et al., 2018) is of special interest.Despite the fact that this system is complex and multicomponent, it seems to be a good start point for the development of reversible and tunable approaches for transcription regulation of endogenous genes for the subsequent safe medical applications in humans.

Fig. 1 .
Fig. 1.The principle of the LITEZ system.A synthetic protein consisting of the four zinc finger domains (ZF) fused with the Gigantea (GI) protein and a NLS binds its recognition site upstream of the target gene.GI and LOV form heterodimers upon blue light (450 nm) illumination, resulting in the translocation of the transcriptional activation domain VP16 to the promoter of the target gene and transcription activation.Red region in DNA helix indicates zinc finger protein binding site.Asterisks denote intrinsic NLSs in GI and LOV.

Fig. 2 .
Fig. 2. The principle of the LITE system.A TALE-Cry2 fusion protein binds to its recognition site upstream of the gene of interest.Blue light (466 nm) illumination induces heterodimerization between Cry2 and CIB1, translocating the VP64 effector domain to the target promoter.Red region in DNA helix indicates TALE binding site.Asterisks denote intrinsic NLSs in Cry2 and CIB1.

Fig. 3 .
Fig. 3.The principle of the CRISPR/Cas9-based photoactivatable transcription systems.(a)A fusion protein dCas9-CIB1 directed by sgRNA binds the promoter region of the gene of interest.Under blue light (470 nm) illumination, CIB1 forms heterodimers with photolyase homology region (PHR) of Cry2, causing recruitment of VP64 transcriptional activation domain to the promoter region of the target gene; (b) The FACE system consists of the bacterial photoreceptor BphS, which is activated by far-red light (730 nm) and converts GTP into c-di-GMP.c-di-GMP is required for dimerization of the BldD transcription factor, resulting in its activation.The activated BldD transcription factor can turn on expression of the transgene encoding the MS2-NLS-p65-HSF1 fusion protein.The latter is recruited to the MS2 binding site located within sgRNA, inducing transcription activation of the target gene.