Метод поиска структурной гетерогенности сайтов связывания транскрипционных факторов с использованием альтернативных de novo моделей на примере FOXA2
https://doi.org/10.18699/VJ21.002
Аннотация
Об авторах
А. В. ЦукановРоссия
Новосибирск
В. Г. Левицкий
Россия
Новосибирск
Т. И. Меркулова
Россия
Новосибирск
Список литературы
1. Bailey T.L., Elkan C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. In: Proc. Int. Conf. Intell. Syst. Mol. Biol. 1994;2:28-36. DOI citeulike-article-id:878292. PMID 7584402.
2. Benos P.V., Bulyk M.L., Stormo G.D. Additivity in protein-DNA interactions: how good an approximation is it? Nucleic Acids Res. 2002;30(20):4442-4451. DOI 10.1093/nar/gkf578.
3. Bi Y., Kim H., Gupta R., Davuluri R.V. Tree-based position weight matrix approach to model transcription factor binding site profiles. PLoS One. 2011;6(9):e24210. DOI 10.1371/journal.pone.0024210.
4. Bulyk M.L., Johnson P.L.F., Church G.M. Nucleotides of transcription factor binding sites exert interdependent effects on the binding affinities of transcription factors. Nucleic Acids Res. 2002;30(5):1255-1261. DOI 10.1093/nar/30.5.1255.
5. Chen X., Wei H., Li J., Liang X., Dai S., Jiang L., Guo M., Qu L., Chen Z., Chen L., Chen Y. Structural basis for DNA recognition by FOXC2. Nucleic Acids Res. 2019;47(7):3752-3764. DOI 10.1093/nar/gkz077.
6. Chèneby J., Ménétrier Z., Mestdagh M., Rosnet T., Douida A., Rhalloussi W., Bergon A., Lopez F., Ballester B. ReMap 2020: a database of regulatory regions from an integrative analysis of Human and Arabidopsis DNA-binding sequencing experiments. Nucleic Acids Res. 2020;48(D1):D180-D188. DOI 10.1093/nar/gkz945.
7. Eggeling R., Grosse I., Grau J. InMoDe: tools for learning and visualizing intra-motif dependencies of DNA binding sites. Bioinformatics. 2017;33(4):580-582. DOI 10.1093/bioinformatics/btw689.
8. Farnham P.J. Insights from genomic profiling of transcription factors. Nat. Rev. Genet. 2009;10(9):605-616. DOI 10.1038/nrg2636.
9. Furey T.S. ChIP-seq and beyond: new and improved methodologies to detect and characterize protein-DNA interactions. Nat. Rev. Genet. 2012;13(12):840-852. DOI 10.1038/nrg3306.
10. Gheorghe M., Sandve G.K., Khan A., Chèneby J., Ballester B., Mathelier A. A map of direct TF-DNA interactions in the human genome. Nucleic Acids Res. 2019;47(4):e21. DOI 10.1093/nar/gky1210.
11. Gupta S., Stamatoyannopoulos J.A., Bailey T.L., Noble W.S. Quantifying similarity between motifs. Genome Biol. 2007;8(2):R24. DOI 10.1186/gb-2007-8-2-r24.
12. Heinz S., Benner C., Spann N., Bertolino E., Lin Y.C., Laslo P., Cheng J.X., Murre C., Singh H., Glass C.K. Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. Mol. Cell. 2010;38(4):576-589. DOI 10.1016/j.molcel.2010.05.004.
13. Ignatieva E.V., Oshchepkov D.Y., Levitsky V.G., Vasiliev G.V., Klimova N.V., Busygina T.V., Merkulova T.I. Comparison of the results of search for the SF-1 binding sites in the promoter regions of the steroidogenic genes, using the SiteGA and SITECON methods. In: Proc. Fourth Int. Conf. Bioinform. Genome Regul. Struct. (BGRS). 2004;1:69-72.
14. Iwafuchi-Doi M. The mechanistic basis for chromatin regulation by pioneer transcription factors. WIREs Syst. Biol. Med. 2019;11(1): e1427. DOI 10.1002/wsbm.1427.
15. Keilwagen J., Grau J. Varying levels of complexity in transcription factor binding motifs. Nucleic Acids Res. 2015;43(18):e119. DOI 10.1093/nar/gkv577.
16. Kiesel A., Roth C., Ge W., Wess M., Meier M., Söding J. The BaMM web server for de-novo motif discovery and regulatory sequence analysis. Nucleic Acids Res. 2018;46(W1):W215-W220. DOI 10.1093/nar/gky431.
17. Kulakovskiy I.V., Boeva V.A., Favorov A.V., Makeev V.J. Deep and wide digging for binding motifs in ChIP-Seq data. Bioinformatics. 2010;26(20):2622-2623. DOI 10.1093/bioinformatics/btq488.
18. Kulakovskiy I., Levitsky V., Oshchepkov D., Bryzgalov L., Vorontsov I., Makeev V. From binding motifs in ChIP-Seq data to improved models of transcription factor binding sites. J. Bioinform. Comput. Biol. 2013;11(01):1340004. DOI 10.1142/S0219720013400040.
19. Kulakovskiy I.V., Makeev V.J. Discovery of DNA motifs recognized by transcription factors through integration of different experimental sources. Biophysics (Oxf.). 2009;54(6):667-674. DOI 10.1134/S0006350909060013.
20. Kulakovskiy I.V., Vorontsov I.E., Yevshin I.S., Sharipov R.N., Fedorova A.D., Rumynskiy E.I., Medvedeva Y.A., Magana-Mora A., Bajic V.B., Papatsenko D.A., Kolpakov F.A., Makeev V.J. HOCOMOCO: towards a complete collection of transcription factor binding models for human and mouse via large-scale ChIP-Seq analysis. Nucleic Acids Res. 2018;46(D1):D252-D259. DOI 10.1093/nar/gkx1106.
21. Lambert S.A., Jolma A., Campitelli L.F., Das P.K., Yin Y., Albu M., Chen X., Taipale J., Hughes T.R., Weirauch M.T. The human transcription factors. Cell. 2018;172(4):650-665. DOI 10.1016/j.cell.2018.01.029.
22. Latchman D.S. Transcription factors: bound to activate or repress. Trends Biochem. Sci. 2001;26(4):211-213. DOI 10.1016/S0968-0004(01)01812-6.
23. Levitsky V.G., Ignatieva E.V., Ananko E.A., Turnaev I.I., Merkulova T.I., Kolchanov N.A., Hodgman T.C.T. Effective transcription factor binding site prediction using a combination of optimization, a genetic algorithm and discriminant analysis to capture distant interactions. BMC Bioinform. 2007;8(1):1-20. DOI 10.1186/1471-2105-8-481.
24. Levitsky V.G., Kulakovskiy I.V., Ershov N.I., Oshchepkov D.Y., Makeev V.J., Hodgman T.C., Merkulova T.I. Application of experimentally verified transcription factor binding sites models for computational analysis of ChIP-Seq data. BMC Genom. 2014;15(1):80. DOI 10.1186/1471-2164-15-80.
25. Levitsky V.G., Oshchepkov D.Y., Klimova N.V., Ignatieva E.V., Vasiliev G.V., Merkulov V.M., Merkulova T.I. Hidden heterogeneity of transcription factor binding sites: a case study of SF-1. Comput. Biol. Chem. 2016;64:19-32. DOI 10.1016/j.compbiolchem.2016.04.008.
26. Lloyd S.M., Bao X. Pinpointing the genomic localizations of chromatin-associated proteins: the yesterday, today, and tomorrow of ChIP-seq. Curr. Protoc. Cell Biol. 2019;84(1):e89. DOI 10.1002/cpcb.89.
27. Machanick P., Bailey T.L. MEME-ChIP: motif analysis of large DNA datasets. Bioinformatics. 2011;27(12):1696-1697. DOI 10.1093/bioinformatics/btr189.
28. Mathelier A., Wasserman W.W. The next generation of transcription factor binding site prediction. PLoS Comput. Biol. 2013;9(9): e1003214. DOI 10.1371/journal.pcbi.1003214.
29. McClish D.K. Analyzing a portion of the ROC curve. Med. Decis. Mak. 1989;9(3):190-195. DOI 10.1177/0272989X8900900307.
30. Mitra S., Biswas A., Narlikar L. DIVERSITY in binding, regulation, and evolution revealed from high-throughput ChIP. PLoS Comput. Biol. 2018;14(4):1-20. DOI 10.1371/journal.pcbi.1006090.
31. Morgunova E., Taipale J. Structural perspective of cooperative transcription factor binding. Curr. Opin. Struct. Biol. 2017;47:1-8. DOI 10.1016/j.sbi.2017.03.006.
32. Morgunova E., Yin Y., Das P.K., Jolma A., Zhu F., Popov A., Xu Y., Nilsson L., Taipale J. Two distinct DNA sequences recognized by transcription factors represent enthalpy and entropy optima. eLife. 2018;7:1-21. DOI 10.7554/eLife.32963.
33. Park P.J. ChIP-seq: advantages and challenges of a maturing technology. Nat. Rev. Genet. 2009;10(10):669-680. DOI 10.1038/nrg2641.
34. Quinlan A.R., Hall I.M. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26(6):841-842. DOI 10.1093/bioinformatics/btq033.
35. Rogers J.M., Waters C.T., Seegar T.C.M., Jarrett S.M., Hallworth A.N., Blacklow S.C., Bulyk M.L. Bispecific forkhead transcription factor FoxN3 recognizes two distinct motifs with different DNA shapes. Mol. Cell. 2019;74(2):245-253.DOI 10.1016/j.molcel.2019.01.019.
36. Samee M.A.H., Bruneau B.G., Pollard K.S. A de novo shape motif discovery algorithm reveals preferences of transcription factors for DNA shape beyond sequence motifs. Cell Syst. 2019;8(1):27-42. DOI 10.1016/j.cels.2018.12.001.
37. Siebert M., Söding J. Bayesian Markov models consistently outperform PWMs at predicting motifs in nucleotide sequences. Nucleic Acids Res. 2016;44(13):6055-6069. DOI 10.1093/nar/gkw521.
38. Srivastava D., Mahony S. Sequence and chromatin determinants of transcription factor binding and the establishment of cell type-specific binding patterns. Biochim. Biophys. Acta – Gene Regul. Mech. 2020;1863(6):e194443. DOI 10.1016/j.bbagrm.2019.194443.
39. Stormo G.D. DNA binding sites: representation and discovery. Bioinformatics. 2000;16(1):16-23. DOI 10.1093/bioinformatics/16.1.16.
40. Wallerman O., Motallebipour M., Enroth S., Patra K., Bysani M.S.R., Komorowski J., Wadelius C. Molecular interactions between HNF4a, FOXA2 and GABP identified at regulatory DNA elements through ChIP-sequencing. Nucleic Acids Res. 2009;37(22):7498-7508. DOI 10.1093/nar/gkp823.
41. Wederell E.D., Bilenky M., Cullum R., Thiessen N., Dagpinar M., Delaney A., Varhol R., Zhao Y., Zeng T., Bernier B., Ingham M., Hirst M., Robertson G., Marra M.A., Jones S., Hoodless P.A. Global analysis of in vivo Foxa2-binding sites in mouse adult liver using massively parallel sequencing. Nucleic Acids Res. 2008;36(14): 4549-4564. DOI 10.1093/nar/gkn382.
42. Worsley Hunt R., Wasserman W.W. Non-targeted transcription factors motifs are a systemic component of ChIP-seq datasets. Genome Biol. 2014;15(7):412. DOI 10.1186/s13059-014-0412-4.
43. Yang L., Zhou T., Dror I., Mathelier A., Wasserman W.W., Gordân R., Rohs R. TFBSshape: a motif database for DNA shape features of transcription factor binding sites. Nucleic Acids Res. 2014;42(D1): D148-D155. DOI 10.1093/nar/gkt1087.
44. Zhang M.O., Marr T.G. A weight array method for splicing signal analysis. Bioinformatics. 1993;9(5):499-509. DOI 10.1093/bioinformatics/9.5.499.
45. Zhang Y., Liu T., Meyer C.A., Eeckhoute J., Johnson D.S., Bernstein B.E., Nusbaum C., Myers R.M., Brown M., Li W., Liu X.S. Model-based analysis of ChIP-Seq (MACS). Genome Biol. 2008; 9(9):R137. DOI 10.1186/gb-2008-9-9-r137.