Импутация генотипов в геномных исследованиях человека
https://doi.org/10.18699/vjgb-24-70
Аннотация
Импутация – это метод, позволяющий восстанавливать недостающую информацию о генетических вариантах, которые не удалось генотипировать напрямую с помощью ДНК-микрочипов или секвенирования с низким покрытием. Импутация играет важнейшую роль в полногеномном анализе ассоциаций (genome wide associations study, GWAS). Она приводит к существенному увеличению количества изучаемых вариантов, что повышает разрешающую способность метода и увеличивает сопоставимость данных, полученных в разных когортах и/или с помощью разных технологий, что важно при проведении метаанализов. При ее выполнении информацию о генотипах в исследуемой выборке, у которой известна только часть генетических вариантов, дополняют за счет эталонной (референсной) выборки, имеющей более полные данные о генотипах (чаще всего это результаты полногеномного секвенирования). Импутация стала неотъемлемой частью геномных исследований человека благодаря преимуществам, которые она дает, а также увеличению доступности инструментов для импутации и данных референсных выборок. Обзор посвящен импутации в геномных исследованиях человека. В первом разделе приводятся описание технологий получения информации о генотипах человека и характеристика получаемых типов данных. Во втором разделе представлена методология импутации, перечисляются этапы ее проведения и соответствующие программы, дается описание наиболее популярных референсных панелей и способов оценки качества импутации. В заключении представлены примеры использования импутации в геномных исследованиях выборок из России. Настоящий обзор показывает важность проведения импутации, дает информацию о том, как ее выполнять, и систематизирует результаты ее применения на примере российских выборок.
Ключевые слова
Об авторах
А. А. БердниковаРоссия
Новосибирск
И. В. Зоркольцева
Россия
Новосибирск
Я. А. Цепилов
Россия
Новосибирск
Е. Е. Елгаева
Россия
Новосибирск
Список литературы
1. Abraham G., Qiu Y., Inouye M. FlashPCA2: principal component analysis of Biobank-scale genotype datasets. Bioinformatics. 2017; 33(17):2776-2778. DOI 10.1093/bioinformatics/btx299
2. Ali A.T., Liebert A., Lau W., Maniatis N., Swallow D.M. The hazards of genotype imputation in chromosomal regions under selection: A case study using the lactase gene region. Ann. Hum. Genet. 2022; 86(1):24-33. DOI 10.1111/ahg.12444
3. Anderson C.A., Pettersson F.H., Clarke G.M., Cardon L.R., Morris A.P., Zondervan K.T. Data quality control in genetic case-control association studies. Nat. Protoc. 2010;5(9):1564-1573. DOI 10.1038/nprot.2010.116
4. Auton A., Abecasis G.R., Altshuler D.M., Durbin R.M., Abecasis G.R., Bentley D.R., … Min Kang H., Korbel J.O., Marchini J.L., McCarthy S., McVean G.A., Abecasis G.R. A global reference for human genetic variation. Nature. 2015;526(7571):68-74. DOI 10.1038/nature15393
5. Barton A.R., Sherman M.A., Mukamel R.E., Loh P.-R. Whole-exome imputation within UK Biobank powers rare coding variant association and fine-mapping analyses. Nat. Genet. 2021;53(8):1260-1269. DOI 10.1038/s41588-021-00892-1
6. Bhattacharyya A. On a measure of divergence between two multinomial populations. Sankhyā: Ind. J. Stat. 1946;7(4):401-406
7. Bourke P.M., Voorrips R.E., Visser R.G.F., Maliepaard C. Tools for genetic studies in experimental populations of polyploids. Front. Plant. Sci. 2018;9:513. DOI 10.3389/fpls.2018.00513
8. Brown A., Ampratwum P.O., Ray S.D. Microarray analysis. In: Encyclopedia of Toxicology. 4 ed. 2024;6:385-392. DOI 10.1016/B978-0-12-824315-2.00210-4
9. Browning B.L., Zhou Y., Browning S.R. A One-penny imputed genome from next-generation reference panels. Am. J. Hum. Genet. 2018;103(3):338-348. DOI 10.1016/j.ajhg.2018.07.015
10. Browning B.L., Tian X., Zhou Y., Browning S.R. Fast two-stage phasing of large-scale sequence data. Am. J. Hum. Genet. 2021;108(10): 1880-1890. DOI 10.1016/j.ajhg.2021.08.005
11. Browning S.R., Browning B.L. Haplotype phasing: existing methods and new developments. Nat. Rev. Genet. 2011;12(10):703-714. DOI 10.1038/nrg3054
12. Cann H.M., de Toma C., Cazes L., Legrand M.F., Morel V., Piouffre L., Bodmer J., … Zhu S., Weber J.L., Greely H.T., Feldman M.W., Thomas G., Dausset J., Cavalli-Sforza L.L. A human genome diversity cell line panel. Science. 2002;296(5566):261-262. DOI 10.1126/science.296.5566.261b
13. Chang C.C., Chow C.C., Tellier L.C., Vattikuti S., Purcell S.M., Lee J.J. Second-generation PLINK: rising to the challenge of larger and richer datasets. GigaScience. 2015;4(1):7. DOI 10.1186/s13742-015-0047-8
14. Chat V., Ferguson R., Morales L., Kirchhoff T. Ultra low-coverage whole-genome sequencing as an alternative to genotyping arrays in genome-wide association studies. Front. Genet. 2022;12:790445. DOI 10.3389/fgene.2021.790445
15. Check Hayden E. Genome sequencing: the third generation. Nature. 2009;457(7231):768-769. DOI 10.1038/news.2009.86
16. Choi S.W., Mak T.S.-H., O’Reilly P.F. Tutorial: a guide to performing polygenic risk score analyses. Nat. Protoc. 2020;15(9):2759-2772. DOI 10.1038/s41596-020-0353-1
17. Chundru V.K., Marioni R.E., Prendergast J.G.D., Vallerga C.L., Lin T., Beveridge A.J., Gratten J., Hume D.A., Deary I.J., Wray N.R., Visscher P.M., McRae A.F. Examining the impact of imputation errors on fine-mapping using DNA methylation QTL as a model trait. Genetics. 2019;212(3):577-586. DOI 10.1534/genetics.118. 301861
18. Clark A.G. Inference of haplotypes from PCR-amplified samples of diploid populations. Mol. Biol. Evol. 1990;7(2):111-122. DOI 10.1093/oxfordjournals.molbev.a040591
19. Collister J.A., Liu X., Clifton L. Calculating polygenic risk scores (PRS) in UK biobank: A practical guide for epidemiologists. Front. Genet. 2022;13:818574. DOI 10.3389/fgene.2022.818574
20. Connell C., Fung S., Heiner C., Bridgham J., Chakerian V., Heron E., Jones B., Menchen S., Mordan W., Raff M., Recknor M., Smith L.M., Springer J., Woo S., Hunkapiller M. Automated DNA-sequence analysis. Biotechniques. 1987;5:342-348
21. Das S., Forer L., Schönherr S., Sidore C., Locke A.E., Kwong A., Vrieze S.I., Chew E.Y., Levy S., McGue M., Schlessinger D., Stambolian D., Loh P.-R., Iacono W.G., Swaroop A., Scott L.J., Cucca F., Kronenberg F., Boehnke M., Abecasis G.R., Fuchsberger C. Nextgeneration genotype imputation service and methods. Nat. Genet. 2016;48(10):1284-1287. DOI 10.1038/ng.3656
22. De Marino A., Mahmoud A.A., Bose M., Bircan K.O., Terpolovsky A., Bamunusinghe V., Bohn S., Khan U., Novković B., Yazdi P.G. A comparative analysis of current phasing and imputation software. PLoS One. 2022;17(10):e0260177. DOI 10.1371/journal.pone.0260177
23. Deamer D., Akeson M., Branton D. Three decades of nanopore sequencing. Nat. Biotechnol. 2016;34(5):518-524. DOI 10.1038/nbt.3423
24. Delaneau O., Marchini J., Zagury J.-F. A linear complexity phasing method for thousands of genomes. Nat. Methods. 2012;9(2):179- 181. DOI 10.1038/nmeth.1785
25. Dempster A.P., Laird N.M., Rubin D.B. Maximum likelihood from incomplete data via the EM algorithm. J. Royal Statist. Society. 1977;39(1):1-38. DOI 10.1111/j.2517-6161.1977.tb01600.x
26. DePristo M.A., Banks E., Poplin R., Garimella K.V., Maguire J.R., Hartl C., Philippakis A.A., del Angel G., Rivas M.A., Hanna M., McKenna A., Fennell T.J., Kernytsky A.M., Sivachenko A.Y., Cibulskis K., Gabriel S.B., Altshuler D., Daly M.J. A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat. Genet. 2011;43(5):491-498. DOI 10.1038/ng.806
27. Drmanac R., Sparks A.B., Callow M.J., Halpern A.L., Burns N.L., Kermani B.G., Carnevali P., … Drmanac S., Oliphant A.R., Banyai W.C., Martin B., Ballinger D.G., Church G.M., Reid C.A. Human genome sequencing using unchained base reads on selfassembling DNA nanoarrays. Science. 2010;327(5961):78-81. DOI 10.1126/science.1181498
28. Fan J.B., Oliphant A., Shen R., Kermani B.G., Garcia F., Gunderson K.L., Hansen M., … Kruglyak S., Bentley D., Haas J., Rigault P., Zhou L., Stuelpnagel J., Chee M.S. Highly parallel SNP genotyping. Cold Spring Harb. Symp. Quant. Biol. 2003;68:69-78. DOI 10.1101/sqb.2003.68.69
29. Fatumo S., Mugisha J., Soremekun O.S., Kalungi A., Mayanja R., Kintu C., Makanga R., Kakande A., Abaasa A., Asiki G., Kalyesubula R., Newton R., Nyirenda M., Sandhu M.S., Kaleebu P. Uganda genome resource: A rich research database for genomic studies of communicable and non-communicable diseases in Africa. Cell Genom. 2022;2(11):100209. DOI 10.1016/j.xgen.2022.100209
30. Feng Z., Peng F., Xie F., Liu Y., Zhang H., Ma J., Xing J., Guo X. Comparison of capture-based mtDNA sequencing performance between MGI and illumina sequencing platforms in various sample types. BMC Genomics. 2024;25(1):41. DOI 10.1186/s12864-023-09938-6
31. Govindarajan R., Duraiyan J., Kaliyappan K., Palanisamy M. Microarray and its applications. J. Pharm. Bioallied Sci. 2012;4(6):310. DOI 10.4103/0975-7406.100283
32. Gresham D., Dunham M.J., Botstein D. Comparing whole genomes using DNA microarrays. Nat. Rev. Genet. 2008;9(4):291-302. DOI 10.1038/nrg2335
33. Guo Y., He J., Zhao S., Wu H., Zhong X., Sheng Q., Samuels D.C., Shyr Y., Long J. Illumina human exome genotyping array clustering and quality control. Nat. Protoc. 2014;9(11):2643-2662. DOI 10.1038/nprot.2014.174
34. Hayat M.A. DNA microarrays technology. In: Handbook of Immunohistochemistry and in situ Hybridization of Human Carcinomas. 2002;49-55. DOI 10.1016/S1874-5784(04)80015-1
35. Huang G.-H., Tseng Y.-C. Genotype imputation accuracy with different reference panels in admixed populations. BMC Proc. 2014;8(S1): S64. DOI 10.1186/1753-6561-8-S1-S64
36. Jeon S.A., Park J.L., Park S.-J., Kim J.H., Goh S.-H., Han J.-Y., Kim S.-Y. Comparison between MGI and illumina sequencing platforms for whole genome sequencing. Genes Genom. 2021;43(7): 713-724. DOI 10.1007/s13258-021-01096-x
37. Kolosov N., Rezapova V., Rotar O., Loboda A., Freylikhman O., Melnik O., Sergushichev A., Stevens C., Voortman T., Kostareva A., Konradi A., Daly M.J., Artomov M. Genotype imputation and polygenic score estimation in northwestern Russian population. PLoS One. 2022;17(6):e0269434. DOI 10.1371/journal.pone.
38. 0269434 Korostin D., Kulemin N., Naumov V., Belova V., Kwon D., Gorbachev A. Comparative analysis of novel MGISEQ-2000 sequencing platform vs Illumina HiSeq 2500 for whole-genome sequencing. PLoS One. 2020;15(3):e0230301. DOI 10.1371/journal.pone.0230301
39. Kurg A., Tõnisson N., Georgiou I., Shumaker J., Tollett J., Metspalu A. Arrayed primer extension: solid-phase four-color DNA resequencing and mutation detection technology. Genet. Test. 2000;4(1):1-7. DOI 10.1089/109065700316408
40. Lam M., Awasthi S., Watson H.J., Goldstein J., Panagiotaropoulou G., Trubetskoy V., Karlsson R., Frei O., Fan C.-C., De Witte W., Mota N.R., Mullins N., Brügger K., Lee S.H., Wray N.R., Skarabis N., Huang H., Neale B., Daly M.J., Mattheisen M., Walters R., Ripke S. RICOPILI: rapid imputation for COnsortias PIpeLIne. Bioinformatics. 2020;36(3):930-933. DOI 10.1093/bioinformatics/btz633
41. Lamy P., Andersen C.L., Wikman F.P., Wiuf C. Genotyping and annotation of Affymetrix SNP arrays. Nucleic Acids Res. 2006;34(14):e100. DOI 10.1093/nar/gkl475
42. Lau W., Ali A., Maude H., Andrew T., Swallow D.M., Maniatis N. The hazards of genotype imputation when mapping disease susceptibility variants. Genome Biol. 2024;25(1):7. DOI 10.1186/s13059-023-03140-3
43. Li L., Huang P., Sun X., Wang S., Xu M., Liu S., Feng Z., Zhang Q., Wang X., Zheng X., Dai M., Bi Y., Ning G., Cao Y., Wang W. The ChinaMAP reference panel for the accurate genotype imputation in Chinese populations. Cell Res. 2021;31(12):1308-1310. DOI 10.1038/s41422-021-00564-z
44. Li N., Stephens M. Modeling linkage disequilibrium and identifying recombination hotspots using single-nucleotide polymorphism data. Genetics. 2003;165(4):2213-2233. DOI 10.1093/genetics/165.4.2213
45. Li Y., Willer C., Sanna S., Abecasis G. Genotype imputation. Annu. Rev. Genomics Hum. Genet. 2009;10(1):387-406. DOI 10.1146/annurev.genom.9.081307.164242
46. Lin P., Hartz S.M., Zhang Z., Saccone S.F., Wang J., Tischfield J.A., Edenberg H.J., Kramer J.R., Goate A.M., Bierut L.J., Rice J.P. A new statistic to evaluate imputation reliability. PLoS One. 2010; 5(3):e9697. DOI 10.1371/journal.pone.0009697
47. Loh P.-R., Danecek P., Palamara P.F., Fuchsberger C., Reshef Y.A., Finucane H.K., Schoenherr S., Forer L., McCarthy S., Abecasis G.R., Durbin R., L Price A. Reference-based phasing using the haplotype reference consortium panel. Nat. Genet. 2016;48(11):1443-1448. DOI 10.1038/ng.3679
48. Marchini J., Howie B. Genotype imputation for genome-wide association studies. Nat. Rev. Genet. 2010;11(7):499-511. DOI 10.1038/nrg2796
49. Marees A.T., de Kluiver H., Stringer S., Vorspan F., Curis E., Marie Claire C., Derks E.M. A tutorial on conducting genome wide association studies: Quality control and statistical analysis. Int. J. Methods Psychiatr. Res. 2018;27(2). DOI 10.1002/mpr.1608
50. Martin A.R., Atkinson E.G., Chapman S.B., Stevenson A., Stroud R.E., Abebe T., Akena D., … Ramesar R., Shiferaw W., Stein D.J., Teferra S., van der Merwe C., Zingela Z. Low-coverage sequencing cost-effectively detects known and novel variation in underrepresented populations. Am. J. Hum. Genet. 2021;108(4):656-668. DOI 10.1016/j.ajhg.2021.03.012
51. Maxam A.M., Gilbert W. A new method for sequencing DNA. Proc. Natl. Acad. Sci. USA. 1977;74(2):560-564. DOI 10.1073/pnas.74. 2.560
52. Mills M.C., Barban N., Tropf F.C. An Introduction to Statistical Genetic Data Analysis. Cambridge, MA: MIT Press, 2020
53. Mirzabekov A.D. Biochips in the biology and medicine of the XXI century. Vestnik Rossiyskoj Akademii Nauk = Herald of the Russian Academy of Sciences. 2003;73(5):412 (in Russian)
54. Moreland E., Borisov O.V., Semenova E.A., Larin A.K., Andryushchenko O.N., Andryushchenko L.B., Generozov E.V., Williams A.G., Ahmetov I.I. Polygenic profile of elite strength athletes. J. Strength. Cond. Res. 2022;36(9):2509-2514. DOI 10.1519/JSC.0000000000003901
55. O’Connell J., Yun T., Moreno M., Li H., Litterman N., Kolesnikov A., Noblin E., … Wang W., Weldon C.H., Wilton P., Wong C., Auton A., Carroll A., McLean C.Y. A population-specific reference panel for improved genotype imputation in African Americans. Commun. Biol. 2021;4(1):1269. DOI 10.1038/s42003-021-02777-9
56. Pasaniuc B., Rohland N., McLaren P.J., Garimella K., Zaitlen N., Li H., Gupta N., … Haas D.W., Liang L., Sunyaev S., Patterson N., de Bakker P.I.W., Reich D., Price A.L. Extremely low-coverage sequencing and imputation increases power for genome-wide association studies. Nat. Genet. 2012;44(6):631-635. DOI 10.1038/ng.2283
57. Pinakhina D., Yermakovich D., Vergasova E., Kasyanov E., Rukavishnikov G., Rezapova V., Kolosov, … Plotnikov N., Ilinsky V., Neznanov N., Mazo G., Kibitov A., Rakitko A., Artomov M. GWAS of depression in 4,520 individuals from the Russian population highlights the role of MAGI2 (S-SCAM) in the gut-brain axis. Front. Genet. 2022;13:972196. DOI 10.3389/fgene.2022.972196
58. Price A.L., Patterson N.J., Plenge R.M., Weinblatt M.E., Shadick N.A., Reich D. Principal components analysis corrects for stratification in genome-wide association studies. Nat. Genet. 2006;38(8):904-909. DOI 10.1038/ng1847
59. Purcell S., Neale B., Todd-Brown K., Thomas L., Ferreira M.A.R., Bender D., Maller J., Sklar P., de Bakker P.I.W., Daly M.J., Sham P.C. PLINK: a tool set for whole-genome association and populationbased linkage analyses. Am. J. Hum. Genet. 2007;81(3):559-575. DOI 10.1086/519795
60. Ramirez A.H., Sulieman L., Schlueter D.J., Halvorson A., Qian J., Ratsimbazafy F., Loperena R., … Denny J.C., Carroll R.J., Glazer D., Harris P.A., Hripcsak G., Philippakis A., Roden D.M.; All of Us research program. The All of Us research program: Data quality, utility, and diversity. Patterns (N Y ). 2022;3(8):100570. DOI 10.1016/j.patter.2022.100570
61. Rhoads A., Au K.F. PacBio Sequencing and its applications. Genomics Proteomics Bioinformatics. 2015;13(5):278-289. DOI 10.1016/j.gpb.2015.08.002
62. Roshyara N.R., Kirsten H., Horn K., Ahnert P., Scholz M. Impact of pre-imputation SNP-filtering on genotype imputation results. BMC Genet. 2014;15(1):88. DOI 10.1186/s12863-014-0088-5
63. Rubinacci S., Delaneau O., Marchini J. Genotype imputation using the Positional Burrows Wheeler Transform. PLoS Genet. 2020;16(11): e1009049. DOI 10.1371/journal.pgen.1009049
64. Sanger F., Nicklen S., Coulson A.R. DNA sequencing with chain-terminating inhibitors. Proc. Natl. Acad. Sci. USA. 1977;74(12):5463- 5467. DOI 10.1073/pnas.74.12.5463
65. Scheet P., Stephens M. A fast and flexible statistical model for largescale population genotype data: Applications to inferring missing genotypes and haplotypic phase. Am. J. Hum. Genet. 2006;78(4): 629-644. DOI 10.1086/502802
66. Shendure J., Balasubramanian S., Church G.M., Gilbert W., Rogers J., Schloss J.A., Waterston R.H. DNA sequencing at 40: past, present and future. Nature. 2017;550(7676):345-353. DOI 10.1038/nature 24286
67. Smith L.M., Sanders J.Z., Kaiser R.J., Hughes P., Dodd C., Connell C.R., Heiner C., Kent S.B.H., Hood L.E. Fluorescence detection in automated DNA sequence analysis. Nature. 1986;321(6071): 674-679. DOI 10.1038/321674a0
68. Stahl K., Gola D., König I.R. Assessment of imputation quality: comparison of phasing and imputation algorithms in real data. Front. Genet. 2021;12:724037. DOI 10.3389/fgene.2021.724037
69. Sudlow C., Gallacher J., Allen N., Beral V., Burton P., Danesh J., Downey P., Elliott P., Green J., Landray M., Liu B., Matthews P., Ong G., Pell J., Silman A., Young A., Sprosen T., Peakman T., Collins R. UK Biobank: An open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med. 2015;12(3):e1001779. DOI 10.1371/journal.pmed.1001779
70. Sudmant P.H., Rausch T., Gardner E.J., Handsaker R.E., Abyzov A., Huddleston J., Zhang Y., … Gerstein M.B., Bashir A., Stegle O., Devine S.E., Lee C., Eichler E.E., Korbel J.O. An integrated map of structural variation in 2,504 human genomes. Nature. 2015; 526(7571):75-81. DOI 10.1038/nature15394
71. Taliun D., Harris D.N., Kessler M.D., Carlson J., Szpiech Z.A., Torres R., … Cupples L.A., Laurie C.C., Jaquish C.E., Hernandez R.D., O’Connor T.D., Abecasis G.R. Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program. Nature. 2021; 590(7845):290-299. DOI 10.1038/s41586-021-03205-y
72. The Haplotype Reference Consortium. A reference panel of 64,976 haplotypes for genotype imputation. Nat. Genet. 2016;48:1279- 1283. DOI 10.1038/ng.3643
73. Usoltsev D., Kolosov N., Rotar O., Loboda A., Boyarinova M., Moguchaya E., Kolesova E., … Laiho P., Kostareva A., Konradi A., Shlyakhto E., Palotie A., Daly M.J., Artomov M. Understanding complex trait susceptibilities and ethnical diversity in a sample of 4,145 Russians through analysis of clinical and genetic data. bioRxiv. 2023. DOI 10.1101/2023.03.23.534000
74. Wall J.D., Stawiski E.W., Ratan A., Kim H.L., Kim C., Gupta R., Suryamohan K., … Radha V., Mohan V., Majumder P.P., Seshagiri S., Seo J.-S., Schuster S.C., Peterson A.S. The GenomeAsia 100K Project enables genetic discoveries across Asia. Nature. 2019; 576(7785):106-111. DOI 10.1038/s41586-019-1793-z
75. Wang D.G., Fan J.-B., Siao C.-J., Berno A., Young P., Sapolsky R., Ghandour G., Perkins N., Winchester E., Spencer J., Kruglyak L., Stein L., Hsie L., Topaloglou T., Hubbell E., Robinson E., Mittmann M., Morris M.S., Shen N., Kilburn D., Rioux J., Nusbaum C., Rozen S., Hudson T.J., Lipshutz R., Chee M., Lander E.S. Large-scale identification, mapping, and genotyping of single-nucleotide polymorphisms in the human genome. Science. 1998;280(5366):1077- 1082. DOI 10.1126/science.280.5366.1077
76. Wang Q.S., Huang H. Methods for statistical fine-mapping and their applications to auto-immune diseases. Semin. Immunopathol. 2022; 44(1):101-113. DOI 10.1007/s00281-021-00902-8
77. Weale M.E. A survey of current software for haplotype phase inference. Hum. Genomics. 2004;1(2):141. DOI 10.1186/1479-7364-1-2-141
78. Weng Z.-Q., Saatchi M., Schnabel R.D., Taylor J.F., Garrick D.J. Recombination locations and rates in beef cattle assessed from parentoffspring pairs. Gen. Select. Evol. 2014;46(1):34. DOI 10.1186/ 1297-9686-46-34
79. Wu D., Dou J., Chai X., Bellis C., Wilm A., Shih C.C., … Wong W.-C., Xie Z., Yeo K.K., Zhang L., Zhai W., Zhao Y. Large-scale wholegenome sequencing of three diverse Asian populations in Singapore. Cell. 2019;179(3):736-749.e15. DOI 10.1016/j.cell.2019.09.019
80. Yang H.-C., Lin H.-C., Kang M., Chen C.-H., Lin C.-W., Li L.-H., Wu J.-Y., Chen Y.-T., Pan W.-H. SAQC: SNP array quality control. BMC Bioinformatics. 2011;12(1):100. DOI 10.1186/1471-2105-12-100
81. Yoo S.-K., Kim C.-U., Kim H.L., Kim S., Shin J.-Y., Kim N., Yang J.S.W., Lo K.-W., Cho B., Matsuda F., Schuster S.C., Kim C., Kim J.-I., Seo J.-S. NARD: whole-genome reference panel of 1779 Northeast Asians improves imputation accuracy of rare and lowfrequency variants. Genome Med. 2019;11(1):64. DOI 10.1186/ s13073-019-0677-z
82. Yu K., Das S., LeFaive J., Kwong A., Pleiness J., Forer L., Schönherr S., Fuchsberger C., Smith A.V., Abecasis G.R. Meta-imputation: An efficient method to combine genotype data after imputation with multiple reference panels. Am. J. Hum. Genet. 2022;109(6):1007-1015. DOI 10.1016/j.ajhg.2022.04.002
83. Zhao S., Jing W., Samuels D.C., Sheng Q., Shyr Y., Guo Y. Strategies for processing and quality control of Illumina genotyping arrays. Brief. Bioinform. 2018;19(5):765-775. DOI 10.1093/bib/bbx012