Evaluation of Population-Based Haplotype Phasing Algorithms

Sethi, Riccha

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Lokale TagsFreigabegeschichteDetailsÜbersicht

Freigegeben

Hochschulschrift

Evaluation of Population-Based Haplotype Phasing Algorithms

MPG-Autoren

/persons/resource/persons202171

Sethi, Riccha
International Max Planck Research School, MPI for Informatics, Max Planck Society;
Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society;

Externe Ressourcen

Es sind keine externen Ressourcen hinterlegt

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

Es sind keine frei zugänglichen Volltexte in PuRe verfügbar

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

Sethi, R. (2016). Evaluation of Population-Based Haplotype Phasing Algorithms. Master Thesis, Universität des Saarlandes, Saarbrücken.

Zitierlink: https://hdl.handle.net/11858/00-001M-0000-002C-41DA-7

Zusammenfassung

The valuable information in correct order of alleles on the haplotypes has many applications in GWAS studies and population genetics. A considerable number of computational and statistical algorithms have been developed for haplotype phasing. Historically, these algorithms were compared using the simulated population data with less dense markers which was inspired by genotype data from the HapMap project. Currently due to the advancement and reduction in cost of NGS, thousands of individuals across the world have been sequenced in 1000 Genomes Project. This has generated the genotype information of individuals from different ethnicity along with much denser genetic variations in them. Here, we have developed a scalable approach to assess state-of-the-art population-based haplotype phasing algorithms with benchmark data designed by simulation of the population (unrelated and related individuals), NGS pipeline and genotype calling. The most accurate algorithm was MVNCall (v1) for phase inference in unrelated individuals while DuoHMM approach of Shapeit (v2) had lowest switch error rate of 0.298 %(with true genotype likelihoods) in the related individuals. Moreover, we also conducted a comprehensive assessment of algorithms for the imputation of missing genotypes in the population with a reference panel. For this metrics, Impute2 (v2.3.2) and Beagle (v4.1) both performed competitively under different imputation scenarios and had genotype concordance rate of >99%. However, Impute2 was better in imputation of genotypes with minor allele frequency of <0.025 in the reference panel.