Calculating the SNP-effective sample size from an alignment

Haubold, B.; Wiehe, T.

doi:10.1093/bioinformatics/18.1.36

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Journal Article

Calculating the SNP-effective sample size from an alignment

MPS-Authors

There are no MPG-Authors in the publication available

External Resource

No external resources are shared

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

There are no public fulltexts stored in PuRe

Supplementary Material (public)

There is no public supplementary material available

Citation

Haubold, B., & Wiehe, T. (2002). Calculating the SNP-effective sample size from an alignment. Bioinformatics, 18(1), 36-38. doi:10.1093/bioinformatics/18.1.36.

Cite as: https://hdl.handle.net/11858/00-001M-0000-0010-0FCF-C

Abstract

Motivation: The number of Single Nucleotide Polymorphisms (SNPs) detectable in an alignment is a function of the length and the number of the aligned sequences. The latter is called sample size. However, a typical alignment, for instance obtained as a BLAST-search result of a query sequence against an EST database, does not evenly cover the query sequence. Therefore, it is usually not clear what the actual sample size is. Results: We present a method to calculate the effective sample size, called n(eff), for a given BLAST alignment. This method takes into account that multiple coverage contributes only logarithmically to the SNP yield of a given sequence stretch. We show that the effective sample size n(eff) is usually much smaller than would be expected for a given amount of coverage and illustrate this with two typical examples.