hide
Free keywords:
chromatin immunoprecipitation (Chip); CpG island; emulsion polymerase chain reaction (ePCR); GCbbias; H3K9ac; microbial genomic DNA; next generation sequencing(NGS); PCR-free library preparation; sequencing depth; upscale PCR
Abstract:
Different types of sequencing biases have been described and
subsequently improved for a variety of sequencing systems, mostly
focusing on the widely used Illumina systems. Similar studies are
missing for the SOLiD 5500xl system, a sequencer which produced many
data sets available to researchers today. Describing and understanding
the bias is important to accurately interpret and integrate these
published data in various ongoing research projects. We report a
particularly strong GC bias for this sequencing system when analyzing a
defined gDNA mix of five microbes with a wide range of different GC
contents (20-72%) when comparing to the expected distribution and
Illumina MiSeq data from the same DNA pool. Since we observed this bias
already under PCR-free conditions, changing the PCR conditions during
library preparation - a common strategy to handle bias in the Illumina
system - was not relevant. Source of the bias appeared to be an uneven
heat distribution during the SOLiD emulsion PCR (ePCR) - for enrichment
of libraries prior loading - since ePCR in either small pouches or in
96-well plates improved the GC bias. Sequencing of chromatin
immunoprecipitated DNA (ChIP-seq) is a common approach in epigenetics.
ChIP-seq of the mixed source histone mark H3K9ac (acetyl Histone H3
lysine 9), typically found on promoter regions and on gene bodies,
including CpG islands, performed on a SOLiD 5500xl machine, resulted in
major loss of reads at GC rich loci (GC content ≥ 62%), not explained by
low sequencing depth. This was improved with adaptations of the ePCR.