A dual transcript-discovery approach to improve the delimitation of gene 
features from RNA-seq data in the chicken model

Orgeur, Mickael; Martens, Marvin; Börno, Stefan T.; Timmermann, Bernd; Duprez, Delphine; Stricker, Sigmar

doi:10.1242/bio.028498

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Lokale TagsFreigabegeschichteDetailsÜbersicht

Freigegeben

Zeitschriftenartikel

A dual transcript-discovery approach to improve the delimitation of gene features from RNA-seq data in the chicken model

MPG-Autoren

/persons/resource/persons50108

Börno, Stefan T.
Sequencing (Head: Bernd Timmermann), Scientific Service (Head: Christoph Krukenkamp), Max Planck Institute for Molecular Genetics, Max Planck Society;

/persons/resource/persons50598

Timmermann, Bernd
Sequencing (Head: Bernd Timmermann), Scientific Service (Head: Christoph Krukenkamp), Max Planck Institute for Molecular Genetics, Max Planck Society;

/persons/resource/persons50578

Stricker, Sigmar
Research Group Development & Disease (Head: Stefan Mundlos), Max Planck Institute for Molecular Genetics, Max Planck Society;

Externe Ressourcen

Es sind keine externen Ressourcen hinterlegt

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

Orgeur.pdf
(Verlagsversion), 1024KB

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

Orgeur, M., Martens, M., Börno, S. T., Timmermann, B., Duprez, D., & Stricker, S. (2018). A dual transcript-discovery approach to improve the delimitation of gene features from RNA-seq data in the chicken model. Biology Open, 7(1): bio.028498. doi:10.1242/bio.028498.

Zitierlink: https://hdl.handle.net/11858/00-001M-0000-002E-A53E-0

Zusammenfassung

The sequence of the chicken genome, like several other draft genome sequences, is presently not fully covered. Gaps, contigs assigned with low confidence and uncharacterized chromosomes result in gene fragmentation and imprecise gene annotation. Transcript abundance estimation from RNA sequencing (RNA-seq) data relies on read quality, library complexity and expression normalization. In addition, the quality of the genome sequence used to map sequencing reads and the gene annotation that defines gene features must also be taken into account. Partially covered genome sequence causes the loss of sequencing reads from the mapping step, while an inaccurate definition of gene features induces imprecise read counts from the assignment step. Both steps can significantly bias interpretation of RNA-seq data. Here, we describe a dual transcript-discovery approach combining a genome-guided gene prediction and a de novo transcriptome assembly. This dual approach enabled us to increase the assignment rate of RNA-seq data by nearly 20% as compared to when using only the chicken reference annotation, contributing therefore to a more accurate estimation of transcript abundance. More generally, this strategy could be applied to any organism with partial genome sequence and/or lacking a manually-curated reference annotation in order to improve the accuracy of gene expression studies.