Fast Mapping of Short Sequences with Mismatches, Insertions and Deletions Using 
Index Structures

Hoffmann, Steve; Otto, Christian; Kurtz, Stefan; Sharma, Cynthia Mira; Khaitovich, Philipp; Vogel, Jörg; Stadler, Peter F.; Hackermüller, Jörg

Fast Mapping of Short Sequences with Mismatches, Insertions and Deletions Using Index Structures

Hoffmann, S., Otto, C., Kurtz, S., Sharma, C. M., Khaitovich, P., Vogel, J., Stadler, P. F., & Hackermüller, J. (2009). Fast Mapping of Short Sequences with Mismatches, Insertions and Deletions Using Index Structures. PLoS Computational Biology, 5(9):.

Item is 公開

表示: 全項目非表示: 全項目

基本情報

表示: 非表示:

アイテムのパーマリンク: https://hdl.handle.net/11858/00-001M-0000-000E-C0B3-5 版のパーマリンク: https://hdl.handle.net/11858/00-001M-0000-000E-C0B4-3

資料種別: 学術論文

ファイル

表示: ファイル

非表示: ファイル

:

PLoS_Comput_Biol_2009_5_e1000502.pdf (出版社版), 704KB

表示保存

ファイルのパーマリンク:
https://hdl.handle.net/11858/00-001M-0000-000E-C0B2-7

ファイル名:
PLoS_Comput_Biol_2009_5_e1000502.pdf

説明:
-

OA-Status:

閲覧制限:
公開

MIMEタイプ / チェックサム:
application/pdf / [MD5]

技術的なメタデータ:

表示

著作権日付:
-

著作権情報:
© 2009 Hoffmann et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

CCライセンス:
-

作成者

表示:

非表示:

作成者:
Hoffmann, Steve, 著者
Otto, Christian, 著者
Kurtz, Stefan, 著者
Sharma, Cynthia Mira¹, 著者
Khaitovich, Philipp, 著者
Vogel, Jörg¹, 著者
Stadler, Peter F.², 著者
Hackermüller, Jörg, 著者

所属:
1Max-Planck Research Group RNA Biology, Max Planck Institute for Infection Biology, Max Planck Society, ou_1664150
2Max Planck Society, ou_persistent13

内容説明

表示:

非表示:

キーワード: -

要旨: With few exceptions, current methods for short read mapping make use of simple seed heuristics to speed up the search. Most of the underlying matching models neglect the necessity to allow not only mismatches, but also insertions and deletions. Current evaluations indicate, however, that very different error models apply to the novel high-throughput sequencing methods. While the most frequent error-type in Illumina reads are mismatches, reads produced by 454's GS FLX predominantly contain insertions and deletions (indels). Even though 454 sequencers are able to produce longer reads, the method is frequently applied to small RNA (miRNA and siRNA) sequencing. Fast and accurate matching in particular of short reads with diverse errors is therefore a pressing practical problem. We introduce a matching model for short reads that can, besides mismatches, also cope with indels. It addresses different error models. For example, it can handle the problem of leading and trailing contaminations caused by primers and poly-A tails in transcriptomics or the length-dependent increase of error rates. In these contexts, it thus simplifies the tedious and error-prone trimming step. For efficient searches, our method utilizes index structures in the form of enhanced suffix arrays. In a comparison with current methods for short read mapping, the presented approach shows significantly increased performance not only for 454 reads, but also for Illumina reads. Our approach is implemented in the software segemehl available at http://www.bioinf.uni-leipzig.de/Software/segemehl/.

資料詳細

表示:

非表示:

言語: eng - English

日付: 出版: 2009-09

出版の状態: 出版

ページ: -

出版情報: -

目次: -

査読: 査読あり

識別子（DOI, ISBNなど）: eDoc: 442297
ISI: 000270800100020

学位: -

訴訟

表示:

Project information

表示:

出版物 1

表示:

非表示:

出版物名: PLoS Computational Biology

種別: 学術雑誌

著者・編者:

所属:

出版社, 出版地: -

ページ: - 巻号: 5 (9) 通巻号: e1000502 開始・終了ページ: - 識別子（ISBN, ISSN, DOIなど）: ISSN: 1553-734X

アイテム詳細

基本情報

ファイル

関連URL

作成者

内容説明

資料詳細

関連イベント

訴訟

Project information

出版物 1