Efficient estimation of pairwise distances between genomes

Domazet-Lošo, Mirjana; Haubold, Bernhard

doi:10.1093/bioinformatics/btp590

Efficient estimation of pairwise distances between genomes

Domazet-Lošo, M., & Haubold, B. (2009). Efficient estimation of pairwise distances between genomes. Bioinformatics, 25(24), 3221-3227. doi:10.1093/bioinformatics/btp590.

Item is 公開

表示: 全項目非表示: 全項目

基本情報

表示: 非表示:

アイテムのパーマリンク: https://hdl.handle.net/11858/00-001M-0000-000F-D561-D 版のパーマリンク: https://hdl.handle.net/11858/00-001M-0000-000F-D562-B

資料種別: 学術論文

ファイル

表示: ファイル

非表示: ファイル

:

Domazet-Loso_2009.pdf (出版社版), 152KB

ファイルのパーマリンク:
-

ファイル名:
Domazet-Loso_2009.pdf

説明:
-

OA-Status:

閲覧制限:
制限付き (Max Planck Institute for Evolutionary Biology, MPLM; )

MIMEタイプ / チェックサム:
application/pdf

技術的なメタデータ:

著作権日付:
-

著作権情報:
-

CCライセンス:
-

作成者

表示:

非表示:

作成者:
Domazet-Lošo, Mirjana¹, 著者
Haubold, Bernhard¹, 著者

所属:
1Research Group Bioinformatics, Department Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Max Planck Society, ou_1445644

内容説明

表示:

非表示:

キーワード: -

要旨: Motivation: Genome comparison is central to contemporary genomics and typically relies on sequence alignment. However, genome-wide alignments are difficult to compute. We have, therefore, recently developed an accurate alignment-free estimator of the number of substitutions per site based on the lengths of exact matches between pairs of sequences. The previous implementation of this measure requires n(n–1) suffix tree constructions and traversals, where n is the number of sequences analyzed. This does not scale well for large n. Results: We present an algorithm to extract Formula pairwise distances in a single traversal of a single suffix tree containing n sequences. As a result, the run time of the suffix tree construction phase of our algorithm is reduced from O(n2L) to O(nL), where L is the length of each sequence. We implement this algorithm in the program kr version 2 and apply it to 825 HIV genomes, 13 genomes of enterobacteria and the complete genomes of 12 Drosophila species. We show that, depending on the input dataset, the new program is at least 10 times faster than its predecessor. Availability: Version 2 of kr can be tested via a web interface at http://guanine.evolbio.mpg.de/kr2/. It is written in standard C and its source code is available under the GNU General Public License from the same web site. Supplementary informations: Supplementary data are available at Bioinformatics online.

資料詳細

表示:

非表示:

言語: eng - English

日付: 出版: 2009-10-13

出版の状態: 出版

ページ: -

出版情報: -

目次: -

査読: -

識別子（DOI, ISBNなど）: eDoc: 440827
DOI: 10.1093/bioinformatics/btp590
その他: 2732/S 39054

学位: -

訴訟

表示:

Project information

表示:

出版物 1

表示:

非表示:

出版物名: Bioinformatics

種別: 学術雑誌

著者・編者:

所属:

出版社, 出版地: -

ページ: - 巻号: 25 (24) 通巻号: - 開始・終了ページ: 3221 - 3227 識別子（ISBN, ISSN, DOIなど）: ISSN: 0266-7061 (print)
ISSN: 1367-4803 (print)
ISSN: 1367-4811 (online)
ISSN: 1460-2059 (online)

アイテム詳細

基本情報

ファイル

関連URL

作成者

内容説明

資料詳細

関連イベント

訴訟

Project information

出版物 1