de.mpg.escidoc.pubman.appbase.FacesBean
Deutsch
 
Hilfe Wegweiser Datenschutzhinweis Impressum Kontakt
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT
  Efficient estimation of pairwise distances between genomes

Domazet-Lošo, M., & Haubold, B. (2009). Efficient estimation of pairwise distances between genomes. Bioinformatics, 25(24), 3221-3227. doi:10.1093/bioinformatics/btp590.

Item is

Basisdaten

einblenden: ausblenden:
Datensatz-Permalink: http://hdl.handle.net/11858/00-001M-0000-000F-D561-D Versions-Permalink: http://hdl.handle.net/11858/00-001M-0000-000F-D562-B
Genre: Zeitschriftenartikel

Dateien

einblenden: Dateien
ausblenden: Dateien
:
Domazet-Loso_2009.pdf (Verlagsversion), 152KB
 
Datei-Permalink:
-
Beschreibung:
-
Sichtbarkeit:
Eingeschränkt
MIME-Typ / Prüfsumme:
application/pdf
Technische Metadaten:
Copyright Datum:
-
Copyright Info:
-
Lizenz:
-

Externe Referenzen

einblenden:

Urheber

einblenden:
ausblenden:
 Urheber:
Domazet-Lošo, Mirjana1, Autor              
Haubold, Bernhard1, Autor              
Affiliations:
1Research Group Bioinformatics, Department Evolutionary Genetics, Max Planck Institute for Evolutionary Biology, Max Planck Society, escidoc:1445644              

Inhalt

einblenden:
ausblenden:
Schlagwörter: -
 Zusammenfassung: Motivation: Genome comparison is central to contemporary genomics and typically relies on sequence alignment. However, genome-wide alignments are difficult to compute. We have, therefore, recently developed an accurate alignment-free estimator of the number of substitutions per site based on the lengths of exact matches between pairs of sequences. The previous implementation of this measure requires n(n–1) suffix tree constructions and traversals, where n is the number of sequences analyzed. This does not scale well for large n. Results: We present an algorithm to extract Formula pairwise distances in a single traversal of a single suffix tree containing n sequences. As a result, the run time of the suffix tree construction phase of our algorithm is reduced from O(n2L) to O(nL), where L is the length of each sequence. We implement this algorithm in the program kr version 2 and apply it to 825 HIV genomes, 13 genomes of enterobacteria and the complete genomes of 12 Drosophila species. We show that, depending on the input dataset, the new program is at least 10 times faster than its predecessor. Availability: Version 2 of kr can be tested via a web interface at http://guanine.evolbio.mpg.de/kr2/. It is written in standard C and its source code is available under the GNU General Public License from the same web site. Supplementary informations: Supplementary data are available at Bioinformatics online.

Details

einblenden:
ausblenden:
Sprache(n): eng - Englisch
 Datum: 2009-10-13
 Publikationsstatus: Im Druck publiziert
 Seiten: -
 Ort, Verlag, Ausgabe: -
 Inhaltsverzeichnis: -
 Art der Begutachtung: -
 Identifikatoren: eDoc: 440827
DOI: 10.1093/bioinformatics/btp590
Anderer: 2732/S 39054
 Art des Abschluß: -

Veranstaltung

einblenden:

Entscheidung

einblenden:

Projektinformation

einblenden:

Quelle 1

einblenden:
ausblenden:
Titel: Bioinformatics
Genre der Quelle: Zeitschrift
 Urheber:
Affiliations:
Ort, Verlag, Ausgabe: -
Seiten: - Band / Heft: 25 (24) Artikelnummer: - Start- / Endseite: 3221 - 3227 Identifikator: ISSN: 0266-7061 (print)
ISSN: 1367-4803 (print)
ISSN: 1367-4811 (online)
ISSN: 1460-2059 (online)