de.mpg.escidoc.pubman.appbase.FacesBean
Deutsch
 
Hilfe Wegweiser Datenschutzhinweis Impressum Kontakt
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT

Freigegeben

Konferenzbeitrag

Better External Memory Suffix Array Construction

MPG-Autoren
http://pubman.mpdl.mpg.de/cone/persons/resource/persons44293

Dementiev,  Roman
Algorithms and Complexity, MPI for Informatics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons44717

Kärkkäinen,  Juha
Algorithms and Complexity, MPI for Informatics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons45022

Mehnert,  Jens
Algorithms and Complexity, MPI for Informatics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons45344

Sanders,  Peter
Algorithms and Complexity, MPI for Informatics, Max Planck Society;

Externe Ressourcen
Es sind keine Externen Ressourcen verfügbar
Volltexte (frei zugänglich)
Es sind keine frei zugänglichen Volltexte verfügbar
Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar
Zitation

Dementiev, R., Kärkkäinen, J., Mehnert, J., & Sanders, P. (2005). Better External Memory Suffix Array Construction. In Proceedings of the Seventh Workshop on Algorithm Engineering and Experiments and the Second Workshop on Analytic Algorithmics and Combinatorics (ALENEX/ANALCO 2005) (pp. 86-97). Philadelphia, USA: SIAM.


Zitierlink: http://hdl.handle.net/11858/00-001M-0000-000F-25F0-6
Zusammenfassung
Suffix arrays are a simple and powerful data structure for text processing that can be used for full text indexes, data compression, and many other applications in particular in bioinformatics. However, so far it looked prohibitive to build suffix arrays for huge inputs that do not fit into main memory. This paper presents design, analysis, implementation, and experimental evaluation of several new and improved algorithms for suffix array construction. The algorithms are asymptotically optimal in the worst case or on the average. Our implementation can construct suffix arrays for inputs of up to 4GByte in hours on a low cost machine where all previous implementations we are aware of would fail or take days. We also present a simple and efficient external algorithm for checking whether an array of indexes is a suffix array. As a tool of possible independent interest we present a systematic way to design, analyze, and implement \emph{pipelined} algorithms.