Large Scale Genomic Sequence SVM Classifiers

Sonnenburg, S; Rätsch, G; Schölkopf, B

doi:10.1145/1102351.1102458

Local TagsRelease HistoryDetailsSummary

Large Scale Genomic Sequence SVM Classifiers

Sonnenburg, S., Rätsch, G., & Schölkopf, B. (2005). Large Scale Genomic Sequence SVM Classifiers. In S. Dzeroski, L. de Raedt, & S. Wrobel (Eds.), ICML '05: the 22nd international conference on Machine learning (pp. 848-855). New York, NY, USA: ACM Press.

Item is Released

show all hide all

Basic

show hide

Item Permalink: https://hdl.handle.net/11858/00-001M-0000-0013-D6E1-7 Version Permalink: https://hdl.handle.net/21.11116/0000-0005-0E1C-E

Genre: Conference Paper

Files

show Files

hide Files

:

SonRaeSch05b.pdf (Any fulltext), 309KB

View Save

File Permalink:
https://hdl.handle.net/21.11116/0000-0005-0E1D-D

Name:
SonRaeSch05b.pdf

Description:
-

OA-Status:

Visibility:
Public

MIME-Type / Checksum:
application/pdf / [MD5]

Technical Metadata:

View

Copyright Date:
-

Copyright Info:
-

License:
-

Locators

show

hide

Locator:
https://dl.acm.org/citation.cfm?doid=1102351.1102458 (Publisher version) Open Access status unknown

Description:
-

OA-Status:

Creators

show

hide

Creators:
Sonnenburg, S, Author
Rätsch, G¹, Author
Schölkopf, B^{2, 3}, Author

Affiliations:
1Friedrich Miescher Laboratory, Max Planck Society, Max-Planck-Ring 9, 72076 Tübingen, DE, ou_2575692
2Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_1497795
3Max Planck Institute for Biological Cybernetics, Max Planck Society, Spemannstrasse 38, 72076 Tübingen, DE, ou_1497794

Content

show

hide

Free keywords: -

Abstract: In genomic sequence analysis tasks like splice site recognition or promoter identification, large amounts of training sequences are available, and indeed needed to achieve sufficiently high classification performances. In this work we study two recently proposed and successfully used kernels, namely the Spectrum kernel and the Weighted Degree kernel (WD). In particular, we suggest several extensions using Suffix Trees and modi cations of an SMO-like SVM training algorithm in order to accelerate the training of the SVMs and their evaluation on test sequences. Our simulations show that for the spectrum kernel and WD kernel, large scale SVM training can be accelerated by factors of 20 and 4 times, respectively, while using much less memory (e.g. no kernel caching). The evaluation on new sequences is often several thousand times faster using the new techniques (depending on the number of Support Vectors). Our method allows us to train on sets as large as one million sequences.

Details

show

hide

Language(s):

Dates: Date issued: 2005-08

Publication Status: Issued

Pages: -

Publishing info: -

Table of Contents: -

Rev. Type: -

Identifiers: BibTex Citekey: 3627
DOI: 10.1145/1102351.1102458

Degree: -

Event

show

hide

Title: 22nd International Conference on Machine Learning (ICML 2005)

Place of Event: Bonn, Germany

Start-/End Date: 2005-08-07 - 2005-08-11

Legal Case

show

Project information

show

Source 1

show

hide

Title: ICML '05: the 22nd international conference on Machine learning

Source Genre: Proceedings

Creator(s):
Dzeroski, S, Editor
de Raedt, L, Editor
Wrobel, S, Editor

Affiliations:
-

Publ. Info: New York, NY, USA : ACM Press

Pages: - Volume / Issue: - Sequence Number: - Start / End Page: 848 - 855 Identifier: ISBN: 1-59593-180-5