English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
 
 
DownloadE-Mail
  Large Scale Genomic Sequence SVM Classifiers

Sonnenburg, S., Rätsch, G., & Schölkopf, B. (2005). Large Scale Genomic Sequence SVM Classifiers. In S. Dzeroski, L. de Raedt, & S. Wrobel (Eds.), ICML '05: the 22nd international conference on Machine learning (pp. 848-855). New York, NY, USA: ACM Press.

Item is

Files

show Files
hide Files
:
SonRaeSch05b.pdf (Any fulltext), 309KB
Name:
SonRaeSch05b.pdf
Description:
-
OA-Status:
Visibility:
Public
MIME-Type / Checksum:
application/pdf / [MD5]
Technical Metadata:
Copyright Date:
-
Copyright Info:
-
License:
-

Locators

show
hide
Description:
-
OA-Status:

Creators

show
hide
 Creators:
Sonnenburg, S, Author           
Rätsch, G1, Author           
Schölkopf, B2, 3, Author           
Affiliations:
1Friedrich Miescher Laboratory, Max Planck Society, Max-Planck-Ring 9, 72076 Tübingen, DE, ou_2575692              
2Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_1497795              
3Max Planck Institute for Biological Cybernetics, Max Planck Society, Spemannstrasse 38, 72076 Tübingen, DE, ou_1497794              

Content

show
hide
Free keywords: -
 Abstract: In genomic sequence analysis tasks like splice site recognition or promoter identification, large amounts of training sequences are available, and indeed needed to achieve sufficiently high classification performances. In this work we study two recently proposed and successfully used kernels, namely the Spectrum kernel and the Weighted Degree kernel (WD). In particular, we suggest several extensions using Suffix Trees and modi cations of an SMO-like SVM training algorithm in order to accelerate the training of the SVMs and their evaluation on test sequences. Our simulations show that for the spectrum kernel and WD kernel, large scale SVM training can be accelerated by factors of 20 and 4 times, respectively, while using much less memory (e.g. no kernel caching). The evaluation on new sequences is often several thousand times faster using the new techniques (depending on the number of Support Vectors). Our method allows us to train on sets as large as one million sequences.

Details

show
hide
Language(s):
 Dates: 2005-08
 Publication Status: Issued
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: BibTex Citekey: 3627
DOI: 10.1145/1102351.1102458
 Degree: -

Event

show
hide
Title: 22nd International Conference on Machine Learning (ICML 2005)
Place of Event: Bonn, Germany
Start-/End Date: 2005-08-07 - 2005-08-11

Legal Case

show

Project information

show

Source 1

show
hide
Title: ICML '05: the 22nd international conference on Machine learning
Source Genre: Proceedings
 Creator(s):
Dzeroski, S, Editor
de Raedt, L, Editor
Wrobel, S, Editor
Affiliations:
-
Publ. Info: New York, NY, USA : ACM Press
Pages: - Volume / Issue: - Sequence Number: - Start / End Page: 848 - 855 Identifier: ISBN: 1-59593-180-5