English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Conference Paper

Class discovery in gene expression data: characterizing splits by support vector machines

MPS-Authors

Markowetz,  Florian
Max Planck Society;

von Heydebreck,  Anja
Max Planck Society;

External Resource
No external resources are shared
Fulltext (restricted access)
There are currently no full texts shared for your IP range.
Fulltext (public)
There are no public fulltexts stored in PuRe
Supplementary Material (public)
There is no public supplementary material available
Citation

Markowetz, F., & von Heydebreck, A. (2002). Class discovery in gene expression data: characterizing splits by support vector machines. In Between Data Science And Everyday Web Practice (pp. 662-669). Berlin: Springer Verlag.


Cite as: https://hdl.handle.net/11858/00-001M-0000-0010-8CBE-3
Abstract
We present a variation of ISIS, a class discovery method for microarray data described by Heydebreck et al. (2001). The objective is to discover biologically relevant structures in the gene expression profiles of different tissue samples in an unsupervised fashion. The method searches for binary partitions in the set of samples that show clear separation. Mathematically, each class distinction is characterized according to the size of margin achieved by a support vector machine (svm) separating the two classes. The method produces not only one partition (like most commonly used clustering algorithms) but several mutually independent ones. The significance of the margin as a measure of class distinction is shown by comparison to random partitions of the samples. In three data sets from cancer gene expression studies the svm-margin approach succeeds in detecting relationships between the tissue samples, for example cancer subtypes. The known biological classes exhibit a exceptionally large value of the svm-margin. We compare the outcome of the svm-margin method to a characterization of bipartitions of the samples based on Diagonal Linear Discriminant Analysis.