Deutsch
 
Hilfe Datenschutzhinweis Impressum
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT
  Aggregation of Multiple Clusterings and Active Learning in a Transductive Setting

Arvanitopoulos-Darginis, N. (2012). Aggregation of Multiple Clusterings and Active Learning in a Transductive Setting. Master Thesis, Universität des Saarlandes, Saarbrücken.

Item is

Dateien

einblenden: Dateien
ausblenden: Dateien
:
2011_Nikolaos_Darginis_Arvanitopoulos.pdf (beliebiger Volltext), 548KB
 
Datei-Permalink:
-
Name:
2011_Nikolaos_Darginis_Arvanitopoulos.pdf
Beschreibung:
-
OA-Status:
Sichtbarkeit:
Eingeschränkt (Max Planck Institute for Informatics, MSIN; )
MIME-Typ / Prüfsumme:
application/pdf
Technische Metadaten:
Copyright Datum:
-
Copyright Info:
-
Lizenz:
-

Externe Referenzen

einblenden:

Urheber

einblenden:
ausblenden:
 Urheber:
Arvanitopoulos-Darginis, Nikolaos1, Autor           
Hein, Matthias2, Ratgeber
Weikert, Joachim2, Gutachter
Affiliations:
1International Max Planck Research School, MPI for Informatics, Max Planck Society, ou_1116551              
2External Organizations, ou_persistent22              

Inhalt

einblenden:
ausblenden:
Schlagwörter: -
 Zusammenfassung: In this work we proposed a novel transductive method to solve the problem of learning from partially labeled data. Our main idea was to aggregate information obtained from several clusterings to infer the labels of the unlabeled data. While our method is not restricted to a specific clustering method, we chose to use in our experiments the normalized variant of 1-spectral clustering, which was demonstrated to produce in most cases better clusterings than the standard spectral clustering method. Our approach yielded results which were at least comparable to, and in some cases even significantly better than the best results obtained by state-of-the-art methods reported in the literature. Furthermore, we proposed a novel active learning framework that is able to query the labels of the most informative points which help in the classification of the unlabeled points. For the majority vote scheme we provided some guarantees on the number of points that should be drawn from each cluster in order to infer the correct label of the cluster with high probability. Moreover, in the ridge regression scheme we proposed an algorithm that in each step selects the most uncertain point in terms of the prediction function of the classier (the point that lies near the decision boundary of the classifier). In both cases, experimental results show the strength of our methods and confirm our theoretical guarantees. The results look very promising and open several interesting directions of future research. For the SSL scheme, it is interesting to test the performance of several other clustering approaches, such as k-means, standard spectral clustering, hierarchical clustering, e.t.c. and combine them together in one general method. Our intuition is that the algorithm should be able to select only the good clusterings that provide discriminative information for each specific problem. Apart from ridge regression, it would be beneficial to experiment with other fitting approaches that produce sparse representations in our constructed basis. For the active learning framework, one interesting direction is to further generalize it into more general clusterings that take into account the hierarchical structure of data. In that way, we will take advantage of the underlying hierarchy and by adaptively selecting the pruning of the cluster tree we can (potentially) further improve our sampling strategy. Additionally, we believe that in the multi-clustering scenario extensive improvements of our algorithm can be proposed in order to better take advantage of the variation in the multiple clustering representations of the data. Finally, as our methods scale to large-scale problems and partially labeled data occurs in many different areas ranging from web documents to protein data, there is room for many interesting applications of the proposed methods.

Details

einblenden:
ausblenden:
Sprache(n): eng - English
 Datum: 20122012
 Publikationsstatus: Erschienen
 Seiten: -
 Ort, Verlag, Ausgabe: Saarbrücken : Universität des Saarlandes
 Inhaltsverzeichnis: -
 Art der Begutachtung: -
 Identifikatoren: BibTex Citekey: Arvanitopoulos-Darginis2011
 Art des Abschluß: Master

Veranstaltung

einblenden:

Entscheidung

einblenden:

Projektinformation

einblenden:

Quelle

einblenden: