Goal-oriented Methods and Meta Methods for Document Classification and their 
Parameter Tuning

Sizov, Sergej; Siersdorfer, Stefan; Weikum, Gerhard; Evans, David A.; Gravano, Luis; Herzog, Otthein; Zhai, ChengXiang; Ronthaler, Marc

Lokale TagsFreigabegeschichteDetailsÜbersicht

Goal-oriented Methods and Meta Methods for Document Classification and their Parameter Tuning

Sizov, S., Siersdorfer, S., & Weikum, G. (2004). Goal-oriented Methods and Meta Methods for Document Classification and their Parameter Tuning. In CIKM 2004: proceedings of the Thirteenth Conference on Information and Knowledge Management (pp. 59-68). New York, USA: ACM.

Item is Freigegeben

einblenden: alle ausblenden: alle

Basisdaten

einblenden: ausblenden:

Datensatz-Permalink: https://hdl.handle.net/11858/00-001M-0000-000F-2AAF-9 Versions-Permalink: https://hdl.handle.net/11858/00-001M-0000-000F-2AB0-3

Genre: Konferenzbeitrag

Dateien

einblenden: Dateien

ausblenden: Dateien

:

cikm04-275-sizov.pdf (beliebiger Volltext), 202KB

Datei-Permalink:
-

Name:
cikm04-275-sizov.pdf

Beschreibung:
-

OA-Status:

Sichtbarkeit:
Privat

MIME-Typ / Prüfsumme:
application/pdf

Technische Metadaten:

Copyright Datum:
-

Copyright Info:
-

Lizenz:
-

Externe Referenzen

einblenden:

Urheber

einblenden:

ausblenden:

Urheber:
Sizov, Sergej¹, Autor
Siersdorfer, Stefan¹, Autor
Weikum, Gerhard¹, Autor
Evans, David A., Herausgeber
Gravano, Luis, Herausgeber
Herzog, Otthein, Herausgeber
Zhai, ChengXiang, Herausgeber
Ronthaler, Marc, Herausgeber

Affiliations:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018

Inhalt

einblenden:

ausblenden:

Schlagwörter: -

Zusammenfassung: Automatic text classification methods come with various calibration parameters such as thresholds for probabilities in Bayesian classifiers or for hyperplane distances in SVM classifiers. In a given application context these parameters should be set so as to meet the relative importance of various result quality metrics such as precision versus recall. In this paper we consider classifiers that can accept a document for a topic, reject it, or abstain. We aim to meet the application's goals in terms of accuracy (i.e., avoid false acceptances or rejections) and loss (i.e., limit the fraction of documents for which no decision is made). To this end we investigate restrictive forms of Support Vector Machine classifiers and we develop meta methods that split the training data into subsets for independently trained classifiers and then combine the results of these classifiers. These techniques tend to improve accuracy at the expense of document loss. We develop estimators that help to predict the accuracy and loss for a given setting of the methods' tuning parameters, and a methodology for efficiently deriving a setting that meets the application's goals. Our experiments confirm the practical viability of the approach.

Details

einblenden:

ausblenden:

Sprache(n): eng - English

Datum: Geändert: 2005-05-31Erschienen: 2004

Publikationsstatus: Erschienen

Seiten: -

Ort, Verlag, Ausgabe: New York, USA : ACM

Inhaltsverzeichnis: -

Art der Begutachtung: -

Identifikatoren: eDoc: 231864
Anderer: Local-ID: C1256DBF005F876D-EE103EB2A21109DCC1256F9500333F20-SizovClass2004

Art des Abschluß: -

Veranstaltung

einblenden:

ausblenden:

Titel: Untitled Event

Veranstaltungsort: Washington D.C., USA

Start-/Enddatum: 2004-11-08

ausblenden:

Titel: CIKM 2004 : proceedings of the Thirteenth Conference on Information and Knowledge Management

Genre der Quelle: Konferenzband

Urheber:

Affiliations:

Ort, Verlag, Ausgabe: New York, USA : ACM

Seiten: - Band / Heft: - Artikelnummer: - Start- / Endseite: 59 - 68 Identifikator: -

Datensatz

Basisdaten

Dateien

Externe Referenzen

Urheber

Inhalt

Details

Veranstaltung

Entscheidung

Projektinformation

Quelle 1