Deutsch
 
Hilfe Datenschutzhinweis Impressum
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT
  MDL4BMF: Minimum Description Length for Boolean Matrix Factorization

Miettinen, P., & Vreeken, J.(2012). MDL4BMF: Minimum Description Length for Boolean Matrix Factorization (MPI-I-2012-5-001). Saarbrücken: Max-Planck-Institut für Informatik.

Item is

Basisdaten

einblenden: ausblenden:
Genre: Bericht
Latex : {MDL4BMF}: Minimum Description Length for Boolean Matrix Factorization

Dateien

einblenden: Dateien
ausblenden: Dateien
:
MPI-I-2012-5-001.pdf (beliebiger Volltext), 3MB
Name:
MPI-I-2012-5-001.pdf
Beschreibung:
-
OA-Status:
Sichtbarkeit:
Öffentlich
MIME-Typ / Prüfsumme:
application/pdf / [MD5]
Technische Metadaten:
Copyright Datum:
-
Copyright Info:
-
Lizenz:
-

Externe Referenzen

einblenden:

Urheber

einblenden:
ausblenden:
 Urheber:
Miettinen, Pauli1, Autor           
Vreeken, Jilles2, Autor           
Affiliations:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018              
2External Organizations, ou_persistent22              

Inhalt

einblenden:
ausblenden:
Schlagwörter: -
 Zusammenfassung: Matrix factorizations—where a given data matrix is approximated by a prod- uct of two or more factor matrices—are powerful data mining tools. Among other tasks, matrix factorizations are often used to separate global structure from noise. This, however, requires solving the ‘model order selection problem’ of determining where fine-grained structure stops, and noise starts, i.e., what is the proper size of the factor matrices. Boolean matrix factorization (BMF)—where data, factors, and matrix product are Boolean—has received increased attention from the data mining community in recent years. The technique has desirable properties, such as high interpretability and natural sparsity. However, so far no method for selecting the correct model order for BMF has been available. In this paper we propose to use the Minimum Description Length (MDL) principle for this task. Besides solving the problem, this well-founded approach has numerous benefits, e.g., it is automatic, does not require a likelihood function, is fast, and, as experiments show, is highly accurate. We formulate the description length function for BMF in general—making it applicable for any BMF algorithm. We discuss how to construct an appropriate encoding, starting from a simple and intuitive approach, we arrive at a highly efficient data-to-model based encoding for BMF. We extend an existing algorithm for BMF to use MDL to identify the best Boolean matrix factorization, analyze the complexity of the problem, and perform an extensive experimental evaluation to study its behavior.

Details

einblenden:
ausblenden:
Sprache(n): eng - English
 Datum: 2012
 Publikationsstatus: Online veröffentlicht
 Seiten: 48 p.
 Ort, Verlag, Ausgabe: Saarbrücken : Max-Planck-Institut für Informatik
 Inhaltsverzeichnis: -
 Art der Begutachtung: -
 Identifikatoren: BibTex Citekey: MiettinenVreeken
Reportnr.: MPI-I-2012-5-001
 Art des Abschluß: -

Veranstaltung

einblenden:

Entscheidung

einblenden:

Projektinformation

einblenden:

Quelle 1

einblenden:
ausblenden:
Titel: Research Report
Genre der Quelle: Reihe
 Urheber:
Affiliations:
Ort, Verlag, Ausgabe: -
Seiten: - Band / Heft: - Artikelnummer: - Start- / Endseite: - Identifikator: ISSN: 0946-011X