English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
  Model Order Selection for Boolean Matrix Factorization

Miettinen, P., & Vreeken, J. (2011). Model Order Selection for Boolean Matrix Factorization. In Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'11) (pp. 51-59). New York, NY: ACM. doi:10.1145/2020408.2020424.

Item is

Basic

show hide
Genre: Conference Paper
Latex : Model Order Selection for {Boolean} Matrix Factorization

Files

show Files
hide Files
:
op0629-miettinen.pdf (Any fulltext), 2MB
 
File Permalink:
-
Name:
op0629-miettinen.pdf
Description:
-
OA-Status:
Visibility:
Private
MIME-Type / Checksum:
application/pdf
Technical Metadata:
Copyright Date:
-
Copyright Info:
(c) ACM, 2011. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 17th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD-2011), http://doi.acm.org/10.1145/2020408.2020424.
License:
-

Locators

show

Creators

show
hide
 Creators:
Miettinen, Pauli1, Author           
Vreeken, Jilles2, Author
Affiliations:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018              
2External Organizations, ou_persistent22              

Content

show
hide
Free keywords: -
 Abstract: Matrix factorizations---where a given data matrix is approximated by a product of two or more factor matrices---are powerful data mining tools. Among other tasks, matrix factorizations are often used to separate global structure from noise. This, however, requires solving the `model order selection problem' of determining where fine-grained structure stops, and noise starts, i.e., what is the proper size of the factor matrices. Boolean matrix factorization (BMF)---where data, factors, and matrix product are Boolean---has received increased attention from the data mining community in recent years. The technique has desirable properties, such as high interpretability and natural sparsity. But so far no method for selecting the correct model order for BMF has been available. In this paper we propose to use the Minimum Description Length (MDL) principle for this task. Besides solving the problem, this well-founded approach has numerous benefits, e.g., it is automatic, does not require a likelihood function, is fast, and, as experiments show, is highly accurate. We formulate the description length function for BMF in general---making it applicable for any BMF algorithm. We extend an existing algorithm for BMF to use MDL to identify the best Boolean matrix factorization, analyze the complexity of the problem, and perform an extensive experimental evaluation to study its behavior.

Details

show
hide
Language(s): eng - English
 Dates: 20112011
 Publication Status: Issued
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: -
 Identifiers: eDoc: 618993
DOI: 10.1145/2020408.2020424
URI: http://doi.acm.org/10.1145/2020408.2020424
Other: Local-ID: C1256DBF005F876D-9C688EF8F47E45D4C125796C0046EFA8-miettinen11model
 Degree: -

Event

show
hide
Title: 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Place of Event: San Diego, CA, USA
Start-/End Date: 2011-08-21 - 2011-08-24

Legal Case

show

Project information

show

Source 1

show
hide
Title: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'11)
  Subtitle : 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
  Abbreviation : KDD 2011
Source Genre: Proceedings
 Creator(s):
Affiliations:
Publ. Info: New York, NY : ACM
Pages: - Volume / Issue: - Sequence Number: - Start / End Page: 51 - 59 Identifier: ISBN: 978-1-4503-0813-7