Bonsai: Growing Interesting Small Trees

Seufert, Stephan; Bedathur, Srikanta; Mestre, Julian; Weikum, Gerhard

doi:10.1109/ICDM.2010.86

Item

ITEM ACTIONSEXPORT

Add to Basket

Local TagsRelease HistoryDetailsSummary

Released

Conference Paper

Bonsai: Growing Interesting Small Trees

MPS-Authors

/persons/resource/persons45462

Seufert, Stephan
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons44104

Bedathur, Srikanta
Databases and Information Systems, MPI for Informatics, Max Planck Society;

/persons/resource/persons45030

Mestre, Julian
Algorithms and Complexity, MPI for Informatics, Max Planck Society;

/persons/resource/persons45720

Weikum, Gerhard
Databases and Information Systems, MPI for Informatics, Max Planck Society;

External Resource

No external resources are shared

Fulltext (restricted access)

There are currently no full texts shared for your IP range.

Fulltext (public)

There are no public fulltexts stored in PuRe

Supplementary Material (public)

There is no public supplementary material available

Citation

Seufert, S., Bedathur, S., Mestre, J., & Weikum, G. (2010). Bonsai: Growing Interesting Small Trees. In G. I. Webb, B. Liu, C. Zhang, D. Gunopulos, & X. Wu (Eds.), 10th IEEE International Conference on Data Mining (pp. 1013-1018). Los Alamitos, CA: IEEE Computer Society.

Cite as: https://hdl.handle.net/11858/00-001M-0000-000F-14D5-D

Abstract

Graphs are increasingly used to model a variety of loosely structured data such as biological or social networks and entity-relationships. Given this profusion of large-scale graph data, efficiently discovering interesting substructures buried within is essential. These substructures are typically used in determining subsequent actions, such as conducting visual analytics by humans or designing expensive biomedical experiments. In such settings, it is often desirable to constrain the size of the discovered results in order to directly control the associated costs. In this paper, we address the problem of finding cardinality-constrained connected subtrees in large node-weighted graphs that maximize the sum of weights of selected nodes. We provide an efficient constant-factor approximation algorithm for this strongly NP-hard problem. Our techniques can be applied in a wide variety of application settings, for example in differential analysis of graphs, a problem that frequently arises in bioinformatics but also has applications on the web.