ausblenden:
Schlagwörter:
-
Zusammenfassung:
Redescription mining is a powerful data analysis tool that is
used to find multiple descriptions of the same entities. Consider
geographical regions as an example. They can be characterized by the
fauna that inhabits them on one hand and by their meteorological
conditions on the other hand. Finding such redescriptors, a
task known as niche-finding, is of much importance in biology.
But current redescription mining methods cannot handle other than
Boolean data. This restricts the range of possible applications or makes
discretization a prerequisite, entailing a possibly harmful loss of
information. In niche-finding, while the fauna can be naturally
represented using a Boolean presence/absence data, the weather cannot.
In this paper, we extend redescription mining to real-valued data using
a surprisingly simple and efficient approach. We provide extensive experimental
evaluation to study the behaviour of the proposed algorithm. Furthermore,
we show the statistical significance of our results using recent innovations
on randomization methods.