Deutsch
 
Hilfe Datenschutzhinweis Impressum
  DetailsucheBrowse

Datensatz

 
 
DownloadE-Mail
  Identifying Consistent Statements about Numerical Data with Dispersion-Corrected Subgroup Discovery

Boley, M., Goldsmith, B. R., Ghiringhelli, L. M., & Vreeken, J. (2017). Identifying Consistent Statements about Numerical Data with Dispersion-Corrected Subgroup Discovery. Retrieved from http://arxiv.org/abs/1701.07696.

Item is

Dateien

einblenden: Dateien
ausblenden: Dateien
:
arXiv:1701.07696.pdf (Preprint), 3MB
Name:
arXiv:1701.07696.pdf
Beschreibung:
File downloaded from arXiv at 2017-07-10 11:56 significance of empirical results tested; additional illustrations; table of used notations
OA-Status:
Sichtbarkeit:
Öffentlich
MIME-Typ / Prüfsumme:
application/pdf / [MD5]
Technische Metadaten:
Copyright Datum:
-
Copyright Info:
-

Externe Referenzen

einblenden:

Urheber

einblenden:
ausblenden:
 Urheber:
Boley, Mario1, Autor           
Goldsmith, Bryan R.2, Autor
Ghiringhelli, Luca M.2, Autor
Vreeken, Jilles1, Autor           
Affiliations:
1Databases and Information Systems, MPI for Informatics, Max Planck Society, ou_24018              
2External Organizations, ou_persistent22              

Inhalt

einblenden:
ausblenden:
Schlagwörter: Computer Science, Artificial Intelligence, cs.AI,Computer Science, Databases, cs.DB
 Zusammenfassung: Existing algorithms for subgroup discovery with numerical targets do not optimize the error or target variable dispersion of the groups they find. This often leads to unreliable or inconsistent statements about the data, rendering practical applications, especially in scientific domains, futile. Therefore, we here extend the optimistic estimator framework for optimal subgroup discovery to a new class of objective functions: we show how tight estimators can be computed efficiently for all functions that are determined by subgroup size (non-decreasing dependence), the subgroup median value, and a dispersion measure around the median (non-increasing dependence). In the important special case when dispersion is measured using the average absolute deviation from the median, this novel approach yields a linear time algorithm. Empirical evaluation on a wide range of datasets shows that, when used within branch-and-bound search, this approach is highly efficient and indeed discovers subgroups with much smaller errors.

Details

einblenden:
ausblenden:
Sprache(n): eng - English
 Datum: 2017-01-262017-04-232017
 Publikationsstatus: Online veröffentlicht
 Seiten: 28 p.
 Ort, Verlag, Ausgabe: -
 Inhaltsverzeichnis: -
 Art der Begutachtung: -
 Identifikatoren: arXiv: 1701.07696
URI: http://arxiv.org/abs/1701.07696
BibTex Citekey: DBLP:journals/corr/BoleyGGV17
 Art des Abschluß: -

Veranstaltung

einblenden:

Entscheidung

einblenden:

Projektinformation

einblenden: ausblenden:
Projektname : NoMaD, The Novel Materials Discovery Laboratory
Grant ID : 676580
Förderprogramm : Horizon 2020 (H2020)
Förderorganisation : European Commission (EC)

Quelle

einblenden: