de.mpg.escidoc.pubman.appbase.FacesBean
Deutsch
 
Hilfe Wegweiser Impressum Kontakt Einloggen
  DetailsucheBrowse

Datensatz

DATENSATZ AKTIONENEXPORT

Freigegeben

Forschungspapier

Causal Inference by Stochastic Complexity

MPG-Autoren
http://pubman.mpdl.mpg.de/cone/persons/resource/persons123435

Budhathoki,  Kailash
Databases and Information Systems, MPI for Informatics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons79525

Vreeken,  Jilles
Databases and Information Systems, MPI for Informatics, Max Planck Society;

Externe Ressourcen
Es sind keine Externen Ressourcen verfügbar
Volltexte (frei zugänglich)

arXiv:1702.06776.pdf
(Preprint), 773KB

Ergänzendes Material (frei zugänglich)
Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar
Zitation

Budhathoki, K., & Vreeken, J. (2017). Causal Inference by Stochastic Complexity. Retrieved from http://arxiv.org/abs/1702.06776.


Zitierlink: http://hdl.handle.net/11858/00-001M-0000-002D-90F2-A
Zusammenfassung
The algorithmic Markov condition states that the most likely causal direction between two random variables X and Y can be identified as that direction with the lowest Kolmogorov complexity. Due to the halting problem, however, this notion is not computable. We hence propose to do causal inference by stochastic complexity. That is, we propose to approximate Kolmogorov complexity via the Minimum Description Length (MDL) principle, using a score that is mini-max optimal with regard to the model class under consideration. This means that even in an adversarial setting, such as when the true distribution is not in this class, we still obtain the optimal encoding for the data relative to the class. We instantiate this framework, which we call CISC, for pairs of univariate discrete variables, using the class of multinomial distributions. Experiments show that CISC is highly accurate on synthetic, benchmark, as well as real-world data, outperforming the state of the art by a margin, and scales extremely well with regard to sample and domain sizes.