de.mpg.escidoc.pubman.appbase.FacesBean
English
 
Help Guide Disclaimer Contact us Login
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT

Released

Poster

A Maximum Entropy Approach to Semi-supervised Learning

MPS-Authors
http://pubman.mpdl.mpg.de/cone/persons/resource/persons83905

Erkan,  AN
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;

http://pubman.mpdl.mpg.de/cone/persons/resource/persons83782

Altun,  Y
Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society;

Locator
There are no locators available
Fulltext (public)
There are no public fulltexts available
Supplementary Material (public)
There is no public supplementary material available
Citation

Erkan, A., & Altun, Y. (2010). A Maximum Entropy Approach to Semi-supervised Learning.


Cite as: http://hdl.handle.net/11858/00-001M-0000-0013-BF4C-2
Abstract
Maximum entropy (MaxEnt) framework has been studied extensively in supervised learning. Here, the goal is to find a distribution p that maximizes an entropy function while enforcing data constraints so that the expected values of some (pre-defined) features with respect to p match their empirical counterparts approximately. Using different entropy measures, different model spaces for p and different approximation criteria for the data constraints yields a family of discriminative supervised learning methods (e.g., logistic regression, conditional random fields, least squares and boosting). This framework is known as the generalized maximum entropy framework. Semi-supervised learning (SSL) has emerged in the last decade as a promising field that combines unlabeled data along with labeled data so as to increase the accuracy and robustness of inference algorithms. However, most SSL algorithms to date have had trade-offs, e.g., in terms of scalability or applicability to multi-categorical data. We extend the generalized MaxEnt framework to develop a family of novel SSL algorithms. Extensive empirical evaluation on benchmark data sets that are widely used in the literature demonstrates the validity and competitiveness of the proposed algorithms.