Help Guide Disclaimer Contact us Login
  Advanced SearchBrowse





Semi-Supervised Discovery of Visual Attributes


Bahmanyar,  Gholamreza
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society;

There are no locators available
Fulltext (public)
There are no public fulltexts available
Supplementary Material (public)
There is no public supplementary material available

Bahmanyar, G. (2011). Semi-Supervised Discovery of Visual Attributes. Master Thesis, Universität des Saarlandes, Saarbrücken.

Cite as:
Abstract\\ It has been shown that learning on high-level visual description or visual properties of objects, also known as \hlight{attributes}, can play an effective role in recognition systems. However, there are still two open challenging problems: First, how to scale the use of attributes to a large number of categories, and second, how to select attributes for a given recognition task. \\In this thesis, our focus is on these two open problems. We explore automatic discovering of attributes from images and text data to scale the applicability of attributes to large collections of images and text documents. In addition, we find which attributes are relevant for a recognition task by evaluating the attributes based on how well they can be distinguished by a recognition system. \\We deal with these problems by two approaches. In the first approach, which is based on the work of Berg et al.~\sdcite{Berg:2010:AAD:1886063.1886114}, we extract attributes from text on the web and rank them on the basis of how well they can be distinguished using a discriminatively trained SVM. \\In contrast, the second approach uses a generative technique, namely \hlight{topic model}, to discover textual and the semantically correlated visual attributes based on the co-occurrence statistics. To this end, three different models are proposed which differ in how they leverage text to image relationships. \\These two approaches are evaluated both qualitatively and quantitatively. The qualitative evaluation shows that both approaches can discover human-understandable attributes. The quantitative evaluation demonstrates the comparable performance of the mentioned approaches in terms of discovering discriminative attributes. Furthermore, the generative model can localize all parts of a visual attribute by multiple patches, whereas the discriminative model shows only a predominant part of each visual attribute by a single patch.