Semi-Supervised Discovery of Visual Attributes

Bahmanyar, Gholamreza

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Lokale TagsFreigabegeschichteDetailsÜbersicht

Freigegeben

Hochschulschrift

Semi-Supervised Discovery of Visual Attributes

MPG-Autoren

/persons/resource/persons71838

Bahmanyar, Gholamreza
Computer Vision and Multimodal Computing, MPI for Informatics, Max Planck Society;

Externe Ressourcen

Es sind keine externen Ressourcen hinterlegt

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

Es sind keine frei zugänglichen Volltexte in PuRe verfügbar

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

Bahmanyar, G. (2011). Semi-Supervised Discovery of Visual Attributes. Master Thesis, Universität des Saarlandes, Saarbrücken.

Zitierlink: https://hdl.handle.net/11858/00-001M-0000-0010-12F4-1

Zusammenfassung

Abstract\\ It has been shown that learning on high-level visual description or visual properties of objects, also known as \hlight{attributes}, can play an effective role in recognition systems. However, there are still two open challenging problems: First, how to scale the use of attributes to a large number of categories, and second, how to select attributes for a given recognition task. \\In this thesis, our focus is on these two open problems. We explore automatic discovering of attributes from images and text data to scale the applicability of attributes to large collections of images and text documents. In addition, we find which attributes are relevant for a recognition task by evaluating the attributes based on how well they can be distinguished by a recognition system. \\We deal with these problems by two approaches. In the first approach, which is based on the work of Berg et al.~\sdcite{Berg:2010:AAD:1886063.1886114}, we extract attributes from text on the web and rank them on the basis of how well they can be distinguished using a discriminatively trained SVM. \\In contrast, the second approach uses a generative technique, namely \hlight{topic model}, to discover textual and the semantically correlated visual attributes based on the co-occurrence statistics. To this end, three different models are proposed which differ in how they leverage text to image relationships. \\These two approaches are evaluated both qualitatively and quantitatively. The qualitative evaluation shows that both approaches can discover human-understandable attributes. The quantitative evaluation demonstrates the comparable performance of the mentioned approaches in terms of discovering discriminative attributes. Furthermore, the generative model can localize all parts of a visual attribute by multiple patches, whereas the discriminative model shows only a predominant part of each visual attribute by a single patch.