Robust Principal Component Analysis as a Nonlinear Eigenproblem

Podosinnikova, Anastasia

Datensatz

DATENSATZ AKTIONENEXPORT

Zur Ablage hinzufügen

Lokale TagsFreigabegeschichteDetailsÜbersicht

Freigegeben

Hochschulschrift

Robust Principal Component Analysis as a Nonlinear Eigenproblem

MPG-Autoren

/persons/resource/persons45202

Podosinnikova, Anastasia
International Max Planck Research School, MPI for Informatics, Max Planck Society;

Externe Ressourcen

Es sind keine externen Ressourcen hinterlegt

Volltexte (beschränkter Zugriff)

Für Ihren IP-Bereich sind aktuell keine Volltexte freigegeben.

Volltexte (frei zugänglich)

Es sind keine frei zugänglichen Volltexte in PuRe verfügbar

Ergänzendes Material (frei zugänglich)

Es sind keine frei zugänglichen Ergänzenden Materialien verfügbar

Zitation

Podosinnikova, A. (2013). Robust Principal Component Analysis as a Nonlinear Eigenproblem. Master Thesis, Universität des Saarlandes, Saarbrücken.

Zitierlink: https://hdl.handle.net/11858/00-001M-0000-0026-CC75-A

Zusammenfassung

Principal Component Analysis (PCA) is a widely used tool for, e.g., exploratory data analysis, dimensionality reduction and clustering. However, it is well known that PCA is strongly aected by the presence of outliers and, thus, is vulnerable to both gross measurement error and adversarial manipulation of the data. This phenomenon motivates the development of robust PCA as the problem of recovering the principal components of the uncontaminated data. In this thesis, we propose two new algorithms, QRPCA and MDRPCA, for robust PCA components based on the projection-pursuit approach of Huber. While the resulting optimization problems are non-convex and non-smooth, we show that they can be eciently minimized via the RatioDCA using bundle methods/accelerated proximal methods for the interior problem. The key ingredient for the most promising algorithm (QRPCA) is a robust, location invariant scale measure with breakdown point 0.5. Extensive experiments show that our QRPCA is competitive with current state-of-the-art methods and outperforms other methods in particular for a large number of outliers.