Clustering Protein Sequence and Structure Space with Infinite Gaussian Mixture 
Models

Dubey, A; Hwang, S; Rangel, C; Rasmussen, CE; Ghahramani, Z; Wild, DL

Lokale TagsFreigabegeschichteDetailsÜbersicht

Clustering Protein Sequence and Structure Space with Infinite Gaussian Mixture Models

Dubey, A., Hwang, S., Rangel, C., Rasmussen, C., Ghahramani, Z., & Wild, D. (2004). Clustering Protein Sequence and Structure Space with Infinite Gaussian Mixture Models. In Pacific Symposium on Biocomputing (PSB 2004) (pp. 399-410). Singapore: World Scientific Publishing.

Item is Freigegeben

einblenden: alle ausblenden: alle

Basisdaten

einblenden: ausblenden:

Datensatz-Permalink: https://hdl.handle.net/11858/00-001M-0000-0013-F3A7-5 Versions-Permalink: https://hdl.handle.net/21.11116/0000-0005-53AE-A

Genre: Konferenzbeitrag

Dateien

einblenden: Dateien

ausblenden: Dateien

:

pdf2373.pdf (beliebiger Volltext), 182KB

Öffnen Speichern

Datei-Permalink:
https://hdl.handle.net/21.11116/0000-0005-53AF-9

Name:
pdf2373.pdf

Beschreibung:
-

OA-Status:

Sichtbarkeit:
Öffentlich

MIME-Typ / Prüfsumme:
application/pdf / [MD5]

Technische Metadaten:

Öffnen

Copyright Datum:
-

Copyright Info:
-

Lizenz:
-

Externe Referenzen

einblenden:

ausblenden:

externe Referenz:
http://psb.stanford.edu/previous/psb04/ (Inhaltsverzeichnis) Open Access Status unbekannt

Beschreibung:
-

OA-Status:

Urheber

einblenden:

ausblenden:

Urheber:
Dubey, A, Autor
Hwang, S, Autor
Rangel, C, Autor
Rasmussen, CE^{1, 2}, Autor
Ghahramani, Z, Autor
Wild, DL, Autor

Affiliations:
1Department Empirical Inference, Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_1497795
2Max Planck Institute for Biological Cybernetics, Max Planck Society, ou_1497794

Inhalt

einblenden:

ausblenden:

Schlagwörter: -

Zusammenfassung: We describe a novel approach to the problem of automatically clustering protein sequences and discovering protein families, subfamilies etc., based on the thoery of infinite Gaussian mixture models. This method allows the data itself to dictate how many mixture components are required to model it, and provides a measure of the probability that two proteins belong to the same cluster. We illustrate our methods with application to three data sets: globin sequences, globin sequences with known tree-dimensional structures and G-pretein coupled receptor sequences. The consistency of the clusters indicate that that our methods is producing biologically meaningful results, which provide a very good indication of the underlying families and subfamilies. With the inclusion of secondary structure and residue solvent accessibility information, we obtain a classification of sequences of known structure which reflects and extends their SCOP classifications.

A supplementary web site containing larger versions of the figures is available at http://public.kgi.edu/~wild/PSB04

Details

einblenden:

ausblenden:

Sprache(n):

Datum: Erschienen: 2004-01

Publikationsstatus: Erschienen

Seiten: -

Ort, Verlag, Ausgabe: -

Inhaltsverzeichnis: -

Art der Begutachtung: -

Identifikatoren: BibTex Citekey: 2373

Art des Abschluß: -

Veranstaltung

einblenden:

ausblenden:

Titel: Pacific Symposium on Biocomputing (PSB 2004)

Veranstaltungsort: Waimea, HI, USA

Start-/Enddatum: 2004-01-06 - 2004-01-10

ausblenden:

Titel: Pacific Symposium on Biocomputing (PSB 2004)

Genre der Quelle: Konferenzband

Urheber:

Affiliations:

Ort, Verlag, Ausgabe: Singapore : World Scientific Publishing

Seiten: - Band / Heft: - Artikelnummer: - Start- / Endseite: 399 - 410 Identifikator: -

Datensatz

Basisdaten

Dateien

Externe Referenzen

Urheber

Inhalt

Details

Veranstaltung

Entscheidung

Projektinformation

Quelle 1