English
 
Help Privacy Policy Disclaimer
  Advanced SearchBrowse

Item

ITEM ACTIONSEXPORT
 
 
DownloadE-Mail
  Model Selection for Mixtures of Mutagenetic Trees

Yin, J., Beerenwinkel, N., Rahnenführer, J., & Lengauer, T. (2006). Model Selection for Mixtures of Mutagenetic Trees. Statistical Applications in Genetics and Molecular Biology, 5: 17.

Item is

Files

show Files
hide Files
:
AG3_001.pdf (Any fulltext), 212KB
 
File Permalink:
-
Name:
AG3_001.pdf
Description:
-
OA-Status:
Visibility:
Private
MIME-Type / Checksum:
application/pdf
Technical Metadata:
Copyright Date:
-
Copyright Info:
... For 5 years from the date of initial publication in Statistical Applications in Genetics and Molecular Biology, the author (or copyright holder if different) grants bepress and exclusive right to distribution of the article in digital form... Personal-use exceptions: The following uses are exempt from the exclusive 5-year period: ... a non-commercial open access repository, or publication site affiliated with the author's plase of employment ...
License:
-

Locators

show

Creators

show
hide
 Creators:
Yin, Junming1, Author           
Beerenwinkel, Niko2, Author           
Rahnenführer, Jörg2, Author           
Lengauer, Thomas2, Author           
Affiliations:
1International Max Planck Research School, MPI for Informatics, Max Planck Society, ou_1116551              
2Computational Biology and Applied Algorithmics, MPI for Informatics, Max Planck Society, ou_40046              

Content

show
hide
Free keywords: -
 Abstract: The evolution of drug resistance in HIV is characterized by the accumulation of resistance-associated mutations in the HIV genome. Mutagenetic trees, a family of restricted Bayesian tree models, have been applied to infer the order and rate of occurrence of these mutations. Understanding and predicting this evolutionary process is an important prerequisite for the rational design of antiretroviral therapies. In practice, mixtures models of K mutagenetic trees provide more flexibility and are often more appropriate for modelling observed mutational patterns. Here, we investigate the model selection problem for K-mutagenetic trees mixture models. We evaluate several classical model selection criteria including cross-validation, the Bayesian Information Criterion (BIC), and the Akaike Information Criterion. We also use the empirical Bayes method by constructing a prior probability distribution for the parameters of a mutagenetic trees mixture model and deriving the posterior probability of the model. In addition to the model dimension, we consider the redundancy of a mixture model, which is measured by comparing the topologies of trees within a mixture model. Based on the redundancy, we propose a new model selection criterion, which is a modification of the BIC. Experimental results on simulated and on real HIV data show that the classical criteria tend to select models with far too many tree components. Only cross-validation and the modified BIC recover the correct number of trees and the tree topologies most of the time. At the same optimal performance, the runtime of the new BIC modification is about one order of magnitude lower. Thus, this model selection criterion can also be used for large data sets for which cross-validation becomes computationally infeasible.

Details

show
hide
Language(s): eng - English
 Dates: 2007-04-022006
 Publication Status: Issued
 Pages: -
 Publishing info: -
 Table of Contents: -
 Rev. Type: Peer
 Identifiers: eDoc: 314620
Other: Local-ID: C125673F004B2D7B-7E0C9F42BD91FB39C12570500035F600-Yin05
 Degree: -

Event

show

Legal Case

show

Project information

show

Source 1

show
hide
Title: Statistical Applications in Genetics and Molecular Biology
Source Genre: Journal
 Creator(s):
Affiliations:
Publ. Info: -
Pages: - Volume / Issue: 5 Sequence Number: 17 Start / End Page: - Identifier: ISSN: 1544-6115