Figures and Tables. Citations Publications citing this paper. Characterization and sexual dimorphic expression of Cytochrome P genes in the hypothalamic-pituitary-gonad axis of yellow catfish. Cooper , Rachael A. King , William F. Humphreys , Andrew D. Into the deep: a phylogenetic approach to the bivalve subclass Protobranchia. Prashant P Sharma , John D. Coexistence and origin of trophic ecotypes of pygmy whitefish, Prosopium coulterii, in a south-western Alaskan lake.

Conrad P. New data and phylogenetic placement of the enigmatic Old World lupin: Lupinus mariae-josephi H. References Publications referenced by this paper. David Posada. Testing a molecular clock without an outgroup: derivations of induced priors on branch-length restrictions in a Bayesian framework. Marc A. Suchard , Robert E. Weiss , Janet S. Bayesianism: its scope and limits Elliott Sober.

- Model selection and model averaging in behavioural ecology: the utility of the IT-AIC framework!
- Film and Democracy in Paraguay;
- NSF Award Search: Award# - High Dimensional Model Averaging and Model Selection.
- Picture Yourself Ghost Hunting;
- Incredible Dinosaurs.

It is evident that the use of appropriate models is essential if we are to be confident in the results of a phylogenetic analysis, and indeed, several strategies for model choice have been proposed in the context of phylogenetics. We refer the reader to Johnson and Omland , Posada and Crandall b and Posada for a detailed introduction, and for an evaluation of the performance of these methods to recover the model generating the data.

Computer programs exist that implement these methods Adachi and Hasegawa, ; Posada and Crandall, Among the available methods for model selection in phylogenetics, hierarchical likelihood ratio tests hLRTs are the most popular. However, here we argue that the hLRTs approach is not the optimal strategy for model selection in phylogenetics, and that approaches like the Akaike Information Criterion AIC and Bayesian methods offer important advantages. In particular, the latter two allow for assessment of model selection uncertainty and model averaging.

In other words, the set of models is misspecified. All models are wrong but some are useful Box, , and model selection is best seen as a way of approximating, rather than identifying, full reality Burnham and Anderson, , pp. Statistical model selection is commonly based on William of Occam's ca.

- The Mystery of Yawning in Physiology and Disease (Frontiers of Neurology and Neuroscience, Vol. 28);
- Nonlinear predictive model selection and model averaging using information criteria.
- [Magazine] Everyday Practical Electronics. Volume 30. Issue 12!
- Java Programming for Spatial Sciences.

In statistical terms, this is a trade-off between bias distance between the average estimate and truth and variance spread of the estimates around the truth Fig. In addition, the computations typically require more time. So the question is how complex should the model be for a given problem. The principle of parsimony. Model selection is more or less based on the trade-off between bias and variance versus the number of estimable parameters in the model.

The principle of parsimony tells us that as we increase the number of parameters in a model the bias decreases but the variance increases. This principle underlies all model selection approaches. However, this multidimensional integral can be very difficult to compute, and it is typically approximated using computationally intensive techniques like Markov chain Monte Carlo MCMC Gilks et al.

Steel and Penny and Holder and Lewis provide an instructive discussion on joint and marginal estimation in the context of phylogenetics. Hierarchical likelihood ratio tests hLRTs. This figure illustrates an arbitrary hierarchy of LRTs for six different models. Within each LRT, the null model is depicted above the alternative model. There are six possible paths depending on the outcome of the individual LRTs, and each path results in the selection of a different model. Two models are nested when one of them, the null model, is a special case of the other, the alternative model.

For example, the Jukes-Cantor model Jukes and Cantor, JC69 is nested within the Kimura two-parameter model Kimura, K80 , because if we assume that transitions and transversions occur at the same rate i. However, obtaining correct P -values for the LRT statistics can be difficult. LRTs implicitly assume that at least one of the models compared is correct, and when the models are misspecified these tests can often be incorrect Foutz and Srivastava, ; Golden, ; Kent, Although proper LRTs can be constructed when models are wrong Vuong, , standard LRTs in phylogenetics are not robust to model misspecification Zhang, In addition, when sample size is small the usual asymptotic approximation on which P -values are based no longer applies.

Furthermore, LRTs were designed for hypothesis testing, and although classical hypothesis testing is commonly used as a model selection strategy, it has been argued that hypothesis testing and model selection are distinct issues Burnham and Anderson, , pp. A stepwise procedure like the hLRTs, in which we sequentially decide whether to add or remove certain parameters, is analogous to forward and backward selections in best-subset linear regression Miller, , pp. As pointed out by Sanderson and Kim , we can identify several potential problems with the use of hLRTs for model selection in phylogenetics.

There exist situations in which an optimal model may not exist for the hLRTs procedure. Even if an optimal model exists, it will be always a function of the significance level, and the outcome of the model choice procedure may vary accordingly. Although there are statistical procedures to correct for this effect—like the Bonferroni correction see Hochberg, —here the tests are nonindependent, and the appropriate adjustment can be very complex see also Shimodaira, , ; Shimodaira and Hasegawa, The outcome of the hLRTs might also be affected by the starting model for the hLRTs procedure we need to select a starting point, usually represented by the simplest or the most complex model in the set of candidate models.

In addition, there are cases in which the hLRTs will not select the best model, according to its own criteria, among the candidate models. Indeed, these problems can have an impact on the analysis of real data sets, and we have analyzed a set of HIV sequences Posada and Crandall, a for illustrative purposes Fig. In Figure 3a we can see a case in which an optimal model does not exist, as all of the three models are rejected when compared with one of the other two. Also, note that increasing the significance level Fig. We cannot devise a hierarchy of hLRTs that overcomes all these problems at once, but better approaches exist than simply forward and backward selection Miller, Problems of hLRTs with a real data set.

See text for further details. The data set analyzed is an alignment of 12 HIV-1 subtype D sequences of a fragment of nucleotides from the gag region Posada and Crandall, a. K81uf is the Kimura model Kimura, with unequal base frequencies.

## Model selection - Wikipedia

Solid arrows indicate the outcome of the LRT performed, whereas discontinuous arrows indicate the outcome of a potential LRT not performed. P is the associated P -value of the LRTs. The underlined model is the starting point of the hLRT, the best model according to all LRTs is indicated with an asterisk, and the model selected is enclosed within a square. Model selection is an integral part of Bayesian estimation Gelfand, ; Raftery, ; Wasserman, , and within this framework, different strategies exist to accomplish the same tasks. It is important to note that Bayes factors compare model likelihoods or P D M , which are calculated by integrating—not maximizing—over all possible parameter values except in empirical Bayesian approaches, where maximum likelihood estimates can be used instead.

Bayes factors are already being used in the context of phylogenetics, for example to infer the occurrence of recombination events Suchard et al. A word is needed about model prior probabilities P M i. Although models are commonly assigned equal prior probabilities, in phylogenetics we may have prior beliefs stating that some models are more probable than others. For example, we have enough information about the process of mitochondrial sequence evolution to believe that the JC69 model is less probable in this case than the HKY85 model with a gamma distribution for rates among sites see Yang, a.

Ideally, this information should be reflected in the model priors, and although considerable Bayesian research exists on eliciting prior information Kadane and Wolfson, ; Madigan et al. Fortunately, if the signal in the data, conveyed through the likelihood, is strong enough, then the prior distributions should not have a large influence on the posterior distribution. Indeed, posterior probabilities of trees are already being used to estimate phylogenies Holder and Lewis, ; Huelsenbeck et al.

### Catching Up Faster in Bayesian Model Selection and Model Averaging

When the priors for the parameters in the complex model are very diffuse, Bayesian approaches tend to support the null model in contradiction to significance tests e. If the diffuseness of these priors arises because of mere ignorance of the values these parameters can take, this conflict highlights a disadvantage of Bayesian approaches, especially in the case of Bayesian Information Criterion BIC see below , which assume flat, improper priors.

In any case, Jeffreys-Lindley's paradox illustrates the relevance, for good or for bad, of the priors we choose for the model parameters Huelsenbeck et al. A collection of BIC statistics contains the same information as a collection of pairwise Bayes factors.

## Model Selection and Model Averaging

However, when choosing among several models, the BIC statistics are easier to interpret by visual inspection, as they allow for the simultaneous comparison of multiple models, so the best-fit models can be immediately identified. On the other hand, selecting the best-fit model from a collection of multiple pairwise Bayes factors could be more burdensome, and such procedure might suffer from some of the problems described above for the hLRTs.

Nevertheless, the BIC approximation might not be appropriate when the posterior mode occurs at the boundary of the parameter space Hsiao, ; Ota et al.

- Lingual Orthodontics: A New Approach Using STb Light Lingual System and Lingual Straight Wire.
- Biotechnolgy Annual Review.
- Thomas J. Sargent and Jouko Vilmunen!
- Model Selection and Model Averaging.

Recently, Minin et al. Minin et al. They assess models through a penalty or loss function, related to how dissimilar the branch length estimates are across models, and pick the model with the minimum posterior loss. As expected, simulations suggested that models selected with this criterion result in slightly more accurate branch length estimates than those obtained under models selected by the hLRTs. Once we have selected a model it is very important that we are able to assess how confident we are in that selection see Chatfield, We would like to be able to rank the models and to know whether the model selected is much better than the other candidate models.

At the same time, we should be interested to learn whether we would select the same model if several other independent samples were available. The assessment of model selection uncertainty has a long tradition within the Bayesian community and posterior probabilities can be naturally used to take account of model uncertainty Kass and Raftery, ; Madigan and Raftery, Although computing posterior probabilities can be hard and time consuming, in theory we could approximate those probabilities with the BIC.

We also need to be careful when interpreting the relative importance of parameters.