Model selection

In this example, we demonstrate the generalized order-restricted information criterion (GORIC), which is a modification of the Akaike information criterion (AIC). The GORIC can be used to evaluate competing hypotheses based on their fit (i.e., likelihood) and complexity (i.e., (in)equality constraints).

Consider the order-constrained hypothesis $H_1: \mu_{\text{No}} \leq {\mu_{\text{Physical}} = \mu_{\text{Behavioral}}} \leq \mu_{\text{Both}}$. To test this informative hypothesis, we evaluated it against the competing unconstrained hypothesis. However, instead of using the unconstrained-hypothesis as competing hypothesis, it is also possible to specify other order-constrained hypotheses. The GORIC can be used to evaluate a set of informative hypotheses. Suppose, we want to evaluate the following set of informative hypotheses:

\begin{align*} H_0&: \mu_{\text{No}} = \mu_{\text{Physical}} = \mu_{\text{Behavioral}} = \mu_{\text{Both}} \\ H_1&: \mu_{\text{No}} \leq {\mu_{\text{Physical}} = \mu_{\text{Behavioral}}} \leq \mu_{\text{Both}} \\ H_2&: \mu_{\text{No}} \leq \mu_{\text{Physical}} \leq \mu_{\text{Behavioral}} \leq \mu_{\text{Both}} \\ H_{\text{u}}&: \mu_{\text{No}} ~,~ \mu_{\text{Physical}} ~,~ \mu_{\text{Behavioral}} ~,~ \mu_{\text{Both}}. \end{align*}

Note that it is recommended to also include the unconstrained hypothesis $H_\text{u}$ in the set, to avoid drawing wrong conclusions. The model with the lowest GORIC value is the preferred one. To improve the interpretation, we will also compute the GORIC weights, which are comparable to the Akaike weights and reflect the support for each model in the set.

Step 1. - load the restriktor library

library(restriktor)

Step 2. - import data

The AngerManagement dataset is available in restriktor and can be called directly, see next step.

Step 3. - fit unconstrained ANCOVA model

First, we fit the unconstrained model using the \texttt{lm()} function. The model is identical to the one discussed in the introduction:

fit.ANOVA <- lm(Anger ~ -1 + Group, data = AngerManagement)

Step 4. - set up the constraint syntax for each hypothesis

First, we construct the syntax for each hypothesis, except for the unconstrained hypothesis:

myConstraints1 <- ' GroupNo = GroupPhysical = GroupBehavioral = GroupBoth '

myConstraints2 <- ' GroupNo < GroupPhysical = GroupBehavioral < GroupBoth '

myConstraints3 <- ' GroupNo < GroupPhysical < GroupBehavioral < GroupBoth '

Step 5. - fit the restricted models

restr1.ANOVA <- restriktor(fit.ANOVA, constraints = myConstraints1)
restr2.ANOVA <- restriktor(fit.ANOVA, constraints = myConstraints2)
restr3.ANOVA <- restriktor(fit.ANOVA, constraints = myConstraints3)
restr4.ANOVA <- restriktor(fit.ANOVA)

The fourth model is the unrestricted model. It is recommended to include this model to avoid selecting the wrong model.

Step 6. - the goric function

Finally, we use the goric() function to compute the log-likelihood, penalty (complexity), GORIC value, and the GORIC weight for each model. The input are the four fitted restriktor objects:

goric(restr1.ANOVA, restr2.ANOVA, restr3.ANOVA, restr4.ANOVA)
          model     loglik  penalty      goric  goric.weights
1  restr1.ANOVA  -93.40117  2.00000  190.80234        0.00001
2  restr2.ANOVA  -84.16211  2.89183  174.10788        0.02598
3  restr3.ANOVA  -80.48390  3.08324  167.13427        0.84912
4  restr4.ANOVA  -80.48390  5.00000  170.96780        0.12489

Step 7. - interpret the results

The model with the lowest GORIC value is the preferred one. Note that the GORIC value for the unconstrained model renders to the AIC. The last column, shows the GORIC weights and reflect the support of each model in the set. If we want to compare model restr2_ANOVA with model restr3_ANOVA we can examine the ratio of the two corresponding GORIC weights: 0.849/0.026 = 32.654. This means that model restr3_ANOVA is about 32.654 times more likely than model restr2_ANOVA. In addition, model restr2_ANOVA is about 6.792 (0.849/0.125) more likely than the unconstrained model restr4_ANOVA. Hence, we can concluded that model restr3_ANOVA is the preferred one.