In this example we consider the Exam
dataset, which is
built-in restriktor. We will use these data to show how order constraints can be
imposed on the standardized regression coefficients of a linear model. The model
relates students' 'exam scores' (Scores, with a range of 38 to 82) to the 'averaged
point score' (APS, with a range of 18 to 28), the amount of 'study hours' (Hours,
with a range of 25 to 61), and 'anxiety score' (Anxiety, with a range of 13 to 91).
The assumption is that 'APS' is the strongest predictor, followed by 'study hours'
and 'anxiety scores', respectively. In symbols this informative hypothesis can be
formulated as:
$$ H_1: \beta_{APS} \geq \beta_{Hours} \geq \beta_{Anxiety}, $$
where $\beta$ denotes the regression coefficient. Since, the hypothesis is in terms of which predictor is stronger, we should be aware that the predictor variables are measured on a different scale. Comparing the unstandardized coefficients might lead to spurious conclusions. Therefore, the predictor variables should be standardized first before entering the analysis. In what follows, we describe all steps to test $H_1$.
library(restriktor)
The Exam
dataset is available in restriktor and can be
called directly, see next step.
To obtain standardized regression coefficients, we need to standardize the predictor variables. This can be done in R by typing:
Exam$Hours_Z <- (Exam$Hours - mean(Exam$Hours)) / sd(Exam$Hours)
Exam$Anxiety_Z <- (Exam$Anxiety - mean(Exam$Anxiety)) / sd(Exam$Anxiety)
Exam$APS_Z <- (Exam$APS - mean(Exam$APS)) / sd(Exam$APS)
The linear model is estimated using the newly defined centered predictor variables:
fit.exam <- lm(Scores ~ APS_Z + Hours_Z + Anxiety_Z, data = Exam)
The constraint syntax can be constructed by using variable names. To get the correct names, type
names(coef(fit.exam))
[1] "(Intercept)" "APS_Z" "Hours_Z" "Anxiety_Z"
Based on these variable names, we can construct the constraint syntax:
myConstraints <- ' APS_Z > Hours_Z > Anxiety_Z '
#note that the constraint syntax is enclosed within single quotes.
The restricted estimates are computed using the restriktor() function. The first argument to restriktor() is the fitted linear model and the second argument is the constraint syntax.
restr.exam <- restriktor(fit.exam, constraints = myConstraints)
summary(restr.exam)
Call:
conLM.lm(object = fit.exam, constraints = myConstraints)
Restriktor: restricted linear model:
Residuals:
Min 1Q Median 3Q Max
-8.49329 -2.50810 -0.38974 3.13947 6.65373
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 61.00000 0.99898 61.0624 < 2.2e-16 ***
APS_Z 6.37488 1.50386 4.2390 0.0006254 ***
Hours_Z 5.00124 1.55043 3.2257 0.0052842 **
Anxiety_Z 1.95754 1.08967 1.7964 0.0913268 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 4.4676 on 16 degrees of freedom
Standard errors: standard
Multiple R-squared remains 0.860
Generalized Order-Restricted Information Criterion:
Loglik Penalty Goric
-56.0842 3.7181 119.6046
## by default the iht() function prints a summary of all available
## hypothesis tests. A more detailed overview of the separate hypothesis
## tests can be requested by specifying the test = "" argument, e.g.,
## iht(restr.exam, test = "A").
iht(fit.exam, constraints = myConstraints)
Restriktor: restricted hypothesis tests ( 16 residual degrees of freedom ):
Multiple R-squared remains 0.860
Constraint matrix:
(Intercept) APS_Z Hours_Z Anxiety_Z op rhs active
1: 0 1 -1 0 >= 0 no
2: 0 0 1 -1 >= 0 no
Overview of all available hypothesis tests:
Global test: H0: all parameters are restricted to be equal (==)
vs. HA: at least one inequality restriction is strictly true (>)
Test statistic: 98.4338, p-value: <0.0001
Type A test: H0: all restrictions are equalities (==)
vs. HA: at least one inequality restriction is strictly true (>)
Test statistic: 12.3847, p-value: 0.002534
Type B test: H0: all restrictions hold in the population
vs. HA: at least one restriction is violated
Test statistic: 0.0000, p-value: 1
Type C test: H0: at least one restriction is false or active (==)
vs. HA: all restrictions are strictly true (>)
Test statistic: 0.4862, p-value: 0.3167
Note: Type C test is based on a t-distribution (one-sided),
all other tests are based on a mixture of F-distributions.
The results provide strong evidence in favor of the informative hypothesis. Hypothesis test Type B is not rejected in favor of the unconstrained (i.e., best fitting) hypothesis ($\bar{\text{F}}^{\text{B}}_{(0,1,2; 16)}$ = 0, p = 1). A test-statistic value of zero means that all constraints are in line with the data. The results from hypothesis test Type A show that the null-hypothesis is rejected in favor of the order-constrained hypothesis ($\bar{\text{F}}^{\text{A}}_{(0,1,2; 16)}$ = 12.38, p = .003).