The restriktor project

Ordered standardized regression coefficients

In this example we consider the Exam dataset, which is built-in restriktor. We will use these data to show how order constraints can be imposed on the standardized regression coefficients of a linear model. The model relates students' 'exam scores' (Scores, with a range of 38 to 82) to the 'averaged point score' (APS, with a range of 18 to 28), the amount of 'study hours' (Hours, with a range of 25 to 61), and 'anxiety score' (Anxiety, with a range of 13 to 91). The assumption is that 'APS' is the strongest predictor, followed by 'study hours' and 'anxiety scores', respectively. In symbols this informative hypothesis can be formulated as:

$$ H_1: \beta_{APS} \geq \beta_{Hours} \geq \beta_{Anxiety}, $$

where $\beta$ denotes the regression coefficient. Since, the hypothesis is in terms of which predictor is stronger, we should be aware that the predictor variables are measured on a different scale. Comparing the unstandardized coefficients might lead to spurious conclusions. Therefore, the predictor variables should be standardized first before entering the analysis. In what follows, we describe all steps to test $H_1$.

Step 1. - load the restriktor library

library(restriktor)

Step 2. - import data

The Exam dataset is available in restriktor and can be called directly, see next step.

Step 3. - standardize predictor variables

To obtain standardized regression coefficients, we need to standardize the predictor variables. This can be done in R by typing:

Exam$Hours_Z   <- (Exam$Hours   - mean(Exam$Hours))   / sd(Exam$Hours)
Exam$Anxiety_Z <- (Exam$Anxiety - mean(Exam$Anxiety)) / sd(Exam$Anxiety)
Exam$APS_Z     <- (Exam$APS     - mean(Exam$APS))     / sd(Exam$APS)

Step 4. - fit the unconstrained linear model

The linear model is estimated using the newly defined centered predictor variables:

fit.exam <- lm(Scores ~ APS_Z + Hours_Z + Anxiety_Z, data = Exam)

Step 5. - set up the constraint syntax

The constraint syntax can be constructed by using variable names. To get the correct names, type

names(coef(fit.exam))

[1] "(Intercept)" "APS_Z"       "Hours_Z"     "Anxiety_Z"

Based on these variable names, we can construct the constraint syntax:

myConstraints <- ' APS_Z > Hours_Z > Anxiety_Z '

#note that the constraint syntax is enclosed within single quotes.

Step 6a. (optional) - fit the order-constrained linear model

The restricted estimates are computed using the restriktor() function. The first argument to restriktor() is the fitted linear model and the second argument is the constraint syntax.

restr.exam <- restriktor(fit.exam, constraints = myConstraints)

Step 6b. (optional) - summary of the constrained estimates

summary(restr.exam)


Call:
conLM.lm(object = fit.exam, constraints = myConstraints)

Restriktor: restricted linear model:

Residuals:
     Min       1Q   Median       3Q      Max 
-8.49329 -2.50810 -0.38974  3.13947  6.65373 

Coefficients:
            Estimate Std. Error t value  Pr(>|t|)    
(Intercept) 61.00000    0.99898 61.0624 < 2.2e-16 ***
APS_Z        6.37488    1.50386  4.2390 0.0006254 ***
Hours_Z      5.00124    1.55043  3.2257 0.0052842 ** 
Anxiety_Z    1.95754    1.08967  1.7964 0.0913268 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 4.4676 on 16 degrees of freedom
Standard errors: standard 
Multiple R-squared remains 0.860 

Generalized Order-Restricted Information Criterion:
  Loglik  Penalty    Goric 
-56.0842   3.7181 119.6046

Step 7. - test the informative hypothesis

## by default the iht() function prints a summary of all available
## hypothesis tests. A more detailed overview of the separate hypothesis
## tests can be requested by specifying the test = "" argument, e.g.,
## iht(restr.exam, test = "A").
iht(fit.exam, constraints = myConstraints)


Restriktor: restricted hypothesis tests ( 16 residual degrees of freedom ):


Multiple R-squared remains 0.860 

Constraint matrix:
   (Intercept) APS_Z Hours_Z Anxiety_Z    op rhs active
1:           0     1      -1         0    >=   0     no
2:           0     0       1        -1    >=   0     no


Overview of all available hypothesis tests:

Global test: H0: all parameters are restricted to be equal (==)
         vs. HA: at least one inequality restriction is strictly true (>)
       Test statistic: 98.4338,   p-value: <0.0001

Type A test: H0: all restrictions are equalities (==) 
         vs. HA: at least one inequality restriction is strictly true (>)
       Test statistic: 12.3847,   p-value: 0.002534

Type B test: H0: all restrictions hold in the population
         vs. HA: at least one restriction is violated
       Test statistic: 0.0000,   p-value: 1

Type C test: H0: at least one restriction is false or active (==) 
         vs. HA: all restrictions are strictly true (>)
       Test statistic: 0.4862,   p-value: 0.3167

Note: Type C test is based on a t-distribution (one-sided), 
      all other tests are based on a mixture of F-distributions.

Step 8. - interpret the results

The results provide strong evidence in favor of the informative hypothesis. Hypothesis test Type B is not rejected in favor of the unconstrained (i.e., best fitting) hypothesis ($\bar{\text{F}}^{\text{B}}_{(0,1,2; 16)}$ = 0, p = 1). A test-statistic value of zero means that all constraints are in line with the data. The results from hypothesis test Type A show that the null-hypothesis is rejected in favor of the order-constrained hypothesis ($\bar{\text{F}}^{\text{A}}_{(0,1,2; 16)}$ = 12.38, p = .003).

restriktor constrained statistical inference

Tutorial