Testing for effects in the linear model

In this example we consider the burns dataset, which is built-in restriktor. We will show how order constraints can be imposed between newly defined parameters. The data consist of mothers of 278 children. Boys represent 68.7% of the sample. The response variable is parental post-traumatic stress symptoms (PTSS). In addition, five predictor variables are included: a child’s gender (0 = boys, 1 = girls) and age, the estimated percentage total body surface area affected by second or third degree burns (i.e., TBSA, with a range of 1-72%) and parental guilt [0-4] and anger [0-4] feelings in relation to the burn event. The model relates PTSS to the five predictor variables and can be written as a linear function:

\begin{align*} \text{PTSS} \sim \beta_0 + \beta_1\text{gender} + \beta_2\text{age} + \beta_3\text{guilt} + \beta_4\text{anger} + \\ \beta_6\text{gender} \times \text{guilt} + \beta_7\text{gender} \times \text{anger} + \beta_8\text{gender} \times \text{TBSA} \end{align*}

where $\beta_0$ is the intercept, $\beta_1$ to $\beta_5$ are the regression coefficients for the main-effects and $\beta_6$ to $\beta_8$ are the regression coefficients for the interaction-effects.

We hypothesized that the gender-effect would increase for simultaneously higher levels of guilt, anger and TBSA. To test this informative hypothesis, we selected three different settings for guilt, anger and TBSA, namely a small, a medium and a large level. For illustrative reasons, we chose for the small level the values 0, 0, 1 for guilt, anger and TBSA respectively. For the medium level we chose their mean values which are 2.02, 2.06, and 8.35, respectively, and for the large level we chose 4, 4, and 20, respectively. Then, the resulting three effects (small, medium, large) can be calculated as follows respectively:

\begin{align*} \text{smallEffect} &= \beta_1 + 0\beta_6 + 0\beta_7 + 1\beta_8 \\ \text{mediumEffect} &= \beta_1 + 2.02\beta_6 + 2.06\beta_7 + 8.35\beta_8 \\ \text{largeEffect} &= \beta_1 + 4\beta_6 + 4\beta_7 + 20\beta_8. \end{align*}

Note that each effect reflects a mean difference between boys and girls. Then, the informative hypothesis can be expressed as:

$$ H: \text{smallEffect} \leq \text{mediumEffect} \leq \text{largeEffect} $$

Step 1. - load the restriktor library

library(restriktor)

Step 2. - import data

The burnsData dataset is build-in restriktor and can be called directly, see next step.

Step 4. - fit the unconstrained linear model

Based on outlier diagnostics we identified 13 outliers. The outliers were identified with robust Mahalanobis distances larger than the 99.5% quantile of a $\chi^2_8$. Therefore, we will use robust methods. The unconstrained robust linear model using MM-estimation can be fitted as follows:

fit.burns <- rlm(PTSS ~ gender*guilt + gender*anger + gender*TBSA + age,
                 data = Burns, method = "MM")

On the right-hand side of the regression operator (∼) we included the three interaction-effect using the * operator. The main-effects are in this way automatically included. Note that the interaction operator * is not an arithmetic operator.

Step 5. - set up the constraint syntax

A convenient feature of the restriktor constraint syntax is the option to define new parameters, which take on values that are an arbitrary function of the original model parameters. This can be done using the := operator. In this way, we can compute the desired effects and impose order constraints among these effects. Then, the constraint syntax might look as follows:

myConstraints <-  ' smallEffect  := gender + 0*gender.guilt +
                                             1*gender.TBSA  +
                                             0*gender.anger

                    mediumEffect := gender + 2.02*gender.guilt +
                                             8.35*gender.TBSA  +
                                             2.06*gender.anger

                    largeEffect  := gender + 4*gender.guilt +
                                            20*gender.TBSA  +
                                             4*gender.anger

                    smallEffect < mediumEffect < largeEffect '

#note that the constraint syntax is enclosed within single quotes.

Step 6. - fit the order-constrained linear model

Next, we fit the restricted robust linear model using the restriktor() function. The first argument to restriktor() is the fitted robust linear model and the second argument is the constraint syntax.

restr.burns <- restriktor(fit.burns, constraints = myConstraints)

Step 7. (optional) - summary of the constrained estimates

summary(restr.burns)

Call:
conRLM.rlm(object = fit.burns, constraints = myConstraints)

Restriktor: restricted robust linear model:

Residuals:
      Min        1Q    Median        3Q       Max 
-30.94695  -4.53513   0.24005   4.77227  21.22692 

Coefficients:
              Estimate Std. Error t value  Pr(>|t|)    
(Intercept)  22.085726   2.180040 10.1309 < 2.2e-16 ***
gender       -4.954861   3.594092 -1.3786 0.1691595    
guilt         3.677956   0.703553  5.2277  3.45e-07 ***
anger         1.822980   0.673018  2.7087 0.0071888 ** 
TBSA         -0.075327   0.062259 -1.2099 0.2273815    
age           0.149357   0.083494  1.7888 0.0747672 .  
gender:guilt -1.280034   1.265057 -1.0118 0.3125244    
gender:anger  4.570652   1.300216  3.5153 0.0005153 ***
gender:TBSA  -0.049427   0.117998 -0.4189 0.6756417    
smallEffect  -5.004288   3.560883 -1.4054 0.1610710    
mediumEffect  1.462300   1.015066  1.4406 0.1508619    
largeEffect   7.219077   3.689197  1.9568 0.0514035 .  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 7.7534 on 269 degrees of freedom
Standard errors: standard 
Multiple R-squared remains 0.218 

Generalized Order-Restricted Information Criterion:
   Loglik   Penalty     Goric 
-968.6979    9.4524 1956.3006 

Step 8. - test the informative hypothesis

## by default the iht() function prints a summary of all available
## hypothesis tests. A more detailed overview of the separate hypothesis
## tests can be requested by specifying the test = "" argument, e.g.,
## iht(restr.exam, test = "A").
iht(restr.burns)

Restriktor: restricted hypothesis tests ( 269 residual degrees of freedom ):


Multiple R-squared remains 0.218 

Constraint matrix:
   (Intercept) gender guilt anger TBSA age gender:guilt gender:anger
1:           0      0     0     0    0   0         2.02         2.06
2:           0      0     0     0    0   0         1.98         1.94
   gender:TBSA    op rhs active
1:        7.35    >=   0     no
2:       11.65    >=   0     no


Overview of all available hypothesis tests:

Global test: H0: all parameters are restricted to be equal (==)
         vs. HA: at least one inequality restriction is strictly true (>)
       Test statistic: 83.5894,   p-value: <0.0001

Type A test: H0: all restrictions are equalities (==) 
         vs. HA: at least one inequality restriction is strictly true (>)
       Test statistic: 5.3064,   p-value: 0.04542

Type B test: H0: all restrictions hold in the population
         vs. HA: at least one restriction is violated
       Test statistic: 0.0000,   p-value: 1

Type C test: H0: at least one restriction is false or active (==) 
         vs. HA: all restrictions are strictly true (>)
       Test statistic: 1.6422,   p-value: 0.05086

Note: Type C test is based on a t-distribution (one-sided), 
      all other tests are based on a mixture of F-distributions.

Step 9. - interpret the results

The results from hypothesis test Type B show that the informative hypothesis is not rejected in favor of the unconstrained hypothesis ($\bar{\text{F}}^{\text{B}}_{\text{MM}(0,1,2; 269)}$ = 0, p = 1). The results from hypothesis test Type A show that the null-hypothesis is rejected in favor of the order-constrained hypothesis ($\bar{\text{F}}^{\text{A}}_{\text{MM}(0,1,2; 269)}$ = 5.35, p = .044). This means that the inequality constraints are in line with the data. Hence, we can conclude that the data provide evidence that the gender-effect increases for higher levels of guilt, anger and TBSA.

Noteworthy, the non-robust results from hypothesis test Type A would have let to a different conclusion, namely that the null-hypothesis would not have been rejected in favor of the order-constrained hypothesis ($\bar{\text{F}}^{\text{A}}_{(0,1,2; 269)}$ = 3.76, p = .100). This clearly demonstrates that ignoring outliers may result in misleading conclusions.