In this example we consider the burns
dataset, which is built-in
restriktor. We will show how order constraints can be imposed between newly defined
parameters. The data consist of mothers of 278 children. Boys represent 68.7% of
the sample. The response variable is parental post-traumatic stress symptoms (PTSS).
In addition, five predictor variables are included: a child’s gender (0 = boys, 1 = girls)
and age, the estimated percentage total body surface area affected by second or
third degree burns (i.e., TBSA, with a range of 1-72%) and parental guilt [0-4]
and anger [0-4] feelings in relation to the burn event. The model relates PTSS to
the five predictor variables and can be written as a linear function:
\begin{align*} \text{PTSS} \sim \beta_0 + \beta_1\text{gender} + \beta_2\text{age} + \beta_3\text{guilt} + \beta_4\text{anger} + \\ \beta_6\text{gender} \times \text{guilt} + \beta_7\text{gender} \times \text{anger} + \beta_8\text{gender} \times \text{TBSA} \end{align*}
where $\beta_0$ is the intercept, $\beta_1$ to $\beta_5$ are the regression coefficients for the main-effects and $\beta_6$ to $\beta_8$ are the regression coefficients for the interaction-effects.
We hypothesized that the gender-effect would increase for simultaneously higher levels of guilt, anger and TBSA. To test this informative hypothesis, we selected three different settings for guilt, anger and TBSA, namely a small, a medium and a large level. For illustrative reasons, we chose for the small level the values 0, 0, 1 for guilt, anger and TBSA respectively. For the medium level we chose their mean values which are 2.02, 2.06, and 8.35, respectively, and for the large level we chose 4, 4, and 20, respectively. Then, the resulting three effects (small, medium, large) can be calculated as follows respectively:
\begin{align*} \text{smallEffect} &= \beta_1 + 0\beta_6 + 0\beta_7 + 1\beta_8 \\ \text{mediumEffect} &= \beta_1 + 2.02\beta_6 + 2.06\beta_7 + 8.35\beta_8 \\ \text{largeEffect} &= \beta_1 + 4\beta_6 + 4\beta_7 + 20\beta_8. \end{align*}
Note that each effect reflects a mean difference between boys and girls. Then, the informative hypothesis can be expressed as:
$$ H: \text{smallEffect} \leq \text{mediumEffect} \leq \text{largeEffect} $$
library(restriktor)
The burnsData
dataset is build-in restriktor and can be
called directly, see next step.
Based on outlier diagnostics we identified 13 outliers. The outliers were identified with robust Mahalanobis distances larger than the 99.5% quantile of a $\chi^2_8$. Therefore, we will use robust methods. The unconstrained robust linear model using MM-estimation can be fitted as follows:
fit.burns <- rlm(PTSS ~ gender*guilt + gender*anger + gender*TBSA + age,
data = Burns, method = "MM")
On the right-hand side of the regression operator (∼) we included the three interaction-effect using the * operator. The main-effects are in this way automatically included. Note that the interaction operator * is not an arithmetic operator.
A convenient feature of the restriktor constraint syntax is the option to define new
parameters, which take on values that are an arbitrary function of the original model
parameters. This can be done using the :=
operator. In this way, we can
compute the desired effects and impose order constraints among these effects. Then, the
constraint syntax might look as follows:
myConstraints <- ' smallEffect := gender + 0*gender.guilt +
1*gender.TBSA +
0*gender.anger
mediumEffect := gender + 2.02*gender.guilt +
8.35*gender.TBSA +
2.06*gender.anger
largeEffect := gender + 4*gender.guilt +
20*gender.TBSA +
4*gender.anger
smallEffect < mediumEffect < largeEffect '
#note that the constraint syntax is enclosed within single quotes.
Next, we fit the restricted robust linear model using the restriktor() function. The first argument to restriktor() is the fitted robust linear model and the second argument is the constraint syntax.
restr.burns <- restriktor(fit.burns, constraints = myConstraints)
summary(restr.burns)
Call:
conRLM.rlm(object = fit.burns, constraints = myConstraints)
Restriktor: restricted robust linear model:
Residuals:
Min 1Q Median 3Q Max
-30.94695 -4.53513 0.24005 4.77227 21.22692
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 22.085726 2.180040 10.1309 < 2.2e-16 ***
gender -4.954861 3.594092 -1.3786 0.1691595
guilt 3.677956 0.703553 5.2277 3.45e-07 ***
anger 1.822980 0.673018 2.7087 0.0071888 **
TBSA -0.075327 0.062259 -1.2099 0.2273815
age 0.149357 0.083494 1.7888 0.0747672 .
gender:guilt -1.280034 1.265057 -1.0118 0.3125244
gender:anger 4.570652 1.300216 3.5153 0.0005153 ***
gender:TBSA -0.049427 0.117998 -0.4189 0.6756417
smallEffect -5.004288 3.560883 -1.4054 0.1610710
mediumEffect 1.462300 1.015066 1.4406 0.1508619
largeEffect 7.219077 3.689197 1.9568 0.0514035 .
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 7.7534 on 269 degrees of freedom
Standard errors: standard
Multiple R-squared remains 0.218
Generalized Order-Restricted Information Criterion:
Loglik Penalty Goric
-968.6979 9.4524 1956.3006
## by default the iht() function prints a summary of all available
## hypothesis tests. A more detailed overview of the separate hypothesis
## tests can be requested by specifying the test = "" argument, e.g.,
## iht(restr.exam, test = "A").
iht(restr.burns)
Restriktor: restricted hypothesis tests ( 269 residual degrees of freedom ):
Multiple R-squared remains 0.218
Constraint matrix:
(Intercept) gender guilt anger TBSA age gender:guilt gender:anger
1: 0 0 0 0 0 0 2.02 2.06
2: 0 0 0 0 0 0 1.98 1.94
gender:TBSA op rhs active
1: 7.35 >= 0 no
2: 11.65 >= 0 no
Overview of all available hypothesis tests:
Global test: H0: all parameters are restricted to be equal (==)
vs. HA: at least one inequality restriction is strictly true (>)
Test statistic: 83.5894, p-value: <0.0001
Type A test: H0: all restrictions are equalities (==)
vs. HA: at least one inequality restriction is strictly true (>)
Test statistic: 5.3064, p-value: 0.04542
Type B test: H0: all restrictions hold in the population
vs. HA: at least one restriction is violated
Test statistic: 0.0000, p-value: 1
Type C test: H0: at least one restriction is false or active (==)
vs. HA: all restrictions are strictly true (>)
Test statistic: 1.6422, p-value: 0.05086
Note: Type C test is based on a t-distribution (one-sided),
all other tests are based on a mixture of F-distributions.
The results from hypothesis test Type B show that the informative hypothesis is not rejected in favor of the unconstrained hypothesis ($\bar{\text{F}}^{\text{B}}_{\text{MM}(0,1,2; 269)}$ = 0, p = 1). The results from hypothesis test Type A show that the null-hypothesis is rejected in favor of the order-constrained hypothesis ($\bar{\text{F}}^{\text{A}}_{\text{MM}(0,1,2; 269)}$ = 5.35, p = .044). This means that the inequality constraints are in line with the data. Hence, we can conclude that the data provide evidence that the gender-effect increases for higher levels of guilt, anger and TBSA.
Noteworthy, the non-robust results from hypothesis test Type A would have let to a different conclusion, namely that the null-hypothesis would not have been rejected in favor of the order-constrained hypothesis ($\bar{\text{F}}^{\text{A}}_{(0,1,2; 269)}$ = 3.76, p = .100). This clearly demonstrates that ignoring outliers may result in misleading conclusions.