|
HLM2 provides
the data analyst with a means of checking the fit and distributional
assumptions of the model by producing residual files for the
level-1 and level-2 models. These files may be requested using
the Basic
Model Specifications dialog box . The level-1
and level-2 residual files will be written as SPSS,
SAS,
STATA,
SYSTAT or
ASCII
data files. In the case of SPSS
and STATA,
the residual files will be written out so that the respective
packages may use them immediately. The
other forms of raw data will require submitting them as command
streams.
The
level-1 residual file
The level-1 residual file
will contain level-1 residuals
(the differences between the observed
and fitted value s), the fitted values, the square root of
,
the values of the level-1 and level-2 predictors entered in
the model, and those of other level-1 and level-2 variables
selected by the user.
The
level-2 residual file
This file will contain
the EB
residuals (see Equation 1.10 above), OL residuals
(see Equation 1.9 above),
and fitted values, i.e.,
for
each level-1 coefficient. By adding the OL
residuals to
the corresponding fitted value s, the analyst can also obtain
the OL
estimate of the corresponding level-1 coefficient
. The file also produces the EB
estimate
of each level-1 coefficient,
.
In addition, the file
will contain Mahalanobis distances
(which are discussed below), estimates of the total and residual
standard deviations (log metric) within each unit, the values
of the predictors used in the level-2 model, and any other
level-2 prediction variables selected by the user.
The residual file contains
a single record per unit. The first variable in this file
is the level-2 unit ID
(here named l2ID),
followed by the number of level-1 units within that level-2
unit (denoted by nj),
and various summary statistics (chipct
through mdrsvar
explained below). These are followed by the
two EB
residuals (ebintrcp
and ebses);
the two OLS
residuals (olintrcp
and olses);
and the fitted values, that is, the predicted values based
on the estimated level-2 model (fvintrcp
and fvses).
Next are the EB
coefficients (ecintrcp
and ecses),
which are the sum of the fitted values and the EB
residuals. The posterior variances and covariances
of the level-2 residuals are given next (pv00
for the posterior variance of the intercept residual, pv10
for the posterior covariance between the intercept residual
and the slope residual, and pv11
for the posterior variance of the slope residual).
Next are the corresponding posterior variances and covariances
of the random intercept and coefficient (pvc00
for the posterior variance of the random intercept, pvc10
for the posterior covariance between the random intercept
and the random slope, and pv11
for the posterior variance of the random slope). Finally,
the level-2 predictors used in the analysis plus those
additional level-2 predictors requested by the user for inclusion
in the file are given.
While
most of this is straightforward, the information contained
in the first set of variables for each unit merits elaboration.
nj
is the number of cases for level-2 unit
. It is followed by two variables, chipct
and mdist.
If we model
level-1 coefficients, mdist
would be the Mahalanobis distance
(i.e., the standardized squared
distance of a unit from the center of a
-dimensional distribution, where
is the number of random
effects per unit). Essentially, mdist
provides a single, summary measure of the distance of a unit's
EB
estimates,
, from its "fitted value ,"
.
Note that the units in
the residual file are sorted in
ascending order by mdist.
If the normality assumption is true, then the Mahalanobis
distances should be distributed approximately
. Analogous to univariate
normal probability plotting, we can construct a Q-Q
plot of mdist
vs. chipct.
chipct
are the expected values of the order statistics for a sample
of size
selected from a population
that is distributed
. If the Q-Q
plot resembles a 45 degree line, we have evidence that the
random effects are distributed v-variate
normal. In addition, the plot will help us detect outlying
units (i.e., units with large mdist
values well above the 45 degree line). It should be noted
that such plots are good diagnostic tools only when the level-1
sample sizes, nj,
are at least moderately large. (For further
discussion see Hierarchical
Linear Models, pp. 274-280.)
After mdist,
three estimates of the level-1 variability are given:
- The natural logarithm of
the total standard deviation within
each unit, lntotvar.
- The
natural logarithm of the residual standard deviation within
each unit based on its least squares regression, olsrsvar.
Note, this estimate
exists only for those units which have sufficient data to
compute level-1 OLS estimates.
- The mdrsvar,
the natural logarithm of the residual standard deviation from
the final fitted fixed effects model.
The natural log of these
three standard deviations (with the addition of a bias-correction
factor for varying degrees of freedom) is reported (see Hierarchical
Linear Models, p. 219).
We note that these statistics can be used as input for the
V-known option in HLM2
in research on group-level correlates of diversity (Raudenbush
& Bryk, 1987).
An example of an SPSS
version of a level-2 residual file is
shown below. Only the data from the first ten units and the
first 8 variables are reproduced here. This file can be used
to construct various diagnostic plots.

|