 |
The McKinney Homeless Research Project study (Hough, et. al., 1997; Hurlburt, et. al. 1996) was designed to evaluate the effectiveness of using Section 8 certificates to provide independent housing to the severely mentally ill homeless. These housing certificates, which require clients to pay 30% of their income toward rent, are meant to enable low income subjects to choose and obtain independent housing in the community. Three hundred sixty-two clients took part in this longitudinal study employing a randomized factorial design. Clients were randomly assigned to one of two types of case management (comprehensive vs. traditional) and to one of two levels of access to independent housing (using Section 8 certificates). The project was restricted to clients diagnosed with a severe and persistent mental illness who were either homeless or at high risk of becoming homeless at the start of the study. Individuals' housing status was classified at baseline and at 6, 12, and 24 month follow-ups. Here, we focus on examining the effect of access to Section 8 certificates on housing outcomes across time. At each time point, subjects' housing status was classified as either streets/shelters, community housing, or independent housing; a partial list of these data is given below in the form of a SuperMix spreadsheet file, named sdhouse.ss3.

The variables of interest are:
- ID is the subject ID (362 subjects in total).
- HOUSING represents the housing status at the time of interview: 0 = street, 1 = community, and 2 = independent.
- SECTION8 indicates the Section 8 group, with 1 representing those using Section 8 certifications, and 0 those without.
- TIME1 to TIME3 are three dummy variables for time effects, and denote whether a classification was at baseline, or at the 6, 12 or 24 month follow-up. If at the 6 months follow-up, TIME1 = 1 and TIME2 = TIME3 = 0; if at 12 months, TIME2 = 0 while TIME1 = TIME3 = 0; and at the 24 month follow-up TIME3 = 1 and TIME1 = TIME2 = 0. With this coding scheme, the baseline serves as the reference group of classification. The coding structure is shown in Table 1 below.
- Three Section 8 by time interaction terms follow. SECT8T1 is the product of SECTION8 and TIME1, and SECT8T2 and SECT8T3 are the products of SECTION8 and TIME2 and TIME3 respectively.
- NOSECT8 indicates the non-Section 8 group, with 0 = no, and 1 = yes.
- TIME represents the linear time contrast. At baseline, TIME = 0, at 6 months, TIME = 1, at 12 months TIME = 2, and at 24 months TIME = 3.
- SEC8TIME is the product of SECTION8 and TIME.
Table 1: Coding of the dummy variables TIME1 , TIME2 , and TIME3
|
TIME1 |
TIME2 |
TIME3 |
TIME |
baseline |
0 |
0 |
0 |
0 |
6 months |
1 |
0 |
0 |
1 |
12 months |
0 |
1 |
0 |
2 |
24 months |
0 |
0 |
0 |
3 |
Values of 999 represent missing value codes for the housing status variable. Thus, some subjects are measured at all four time points and others at fewer time points. Data from these time points with missing values are not used in the analysis, however data are used from other time points where there are no missing data. Thus, for inclusion into the analysis, a subject's data (both the dependent variable and all explanatory variables used in a particular analysis) at a specific time point must be complete. The number of repeated observations per subject depends on the number of time points for which there are non-missing data for that subject.
The observed sample sizes and response proportions by group are given in Table 2 below. These observed proportions indicate a general decrease in street living and an increase in independent living across time for both groups. The increase in independent housing, however, appears to occur sooner for the section 8 group relative to the control group. Regarding community living, across time there is an increase for the control group and a decrease for the section 8 group.
Regarding missing data, further inspection of Table 2 indicates that there is some attrition across time; attrition rates of 19.4% and 12.7% are observed at the final time point for the control and section 8 groups, respectively. Also, one subject provided no housing data at any of the four measurement time points. Since estimation of model parameters is based on a full-likelihood approach, missing data are assumed to be "ignorable" conditional on both the explanatory variables and observed nominal responses (Laird, 1988). In longitudinal studies, ignorable nonresponse falls under Rubin's (1976) "missing at random" (MAR) assumption, in which the missingness depends only on observed data. In what follows, since the focus is on describing use of the nominal model in SuperMix, we will make the MAR assumption. A further approach, however, that does not rely on the MAR assumption (e.g., a mixed-effects pattern-mixture model as described in Hedeker & Gibbons (1997)) could be used.
Table 2: Observed sample sizes and response proportions by group
Time point |
Group |
Status |
Baseline |
6 months |
12 months |
24 months |
Control |
Street |
0.555 |
0.186 |
0.089 |
0.124 |
|
Community |
0.339 |
0.578 |
0.582 |
0.455 |
|
Independent |
0.106 |
0.236 |
0.329 |
0.421 |
|
n |
180 |
161 |
146 |
145 |
Section 8 |
Street |
0.442 |
0.093 |
0.121 |
0.120 |
|
Community |
0.414 |
0.280 |
0.146 |
0.228 |
|
Independent |
0.144 |
0.627 |
0.732 |
0.652 |
|
n |
181 |
161 |
157 |
158 |
In preparation for the subsequent analyses, the marginal response proportions can be converted to the two logits of the nominal regression model (i.e., and , where = street, = community, and = independent housing).These logits are given in Table 3.
Table 3: Logits across time by group
|
Time point |
Group |
Status |
Baseline |
6-months |
12-months |
24-months |
Control |
Community vs. street |
-.49 |
1.13 |
1.88 |
.130 |
|
Independent vs. street |
-1.66 |
.24 |
1.31 |
1.22 |
|
Section 8 |
Community vs. street |
-.07 |
1.10 |
.19 |
.64 |
|
Independent vs. street |
-1.12 |
1.91 |
1.0 |
1.69 |
|
Difference |
Community vs. street |
.42 |
-.03 |
-1.69 |
-.66 |
|
Independent vs. street |
.54 |
1.67 |
.49 |
.47 |
The logits clearly show the increase in community and independent housing, relative to street housing, at all follow-up time points (6, 12, and 24 months). In terms of group-related differences, these appear most pronounced at 6 months for independent housing and 12 months for community housing. While examination of these logits is instructive, the subsequent analyses will more rigorously assess the degree to which these logits vary by time and group.
In this example, one random subject effect (i.e., a random subject intercept) is assumed and the repeated housing status classifications is modeled in terms of the dummy-coded time effects (6, 12, and 24 month follow-ups compared to baseline), a group effect (section 8 versus control), and group by time interaction terms.
Again, street housing is treated as the reference category because its code (0) is listed as the first response category.

In the nominal case, we need to consider the values corresponding to the unordered multiple categories of the response variable. We thus assume that the response categories are coded as Let denote the probability that a response occurs in category , conditional on the parameters and , where denotes the value of the nominal variable associated with level-2 unit , and level-1 unit , Then


where Here, is the explanatory variable vector and is the design vector for the random effects, both vectors being for the -th level-1 unit nested within level-2 unit . Correspondingly, is a vector of unknown fixed regression parameters, and is a vector of unknown random effects for the level-2 unit . The distribution of the random effects is assumed to be multivariate normal with mean vector and covariance matrix . Notice that the regression coefficient vectors and carry the subscript. Thus, for each of the explanatory variables and random effects, there will be parameters to be estimated. Additionally, the random effect variance-covariance matrix is allowed to vary with .
In the current example, the outcome variable HOUSING is coded 0, 1, and 2. Therefore

where for (community housing)

For (independent housing)

It is assumed that , are i.i.d. normal , 

Preparing the data
The model is fitted to the data in SDHOUSE.ss3. The first step is to create the ss3 file shown above from an Excel spreadsheet named SDHOUSE.xls. This is accomplished as follows:
- Use the File, Import Data File option to activate the display of an Open dialog box.
- Browse for the file SDHOUSE.xls in the Examples, Nominal folder.
- Select the file and click the Open button to return to the main SuperMix window, where the contents of the Excel spreadsheet are displayed as the SuperMix system file with default name SDHOUSE.ss3.

Setting up the analysis
The next step is to describe the model to be fitted. We use the SuperMix interface to provide the model specifications. From the main menu bar, select the File , New Model Setup option.

In this example, only the Configuration, Variables, and Advanced tabs of the Model Setup window that appears will be used. By default, the Configuration screen is displayed first.
Start by providing a title for the analysis in the Title 1 and Title 2 text boxes. Next, select the outcome variable HOUSING from the Dependent Variable drop-down list box and indicate the type of outcome as nominal using the Dependent Variable Type drop-down list box. Once this selection is made, the Categories grid is displayed, with distinct values of the categories in the text boxes as shown below. The subject identification variable is used to define the hierarchical structure of the data, and is selected as the Level-2 ID from the Level-2 IDs drop-down list box. The bottom right portion of the screen indicates that a marginal crosstabulation table of the nominal outcome variable (i.e. housing status) by SECTION8 is requested. This table provides purely descriptive information, and has no effect on the estimation of the model parameters.
Finally, we need to provide information on missing data in the SDHOUSE.ss3 file. Some of the values of the outcome variable HOUSING are missing, and a missing value code of 999 is used to indicate this. Click on the Missing Values Present drop-down list, and select the yes option. Enter the code 999 in the Missing Value for the Dependent Var text box that appears. Proceed to the Variables screen by clicking on this tab.

The Variables screen is used to specify the fixed and random effects to be included in the model. Start by selecting the explanatory (fixed) variables using the check boxes in the Available grid. The image below shows the completed selection of all the predictors. By default, the inclusion of both a fixed intercept coefficient and a random intercept at level-2 is assumed, as indicated by the checked boxes for Include Intercept in the Explanatory Variables and L-2 Random effects grids. As these selections correspond to the model we intend fitting to the data, no further changes are needed on this screen.

Click on the Advanced tab and request the use of 25 quadrature points for estimation using the Number of Quadrature Points text box. Also select non-adaptive quadrature as Optimization Method. Increasing the number of points increases the accuracy of the integration, though minimal change is usually observed beyond 10 points or so. For models with only one random effect, increasing the number of points does not slow the solution down excessively. Thus, using 25 points, while perhaps not necessary, provides a safe choice.

Before running the analysis, the model specifications have to be saved. Select the File, Save option, and provide a name for the model specification file, for example sdhouse.mum. Run the analysis by selecting the Run option from the Analysis menu.

Portions of the output file sdhouse.out are shown below. The first part of the output file gives a description of the model specifications. This is followed by a data summary of the number of observations nested within each subject. The number of observations per subject (level-2 unit) ranges between 1 and 4.

The data summary is followed by descriptive statistics for all the variables included in the model. We note that the most frequent response was in category 2, i.e. "Independent" on the nominal outcome variable HOUSING, while 23% indicated that they were living on the street ( HOUSING = 0).

The crosstabulation of the variable SECTION8 by the response variable HOUSING, requested on the Variables screen, is given next. Most of the classifications from subjects without Section 8 certificates indicated that the subject was living in community housing at the time of classification ( SECTION8 = 0, HOUSING = 1). In the case of classifications from subjects with Section 8 certificates, most classifications showed the use of independent housing ( SECTION8 = 1, HOUSING = 2).

Starting values of parameters are given next. Line 1 (mean) contains the starting values for the intercepts and . The starting values given in the second and third lines (covariates) are for the coefficients of the covariates. The first seven values are those for SECTION8, TIME1, …, SECT8T3 for response code 1 vs. code 0, the last seven are for the same predictors, but for response code 2 vs. response code 0. The starting value for the variance components associated with the random level-2 intercepts are given in the third line (var. terms). For 21.61% of the subjects, no change in classification was observed over the time during which follow-ups were made.

The final results obtained with maximum marginal likelihood estimation are given next. Using 20 quadrature points per dimension, 32 iterations were required to obtain convergence. The log likelihood function value and deviance at convergence are included, and can be used to compare the current model with other models.

In terms of significance of the fixed effects, the time effects are observed to be highly significant. With the inclusion of the Time by Section 8 interaction terms, the time effects reflect comparisons between time points for the control group (i.e., SECTION8 coded as 0). Thus, subjects in the control group show increased use of both independent and community housing relative to street housing at all three follow-ups, as compared to baseline. Similarly, due to the inclusion of the interaction terms, the Section 8 effect is the group difference at baseline (i.e., when all time effects are 0). Using a .05 cutoff, there is no statistical evidence of group differences at baseline. Turning to the interaction terms, these indicate how the two groups differ in terms of comparisons between time points. Compared to controls, the increase in community versus street housing is less pronounced for section 8 subjects at 12 months (the estimate for SECT8T2 equals -1.92 in terms of the logit), but not statistically different at 6 months ( SECT8T1) and only marginally different at 24 months ( SECT8T3). Conversely, as compared to controls, the increase in independent versus street housing (response code 2 vs. code 0) is more pronounced for section 8 subjects at 6 months (the estimate equals 2.00 in terms of the logit), but not statistically different at 12 or 24 months.
In terms of community versus street housing (i.e., response code 1 versus 0), there is an increase across time for the control group relative to the Section 8 group. As the statistical test indicated, these groups differ most at 12 months. For the independent versus street housing comparison (i.e., response code 2 versus 0) there is a beneficial effect of Section 8 certificates at 6 months. Thereafter, the non-significant interaction terms indicate that the control group catches up to some degree. Considering these results of the mixed-effects analysis, it is seen that both groups reduce the degree of street housing, but do so in somewhat different ways. The control group subjects are shifted more towards community housing, whereas Section 8 subjects are more quickly shifted towards independent housing.
This differential effect of Section 8 certificates over time is completely missed if one simply analyzes the outcome variable as a binary indicator of street versus non-street housing (i.e., collapsing community and independent housing categories). In this case (not shown), none of the section 8 by time interaction terms are observed to be statistically significant. Thus, analysis of the three-category nominal outcome is important in uncovering the beneficial effect of Section 8 certificates.
Comparing the log-likelihood value from this analysis to one where there are no random effects (not shown) clearly supports inclusion of the random subject effect (likelihood ratio ). Expressed as intraclass correlations, and for community versus street and independent versus street, respectively. Thus, the subject influence is much more pronounced in terms of distinguishing independent versus street living, relative to community versus street living. This is borne out by contrasting models with separate versus a common random-effect variance across the two category contrasts (not shown) which yields a highly significant likelihood ratio favoring the model with separate variance terms. Also see the note on the use of the likelihood-ratio test in Section 3.6.
From the above output, it follows that for a typical person at 24 months from baseline ( TIME1 = TIME2 = 0, TIME3 = 1) with a Section 8 certificate ( SECTION8 = 1)


so that

and

Therefore

The corresponding probabilities for a typical person without Section 8 certification are obtained using

From these values, it follows that



There is therefore a higher expected proportion of subjects without Section 8 certificates who will be classified as "street/shelters" 24 months from baseline, than is the case for those with section 8 certificates.
Table 4 below is a summary of the predicted probabilities, calculated as described above. We conclude that the highest proportion of subjects without Section 8 certificates make use of community housing, whereas the highest proportion of subjects with certificates make use of independent housing.
Table 4: predicted probabilities
Section 8
certificate |
Time from
baseline |
P(street) |
P(community) |
P(independent) |
no |
6 |
0.1552 |
0.6885 |
0.1563 |
no |
12 |
0.0634 |
0.6764 |
0.2602 |
no |
24 |
0.0889 |
0.5420 |
0.3691 |
yes |
6 |
0.0420 |
0.2739 |
0.6841 |
yes |
12 |
0.0522 |
0.1379 |
0.8099 |
yes |
24 |
0.0580 |
0.2296 |
0.7124 |
The intracluster correlations for this analysis is given. These are for community versus street, and independent versus street, respectively. For the independent versus street housing comparison (i.e., response code 2 versus 0) the intracluster correlation is higher.

The table below is a listing of the estimated intercorrelations between the parameter estimators. Inspection of the off-diagonal elements of the correlation matrix shows no evidence of serious collinearity.

|