Scientific Software International (SSI) publishes statistical data analysis software: LISREL (structural equation model/SEM, survey generalized linear model/SGLIM), 
HLM (hierarchical linear modeling, multilevel model), SuperMix (mixed models, mixed-effects program, MIXREG, MIXOR, MIXNO and MIXPREG) and Item Response Theory/IRT (BILOG-MG, MULTILOG, PARSCALE)Scientific Software International (SSI) publishes statistical data analysis software: LISREL (structural equation model/SEM, survey generalized linear model/SGLIM), 
HLM (hierarchical linear modeling, multilevel model), SuperMix (mixed models, mixed-effects program, MIXREG, MIXOR, MIXNO and MIXPREG) and Item Response Theory/IRT (BILOG-MG, MULTILOG, PARSCALE)Scientific Software International (SSI) publishes statistical data analysis software: LISREL (structural equation model/SEM, survey generalized linear model/SGLIM), 
HLM (hierarchical linear modeling, multilevel model), SuperMix (mixed models, mixed-effects program, MIXREG, MIXOR, MIXNO and MIXPREG) and Item Response Theory/IRT (BILOG-MG, MULTILOG, PARSCALE)


  Data

To illustrate the application of the mixed-effects ordinal logistic regression model to longitudinal data, data from the NIMH Schizophrenia Collaborative Study on treatment related changes in overall severity are used. Specifically, Item 79 of the Inpatient Multidimensional Psychiatric Scale (IMPS; Lorr & Klett, 1966) was used. Item 79, "Severity of Illness," (IMPS79) was scored as: 1 = normal, or not at all ill; 2 = borderline mentally ill; 3 = mildly ill; 4 = moderately ill; 5 = markedly ill; 6 = severely ill; and 7 = among the most extremely ill. An ordinal mixed-effects model, with the seven ordered categories recoded into four, is given in the section dealing with ordinal variables. In the present example, scores have been recoded to a dichotomous variable, where scores up to, but excluding 3.5 were coded 0, and scores of 3.5 or higher were coded 1. The value "0" is associated with measurements classified as normal, borderline, mildly, or moderately mentally ill, while the value "1" was assigned to measurements corresponding to "markedly ill" through "most extremely ill." In this study, patients were randomly assigned to receive one of four medications: placebo, chlorpromazine, fluphenazine, or thioridazine. Since previous analyses (Longford, 1993, and Gibbons & Hedeker, 1994) revealed similar effects for the three anti-psychotic drug groups, they were combined in the present analysis. Finally, again based on previous analysis, to linearize the relationship of the IMPS79 scores over time, a square root transformation of time was chosen.

Data for the first 10 participants on most of the variables used in this section are shown below in the form of a SuperMix spreadsheet file, named schizx1.ss3. The variables of interest are:

o       Patient indicates the IDs for the 437 patients.

o       Imps79 represents the original score on Item 79 of the Inpatient Multidimensional Psychiatric Scale.

o       Imps79D is a recoded version of the same scale, but in binary form, with coding as discussed above.

o       Imps79O is also a recoded version of the same scale, but with the seven original categories reduced to four.

o       TxDrug indicates the treatment group, where 1 indicates membership in the treatment group, and 0 membership in the control group.

o       Week represents the time during the course of the study when a specific measurement was made, and ranges between 0 and 6.

o       SqrtWeek is the square root of Week.

o       Tx*SWeek is the product of the treatment group and the square root of Week.

In this data file, each subject's data consist of seven lines, these being the repeated measurements on seven occasions. Notice that there are missing value codes (每9) for some subjects at specific time points. The data from these time points will not be used in the analysis, but data from these subjects at other time points where there are no missing data will be used in the analysis. Thus, for inclusion into the analysis, a subject's data (both the dependent variable and all model covariates being used in a particular analysis) at a specific time point must be complete. The number of repeated observations per subject then depends on the number of time points for which there are non-missing data for that subject. The specification of missing data codes will be illustrated in the model specification section to follow.

   Model

Mathematical model

Continuous outcome

A general two-level model for a continuous response variable  depending on a set of  predictors  can be written in the form

           

where  denotes the level-2 units, and  the level-1 units. In this context,  represents the response of individual , nested within level-2 unit . The model shown here consists of a fixed and a random part. The fixed part of the model is represented by the vector product , where  is a typical row of the design matrix of the fixed part of the model with, as elements, a subset of the  predictors. The vector  contains the fixed, but unknown parameters to be estimated.  and  denote the random part of the model at levels 2 and 1 respectively. For example,  represents a typical row of the design matrix of the random part at level 2, and  the vector of random level-2 effects to be estimated. It is assumed that  are independently and identically distributed (i.i.d.) with mean vector  and covariance matrix . Similarly, the  are assumed i.i.d., with mean vector  and covariance matrix .

Within this hierarchical framework, the effects of TxDrug, SsqrtWeek, and Tx*SWeek can be used to predict the Imps79 score for the case where Imps79 is a continuous variable. The corresponding model may be expressed as

where  denotes the average expected Imps79 score, and  and  indicate the estimated coefficients associated with the fixed part of the model which contains the predictor variables TxDrug, SqrtWeek, and Tx*SWeek. The random part of the model is represented by  and , which denote the variation in average score over patients and between measurements nested within patients at the lowest level of the hierarchy.

Binary outcomes

In the current example, the outcome variable is Imps79D, which is of a binary nature. The original scores on Item 79 of the Inpatient Multidimensional Psychiatric Scale have been recoded to a dichotomous variable, where scores up to, but excluding 3.5 were coded 0, and scores of 3.5 or higher were coded 1. In this case, the predicted value of the outcome can be viewed as the predicted probability that Imps79D is 1. Due to this, predicted values outside the interval (0,1) would not be meaningful and a model constraining predicted values to lie within this interval would be appropriate, in contrast with the model for a continuous outcome (see above) where predicted values outside this interval would be interpretable. In addition, the assumption of normality at level 1 is not realistic, as the level-1 random effect can only assume one of two values: 0 or 1. This random effect can thus not have homogeneous variance.

In order to insure that the predicted value lies within the (0,1) interval, a transformation of the level-1 predicted probability can be used. For the binary case considered here, we have

           

where  represents the log of the odds of success, and (for the current model) can be expressed as

            

This transformation, commonly referred to as the logit link function, constrains  to lie in the interval (0,1).

 

  Analysis

Preparing the data

The model is fitted to the data in schizx1.ss3. The first step is to create the ss3 file schizx1.ss3 from an Excel spreadsheet named schizx1.xls. This is accomplished as follows:

o       Use the File, Import Data File option to activate the display of an Open dialog box.

o       Browse for the file schizx1.xls in the SuperMixEn Examples\Primer\Binary folder.

o       Select the file and click the Open button to return to the main SuperMix window, where the contents of the Excel spreadsheet are displayed as the SuperMix system file with default name schizx1.ss3.

Setting up the analysis

The next step is to describe the model to be fitted. We use the SuperMix interface to provide the model specifications. From the main menu bar, select the File, New Model Setup option.

The Model Setup window that appears has six tabs. In this example, three of these tabs are used in model specification.

As a first step, select the dichotomous outcome variable Imps79D from the Dependent Variable drop-down list box on the Configuration screen. Specify the type of outcome as binary using the Dependent Variable Type drop-down list box. Once this selection is made, the Categories grid is displayed. As there are missing data in the ss3 file for both outcome and potential predictors, set the Missing Values Present drop-down list box to true. Once this is done, the Missing Values for Dependent Var and Global Missing Value text boxes are displayed. Enter the value 每9 into both these boxes.

The patient identification variable is used to define the hierarchical structure of the data, and is selected as the Level-2 ID from the Level-2 IDs drop-down list box. Enter a title for the analysis in the Title 1 and Title 2 text boxes. Request a crosstabulation of the outcome variable and the predictor SqrtWeek by setting the Perform Crosstabulation and Crosstab Variable drop-down list boxes to yes and SqrtWeek respectively.

 

The Variables screen is used to specify the fixed and random effects to be included in the model. Start by selecting the explanatory (fixed) variables TxDrug, SqrtWeek, and Tx*SWeek by checking the check boxes in the E column of the Available grid.

After selecting all the explanatory variables, the random effect(s) at level 2 must be selected. By default, the model will include a random intercept, as indicated by the check box for Include Intercept in the L-2 Random Effects grid. The intercept is assumed to vary randomly over higher levels of the hierarchy, while the slopes of the predictors TxDrug, SqrtWeek and the interaction between treatment and the square root of the treatment time, Tx*SWeek, are assumed to be adequately described by common, fixed coefficients that do not vary across patients.

Next, click on the Advanced tab. This screen is used to specify additional settings for the case where the outcome variable is binary. Request the use of 20 non-adaptive quadrature points for estimation by entering the number 20 in the Number of Quadrature Points text box. No changes are made to the Unit Weighting  box. Use the default Bernoulli distribution and keep all the other default settings. The completed Advanced screen is shown below.

Before running the analysis, the model specifications have to be saved. Select the File, Save option, and provide a name for the model specification file. Run the analysis by selection the Run option from the Analysis menu.

  Discussion of results

Portions of the output file schizx1.out are shown below.

In the first section of the output file, a summary of the model specifications is provided. The use of a logistic response function (logit link function), with the assumption of a Bernoulli distribution of random effects, is indicated. This is followed by a summary of the number of observations nested within each patient. The Level 2 observations entry corresponds to the number of patients for whom data were included in the analysis. 

 

The data summary is followed by descriptive statistics for all the variables included in the model. We note that 23% of the patients had a value of 0 on the binary Imps79D score which indicates no or moderate signs of mental illness (Imps79D = 0).

The crosstabulation of the outcome variable and the predictor SqrtWeek requested on the Variables screen during model specification is listed next. We see that most of the measurements (1231) are from the higher, more ill, category of Imps79D. It is also noticeable that more data were obtained at the start, the fourth time point, and the end of the study than on the remaining occasions. The results for the model without any random effects serve as starting values for the iterative procedure.

The output summarizing the estimated parameters after convergence is shown next. Five iterations were required to obtain convergence. The estimates are shown in the column with heading Estimate, and correspond to the coefficients  and  in the model specification. No estimate of the level-1 variance is given, as there is no single level-1 variance for this model.

The expected log-odds of having a high score at the end of the study period (Imps79D score of 1) for a patient from the control group (that is, TxDrug = Tx*SWeek = SqrtWeek = 0) is represented by the estimated intercept of 5.3905. The negative and statistically significant coefficient for SqrtWeek  indicates that the probability of having a high score decreases with time and more so when the interaction between receiving treatment and time is taken into account (Tx*SWeek = 每1.0158, p = 0.0024).

 

As shown below, there is also evidence of significant random variation in the intercepts over patients.

 

Estimated unit-specific probabilities

To evaluate the expected effect of TxDrug, SqrtWeek, and Tx*SWeek on the predicted probability that the Imps79D score is equal to 1, we use the expression for the predicted log odds of success given earlier

For a typical measurement from any patient from the control group at the beginning of the study period (TxDrug = SqrtWeek = Tx*SWeek = 0) we have

For a typical measurement from any patient from the treatment group at the beginning of the study period,  can be expressed as

Using the estimates of  and  of 5.3875 and 每0.0246 respectively as obtained for the current analysis, we can calculate the subject-specific probability of an Imps79D score of 1 as

and

respectively.

On the other end of the scale in terms of time, we can consider patients from both control and treatment groups at the end of the study period. For both groups, this corresponds to a value of 2.45 on the variable SqrtWeek.

For a typical measurement from any patient from the control group at the end of the study period we have

For a typical measurement from any patient from the treatment group at the end of the study period,  can be expressed as

In Table 3.2, the predicted probabilities of a high post-treatment Imps79D score are given for patients from both experimental groups at selected time points during the study period.

Table 3.2: Predicted probability of a high post-treatment Imps79D score

Square root of week

(SqrtWeek)

Control group

Treatment group

0.00

0.9954

0.9953

1.73

0.9423

0.7335

2.45

0.8472

0.3104

We conclude that the diagnoses for both control and treatment groups improved over the seven week study period. There is, however, a marked decrease in  for the treatment group. Looking at the differences of probabilities at the end of the trial, we can draw the conclusion that the treatment effect is highly significant.

Estimated population-average probabilities

Table 3.2 contains estimated unit (patient) specific probabilities. To obtain population-average probabilities, the estimated - values are divided by the square root of the design effect. These adjusted -values are subsequently used in the computation of the probabilities. This aspect is discussed in detail in Section 3.6.

Finally, the random variation in intercept over patients is estimated at 4.4797.

In addition to the standard output file schizx1.out, the Write Bayes Estimates drop-down list box on the Configuration screen of the Model Setup window was used to request Bayes estimates for the individual random terms. This is done by selecting the means & (co)variances option from the Write Bayes Estimates drop-down list box. These estimates are written to the file schizx1.ba2. The first few lines of this file are shown below.

Four pieces of information per patient are given:

o       the level-3 ID (if any),

o       the number of the random effect,

o       the empirical Bayes estimate (), and

o       the estimated associated posterior variance of the patient estimate.

From the data, we know that the first patient, with ID = 1103, was a member of the treatment group. On the other hand, the patient with ID = 1107 was assigned to the control group. At the beginning of the study, the predictor SqrtWeek and hence the interaction term Tx*SWeek were equal to zero. In both cases, the observed Imps79D score at the beginning of the study was equal to 1. Their empirical Bayes estimates were 每0.7018 and 0.4787 respectively. For measurements from each of these patients, this implies

and

respectively.

Similarly, we find that at the end of the study

and

Using the empirical Bayes estimates for each patient, we can also obtain the predicted probability of improving.

This probability,  is obtained as . In Figures 3.11 and 3.12 below, the predicted probabilities of improving for all measurements are shown, with a linear regression line showing the mean probability for each group.

Figure: Predicted probability of improvement (control group) 

Figure: Predicted probability of improvement (treatment group)

For both groups, an increased probability of obtaining a predicted value of 0 on the outcome is observed. Recall that the value "0" is associated with measurements classified as normal, borderline, mildly, or moderately mentally ill, while the value "1" was assigned to measurements corresponding to "markedly ill" through "most extremely ill". However, the patients in the treatment group were much more likely to show improvement, as can be seen from the slopes of the regression lines for the groups in Figure 13.

Figure: Regression lines for control and treatment groups


Copyright © 2005-2013, Scientific Software International, Inc., All rights reserved.
P.O. Box 4728, Skokie, IL 60076-4728