Abstract:
Multilevel data are a commonly encountered
phenomenon in many data structures. Modelling such data
requires careful consideration of the association between
underlying variables at each level of the data structure. This
requires the use of effective univariate techniques prior to
modelling. However, currently no univariate tests are used to
handle this situation. This paper presents the modification and
novel application of a test developed by Zhang and Boos for
testing the association between categorical variables measured
on clusters of observations, for examining initial association in
a multilevel framework. Zhang and Boos have used a SAS/IML
programme (unpublished) for performing their test. This paper
presents an R function for the application of the test, which will
be freely available to users, since R is an open source software.
The function is tested on a dataset from the medical field
pertaining to respiratory disease severity of patients, attending
several different clinics. The explanatory variables pertaining
to this study are Age, Gender, Duration and Symptom, while
the response variable indicating the severity of the diagnosis
made is termed Diagnosis. The results indicate that when
the experimental units show low levels of correlation within
clusters with respect to a particular explanatory variable,
the test performs similarly to the Standard Cochran Mantel
Haenszel (CMH) test. When the corresponding correlation is
high, the Generalized CMH (GCMH) test results in a smaller
p-value than the Standard CMH test. Of the four variables,
only Symptom and Duration are significant with respect to
association with Diagnosis.