## Parametric and semi-parametric models for the analysis of proportions in presence of over/under dispersion

##### Abstract

Data in the form of proportions arise in toxicology (Weil, 1970; Williams, 1975)
and other similar fields (Crowder, 1978; Otake and Prentice, 1984). These proportions
often exhibit variation greater than predicted by a simple binomial model.
Several parametric models such as the beta-binomial (BB) (Skellam, 1948), the correlated
binomial (Kuper and Haseman, 1978) and the additive and multiplicative
binomial models (Altham, 1974) are available for analysing binomial data with over
dispersion. Of these the correlated binomial and the additive binomial models are
identical. The superiority of the beta-binomial model for the analysis of proportions
has been shown by many authors (Paul, 1982; Pack, 1986).
The joint estimation of the mean and the dispersion or the intraclass correlation
parameters is important in the over/under dispersed binomial data. The computation
of the maximum likelihood estimates is quite intensive and not robust to
variance misspecification. We consider 'several semi-parametric models as an alter-
native approach recently developed in the context of correlated binary data, which
require assumption on the form of only the mean and variance. We study large and
small sample efficiency of the mean and the intraclass correlation parameters.
An important problem is to compare proportions of a certain characteristic in
several groups. A common test in these type of studies.is to compare the proportion
in a control group with that ~ in a treatment group. A number of parametric and ,
non-parametric procedures are available for testing homogeneity of proportions in
the presence of over dispersion. Of these, the likelihood ratio test based on the
beta-binomial model has found prominence in the literature (Pack, 1986(a)). We
consider procedures for testing the homogeneity of proportions in the presence of
a common dispersion parameter. We develop C(o:) (Neyman, 1959) or score type
tests (Rao, 1947) based on a parametric model; namely, the extended beta-binomial
model (Prentice, 1986) and two semi-parametric models using the quasi-likelihood
(Wedderburn, 1974) and the extended quasi-likelihood (NeIder and Pregibon, 1987).
We also derive a C( 0:) test using empirical variance based on quasi-likelihood. These
procedures and a recent procedure by Rao and Scott (1992), based on the concept of
design effect and effective sample size, are compared, through simulations in terms
of size, power and robustness for departure from data distribution and dispersion
homogeneity. To study robustness in terms of departure from data distribution,
i.e., departure from the beta-binomial distribution, we simulate data from the betabinomial
distribution, the probit normal binomial distribution and the logit normal
binomial distribution.
Further, we develop C( 0:) tests for testing the assumption of a common dispersion
parameter based on semi-parametric models. In some cases the assumption of a
common dispersion parameter might not be tenable. A C( 0:) test is derived for
testing the homogeneity of proportion with unequal dispersion parameters based
on semi-parametric models.

#####
**Citation**

Doctor of Philosophy#####
**Sponsorhip**

University of Nairobi#####
**Publisher**

Department of Mathematic and Statistics