- Title
- A modelling approach to the analysis of complex survey data
- Creator
- Dlangamandla, Olwethu
- ThesisAdvisor
- Chinomona, Amos
- ThesisAdvisor
- Baxter, Jeremy
- Subject
- Sampling (Statistics)
- Subject
- Linear models (Statistics)
- Subject
- Multilevel models (Statistics)
- Subject
- Logistic regression analysis
- Subject
- Complex survey data
- Date
- 2021-10-29
- Type
- Master's theses
- Type
- text
- Identifier
- http://hdl.handle.net/10962/192955
- Identifier
- vital:45284
- Description
- Surveys are an essential tool for collecting data and most surveys use complex sampling designs to collect the data. Complex sampling designs are used mainly to enhance representativeness in the sample by accounting for the underlying structure of the population. This often results in data that are non-independent and clustered. Ignoring complex design features such as clustering, stratification, multistage and unequal probability sampling may result in inaccurate and incorrect inference. An overview of, and difference between, design-based and model-based approaches to inference for complex survey data has been discussed. This study adopts a model-based approach. The objective of this study is to discuss and describe the modelling approach in analysing complex survey data. This is specifically done by introducing the principle inference methods under which data from complex surveys may be analysed. In particular, discussions on the theory and methods of model fitting for the analysis of complex survey data are presented. We begin by discussing unique features of complex survey data and explore appropriate methods of analysis that account for the complexity inherent in the survey data. We also explore the widely applied logistic regression modelling of binary data in a complex sample survey context. In particular, four forms of logistic regression models are fitted. These models are generalized linear models, multilevel models, mixed effects models and generalized linear mixed models. Simulated complex survey data are used to illustrate the methods and models. Various R packages are used for the analysis. The results presented and discussed in this thesis indicate that a logistic mixed model with first and second level predictors has a better fit compared to a logistic mixed model with first level predictors. In addition, a logistic multilevel model with first and second level predictors and nested random effects provides a better fit to the data compared to other logistic multilevel fitted models. Similar results were obtained from fitting a generalized logistic mixed model with first and second level predictor variables and a generalized linear mixed model with first and second level predictors and nested random effects.
- Description
- Thesis (MSC) -- Faculty of Science, Statistics, 2021
- Format
- computer, online resource, application/pdf, 1 online resource (105 pages), pdf
- Publisher
- Rhodes University, Faculty of Science, Statistics
- Language
- English
- Rights
- Dlangamandla, Olwethu
- Rights
- Attribution 4.0 International (CC BY 4.0)
- Rights
- Open Access
- Hits: 2805
- Visitors: 3207
- Downloads: 502
Thumbnail | File | Description | Size | Format | |||
---|---|---|---|---|---|---|---|
View Details | SOURCE1 | DLANGAMANDA-MSC-TR21-265.pdf | 447 KB | Adobe Acrobat PDF | View Details |