Missing values: a closer look
- Authors: Thorpe, Kerri
- Date: 2017
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: http://hdl.handle.net/10962/d1017827 , vital:20798
- Description: Problem: In today’s world, missing values are more present than ever. Due to the ever-changing and fast paced global society in which we live, most business and research data produced around the world contain missing data. This means that locating data which is meticulously precise can be a hard task in itself, but at times may prove essential as the consequences of making use of incomplete data could be disastrous. The reasons for missing data cropping up in almost all forms of work are numerous and shall be discussed in this dissertation. For example, those being interviewed or polled may choose to simply ignore questions which are posed to them, recording equipment may malfunction or be misplaced, or organisers may not be able to locate the respondent in order to rectify the missing data. Whatever the reasons for data being incomplete, it is necessary to avoid having to use inefficient and incomplete data as a result from the above problems. Therefore, various strategies or methods have been developed in order to handle these missing values. It is important, however, that these strategies or methods are utilised effectively as missing data treatment can introduce bias into the analysis. This dissertation shall look at these and other problems in more detail by using a data set which consists of records for 581 children who were interviewed in 1990 as part of the National Longitudinal Survey of Youth (NLSY). Approach: As mentioned above, many strategies or methods have been developed in order to deal with missing values. More specifically, traditional methods such as complete case analysis, available case analysis or single imputation are widely used by researchers and shall be discussed herein. Although these methods are simple and easy to implement, they require assumptions about the data that are not often satisfied in practice. Over the years, more up to date and relevant methods, such as multiple imputation and maximum likelihood have been developed. These methods rely on weaker assumptions and contain superior statistical properties when compared to the traditional techniques. In this dissertation, these traditional methods shall be reviewed and assessed in SAS and shall be compared to the more modern techniques. Results: The ad hoc techniques for handling missing data such as complete case and available case methods produce biased parameter estimates when the data is not missing completely at random (MCAR). Single imputation techniques likewise produce biased estimates as well as result in the underestimation of standard errors. Although the expectation maximisation (EM) algorithm yields unbiased parameter estimates, the lack of convenient standard errors suggests that using this algorithm for hypothesis testing is not a good idea. Multiple imputation, however, yields unbiased parameter estimates and correctly estimates standard errors. Conclusion: Ignoring missing data in any analysis produces biased parameter estimates. Using single imputation to handle missing values is not recommended, as using a single value to replace missing values does not account for the variation that would have been present if the variables were observed. As a result, the variance will be greatly underestimated. The more modern missing data methods such as the EM algorithm and multiple imputation are preferred over the traditional techniques as they require less stringent assumptions and they also mitigate the downsides of the older methods.
- Full Text:
- Authors: Thorpe, Kerri
- Date: 2017
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: http://hdl.handle.net/10962/d1017827 , vital:20798
- Description: Problem: In today’s world, missing values are more present than ever. Due to the ever-changing and fast paced global society in which we live, most business and research data produced around the world contain missing data. This means that locating data which is meticulously precise can be a hard task in itself, but at times may prove essential as the consequences of making use of incomplete data could be disastrous. The reasons for missing data cropping up in almost all forms of work are numerous and shall be discussed in this dissertation. For example, those being interviewed or polled may choose to simply ignore questions which are posed to them, recording equipment may malfunction or be misplaced, or organisers may not be able to locate the respondent in order to rectify the missing data. Whatever the reasons for data being incomplete, it is necessary to avoid having to use inefficient and incomplete data as a result from the above problems. Therefore, various strategies or methods have been developed in order to handle these missing values. It is important, however, that these strategies or methods are utilised effectively as missing data treatment can introduce bias into the analysis. This dissertation shall look at these and other problems in more detail by using a data set which consists of records for 581 children who were interviewed in 1990 as part of the National Longitudinal Survey of Youth (NLSY). Approach: As mentioned above, many strategies or methods have been developed in order to deal with missing values. More specifically, traditional methods such as complete case analysis, available case analysis or single imputation are widely used by researchers and shall be discussed herein. Although these methods are simple and easy to implement, they require assumptions about the data that are not often satisfied in practice. Over the years, more up to date and relevant methods, such as multiple imputation and maximum likelihood have been developed. These methods rely on weaker assumptions and contain superior statistical properties when compared to the traditional techniques. In this dissertation, these traditional methods shall be reviewed and assessed in SAS and shall be compared to the more modern techniques. Results: The ad hoc techniques for handling missing data such as complete case and available case methods produce biased parameter estimates when the data is not missing completely at random (MCAR). Single imputation techniques likewise produce biased estimates as well as result in the underestimation of standard errors. Although the expectation maximisation (EM) algorithm yields unbiased parameter estimates, the lack of convenient standard errors suggests that using this algorithm for hypothesis testing is not a good idea. Multiple imputation, however, yields unbiased parameter estimates and correctly estimates standard errors. Conclusion: Ignoring missing data in any analysis produces biased parameter estimates. Using single imputation to handle missing values is not recommended, as using a single value to replace missing values does not account for the variation that would have been present if the variables were observed. As a result, the variance will be greatly underestimated. The more modern missing data methods such as the EM algorithm and multiple imputation are preferred over the traditional techniques as they require less stringent assumptions and they also mitigate the downsides of the older methods.
- Full Text:
Eliciting and combining expert opinion : an overview and comparison of methods
- Authors: Chinyamakobvu, Mutsa Carole
- Date: 2015
- Subjects: Decision making -- Statistical methods , Expertise , Bayesian statistical decision theory , Statistical decision , Delphi method , Paired comparisons (Statistics)
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: vital:5579 , http://hdl.handle.net/10962/d1017827
- Description: Decision makers have long relied on experts to inform their decision making. Expert judgment analysis is a way to elicit and combine the opinions of a group of experts to facilitate decision making. The use of expert judgment is most appropriate when there is a lack of data for obtaining reasonable statistical results. The experts are asked for advice by one or more decision makers who face a specific real decision problem. The decision makers are outside the group of experts and are jointly responsible and accountable for the decision and committed to finding solutions that everyone can live with. The emphasis is on the decision makers learning from the experts. The focus of this thesis is an overview and comparison of the various elicitation and combination methods available. These include the traditional committee method, the Delphi method, the paired comparisons method, the negative exponential model, Cooke’s classical model, the histogram technique, using the Dirichlet distribution in the case of a set of uncertain proportions which must sum to one, and the employment of overfitting. The supra Bayes approach, the determination of weights for the experts, and combining the opinions of experts where each opinion is associated with a confidence level that represents the expert’s conviction of his own judgment are also considered.
- Full Text:
- Authors: Chinyamakobvu, Mutsa Carole
- Date: 2015
- Subjects: Decision making -- Statistical methods , Expertise , Bayesian statistical decision theory , Statistical decision , Delphi method , Paired comparisons (Statistics)
- Language: English
- Type: Thesis , Masters , MSc
- Identifier: vital:5579 , http://hdl.handle.net/10962/d1017827
- Description: Decision makers have long relied on experts to inform their decision making. Expert judgment analysis is a way to elicit and combine the opinions of a group of experts to facilitate decision making. The use of expert judgment is most appropriate when there is a lack of data for obtaining reasonable statistical results. The experts are asked for advice by one or more decision makers who face a specific real decision problem. The decision makers are outside the group of experts and are jointly responsible and accountable for the decision and committed to finding solutions that everyone can live with. The emphasis is on the decision makers learning from the experts. The focus of this thesis is an overview and comparison of the various elicitation and combination methods available. These include the traditional committee method, the Delphi method, the paired comparisons method, the negative exponential model, Cooke’s classical model, the histogram technique, using the Dirichlet distribution in the case of a set of uncertain proportions which must sum to one, and the employment of overfitting. The supra Bayes approach, the determination of weights for the experts, and combining the opinions of experts where each opinion is associated with a confidence level that represents the expert’s conviction of his own judgment are also considered.
- Full Text:
- «
- ‹
- 1
- ›
- »