Please use this identifier to cite or link to this item: http://archive.cmb.ac.lk:8080/xmlui/handle/70130/4030
Title: Automatic Word Clustering in Application of Open-Ended Response Categorization
Authors: Medagoda, N.P.K.
Issue Date: 2012
Citation: A Thesis submitted for the Degree of Master of Philosophy
Abstract: Open ended questions are an essential and important part of survey questionnaires. They provide an opportunity for researchers to discover unanticipated information regarding the domain of study. However, they are problematic for processing since they are unstructured questions to which possible answers are not suggested, and the respondent is free to answer in his or her own words. This thesis presents novel methods of categorizing such open ended survey responses. A document clustering technique is employed in this study to categorize responses to open-ended survey questions. Supervised and unsupervised methods of categorizing open ended responses are tested in the study. Initially the author proposed a hierarchical clustering based algorithm as the unsupervised method to code the open-ended responses which were not labelled at all. The algorithm employs several natural language processing techniques to extract a classification of responses automatically. Naive Bayes classification was proposed as the supervised solution. This Naive Bayes algorithm was proposed for the open ended responses which were partially labelled. Two experiments were carried out to determine the accuracy of the proposed algorithms which proved to be promising. Hierarchical clustering based algorithm shows more than 70% accuracy when compared with the manually coded responses. The proposed Naive Bayes algorithm didn’t not illustrate the results as it expected. Therefore Positive Naive Bayes algorithm was introduced and it achieved an overall performance of 80%
URI: http://archive.cmb.ac.lk:8080/xmlui/handle/70130/4030
Appears in Collections:MPhil/PhD theses

Files in This Item:
File Description SizeFormat 
MPhil2013-NPK Medagoda.pdf757.75 kBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.