Employability and Related Context Prediction Framework for University Graduands: A Machine Learning Approach

Wijayapala, Manushi Prabhavi; Premaratne, Lalith; Jayamanne, Imali T

Please use this identifier to cite or link to this item: http://archive.cmb.ac.lk:8080/xmlui/handle/70130/5221

Title:	Employability and Related Context Prediction Framework for University Graduands: A Machine Learning Approach
Authors:	Wijayapala, Manushi Prabhavi Premaratne, Lalith Jayamanne, Imali T
Keywords:	Machine Learning, Employability Prediction, Data Mining, Supervised Learning
Issue Date:	2016
Publisher:	International Journalon Advances in ICT for Emerging Regions
Citation:	Wijayapala, M.P. ,Premaratne, L and Jayamanne, I.T. (2016) Employability and Related Context Prediction Framework for University Graduands:A Machine Learning Approach, International Journal on Advances in ICT for Emerging Regions 9(2) http://journal.icter.org/index.php/ICTer/article/view/217/56
Abstract:	In Sri Lanka (SL), graduands’ employability remains a national issue due to the increasing number of graduates produced by higher education institutions each year. Thus, predicting the employability of university graduands can mitigate this issue since graduands can identify what qualifications or skills, they need to strengthen up in order to find a job of their desired field with a good salary before they complete the degree. The main objective of the study is to discover the plausibility of applying the machine learning approach efficiently and effectively towards predicting the employability and related context of university graduands in Sri Lanka by proposing an architectural framework that consists of four modules; employment status prediction, job salary prediction, job field prediction and job relevance prediction of graduands while also comparing performance of classification algorithms under each prediction module. Series of machine learning algorithms such as C4.5, Naïve Bayes and AODE have been experimented on the Graduand Employment Census -2014 data. A pre-processing step is proposed to overcome challenges embedded in graduand employability data and a feature selection process is proposed in order to reduce computational complexity. Additionally, parameter tuning is also done to get the most optimized parameters. More importantly, this study utilizes several types of Sampling (Oversampling, under sampling) and Ensemble (Bagging, Boosting, RF) techniques as well as a newly proposed hybrid approach to overcome the limitations caused by the class imbalance phenomena. For validation purposes, a wide range of evaluation measures was used to analyze the effectiveness of applying classification algorithms and class imbalance mitigation techniques on the dataset. The experimented results indicated that Random Forest has recorded the highest classification performance for 3 modules, achieving the selected best predictive models under hybrid approach having an area under the ROC curve interpretation as an ‘Excellent’ experiment, while a C4.5 Decision Tree model under Ensemble approach has been selected as the best model of the remaining module (Salary Prediction module).
URI:	http://archive.cmb.ac.lk:8080/xmlui/handle/70130/5221
Appears in Collections:	Department of Statistics

Files in This Item:

File	Description	Size	Format
217-701-1-PB.pdf		1.14 MB	Adobe PDF	View/Open

Show full item record