Selecting the most suitable classification algorithm for tiger beetle identification using morphometric data and habitat data

Abeywardhana, D. L.; Dangalle, C. D.; Nugaliyadde, A.; Mallawarachchi, Y.W.

Please use this identifier to cite or link to this item: http://archive.cmb.ac.lk:8080/xmlui/handle/70130/6447

Full metadata record

DC Field	Value	Language
dc.contributor.author	Abeywardhana, D. L.	-
dc.contributor.author	Dangalle, C. D.	-
dc.contributor.author	Nugaliyadde, A.	-
dc.contributor.author	Mallawarachchi, Y.W.	-
dc.date.accessioned	2022-02-02T08:43:01Z	-
dc.date.available	2022-02-02T08:43:01Z	-
dc.date.issued	2020	-
dc.identifier.citation	Abeywardhana D. L.; Dangalle C.D.; Nugaliyadde A.; Mallawarachchi Y.W. (2020), Selecting the most suitable classification algorithm for tiger beetle identification using morphometric data and habitat data, Proceedings of the Annual Research Symposium, 2020, University of Colombo, 41	en_US
dc.identifier.uri	http://archive.cmb.ac.lk:8080/xmlui/handle/70130/6447	-
dc.description.abstract	Habitat heterogeneity is a main factor in ecology which affects species diversity. Therefore, habitat details can be used as factors that influence species identification. Tiger beetles are highly habitat specific species. Different species of tiger beetles that have morphometric variation can be found restricted to different habitat types in temperate and tropical areas of the world. Therefore, habitat and morphometric data of tiger beetle species were used to develop a predictive model for the identification of tiger beetle species. Data gathered on grounddwelling tiger beetle species collected from 45 locations by Dangalle (2002-2015), and 150 locations by Thotagamuwa (2014 -2017) were used to construct the dataset required for the study. Then data pre-processing was done to convert nominal data to numerical data, detach records with missing data and correct imbalanced data. Data resampling was also done. Further, an evaluation was performed for feature selection to determine the most important attributes for species identification predictions. Finally, a dataset containing 468 records (individuals of species) having 13 attributes of 14 species was constructed and 351 records were selected for the training set while 117 records were selected for the test set. This dataset was fed to several multiclass algorithms belonging to both single (KNN, Naïve-Bayes, SVM) and ensemble classifiers (Gradient Tree Boosting, Extra Tree Classifier and Random Forest). The performance of each algorithm was evaluated by calculating accuracy values for each model and the results revealed that ensemble classifiers yield a higher accuracy than that of single classifiers. Hence it was proven that ensemble models have a positive effect on the overall quality of predictions, in terms of accuracy, generalizability and lower misclassification costs and are more stable than single classifiers. Further, when considering the different types of ensemble classification algorithms, bagging (averaging) ensemble classification algorithm performed better than boosting methods. When considering the two bagging ensemble classification algorithms - Ensemble Extra Tree Classifier and Random Forest algorithm, both revealed almost the same overall accuracy (85%) with less than 0.12% difference. Therefore, both ensemble classification algorithms are effective for species prediction using habitat and morphometric data. However, when considering the computational time with performance, Ensemble Extra Trees Classifier can be considered as the most suitable algorithm for the scenario.	en_US
dc.language.iso	en	en_US
dc.publisher	University of Colombo	en_US
dc.subject	Ensemble classifiers	en_US
dc.subject	single classifiers	en_US
dc.subject	tabular data	en_US
dc.subject	tiger beetles	en_US
dc.title	Selecting the most suitable classification algorithm for tiger beetle identification using morphometric data and habitat data	en_US
dc.type	Article	en_US
Appears in Collections:	Department of Zoology

Files in This Item:

File	Description	Size	Format
Selecting the most suitable classification algorithm for tiger beetle identification using.pdf		192.05 kB	Adobe PDF	View/Open

Show simple item record