Assessing National Reading Habits through Machine Learning: Insights from the Indonesian Reading Interest Rate Survey (2020–2023)

Abstract

Reading interest is a vital component of educational development, yet many regions face low engagement in reading activities. This study employs advanced machine learning methods to analyze and predict provincial reading interest trends in Indonesia (2020–2023). We performed classification and regression analyses using top-performing models, including CatBoost, LightGBM, XGBoost, Random Forest, ExtraTrees, k-Nearest Neighbors, and neural networks. Classification models categorized provinces by reading interest level with exceptional accuracy, reaching up to 100% on the held-out test set using an ensemble neural network. Regression models predicted continuous reading interest index scores precisely, achieving a root mean square error (RMSE) around 1.0 on a 0–100 scale. Our findings demonstrate that modern machine learning approaches can effectively uncover underlying patterns in reading interest data, such as a notable decline in reading interest in 2021 coinciding with the COVID-19 pandemic (highlighting digital disruption effects). However, given the relatively small dataset (34 provinces over 4 years),these results should be interpreted with caution in terms of generalizability and granularity. Ensemble tree-based models and neural networks exhibited superior performance, capturing both linear and non-linear relationships in the data, whereas simpler methods (e.g., k-NN) under performed. This aligns with prior research emphasizing the impact of digital media on reading habits and literacy development. By leveraging predictive analytics, educators and policymakers can proactively identify declines in reading interest and implement targeted interventions to foster sustained reading engagement in an increasingly digital world.

Description

Keywords

Reading Interes, Machine Learning, Educational Development, Predictive Analytics, Digital Disruption

Citation

Monika, W., Wijesundara, C., Sudiar, N., & Latiar, H. (2025). Assessing National Reading Habits through Machine Learning: Insights from the Indonesian Reading Interest Rate Survey (2020–2023). IT Journal Research and Development (ITJRD), 10(1), 35-52. https://doi.org/10.25299/itjrd.2025.24019

Collections

Endorsement

Review

Supplemented By

Referenced By