Volume 1 (2025)

Journal Paper

Machine Learning-Based Prediction of Carbon Dioxide Emissions: A Comparative Analysis of Ensemble Models and Feature Importance Evaluation

Author: Ruimin Ma, Qiong Li, Alexander Kovshov  PDF

Article 37

Abstract- Accurate forecasting of carbon dioxide (CO2) emissions is crucial for developing effective environmental policies and mitigating climate change. In this study, we apply machine learning models, including Random Forest, XGBoost, LightGBM, and CatBoost, to predict CO2 emissions based on a dataset covering 107 countries from 2000 to 2020. We investigate the influence of key economic, social, environmental, and energy-related factors on CO2 emissions and assess the predictive performance of each model. To enhance interpretability, we employ feature importance analysis to identify the most significant drivers of CO2 emissions. By leveraging Permutation Importance, we quantify the contribution of various features across different models. Our methodology integrates a time-window-based feature engineering approach, allowing us to capture temporal patterns in CO2 emissions trends. Experimental results show that CatBoost delivers the highest overall predictive performance, benefiting from its Ordered Boosting mechanism and superior handling of categorical data. LightGBM and XGBoost also achieve strong results, with XGBoost demonstrating notable advantages in controlling prediction bias. The feature importance analysis highlights the dominant role of energy-related factors, particularly electricity consumption from fossil fuels and renewables, in shaping CO2 emissions. Additionally, social and economic indicators, such as land area and GDP growth, exhibit varying levels of impact across models. This study underscores the efficacy of machine learning techniques in CO2 emissions forecasting and provides valuable insights into the underlying drivers of emissions. The findings contribute to advancing data-driven environmental policy-making.

Keywords:

Carbon Dioxide Emissions Prediction Machine Learning Ensemble Learning Models Permutation Importance

Cite: Ma, R., Li, Q., & Kovshov, A. (2025). Machine learning-based prediction of carbon dioxide emissions: A comparative analysis of ensemble models and feature importance evaluation. Glovento Journal of Integrated Studies (GJIS), 1, Article 37. http://doi.org/10.63665/gjis.v1.37