Agriculture and food sciences | Online ISSN: 3066-3407

Machine Learning-Driven Water Quality Index Prediction: Enhancing Accuracy with Gradient Boosting and Explainable AI for Sustainable Water Monitoring

Md. Jahidul Islam1, Siraj Us Salekin2, Asif Anzum3, Nafis Zaman1, Abdullah Al Ahad Khan4, Dilip Sarkar5, Md. Liton Rabbani6, Md. Tarek Hossain6

+ Author Affiliations

Applied Agriculture Sciences 2(1) 1-14

Submitted: 12 August 2024  Revised: 06 October 2024  Published: 07 October 2024 


Background: Water is fundamental to the survival of all life forms, yet access to clean and safe water remains a critical challenge worldwide. Contaminated water is a significant contributor to waterborne diseases, highlighting the need for effective water quality monitoring. The Water Quality Index (WQI) is a standard tool for assessing water quality; however, traditional WQI methods are often constrained by inconsistencies, laboratory inaccuracies, and human error. Methods: This study aimed to overcome these limitations by integrating advanced machine learning (ML) techniques into WQI prediction. Physicochemical parameters, including pH, chloride (Cl), sulfate (SO4²), sodium (Na), potassium (K), calcium (Ca²), magnesium (Mg²), total hardness, and total dissolved solids, were collected from diverse water sources to form a robust dataset. ML algorithms such as Gradient Boosting, Random Forest, and XGBoost, augmented with explainable AI (XAI), were employed to enhance prediction accuracy. The dataset was split into training (70%), testing (15%), and validation (15%) subsets, and model performance was assessed using RMSE, MSE, MAE, and R² metrics. Results: Gradient Boosting outperformed other models, achieving 96% accuracy on the test dataset after fine-tuning. It demonstrated superior predictive capabilities, as evidenced by its performance metrics. These results indicate the potential for ML techniques to address the limitations of traditional WQI methods. Conclusion: This study demonstrates the effectiveness of ML-driven approaches in improving water quality assessments. The integration of Gradient Boosting and explainable AI provides a reliable framework for WQI prediction, enabling better decision-making in environmental health policies and water resource management. This approach offers a pathway to more efficient and accurate water quality monitoring systems.

Keywords: Water Quality Index (WQI), Water Quality Monitoring, Machine Learning Algorithms, Explainable AI (XAI), Predictive Modelling


Abdullah, M. S., Islam, M. J., Hasan, M. M., Sarkar, D., Rana, M. S., Das, S. S., & Hossian, M. (2024). Impact of waste management on infectious disease control: Evaluating strategies to mitigate dengue transmission and mosquito breeding sites – A systematic review. Journal of Angiotherapy, 8(8), 1–12.

Agrawal, K. K., Panda, C., & Bhuyan, M. K. (2021). Impact of urbanization on water quality. In S. K. Acharya & D. P. Mishra (Eds.), Current advances in mechanical engineering (pp. 665–673). Springer.

Ahmed, M., Mumtaz, R., & Anwar, Z. (2022). An enhanced water quality index for water quality monitoring using remote sensing and machine learning. Applied Sciences, 12(24), Article 24.

Ahmed, U., Mumtaz, R., Anwar, H., Shah, A. A., Irfan, R., & García-Nieto, J. (2019). Efficient water quality prediction using supervised machine learning. Water, 11(11), 2210.

Albert, J., & Rizzo, M. (2012). Exploratory data analysis. In J. Albert & M. Rizzo (Eds.), R by example: Concepts to code (pp. 133–151). Springer.

Azad, A., Karami, H., Farzin, S., Saeedian, A., Kashi, H., & Sayyahi, F. (2018). Prediction of water quality parameters using ANFIS optimized by intelligence algorithms (Case study: Gorganrood River). KSCE Journal of Civil Engineering, 22(7), 2206–2213.

Brown, R. M., McClelland, N. I., Deininger, R. A., & O’Connor, M. F. (1972). A water quality index—Crashing the psychological barrier. In W. A. Thomas (Ed.), Indicators of environmental quality (pp. 173–182). Springer US.

Bui, D. T., Khosravi, K., Tiefenbacher, J., Nguyen, H., & Kazakis, N. (2020). Improving prediction of water quality indices using novel hybrid machine-learning algorithms. Science of The Total Environment, 721, 137612.

Chen, S. S., Kimirei, I. A., Yu, C., Shen, Q., & Gao, Q. (2022). Assessment of urban river water pollution with urbanization in East Africa. Environmental Science and Pollution Research, 29(27), 40812–40825.

Hou, R., Lo, J. Y., Marks, J. R., Hwang, E. S., & Grimm, L. J. (2023). Classification performance bias between training and test sets in a limited mammography dataset (p. 2023.02.15.23285985). medRxiv.

Islam, M. J. (2024). A study on seasonal variations in water quality parameters of Dhaka rivers. Iranica Journal of Energy and Environment, 15(1), Article 1.

Islam, Md. J., Abdullah, M. S., & Alam, M. (2024). Flooding crisis in Bangladesh: Urgent measures required. Biodiversity, 25(2), 95–98.

Juwana, I., Muttil, N., & Perera, B. J. C. (2016). Uncertainty and sensitivity analysis of West Java Water Sustainability Index – A case study on Citarum catchment in Indonesia. Ecological Indicators, 61, 170–178.

Khan, I., Zakwan, M., & Mohanty, B. (2022). Water quality assessment for sustainable environmental management. ECS Transactions, 107(1), 10133.

Khoi, D. N., Quan, N. T., Linh, D. Q., Nhi, P. T. T., & Thuy, N. T. D. (2022). Using machine learning models for predicting the water quality index in the La Buong River, Vietnam. Water, 14(10), 1552.

Kiliç, Z. (2020). The importance of water and conscious use of water. International Journal of Hydrology.

Lamrini, M., Quevy, Q. A., Yassin Chkouri, M., & Touhafi, A. (2022). Data integrity analysis of water quality sensors and water quality assessment. IECON 2022 – 48th Annual Conference of the IEEE Industrial Electronics Society, 1–6.

Lap, B. Q., Phan, T.-T.-H., Nguyen, H. D., Quang, L. X., Hang, P. T., Phi, N. Q., Hoang, V. T., Linh, P. G., & Hang, B. T. T. (2023). Predicting water quality index (WQI) by feature selection and machine learning: A case study of An Kim Hai irrigation system. Ecological Informatics, 74, 101991.

Lee, S. (2021). Water quality management. In S. Lee (Ed.), China’s water resources management: A long march to sustainability (pp. 191–228). Springer International Publishing.

Li, X., Ding, J., & Ilyas, N. (2021). Machine learning method for quick identification of water quality index (WQI) based on Sentinel-2 MSI data: Ebinur Lake case study. Water Supply, 21(3), 1291–1312.

Ling, Q. (2023). Machine learning algorithms review. Applied and Computational Engineering, ACE, 4, 91–98.

Mim, F. I., Islam, Md. J., & Abdullah, M. S. (n.d.). Plastic tsunami: Bangladesh’s maritime ecosystem under siege. Environmental Forensics, 0(0), 1–3.

Mogane, L. K., Masebe, T., Msagati, T. A. M., & Ncube, E. (2023). A comprehensive review of water quality indices for lotic and lentic ecosystems. Environmental Monitoring and Assessment, 195(8), 926.

Mueller, J., Varadharajan, C., Wu, Y., & Siirila-Woodburn, E. (2021). Machine learning to enable efficient uncertainty quantification, data assimilation, and informed data acquisition (AI4ESP1097). Lawrence Berkeley National Lab. (LBNL), Berkeley, CA (United States).

Oreški, D., Pihir, I., & Višnjiu, D. (2023). Comparative analysis of machine learning algorithms on data sets of different characteristics for digital transformation. 2023 46th MIPRO ICT and Electronics Convention (MIPRO), 1428–1433.

Rahman, H., Easha, A. A., Fatema, N., Islam, Md. J., & Alam, M. (2024). Climate change adaptation strategy of the coastal indigenous community of Bangladesh. Advances in Civil Engineering, 2024(1), 5395870.

Ren, Z., & Du, C. (2023). A review of machine learning state-of-charge and state-of-health estimation algorithms for lithium-ion batteries. Energy Reports, 9, 2993–3021.

Rezaie-Balf, M., Attar, N. F., Mohammadzadeh, A., Murti, M. A., Ahmed, A. N., Fai, C. M., Nabipour, N., Alaghmand, S., & El-Shafie, A. (2020). Physicochemical parameters data assimilation for efficient improvement of water quality index prediction: Comparative assessment of a noise suppression hybridization approach. Journal of Cleaner Production, 271, 122576.

Schweitzer, R. W., Harvey, B., & Burt, M. (2020). Using innovative smart water management technologies to monitor water provision to refugees. Water International, 45(6), 651–659.

Shadabi, L., & Ward, F. A. (2022). Predictors of access to safe drinking water: Policy implications. Water Policy, 24(6), 1034–1060.

Sillberg, C., Kullavanijaya, P., & Chavalparit, O. (2021). Water quality classification by integration of attribute-realization and support vector machine for the Chao Phraya River. Journal of Ecological Engineering, 22(9), 70–86.

Sutadian, A. D., Muttil, N., Yilmaz, A. G., & Perera, B. J. C. (2015). Development of river water quality indices—A review. Environmental Monitoring and Assessment, 188(1), 58.

Tabassum, S., Kotnala, C. B., Masih, R. K., Shuaib, M., Alam, S., & Alar, T. M. (2023). Performance analysis of machine learning techniques for predicting water quality index using physiochemical parameters. 2023 International Conference on Sustainable Computing and Smart Systems (ICSCSS), 372–377.

To, T. C. (2020). Water quality assessment of Saigon River for public water supply based on water quality index. Vietnam Journal of Science and Technology, 58(5A), 85.

Uddin, M. G., Nash, S., Mahammad Diganta, M. T., Rahman, A., & Olbert, A. I. (2022). Robust machine learning algorithms for predicting coastal water quality index. Journal of Environmental Management, 321, 115923.

Uddin, Md. G., Nash, S., & Olbert, A. I. (2021). A review of water quality index models and their use for assessing surface water quality. Ecological Indicators, 122, 107218.

Wang, L., Zhu, Z., Sassoubre, L., Yu, G., Liao, C., Hu, Q., & Wang, Y. (2021). Improving the robustness of beach water quality modeling using an ensemble machine learning approach. Science of The Total Environment, 765, 142760.

Yilma, M., Kiflie, Z., Windsperger, A., & Gessese, N. (2018). Application of artificial neural network in water quality index prediction: A case study in Little Akaki River, Addis Ababa, Ethiopia. Modeling Earth Systems and Environment, 4(1), 175–187.

Zhai, C., Sui, Y., & Wu, W. (2023). Machine learning-assisted correlations of heat/mass transfer and pressure drop of microchannel membrane-based desorber/absorber for compact absorption cycles. International Journal of Heat and Mass Transfer, 214, 124431.

Zhang, Y., Gao, X., Smith, K., Inial, G., Liu, S., Conil, L. B., & Pan, B. (2019). Integrating water quality and operation into prediction of water production in drinking water treatment plants by genetic algorithm enhanced artificial neural network. Water Research, 164, 114888.

Full Text
Export Citation

View Dimensions

View Plumx

View Altmetric
