Identifying malaria disease through red-blood microscopic image with XGBoost and random forest methods

Authors

  • Rohmatul Fajriyah Universitas Islam Indonesia
  • Muhammad Muhajir Universitas Islam Indonesia
  • Ahmad Hussain Abdullah Algoritma Data Science
  • Devina Gilar Ayu Astagraphia Information Technology
  • Iqbal Fathur Rahman Delta Dunia Makmur

DOI:

https://doi.org/10.12928/bamme.v4i2.11740

Keywords:

malaria, microscopic blood image, random forest, RShiny, XGBoost

Abstract

Blood cells that flow in the human body provide information to diagnose a disease. The information provided can be obtained through images of these blood cells using image processing techniques. Malaria is a very deadly disease and can affect everyone. Patients with malaria will experience anaemia because the red blood cells or erythrocytes are contaminated with plasmodium. This study offers an alternative solution to malaria disease identification through the image classification of red blood cells, by applying image processing and image classification methods with XGBoost and random forest. The research was conducted using the R programming language in RStudio and Python. The accuracy of XGBoost and random forest methods were 71.26% and 77.58%, respectively. Therefore, the random forest provided a better optimal classification model with higher accuracy. The model is used to build an application which is R web-based, RShiny. In practice, this application can be used by health workers in classifying patients based on red blood cell images such that the health centre would be easier to manage the existing patients.

References

Bekele, A. (2017). Automatic detection of malaria parasite based on microscopic image analysis. Doctoral Dissertation. Addis Ababa University.

Berzosa, P., De Lucio, A., Romay-Barja, M., Herrador, Z., González, V., García, L., Fernández- Martínez, A., Santana-Morales, M., Ncogo, P., Valladares, B., Riloha, M., & Benito, A. (2018). Comparison of three diagnostic methods (microscopy, RDT, and PCR) for the detection of malaria parasites in representative samples from Equatorial Guinea. Malaria Journal, 17(1), 1–12. https://doi.org/10.1186/s12936-018-2481-4

Breiman, L. & Cutler, A. (2004). Random Forests. https://www.stat.berkeley.edu/~breiman/RandomForests/cc_home.htm

Breiman, L. & Cutler, A. (2003). Manual on setting up, using, and understanding random forest V4.0. https://www.stat.berkeley.edu/~breiman/Using_random_forests_v4.0.pdf

Breiman, L. (2001). Random Forest (45th ed.). Springer. https://link.springer.com/article/10.1023/A:1010933404324

Bridget, O. N. (2021). Machine-learning techniques for malaria incidence and tuberculosis prediction. Dissertation. African University of Science and Technology. http://repository.aust.edu.ng/xmlui/handle/123456789/5096

Chen, T., & Guestrin, C. (2016). XGBoost: A scalable tree boosting system. Proceedings of the ACM

SIGKDD International Conference on Knowledge Discovery and Data Mining, 13-17-Augu, 785– 794. https://doi.org/10.1145/2939672.2939785

Das, D. K., Ghosh, M., Pal, M., Maiti, A. K., & Chakraborty, C. (2013). Machine learning approach for automated screening of malaria parasite using light microscopic images. Micron, 45, 97–106. https://doi.org/10.1016/j.micron.2012.11.002

Dayat, A. R., & Banyal, N. A. (2018). Penyebab penyakit malaria dalam sel darah merah manusia dengan menggunakan support vektor machine (SVM) di Kota Jayapura-Papua. Jurnal ILKOM, 10(April), 28–32.

Endah, S. (2020). Mengenal malaria dan vektornya. Bandarlampung, 53(9).

Fitri, L. E., Widaningrum, T., Endharti, A. T., Prabowo, M. H., Winaris, N., & Nugraha, R. Y. B. (2022).

Malaria diagnostic update: From conventional to advanced method. Journal of Clinical Laboratory Analysis, 36(4), 1–14. https://doi.org/10.1002/jcla.24314

Fuhad, K. M. F., Tuba, J. F., Sarker, M. R. A., Momen, S., Mohammed, N., & Rahman, T. (2020). Detection from blood smear and its smartphone based application. Diagnostics, 10(329).

Gitta, B., & Kilian, N. (2020). Diagnosis of malaria parasites Plasmodium sp. in endemic areas:

Current strategies for an ancient disease. BioEssays, 42(1), 1–12. https://doi.org/10.1002/bies.201900138

Jain, N., Chauhan, A., Tripathi, P., Moosa, S. Bin, Aggarwal, P., & Oznacar, B. (2020). Cell image

analysis for malaria detection using deep convolutional network. Intelligent Decision Technologies, 14(1), 55–65. https://doi.org/10.3233/IDT-190079

Janitza, S., & Hornung, R. (2018). On the overestimation of random forest’s out-of-bag error. Plos One, 13(8). https://doi.org/10.1371/journal.pone.0201904

Jin, Z., Shang, J., Zhu, Q., Ling, C., Xie, W., & Qiang, B. (2020). RFRSF: Employee turnover prediction based on random forests and survival analysis. Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 12343 LNCS, 503–515. https://doi.org/10.1007/978-3-030-62008-0_35

Kapwata, T., & Gebreslasie, M. T. (2016). Random forest variable selection in spatial malaria transmission modelling in Mpumalanga Province South Africa. Geospatial Health, 11(3), 251– 262. https://doi.org/10.4081/gh.2016.434

Kementerian Kesehatan. (2017). Pedoman Teknis Pemeriksaan Parasit Malaria. Buku Pedoman, 1–78.

Khoirunnisa, A., & Ramadhan, N. G. (2023). Improving malaria prediction with ensemble learning and robust scaler: An integrated approach for enhanced accuracy. Jurnal Infotel, 15(4), 326– 334. https://doi.org/10.20895/infotel.v15i4.1056

Madabhushi, A., & Lee, G. (2016). Image analysis and machine learning in digital pathology: Challenges and opportunities. Medical Image Analysis, 33, 170–175. https://doi.org/10.1016/j.media.2016.06.037

Pan, W. D., Dong, Y., & Wu, D. (2018). Classification of malaria-infected cells using deep convolutional neural networks. Machine Learning - Advanced Techniques and Emerging Applications. https://doi.org/10.5772/intechopen.72426

PDPERSI. (2019). Buku Saku Penatalaksanaan Kasus Malaria.

http://www.pdpersi.co.id/kanalpersi/data/elibrary/bukusaku_malaria.pdf

Rahman, I. F. (2020). Implementasi Metode SVM, MLP dan Xgboost pada Data Ekspresi Gen. 1–79.

https://dspace.uii.ac.id/handle/123456789/23679

Rajaraman, S., Antani, S. K., Poostchi, M., Silamut, K., Hossain, M. A., Maude, R. J., Jaeger, S., & Thoma, G. R. (2018). Pre-trained convolutional neural networks as feature extractors toward improved malaria parasite detection in thin blood smear images. PeerJ, 2018(4), 1–17. https://doi.org/10.7717/peerj.4568

Segal, M. R. (2004). Machine Learning Benchmarks and Random Forest. Center for Bioinformatics and Molecular Biostatistics, 15. https://escholarship.org/uc/item/35x3v9t4

Shafila, G. A. (2020). Implementasi metode extreme gradient boosting (XGBoost) untuk klasifikasi pada data bioinformatika. Studi Kasus Penyakit Ebola, GSE 122692, 1– 77.

Tang, G. H., Rabie, A. B. M., & Hägg, U. (2004). Indian hedgehog: A mechanotransduction mediator in condylar cartilage. Journal of Dental Research, 83(5), 434–438. https://doi.org/10.1177/154405910408300516

Ummah, M. S. (2019). Properties of AdeABC and AdeIJK efflux systems of Acinetobacter baumannii compared with those of the AcrAB-TolC system of Escherichia coli. Sustainability (Switzerland), 11(1), 1–14.

WHO. (2023). World Malaria Report. https://www.who.int/teams/global-malaria-programme/reports/world-malaria-report-2023

WHO. (2018). World Malaria Report 2018. https://www.who.int/teams/global-malaria-programme/reports/world-malaria-report-2018

Yang, F., Poostchi, M., Yu, H., Zhou, Z., Silamut, K., Yu, J., Maude, R. J., Jaeger, S., & Antani, S. (2020). Deep Learning for Smartphone-Based Malaria Parasite Detection in Thick Blood Smears. IEEE Journal of Biomedical and Health Informatics, 24(5), 1427–1438. https://doi.org/10.1109/JBHI.2019.2939121

Downloads

Published

2024-12-12

Issue

Section

Articles