Province clustering based on the percentage of communicable disease using the BCBimax biclustering algorithm

Submitted: 28 March 2023
Accepted: 9 August 2023
Published: 12 September 2023
Abstract Views: 1135
PDF: 400
HTML: 45
Publisher's note
All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.


Indonesia needs to lower its high infectious disease rate. This requires reliable data and following their temporal changes across provinces. We investigated the benefits of surveying the epidemiological situation with the imax biclustering algorithm using secondary data from a recent national scale survey of main infectious diseases from the National Basic Health Research (Riskesdas) covering 34 provinces in Indonesia. Hierarchical and k-means clustering can only handle one data source, but BCBimax biclustering can cluster rows and columns in a data matrix. Several experiments determined the best row and column threshold values, which is crucial for a useful result. The percentages of Indonesia’s seven most common infectious diseases (ARI, pneumonia, diarrhoea, tuberculosis (TB), hepatitis, malaria, and filariasis) were ordered by province to form groups without considering proximity because clusters are usually far apart. ARI, pneumonia, and diarrhoea were divided into toddler and adult infections, making 10 target diseases instead of seven. The set of biclusters formed based on the presence and level of these diseases included 7 diseases with moderate to high disease levels, 5 diseases (formed by 2 clusters), 3 diseases, 2 diseases, and a final order that only included adult diarrhoea. In 6 of 8 clusters, diarrhea was the most prevalent infectious disease in Indonesia, making its eradication a priority. Direct person-to-person infections like ARI, pneumonia, TB, and diarrhoea were found in 4-6 of 8 clusters. These diseases are more common and spread faster than vector-borne diseases like malaria and filariasis, making them more important.



PlumX Metrics


Download data is not yet available.


Al-Akwaa FM, 2012. Analysis of Gene Expression Data Using Biclustering Algorithms. In Functional Genomics. Edited by Germana Meroni and Francesca Petrera. Published 12 September 2012. doi:10.5772/3117. isbn978-953-51-0727-9. ebook (pdf) isbn: 978-953-51-5316-0. IntechOpen 5, Princes Gate Count. London, SW 7 20J, UK.
Almasi A, Zangeneh A, Ziapour A, Saeidi A, Teimouri R, Ahmadi T, Khezeli M, Moradi G, Soofi M, Salimi Y, Rajabi-GilanN, Ghasemi SR, Heydarpour F, Moghadam S,Yigitcanlar T, 2022. Investigating Global Spatial Patterns of Diarrhoea-Related Mortality in Children Under Five. Front Public Health 10:861629. DOI:
Bastida AZ, Tellez MHN, Montes LPB, Torres IM, Paniagua JNSJ, Tes Maejb, Barrera, Rez-Duran NR, 2017. Spatial and temporal distribution of tuberculosis in the State of Mexico, Mexico. Vet Ital 53:39-46.
Castanho EN, Aidos H, Madeira SC, 2022. Biclustering fMRI time series: a comparative study. BMC Bioinformatics 23:1–30. DOI:
Chirenda J, Gwitira I, Warren RM, Sampson SL, Murwira A, Masimirembwa C, Mateveke KM, Duri C, Chonzi P, Rusakaniko S, Streicher EM, 2020. Spatial distribution of Mycobacterium tuberculosis in metropolitan Harare, Zimbabwe. PLoS One 15:116:e0231637. DOI:
Chu HM, Liu JX, Zhang K, Zheng CH, Wang J, Kong XZ, 2022. A binary biclustering algorithm based on the adjacency diference matrix for gene expression data analysis. BMC Bioinformatics 23:381. DOI:
Dmello MK, Badiger S, Kumar S, Kumar N, Dsousa N, 2022. Spatial and space-time clustering of diarrhoeal cases among under-five children in Karkala, Karnataka: a geospatial analysis. J Clin Diagn Res 16:1-5 DOI:
Dolnicar S, Kaiser S,Lazarevski K, Leisch F, 2012. Biclustering: Overcoming data dimensionality problems in market segmentation. J Travel Res 51:41- DOI:
Germas, 2020. Rencana Aksi Kegiatan. Promosi Kesehatan dan Pemberdayaan Masyarakat Tahun 2020-2024 [Activity Action Plan. Health Promotion and Community Empowerment 2020-2024.] Directorate of Health Promotion and Community Empowerment. Ministry of Health of the Republic of Indonesia [Direktorat Promosi Kesehatan dan Pemberdayaan Masyarakat. Kementerian Kesehatan Republik Indonesia.]
Hutson AV, 2018. Statistics in the Health Sciences Theory, Applications, and Computing. 1st Edition. Published August 29, 2022 by Chapman & Hall Book. 2-6 Boundary Row, London, SE1 8HN, UK
James DO, 2016. Spatial distribution of tuberculosis in Nigeria and its socioeconomic correlates. Faculty of Health and Medicine, Lancaster University, Doctoral thesis.
Jamail I, Moussa A, 2020. Current State-of-the-Art of Clustering Methods for Gene Expression Data with RNA-Seq. In Applications of Pattern Recognition, edited by Carlos M. Travieso-Gonzalez. ISBN: 978-1-78985-561-6. IntechOpen 5 Princes Gate Count. London, SW 7 2QJ, UK.
Lee Y, Lee J-H, Jun C-H, 2009. Validation measures of bicluster solutions. Ind Eng Manag Syst 8:101–108.
Lee Y, Lee J-H, Jun C-H, 2011. Stability-based validation of bicluster solutions. Pattern Recognition 44:252-64. DOI:
Liu X, Wang L, 2007. Computing the maximum similarity biclusters of gene expression data. BMC Bioinformatics 23:50-6. DOI:
Magna EK, Dabi M, Tadri P, 2019. Spatial distribution of malaria in the semi-arid zone of Ghana: A case of upper west region using GIS approach. J Environ Health and Sustain Dev 4:670-7.
Ministry of Health (MoH), 2019. Laporan Nasional Riskesdas 2018 (Basic Health Research National Report 2018), Badan Penelitian dan Pengembangan Kesehatan. Jakarta: Indonesia. Available from:
MoH, 2020. Peraturan Menteri Kesehatan Republik Indonesia Nomor 21 Tahun 2020 tentan Rencana Strategis Kementerian Kesehatan Tahun 2020-2024 [Regulation of the Minister of Health of the Republic of Indonesia Number 21 of 2020 concerning the Strategic Plan of the Ministry of Health for 2020-2024]. Indonesia.
MoH, 2021. Profil Kesehatan Indonesia Tahun 2020 (Indonesia Health Profile 2020). Jakarta: Indonesia. doi: 10.1524/itit.2006.48.1.6. DOI:
Padilha VA, Campello RJGB, 2017. A systematic comparative evaluation of biclustering techniques. BMC Bioinformatics 18:1–25. DOI:
Pandove D, Malhi A, 2021. A Correlation based recommendation system for large data sets. J Grid Computing 19:42. DOI:
Payares-Garcia D, Quintero-Alonso B, Carlos Melo-Martinez CE. 2023. Determinants of Pneumonia mortality in Bogota, Colombia: A spatial econometrics approach. Spatial and Spat Spatiotemporal Epidemiol 45:100581. DOI:
Peng X, Cai L, Liao B, Chen H, Zhu W, 2014. Detecting the Maximum Similarity Bi-Clusters of Gene Expression Data with Evolutionary Computation. J Comput Theor Nanosci 11:1585-91. DOI:
Pradana AA, Pramitaningrum IK, Aslam M, Anindita R, 2021. Epidemiologi Penyakit Menular Pengantar Bagi Mahasiswa Kesehatan (Epidemiology of Infectious Diseases Introduction to Health Students). Rajawali Press, Jakarta. ISBN 9786232318144
Prasetyo R, Siagian TH, 2017. Determinan Penyakit Berbasis Lingkungan pada Anak Balita di Indonesia [Determinants of Environmental Based Diseases in Toddlers in Indonesia]. Jurnal Kependudukan. Indonesia 12:93–104.
Puspita T, Suryatma A, Simarmata OS, Veridona G, Lestary H, Anwar A, Pambudi I, Sulistyo, Pakasi TT, 2021. Spatial variation of tuberculosis risk in Indonesia 2010-2019. Health Sci J Indonesia 12:104-10. DOI:
Qian SS, 2016. Environmental and Ecological Statistics with R. Second Edition. CRC Press. Taylor and Francis Group. New York, USA.
Santos MB, dos Santos AD, da Silva PP, Barreto AS, dos Santos EO, França AVC, Barbosa CS, de Araújo KCGM, 2017. Spatial analysis of viral hepatitis and schistosomiasis coinfection in an endemic area in Northeastern Brazil. Rev Soc Bras Med Trop 50:383-7. DOI:
Saputra HA, 2021. Faktor-Faktor yang Berhubungan Dengan Kejadian Infeksi Saluran Pernafasan Akut (Ispa) pada Balita [Factors Associated with the Incidence of Acute Respiratory Infection (ARI) in Toddlers).] J Public Health 8:16-27.
Tomy L, Chesneau C, Madhav AK, 2021. Statistical Techniques for Environmental Sciences: A Review. Math Comput Appl 26;74. DOI:
United Nation, 2017. The Sustainable Development Goals. Report. United Nations. New York, USA.
Uwemedimo OT, Lewis TP, Essien EA, Chan GJ, Nsona H, Kruk ME, Leslie HH, 2018. Distribution and determinants of pneumonia diagnosis using Integrated Management of Childhood Illness guidelines: a nationally representative study in Malawi. BMJ Glob Health 3:e000506.. DOI:
Wang B, Miao Y, Zhao H, Jin J, Chen Y, 2016. A biclustering-based method for market segmentation using customer pain points. Eng Appl Artif Intell 47:101–109. DOI:
Yang J, Wang W, Wang H, Yu P, 2002. /spl delta/-clusters: capturing subspace correlation in a large data set, Proceedings 18th International Conference on Data Engineering, San Jose, CA, USA. pp. 517-528. doi:10.1109/icde.2002.994771. DOI:
Yamada ABF, de Freitas PL, da Silva RF, Souto FJD, 2021. Trends and spatial distribution of Hepatitis D in the North of Brazil, 2009-2018: an ecological study. Epidemiol Serv Saude Brasília 30:e2020867. DOI:

How to Cite

Aidi, M. N., Wulandari, C., Oktarina, . S. D., Aditra, . T. R., Ernawati, . F. ., Efriwati, E., Nurjanah, . N., Rachmawati, . R., Julianti, E. D., Sundari, D. ., Retiaty, F., Arifin, A. Y., Dewi, R. M., Nazaruddin, N., Salimar, S., Fuada, N. ., Widodo, Y. ., Setyawati, B., Nurhidayati, N. ., Sudikno, S., Irawan, . I. R., & Widoretno, W. (2023). Province clustering based on the percentage of communicable disease using the BCBimax biclustering algorithm. Geospatial Health, 18(2).