Institute of Industry and Academic Research Incorporated
Register in
ISRC Cover Page
International Student Research Review

ISSN 3027-9704 (Print) 3027-9712 (Online)

Enhancement of K-Means algorithm for analyzing earthquake occurrence pattern in the Philippines

Sean Marie Bayono, Ronanne Jcher Bulaon, Richard C. Regala, Vivien A. Agustin & Khatalyn E. Mata

This study aims to enhance the K-Means clustering algorithm to improve the analysis of earthquake occurrence patterns in the Philippines. Traditional K-Means, while effective, suffers from limitations such as random initialization and slow convergence. To address these issues, we propose an improved K-Means algorithm that strategically selects initial centroids based on a distance-weighted probability distribution to enhance accuracy and processes data in smaller batches to reduce computation time, thereby improving scalability and convergence speed. Using earthquake data from the Philippine Institute of Volcanology and Seismology (PHIVOLCS), we evaluate the performance of the enhanced algorithm using metrics such as Silhouette Score and Time Complexity. Results demonstrate that the proposed modifications significantly enhance clustering accuracy, computational efficiency, and scalability, leading to more precise identification of high-risk seismic areas. By providing a more accurate and efficient framework for seismic data analysis, this research contributes to disaster preparedness, risk mitigation, and informed decision-making in urban planning and disaster management.

K-Means algorithm, mini-batch processing, disaster preparedness, seismic data analysis

Sean Marie Bayono. Corresponding author. Undergraduate student. Department of Computer Science. College of Information Systems and Technology Management - Pamantasan ng Lungsod ng Maynila. Email: smbbayono2021@plm.edu.ph

Ronanne Jcher Bulaon. Undergraduate student. Department of Computer Science. College of Information Systems and Technology Management - Pamantasan ng Lungsod ng Maynila

Richard C. Regala. Bachelor’s Degree in Information Communication Technology. Pamantasan ng Lungsod ng Maynila. Computer Laboratory Administrator

Vivien A. Agustin. Master in Information Technology. College of Information Systems and Technology Management - Pamantasan ng Lungsod ng Maynila. Associate Dean/Assistant Professor III

Khatalyn E. Mata. Doctor in Information Technology. Dean - College of Information Systems and Technology Management, Pamantasan ng Lungsod ng Maynila.

No potential conflict of interest was reported by the author(s).

This work was not supported by any funding.

The author declares the use of Artificial Intelligence (AI) in writing this paper. In particular, the author used ChatGPT in identifying relevant literature and refining content structure. The author takes full responsibility in ensuring that research idea, analysis and interpretations are original work.

This paper is presented in the 2nd International Student Research Congress (ISRC) 2025

Arthur, D., & Vassilvitskii, S. (2007). k-means++: The Advantages of Careful Seeding. https://theory.stanford.edu/~sergei/papers/kMeansPP-soda.pdf

Bottou, L., & Bousquet, O. (2008). The tradeoffs of large scale learning. Advances in Neural Information Processing Systems (NeurIPS), 20, 161–168.

Béjar, J. (2020). K-means vs Mini Batch K-means: A comparison. https://upcommons.upc.edu/bitstream/handle/2117/23414/R13-8.pdf

Celebi, M. E., Kingravi, H. A., & Vela, P. A. (2013). A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Systems with Applications40(1), 200–210. https://doi.org/10.1016/j.eswa.2012.07.021

Dean, J., Corrado, G., Monga, R., Chen, K., Devin, M., Mao, M., & Le, Q. V. (2012). Large Scale Distributed Deep Networks. Advances in Neural Information Processing Systems (NeurIPS), 25, 1223–1231.

Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. KDD-96 Proceedings. https://file.biolab.si/papers/1996-DBSCAN-KDD.pdf

Fan, Z., & Xu, X. (2019). Application and visualization of typical clustering algorithms in seismic data analysis. Procedia Computer Science151, 171–178. https://doi.org/10.1016/j.procs.2019.04.026

Han, J., Kamber, M., & Pei, J. (2012). Data mining: Concepts and techniques (pp. 451–454). Elsevier. https://doi.org/10.1016/C2009-0-61819-5

Hastie, T., Tibshirani, R., & Friedman, J. (2009). Springer series in statistics the elements of statistical learning data mining, inference, and prediction second edition. https://www.sas.upenn.edu/~fdiebold/NoHesitations/BookAdvanced.pdf

Hicks, S. C., Liu, R., Ni, Y., Purdom, E., & Risso, D. (2021). mbkmeans: Fast clustering for single cell data using mini-batch k-means. PLOS Computational Biology17(1), e1008625. https://doi.org/10.1371/journal.pcbi.1008625

Jain, A. K. (2010). Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 31(8), 651–666. https://doi.org/10.1016/j.patrec.2009.09.011

Kanungo, T., Mount, D. M., Netanyahu, N. S., Piatko, C. D., Silverman, R., & Wu, A. Y. (2002). An efficient k-means clustering algorithm: analysis and implementation. IEEE Transactions on Pattern Analysis and Machine Intelligence24(7), 881–892. https://doi.org/10.1109/tpami.2002.1017616

Likas, A., Vlassis, N., & J. Verbeek, J. (2003). The global k-means clustering algorithm. Pattern Recognition36(2), 451–461. https://doi.org/10.1016/s0031-3203(02)00060-2

Mato, F., & Theofilos Toulkeridis. (2017). An unsupervised K-means based clustering method for geophysical post-earthquake diagnosis. IEEE Symposium Series on Computational Intelligence, 1-8. https://doi.org/10.1109/ssci.2017.8285216

Novianti, P., Setyorini, D., & Rafflesia, U. (2017). K-Means cluster analysis in earthquake epicenter clustering. International Journal of Advances in Intelligent Informatics3(2), 81. https://doi.org/10.26555/ijain.v3i2.100

Reynolds, D.A. (2009). Gaussian mixture models. In: Li, S.Z. and Jain, A., (Eds.), Encyclopedia of Biometrics. Springer.   https://www.scirp.org/reference/referencespapers?referenceid=3466146

Rifa, I. H., Pratiwi, H., & Respatiwulan, R. (2020). Clustering of earthquake risk in Indonesia using K-Medoids and K-Means algorithms. Media Statistika13(2), 194–205. https://doi.org/10.14710/medstat.13.2.194-205

Rousseeuw, P. J. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics20(0377-0427), 53–65. https://doi.org/10.1016/0377-0427(87)90125-7

Sanchez, M. Santibanez., Valdovinos, R. M., Trueba, A., Rendon, E., & Lopez, E. (2013). Applicability of cluster validation indexes for large data sets. Artificial Intelligence (MICAI), 187–193. https://doi.org/10.1109/MICAI.2013.30

Sculley, D. (2010). Web-scale k-means clustering. Proceedings of the 19th International Conference on World Wide Web – WWW ’10. https://doi.org/10.1145/1772690.1772862

Shahapure, K. R., & Nicholas, C. (2020). Cluster quality analysis using silhouette score. IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA). https://doi.org/10.1109/dsaa49011.2020.00096

Xiangyuan, H., Siyuan, L., & Hao, W. (2020). A survey on k-means initialization methods. https://www.dcs.warwick.ac.uk/~u2470130/randalg20/HLW.pdf

Xiao, B., Wang, Z., Liu, Q., & Liu, X. (2018). SMK-means: An improved mini batch K-means algorithm based on mapreduce with big data. Cmc-Computers Materials & Continua56(3), 365–379. https://doi.org/10.3970/cmc.2018.01830

Xie, H., Zhang, L., Lim, C. P., Yu, Y., Liu, C., Liu, H., & Walters, J. (2019). Improving K-means clustering with enhanced Firefly Algorithms. Applied Soft Computing84, 105763. https://doi.org/10.1016/j.asoc.2019.105763

Xu, D., & Tian, Y. (2015). A comprehensive survey of clustering algorithms. Annals of Data Science2(2), 165–193. https://doi.org/10.1007/s40745-015-0040-1

Xu, Y., Qu, W., Li, Z., Min, G., & Liu, Z. (2014). Efficient -Means++ Approximation with MapReduce. IEEE Transactions on Parallel and Distributed Systems25(12), 3135–3144. https://doi.org/10.1109/TPDS.2014.2306193

 

Cite this article:

Bayono, S.M., Bulaon, R.J., Regala, R.C., Agustin, V.A. & Mata, K.E. (2025). Enhancement of K-Means algorithm for analyzing earthquake occurrence pattern in the Philippines. International Student Research Review, 2(1), 1-18. https://doi.org/10.53378/isrr.163

License:

TRP Cover Page
The Research Probe

Proceedings journal for institutional researches.

MEIR COVER Page
Management, Education & Innovation Review

Proceedings journal for ICMEI.

IARR Cover
Industry & Academic Research Review

Proceedings journal for ICMIAR.

Scroll to Top