Enhancement of K-Means algorithm for analyzing earthquake occurrence pattern in the Philippines

Sean Marie Bayono, Ronanne Jcher Bulaon, Richard C. Regala, Vivien A. Agustin & Khatalyn E. Mata

Volume 2 Issue 1 May 2025

PDF

https://doi.org/10.53378/isrr.163

About this Journal

Journal Issue

Submit Paper

Abstract

This study aims to enhance the K-Means clustering algorithm to improve the analysis of earthquake occurrence patterns in the Philippines. Traditional K-Means, while effective, suffers from limitations such as random initialization and slow convergence. To address these issues, we propose an improved K-Means algorithm that strategically selects initial centroids based on a distance-weighted probability distribution to enhance accuracy and processes data in smaller batches to reduce computation time, thereby improving scalability and convergence speed. Using earthquake data from the Philippine Institute of Volcanology and Seismology (PHIVOLCS), we evaluate the performance of the enhanced algorithm using metrics such as Silhouette Score and Time Complexity. Results demonstrate that the proposed modifications significantly enhance clustering accuracy, computational efficiency, and scalability, leading to more precise identification of high-risk seismic areas. By providing a more accurate and efficient framework for seismic data analysis, this research contributes to disaster preparedness, risk mitigation, and informed decision-making in urban planning and disaster management.

Keywords

Author information & Contribution

Sean Marie Bayono. Corresponding author. Undergraduate student. Department of Computer Science. College of Information Systems and Technology Management - Pamantasan ng Lungsod ng Maynila. Email: smbbayono2021@plm.edu.ph

Ronanne Jcher Bulaon. Undergraduate student. Department of Computer Science. College of Information Systems and Technology Management - Pamantasan ng Lungsod ng Maynila

Richard C. Regala. Bachelor’s Degree in Information Communication Technology. Pamantasan ng Lungsod ng Maynila. Computer Laboratory Administrator

Vivien A. Agustin. Master in Information Technology. College of Information Systems and Technology Management - Pamantasan ng Lungsod ng Maynila. Associate Dean/Assistant Professor III

Khatalyn E. Mata. Doctor in Information Technology. Dean - College of Information Systems and Technology Management, Pamantasan ng Lungsod ng Maynila.

"Author 1 primarily handled the implementation and development of the system and contributed to writing and editing the manuscript. Author 2 was responsible for data acquisition and contributed to drafting and revising the manuscript. Authors 3 and 4, as thesis advisers, provided critical feedback on the study’s validity, structure, and overall quality, including thorough review of formatting and content. Author 5, as the thesis coordinator, supervised the alignment of the manuscript with institutional requirements and provided guidance throughout the writing process. All authors reviewed and approved the final version of the manuscript and agreed to be accountable for all aspects of the work."

Disclosure statement

Funding

AI Declaration

Notes

Acknowledgement

References

Arthur, D., & Vassilvitskii, S. (2007). k-means++: The Advantages of Careful Seeding. https://theory.stanford.edu/~sergei/papers/kMeansPP-soda.pdf

Bottou, L., & Bousquet, O. (2008). The tradeoffs of large scale learning. Advances in Neural Information Processing Systems (NeurIPS), 20, 161–168.

Béjar, J. (2020). K-means vs Mini Batch K-means: A comparison. https://upcommons.upc.edu/bitstream/handle/2117/23414/R13-8.pdf

Celebi, M. E., Kingravi, H. A., & Vela, P. A. (2013). A comparative study of efficient initialization methods for the k-means clustering algorithm. Expert Systems with Applications, 40(1), 200–210. https://doi.org/10.1016/j.eswa.2012.07.021

Dean, J., Corrado, G., Monga, R., Chen, K., Devin, M., Mao, M., & Le, Q. V. (2012). Large Scale Distributed Deep Networks. Advances in Neural Information Processing Systems (NeurIPS), 25, 1223–1231.

Ester, M., Kriegel, H.-P., Sander, J., & Xu, X. (1996). A density-based algorithm for discovering clusters in large spatial databases with noise. KDD-96 Proceedings. https://file.biolab.si/papers/1996-DBSCAN-KDD.pdf

Fan, Z., & Xu, X. (2019). Application and visualization of typical clustering algorithms in seismic data analysis. Procedia Computer Science, 151, 171–178. https://doi.org/10.1016/j.procs.2019.04.026

Han, J., Kamber, M., & Pei, J. (2012). Data mining: Concepts and techniques (pp. 451–454). Elsevier. https://doi.org/10.1016/C2009-0-61819-5

Hastie, T., Tibshirani, R., & Friedman, J. (2009). Springer series in statistics the elements of statistical learning data mining, inference, and prediction second edition. https://www.sas.upenn.edu/~fdiebold/NoHesitations/BookAdvanced.pdf

Hicks, S. C., Liu, R., Ni, Y., Purdom, E., & Risso, D. (2021). mbkmeans: Fast clustering for single cell data using mini-batch k-means. PLOS Computational Biology, 17(1), e1008625. https://doi.org/10.1371/journal.pcbi.1008625

Jain, A. K. (2010). Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 31(8), 651–666. https://doi.org/10.1016/j.patrec.2009.09.011

Kanungo, T., Mount, D. M., Netanyahu, N. S., Piatko, C. D., Silverman, R., & Wu, A. Y. (2002). An efficient k-means clustering algorithm: analysis and implementation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(7), 881–892. https://doi.org/10.1109/tpami.2002.1017616

Likas, A., Vlassis, N., & J. Verbeek, J. (2003). The global k-means clustering algorithm. Pattern Recognition, 36(2), 451–461. https://doi.org/10.1016/s0031-3203(02)00060-2

Mato, F., & Theofilos Toulkeridis. (2017). An unsupervised K-means based clustering method for geophysical post-earthquake diagnosis. IEEE Symposium Series on Computational Intelligence, 1-8. https://doi.org/10.1109/ssci.2017.8285216

Novianti, P., Setyorini, D., & Rafflesia, U. (2017). K-Means cluster analysis in earthquake epicenter clustering. International Journal of Advances in Intelligent Informatics, 3(2), 81. https://doi.org/10.26555/ijain.v3i2.100

Reynolds, D.A. (2009). Gaussian mixture models. In: Li, S.Z. and Jain, A., (Eds.), Encyclopedia of Biometrics. Springer. https://www.scirp.org/reference/referencespapers?referenceid=3466146

Rifa, I. H., Pratiwi, H., & Respatiwulan, R. (2020). Clustering of earthquake risk in Indonesia using K-Medoids and K-Means algorithms. Media Statistika, 13(2), 194–205. https://doi.org/10.14710/medstat.13.2.194-205

Rousseeuw, P. J. (1987). Silhouettes: a graphical aid to the interpretation and validation of cluster analysis. Journal of Computational and Applied Mathematics, 20(0377-0427), 53–65. https://doi.org/10.1016/0377-0427(87)90125-7

Sanchez, M. Santibanez., Valdovinos, R. M., Trueba, A., Rendon, E., & Lopez, E. (2013). Applicability of cluster validation indexes for large data sets. Artificial Intelligence (MICAI), 187–193. https://doi.org/10.1109/MICAI.2013.30

Sculley, D. (2010). Web-scale k-means clustering. Proceedings of the 19th International Conference on World Wide Web – WWW ’10. https://doi.org/10.1145/1772690.1772862

Shahapure, K. R., & Nicholas, C. (2020). Cluster quality analysis using silhouette score. IEEE 7th International Conference on Data Science and Advanced Analytics (DSAA). https://doi.org/10.1109/dsaa49011.2020.00096

Xiangyuan, H., Siyuan, L., & Hao, W. (2020). A survey on k-means initialization methods. https://www.dcs.warwick.ac.uk/~u2470130/randalg20/HLW.pdf

Xiao, B., Wang, Z., Liu, Q., & Liu, X. (2018). SMK-means: An improved mini batch K-means algorithm based on mapreduce with big data. Cmc-Computers Materials & Continua, 56(3), 365–379. https://doi.org/10.3970/cmc.2018.01830

Xie, H., Zhang, L., Lim, C. P., Yu, Y., Liu, C., Liu, H., & Walters, J. (2019). Improving K-means clustering with enhanced Firefly Algorithms. Applied Soft Computing, 84, 105763. https://doi.org/10.1016/j.asoc.2019.105763

Xu, D., & Tian, Y. (2015). A comprehensive survey of clustering algorithms. Annals of Data Science, 2(2), 165–193. https://doi.org/10.1007/s40745-015-0040-1

Xu, Y., Qu, W., Li, Z., Min, G., & Liu, Z. (2014). Efficient -Means++ Approximation with MapReduce. IEEE Transactions on Parallel and Distributed Systems, 25(12), 3135–3144. https://doi.org/10.1109/TPDS.2014.2306193

Cite this article:

Bayono, S.M., Bulaon, R.J., Regala, R.C., Agustin, V.A. & Mata, K.E. (2025). Enhancement of K-Means algorithm for analyzing earthquake occurrence pattern in the Philippines. International Student Research Review, 2(1), 1-18. https://doi.org/10.53378/isrr.163

License:

This work is licensed under a Creative Commons Attribution (CC BY 4.0) International License.

Article view: 486

Institute of Industry and Academic Research Incorporated

Register in

International Student Research Review

Enhancement of K-Means algorithm for analyzing earthquake occurrence pattern in the Philippines

Sean Marie Bayono, Ronanne Jcher Bulaon, Richard C. Regala, Vivien A. Agustin & Khatalyn E. Mata

Volume 2 Issue 1 May 2025

Abstract

Keywords

Author information & Contribution

Disclosure statement

Funding

AI Declaration

Notes

Acknowledgement

References

Cite this article:

License:

Related articles:

Most read articles

The Publisher

Publication office

Phone

Email

Follow us

Visit Us