Geographic-Origin Music Classification from Numerical Audio Features: Integrating Unsupervised Clustering with Supervised Models
DOI:
https://doi.org/10.12928/biste.v7i4.13400Keywords:
Geographical Music, Music Information Retrieval , K-means Clustering, Cluster-Supervised Learning, Support Vector Machine, Convolutional Neural Network, ClassificationAbstract
Classifying the geographic origin of music is a relevant task in music information retrieval, yet most studies have focused on genre or style recognition rather than regional origin. This study evaluates Support Vector Machine (SVM) and Convolutional Neural Network (CNN) models on the UCI Geographical Origin of Music dataset (1,059 tracks from 33 non-Western regions) using numerical audio features. To incorporate latent structure, we first applied K-means clustering with the optimal number of clusters (k=2) determined by the Elbow and Silhouette methods. The cluster assignments were used as auxiliary signals for training, while evaluation relied on the true region labels. Classification performance was assessed with Accuracy, Precision, Recall, and F1-score. Results show that SVM achieved 99.53% accuracy (95% CI: 97.38–99.92%), while CNN reached 98.58% accuracy (95% CI: 95.92–99.52%); Precision, Recall, and F1 mirrored these values. The differences confirm SVM’s superior performance on this dataset, though the near-perfect scores also suggest strong separability in the feature space and potential risks of overfitting. Learning-curve analysis indicated stable training, and cluster supervision provided small but consistent benefits. Overall, SVM remains a reliable baseline for tabular music features, while CNNs may require spectro-temporal representations to leverage their full potential. Future work should validate these findings across multiple datasets, apply cross-validation with statistical significance testing, and explore hybrid deep models for broader generalization.
References
S. Makridakis, “The forthcoming Artificial Intelligence (AI) revolution: Its impact on society and firms,” Futures, vol. 90, pp. 46–60, 2017, https://doi.org/10.1016/j.futures.2017.03.006.
D. Carter, “How real is the impact of artificial intelligence? The business information survey 2018,” Bus. Inf. Rev., vol. 35, no. 3, pp. 99–115, 2018, https://doi.org/10.1177/0266382118790150.
S. Al Mansoori, S. A. Salloum, and K. Shaalan, “The Impact of Artificial Intelligence and Information Technologies on the Efficiency of Knowledge Management at Modern Organizations: A Systematic Review,” Recent advances in intelligent systems and smart applications, pp. 163–182, 2021, https://doi.org/10.1007/978-3-030-47411-9_9.
Z. Ullah, F. Al-Turjman, L. Mostarda, and R. Gagliardi, “Applications of Artificial Intelligence and Machine learning in smart cities,” Comput. Commun., vol. 154, pp. 313–323, 2020, https://doi.org/10.1016/j.comcom.2020.02.069.
G. Li and Y. Qin, “An Exploration of the Application of Principal Component Analysis in Big Data Processing,” Appl. Math. Nonlinear Sci., vol. 9, no. 1, pp. 1–24, 2024, https://doi.org/10.2478/amns-2024-0664.
M. Kemal Ahmed, D. Prasad Sharma, H. Seid Worku, G. Yilma, A. Ibenthal, and D. Yadav, “Livestock Disease Data Management for E-Surveillance and Disease Mapping Using Cluster Analysis,” Adv. Artif. Intell. Mach. Learn., vol. 04, no. 01, pp. 1991–2013, 2024, https://doi.org/10.54364/AAIML.2024.41114.
B. H. Aubaidan, R. A. Kadir, and M. T. Ijab, “A Comparative Analysis of Smote and CSSF Techniques for Diabetes Classification Using Imbalanced Data,” J. Comput. Sci., vol. 20, no. 9, pp. 1146–1165, 2024, https://doi.org/10.3844/jcssp.2024.1146.1165.
C. B. Sucahyo et al., “Performance Analysis of Random Forest on Quartile Classification Journal,” Appl. Eng. Technol., vol. 3, no. 1, pp. 1–17, 2024, https://doi.org/10.31763/aet.v3i1.1189.
S. Hendra, H. R. Ngemba, R. Azhar, R. Laila, N. P. Domingo, and R. Nur, “Classification system model for project sustainability,” Appl. Eng. Technol., vol. 1, no. 3, pp. 154–161, 2022, https://doi.org/10.31763/aet.v1i3.689.
E. Xiao, “Comprehensive K-Means Clustering,” J. Comput. Commun., vol. 12, no. 03, pp. 146–159, 2024, https://doi.org/10.4236/jcc.2024.123009.
S. Khadka et al., “Agglomerative Hierarchical Clustering Methodology to Restore Power System considering Reactive Power Balance and Stability Factor Analysis,” Int. Trans. Electr. Energy Syst., vol. 1, pp. 1–16, 2024, https://doi.org/10.1155/2024/8856625.
G. Mo, S. Song, and H. Ding, “Towards Metric DBSCAN: Exact, Approximate, and Streaming Algorithms,” Proc. ACM Manag. Data, vol. 2, no. 3, pp. 1–25, 2024, https://doi.org/10.1145/3654981.
W. Chen, “Exploring the Application of K-means Machine Learning Algorithm in Fruit Classification,” Trans. Comput. Sci. Intell. Syst. Res., vol. 5, pp. 976–980, 2024, https://doi.org/10.62051/gr86br34.
A. Rykov, R. C. De Amorim, V. Makarenkov, and B. Mirkin, “Inertia-Based Indices to Determine the Number of Clusters in K-Means: An Experimental Evaluation,” IEEE Access, vol. 12, pp. 11761–11773, 2024, https://doi.org/10.1109/ACCESS.2024.3350791.
M. Nishom, G. W. Sasmito, and D. S. Wibowo, “Segmentation model toward promotion target determination using k-means algorithm and Elbow method,” in AIP Conference Proceedings, p. 030022, 2024, https://doi.org/10.1063/5.0198858.
J. S. Pimentel, R. Ospina, and A. Ara, “A novel fusion Support Vector Machine integrating weak and sphere models for classification challenges with massive data,” Decis. Anal. J., vol. 11, p. 100457, 2024, https://doi.org/10.1016/j.dajour.2024.100457.
A. Pranolo, S. Sularso, N. Anwar, A. B. P. Utama, A. P. Wibawa, and R. A. Rachman, “Classification of Music Genres based on Machine Learning SVM and CNN,” in 2025 5th International Conference on Pervasive Computing and Social Networking (ICPCSN), pp. 1667–1670, 2025, https://doi.org/10.1109/ICPCSN65854.2025.11035544.
F. Ikhwandoko and D. P. Ismi, “Classification of coronary heart disease using the multi-layer perceptron neural networks,” Sci. Inf. Technol. Lett., vol. 6, no. 1, pp. 34–43, 2025, https://doi.org/10.31763/sitech.v6i1.2186.
N. Khoirunnisa and M. Rosyda, “A comparative study on SMOTE, CTGAN, and hybrid SMOTE-CTGAN for medical data augmentation,” Sci. Inf. Technol. Lett., vol. 6, no. 1, pp. 44–54, 2025, https://doi.org/10.31763/sitech.v6i1.2203.
C. Hardiyanti P, “Optimizing breast cancer classification using SMOTE, Boruta, and XGBoost,” Sci. Inf. Technol. Lett., vol. 6, no. 1, pp. 16–33, 2025, https://doi.org/10.31763/sitech.v6i1.2109.
N. D. Ariyanta, A. N. Handayani, J. T. Ardiansah, and K. Arai, “Ensemble learning approaches for predicting heart failure outcomes: A comparative analysis of feedforward neural networks, random forest, and XGBoost,” Appl. Eng. Technol., vol. 3, no. 3, pp. 173–184, 2024, https://doi.org/10.31763/aet.v3i3.1750.
A. Pranolo, N. P. Utami, A. B. P. Utama, F. K. Anasyua, I. Nurahman, and A. P. Wibawa, “Classification of Obesity Level-Based Transfer Learning and LSTM,” in 2025 3rd International Conference on Inventive Computing and Informatics (ICICI), pp. 1–5, 2025, https://doi.org/10.1109/ICICI65870.2025.11069519.
N. Luo, D. Xu, B. Xing, X. Yang, and C. Sun, “Principles and applications of convolutional neural network for spectral analysis in food quality evaluation: A review,” J. Food Compos. Anal., vol. 128, p. 105996, 2024, https://doi.org/10.1016/j.jfca.2024.105996.
M. E. Sonmez, N. E. Gumus, N. Eczacioglu, E. E. Develi, K. Yücel, and H. B. Yildiz, “Enhancing microalgae classification accuracy in marine ecosystems through convolutional neural networks and support vector machines,” Mar. Pollut. Bull., vol. 205, p. 116616, 2024, https://doi.org/10.1016/j.marpolbul.2024.116616.
M. Bhagat and D. Kumar, “Performance enhancement of kernelized SVM with deep learning features for tea leaf disease prediction,” Multimed. Tools Appl., vol. 83, no. 13, pp. 39117–39134, 2023, https://doi.org/10.1007/s11042-023-17172-1.
S. Sajiha, K. Radha, D. Venkata Rao, N. Sneha, S. Gunnam, and D. P. Bavirisetti, “Automatic dysarthria detection and severity level assessment using CWT-layered CNN model,” EURASIP J. Audio, Speech, Music Process., vol. 2024, no. 1, p. 33, 2024, https://doi.org/10.1186/s13636-024-00357-3.
I. Vatolkin, P. Ginsel, and G. Rudolph, “Advancements in the Music Information Retrieval Framework AMUSE over the Last Decade,” in Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 2383–2389, 2021, https://doi.org/10.1145/3404835.3463252.
R. Gupta, J. Yadav, and C. Kapoor, “Music Information Retrieval and Intelligent Genre Classification,” in Pandian, A.P., Palanisamy, R., Ntalianis, K. (eds) Proceedings of International Conference on Intelligent Computing, Information and Control Systems. Advances in Intelligent Systems and Computing, vol 1272, pp. 207–224, 2021, https://doi.org/10.1007/978-981-15-8443-5_17.
H. Liu and C. Zhao, “A Deep Learning Algorithm for Music Information Retrieval Recommendation System,” Comput. Aided. Des. Appl., pp. 1–16, 2023, https://doi.org/10.14733/cadaps.2024.S13.1-16.
B. Amiri, N. Shahverdi, A. Haddadi, and Y. Ghahremani, “Beyond the Trends: Evolution and Future Directions in Music Recommender Systems Research,” IEEE Access, vol. 12, pp. 51500–51522, 2024, https://doi.org/10.1109/ACCESS.2024.3386684.
V. Chaturvedi, A. B. Kaur, V. Varshney, A. Garg, G. S. Chhabra, and M. Kumar, “Music mood and human emotion recognition based on physiological signals: a systematic review,” Multimed. Syst., vol. 28, no. 1, pp. 21–44, 2022, https://doi.org/10.1007/s00530-021-00786-6.
R. Huang, A. Holzapfel, B. Sturm, and A.-K. Kaila, “Beyond Diverse Datasets : Responsible MIR, Interdisciplinarity, and the Fractured Worlds of Music,” Trans. Int. Soc. Music Inf. Retr., vol. 6, no. 1, pp. 43–59, 2023, https://doi.org/10.5334/tismir.141.
G. Gabbolini and D. Bridge, “Surveying More Than Two Decades of Music Information Retrieval Research on Playlists,” ACM Trans. Intell. Syst. Technol., vol. 15, no. 6, pp. 1–68, 2024, https://doi.org/10.1145/3688398.
K. Zaman, M. Sah, C. Direkoglu, and M. Unoki, “A Survey of Audio Classification Using Deep Learning,” IEEE Access, vol. 11, pp. 106620–106649, 2023, https://doi.org/10.1109/ACCESS.2023.3318015.
P. Doungpaisan and P. Khunarsa, “Deep Spectrogram Learning for Gunshot Classification: A Comparative Study of CNN Architectures and Time-Frequency Representations,” J. Imaging, vol. 11, no. 8, p. 281, 2025, https://doi.org/10.3390/jimaging11080281.
W. Seo, S.-H. Cho, P. Teisseyre, and J. Lee, “A Short Survey and Comparison of CNN-Based Music Genre Classification Using Multiple Spectral Features,” IEEE Access, vol. 12, pp. 245–257, 2024, https://doi.org/10.1109/ACCESS.2023.3346883.
M. K. Gourisaria, R. Agrawal, M. Sahni, and P. K. Singh, “Comparative analysis of audio classification with MFCC and STFT features using machine learning techniques,” Discov. Internet Things, vol. 4, no. 1, p. 1, 2024, https://doi.org/10.1007/s43926-023-00049-y.
B. Chander and K. Gopalakrishnan, “Data clustering using unsupervised machine learning,” in Statistical Modeling in Machine Learning, pp. 179–204, 2023, https://doi.org/10.1016/B978-0-323-91776-6.00015-4.
A. Arief and F. Isnan, “Children songs as a learning media used in increasing motivation and learning student in elementary school,” Int. J. Vis. Perform. Arts, vol. 2, no. 1, pp. 1–7, 2020, https://doi.org/10.31763/viperarts.v2i1.54.
T. Simbolon, A. P. Wibawa, I. A. E. Zaeni, and A. R. Ismail, “Text classification of traditional and national songs using naïve bayes algorithm,” Sci. Inf. Technol. Lett., vol. 3, no. 2, pp. 59–72, 2022, https://doi.org/10.31763/sitech.v3i2.1215.
S.-S. Yu, S.-W. Chu, C.-M. Wang, Y.-K. Chan, and T.-C. Chang, “Two improved k-means algorithms,” Appl. Soft Comput., vol. 68, pp. 747–755, 2018, https://doi.org/10.1016/j.asoc.2017.08.032.
S. Wijitkosum, “Integrated spatial analysis of drought risk factors using agglomerative hierarchical clustering and correlation,” Environmental Advances, vol. 21, p. 100646, 2025, https://doi.org/10.1016/j.envadv.2025.100646.
O. Kulkarni and A. Burhanpurwala, “A survey of advancements in DBSCAN clustering algorithms for big data,” in 2024 3rd International conference on Power Electronics and IoT Applications in Renewable Energy and its Control (PARC), pp. 106–111, 2024, https://doi.org/10.1109/PARC59193.2024.10486339.
S. M. Miraftabzadeh, C. G. Colombo, M. Longo, and F. Foiadelli, “K-means and alternative clustering methods in modern power systems,” IEEE Access, vol. 11, pp. 119596–119633, 2023, https://doi.org/10.1109/ACCESS.2023.3327640.
J.-J. Aucouturier and F. Pachet, “Representing musical genre: A state of the art,” J. new Music Res., vol. 32, no. 1, pp. 83–93, 2003, https://doi.org/10.1076/jnmr.32.1.83.16801.
M. Panteli, E. Benetos, and S. Dixon, “A computational study on outliers in world music,” PLoS One, vol. 12, no. 12, p. e0189399, 2017, https://doi.org/10.1371/journal.pone.0189399.
M. Rossi, G. Iacca, and L. Turchet, “Explainability and Real-Time in Music Information Retrieval: Motivations and Possible Scenarios,” in 2023 4th International Symposium on the Internet of Sounds, pp. 1–9, 2023, https://doi.org/10.1109/IEEECONF59510.2023.10335217.
R. Cahyaningtyas, S. Madenda, B. Bertalya, and D. Indarti, “Solar module defects classification using deep convolutional neural network,” Int. J. Adv. Intell. Informatics, vol. 11, no. 3, p. 499, 2025, https://doi.org/10.26555/ijain.v11i3.1818.
Y. Ding, H. Zhang, W. Huang, X. Zhou, and Z. Shi, “Efficient Music Genre Recognition Using ECAS-CNN: A Novel Channel-Aware Neural Network Architecture,” Sensors, vol. 24, no. 21, p. 7021, 2024, https://doi.org/10.3390/s24217021.
B. Liang and M. Gu, “Music Genre Classification Using Transfer Learning,” in 2020 IEEE Conference on Multimedia Information Processing and Retrieval (MIPR), pp. 392–393, 2020, https://doi.org/10.1109/MIPR49039.2020.00085.
F. Zhou, “Geographical Origin of Music.” 2014, https://archive.ics.uci.edu/dataset/315/geographical+original+of+music.
F. Zhou, Q. Claire, and R. D. King, “Predicting the Geographical Origin of Music,” in 2014 IEEE International Conference on Data Mining, pp. 1115–1120, 2014, https://doi.org/10.1109/ICDM.2014.73.
D. Kostrzewa and P. Grabczyński, “From Sound to Map: Predicting Geographic Origin in Traditional Music Works,” in Franco, L., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2024. ICCS 2024. Lecture Notes in Computer Science, vol 14833, pp. 174–188, 2024, https://doi.org/10.1007/978-3-031-63751-3_12.
J. Abimbola, D. Kostrzewa, and P. Kasprowski, “Music time signature detection using ResNet18,” EURASIP J. Audio, Speech, Music Process., vol. 2024, no. 1, p. 30, 2024, https://doi.org/10.1186/s13636-024-00346-6.
F. Grötschla, A. Solak, L. A. Lanzendörfer, and R. Wattenhofer, “Benchmarking Music Generation Models and Metrics via Human Preference Studies,” in ICASSP 2025 - 2025 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 1–5, 2025, https://doi.org/10.1109/ICASSP49660.2025.10887745.
R. Alfaisal, A. Q. M. AlHamad, and S. A. Salloum, “Enhancing Music Genre Classification Using Advanced Machine Learning Techniques: A Novel Approach,” in Al-Marzouqi, A., Salloum, S., Shaalan, K., Gaber, T., Masa’deh, R. (eds) Generative AI in Creative Industries. Studies in Computational Intelligence, vol 1208, pp. 33–46, 2025, https://doi.org/10.1007/978-3-031-89175-5_3.
M. Furner, M. Z. Islam, and C.-T. Li, “Knowledge discovery and visualisation framework using machine learning for music information retrieval from broadcast radio data,” Expert Syst. Appl., vol. 182, p. 115236, 2021, https://doi.org/10.1016/j.eswa.2021.115236.
X. Ma, V. Sharma, M.-Y. Kan, W. S. Lee, and Y. Wang, “KeYric: Unsupervised Keywords Extraction and Expansion from Music for Coherent Lyrics Generation,” ACM Trans. Multimed. Comput. Commun. Appl., vol. 21, no. 1, pp. 1–28, 2025, https://doi.org/10.1145/3699717.
J. Enguehard, P. O’Halloran, and A. Gholipour, “Semi-Supervised Learning with Deep Embedded Clustering for Image Classification and Segmentation,” IEEE Access, vol. 7, pp. 11093–11104, 2019, https://doi.org/10.1109/ACCESS.2019.2891970.
D. Gupta, R. Ramjee, N. Kwatra, and M. Sivathanu, “Unsupervised Clustering using Pseudo-semi-supervised Learning,” In International Conference on Learning Representations, 2020, https://openreview.net/forum?id=rJlnxkSYPS.
D.-V.-T. Le, L. Bigo, D. Herremans, and M. Keller, “Natural Language Processing Methods for Symbolic Music Generation and Information Retrieval: A Survey,” ACM Comput. Surv., vol. 57, no. 7, pp. 1–40, 2025, https://doi.org/10.1145/3714457.
R. T. Gdeeb, “Weather classification using meta-based random forest fusion of transfer learning models,” Int. J. Adv. Intell. Informatics, vol. 10, no. 2, p. 186, 2024, https://doi.org/10.26555/ijain.v10i2.1264.
D. G. Biswas, S. Das, A. Kairi, A. Roy, T. Saha, and M. Samanta, “Taxonomic Delineation of Musical Genres Through Computational Paradigms: An Exploration Employing the K-Nearest Neighbors (KNN) Algorithm,” in Proceedings of the Fifth International Conference on Emerging Trends in Mathematical Sciences & Computing (IEMSC-24). IEMSC 2024. Information Systems Engineering and Management, vol 10, pp. 128–144, 2024, https://doi.org/10.1007/978-3-031-71125-1_11.
T. Kyriakou, M. Á. de la Campa Crespo, A. Panayiotou, Y. Chrysanthou, P. Charalambous, and A. Aristidou, “Virtual Instrument Performances (VIP): A Comprehensive Review,” Comput. Graph. Forum, vol. 43, no. 2, 2024, https://doi.org/10.1111/cgf.15065.
A.-M. Christodoulou, O. Lartillot, and A. R. Jensenius, “Multimodal music datasets? Challenges and future goals in music processing,” Int. J. Multimed. Inf. Retr., vol. 13, no. 3, p. 37, 2024, https://doi.org/10.1007/s13735-024-00344-6.
D. M. Jiménez-Bravo, Á. Lozano Murciego, J. José Navarro-Cáceres, M. Navarro-Cáceres, and T. Harkin, “Identifying Irish Traditional Music Genres Using Latent Audio Representations,” IEEE Access, vol. 12, pp. 92536–92548, 2024, https://doi.org/10.1109/ACCESS.2024.3421639.
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Andri Pranolo, Sularso Sularso, Nuril Anwar, Agung Bella Utama Putra, Aji Prasetya Wibawa, Shoffan Saifullah, Rafał Dreżewski, Zalik Nuryana, Tri Andi

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgment of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgment of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).
This journal is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

