Sergio Santiago Rentería, Jesus Leopoldo Llano and Francisco Javier Cantú-Ortiz, Tecnológico de Monterrey, México
After the digital revolution, it is not strange to see data science taking interest in music. The sheer amount of available content opens a plethora of possibilities for studying music and its social impact from a data analytic perspective. This paper studies the relationship that exists between, song features and their corresponding genre, to provide data-mining tools for music recommendation and sub-genre identification. For the first task, we compared different classification models, including Random Forests, Fully-connected neural networks and Logistic Regression. For the latter, we carried out cluster analysis and dimensionality reduction for data visualisation. Overall, Random Forest models had better performance in genre classification than Fully-connected networks, but they suffered from overfitting. Moreover, the highest accuracy obtained was too low (64%) to be of use for genre recognition applications. Nevertheless, we think our results show the limitations of hand-crafted features and point towards more sophisticated deep learning techniques.
Music Information Retrieval, Data Mining, Automated Music Recommendation, Classification.