Content-based auto-tagging of audios using deep learning

Rashmeet Nayyar, Sushmita Nair, Omkar Patil, Rasika Pawar, Amruta Lolage

April 2018

Abstract

In the recent years, deep learning and feature learning have drawn significant attention in the field of Music Information Retrieval (MIR) research, inspired by good results in speech recognition and computer vision. Here, we tackle the problem of content-based automatic tagging of audios which is a multi-label classification task. Deep neural network architectures like Convolutional Neural Network and Convolutional Recurrent Neural Network are used to learn hierarchical features from musical audio signals and the experiments are performed on MagnaTagATune (MTT) dataset. We focused to achieve state-of-the-art performance with Mel-spectrogram input. Tags such as genre, instruments, emotions etc. can be automatically predicted for newer tracks with the focus on accurate classification of clips. These tags convey high-level information from a listener’s perspective and thus can be used for organization of music library, efficient music browsing, creating personalized recommendations, playlist generation, and other applications.

Type

Conference paper

Publication

In International Conference on Big Data, IoT and Data Science, 2017