General-purpose audio tagging from noisy labels using convolutional neural networks

Turab Iqbal; Qiuqiang Kong; Mark D Plumbley; Wenwu Wang

Back

General-purpose audio tagging from noisy labels using convolutional neural networks

Conference presentation

Open access

General-purpose audio tagging from noisy labels using convolutional neural networks

Turab Iqbal, Qiuqiang Kong, Mark D Plumbley and Wenwu Wang

Proceedings of the Detection and Classification of Acoustic Scenes and Events 2018 Workshop (DCASE2018, pp.212-216

Tampere University of Technology

DCASE2018 Workshop on Detection and Classification of Acoustic Scenes and Events (Surrey, UK, 19/11/2018 - 20/11/2018)

11/2018

Abstract

Audio classification

convolutional network

recurrent network

deep learning

data augmentation

label noise

General-purpose audio tagging refers to classifying sounds that are of a diverse nature, and is relevant in many applications where domain-specific information cannot be exploited. The DCASE 2018 challenge introduces Task 2 for this very problem. In this task, there are a large number of classes and the audio clips vary in duration. Moreover, a subset of the labels are noisy. In this paper, we propose a system to address these challenges. The basis of our system is an ensemble of convolutional neural networks trained on log-scaled mel spectrograms. We use preprocessing and data augmentation methods to improve the performance further. To reduce the effects of label noise, two techniques are proposed: loss function weighting and pseudo-labeling. Experiments on the private test set of this task show that our system achieves state-of-the-art performance with a mean average precision score of 0.951

Files and links (3)

pdf

DCASE2018Workshop_Iqbal_151207.93 kBDownload View

Text Open Access

url

https://tutcris.tut.fi/portal/en/publications/proceedings-of-the-detection-and-classification-of-acoustic-scenes-and-events-2018-workshop-dcase2018(42d6b7f2-d3ab-4bb6-84d5-f53bb0e94eaa).htmlView

Published (Version of record)

url

http://dcase.community/workshop2018/indexView

Metrics

167 File views/ downloads

76 Record Views

Details

Title: General-purpose audio tagging from noisy labels using convolutional neural networks
Creators: Turab Iqbal
Qiuqiang Kong
Mark D Plumbley
Wenwu Wang
Contributors: Mark D Plumbley (Editor)
Christian Kroos (Editor)
JP Bello (Editor)
G Richard (Editor)
DPW Ellis (Editor)
A Mesaros (Editor)
Publication Details: Proceedings of the Detection and Classification of Acoustic Scenes and Events 2018 Workshop (DCASE2018, pp.212-216
Conference: DCASE2018 Workshop on Detection and Classification of Acoustic Scenes and Events (Surrey, UK, 19/11/2018 - 20/11/2018)
Publisher: Tampere University of Technology
Date published: 11/2018
Date submitted: 23/11/2018
Grant note: Funder: EPSRC | Grant ID: EP/N014111/1
Identifiers: 99514815102346
Academic Unit: School of Computer Science and Electronic Engineering
Resource Type: Conference presentation

General-purpose audio tagging from noisy labels using convolutional neural networks

Abstract

Files and links (3)

Metrics

Details

Usage Policy