Surrey researchers Sign in
WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research
Journal article   Open access   Peer reviewed

WavCaps: A ChatGPT-Assisted Weakly-Labelled Audio Captioning Dataset for Audio-Language Multimodal Research

Xinhao Mei, Haohe Liu, Qiuqiang Kong, Tom Ko, Mark D. Plumbley, Yuexian Zou and Wenwu Wang
IEEE/ACM Transactions on Audio, Speech, and Language Processing, Vol.32, pp.3339-3354
26/06/2024

Abstract

Audio Captioning Audio-language dataset multimodal learning ChatGPT deep learning Acoustics
pdf
WavCaps_A_ChatGPT-Assisted_Weakly-Labelled_Audio_Captioning_Dataset_for_Audio-Language_Multimodal_Research5.14 MBDownloadView
Author's Accepted Manuscript Open Access

Metrics

1 File views/ downloads
1 Record Views

Details

Usage Policy