Surrey researchers Sign in
Sound to Text: Automated Audio Captioning using Deep Learning
Doctoral Thesis   Open access

Sound to Text: Automated Audio Captioning using Deep Learning

Xinhao Mei
University of Surrey
Doctor of Philosophy (PhD), University of Surrey
DOI:
https://doi.org/10.15126/thesis.901197

Abstract

Audio Understanding Language Generation Audio Captioning Multimodal Learning
pdf
Xinhao_Mei_PhD_Thesis11.72 MBDownloadView
PDFCC BY-NC-SA V4.0 Open Access
url
https://ieeexplore.ieee.org/document/10572302View
WavCaps Paper for Chapter 4
url
https://ieeexplore.ieee.org/abstract/document/10568388View
Paper for Chapter 5
url
https://dcase.community/documents/workshop2021/proceedings/DCASE2021Workshop_Mei_68.pdfView
Paper for Chapter 3
url
https://link.springer.com/article/10.1186/s13636-022-00259-2View
Paper for Chaoter 2

Metrics

2 File views/ downloads
12 Record Views

Details

Usage Policy