Conference Paper (published)

A Data Driven Approach to Audiovisual Speech Mapping

Details

Citation

Abel A, Marxer R, Hussain A, Barker J, Watt R, Whitmer B & Derleth P (2016) A Data Driven Approach to Audiovisual Speech Mapping. In: Liu C, Hussain A, Luo B, Tan K, Zeng Y & Zhang Z (eds.) Advances in Brain Inspired Cognitive Systems. Lecture Notes in Computer Science, 10023. BICS 2016: International Conference on Brain Inspired Cognitive Systems, Beijing, China, 28.11.2016-30.11.2016. Cham, Switzerland: Springer, pp. 331-342. https://doi.org/10.1007/978-3-319-49685-6_30

Abstract
The concept of using visual information as part of audio speech processing has been of significant recent interest. This paper presents a data driven approach that considers estimating audio speech acoustics using only temporal visual information without considering linguistic features such as phonemes and visemes. Audio (log filterbank) and visual (2D-DCT) features are extracted, and various configurations of MLP and datasets are used to identify optimal results, showing that given a sequence of prior visual frames an equivalent reasonably accurate audio frame estimation can be mapped.

Keywords
Audiovisual; Speech processing; Speech mapping; ANNs

Status	Published
Funders	Engineering and Physical Sciences Research Council
Title of series	Lecture Notes in Computer Science
Number in series	10023
Publication date	31/12/2016
Publication date online	30/11/2016
URL	http://hdl.handle.net/1893/24710
Publisher	Springer
Place of publication	Cham, Switzerland
ISSN of series	0302-9743
ISBN	978-3-319-49685-6
Conference	BICS 2016: International Conference on Brain Inspired Cognitive Systems
Conference location	Beijing, China
Dates	28/11/2016–30/11/2016

A Data Driven Approach to Audiovisual Speech Mapping

Details

People (1)

Files (1)