Conference Paper (published)
Details
Citation
Abel A, Marxer R, Hussain A, Barker J, Watt R, Whitmer B & Derleth P (2016) A Data Driven Approach to Audiovisual Speech Mapping. In: Liu C, Hussain A, Luo B, Tan K, Zeng Y & Zhang Z (eds.) Advances in Brain Inspired Cognitive Systems. Lecture Notes in Computer Science, 10023. BICS 2016: International Conference on Brain Inspired Cognitive Systems, Beijing, China, 28.11.2016-30.11.2016. Cham, Switzerland: Springer, pp. 331-342. https://doi.org/10.1007/978-3-319-49685-6_30
Abstract
The concept of using visual information as part of audio speech processing has been of significant recent interest. This paper presents a data driven approach that considers estimating audio speech acoustics using only temporal visual information without considering linguistic features such as phonemes and visemes. Audio (log filterbank) and visual (2D-DCT) features are extracted, and various configurations of MLP and datasets are used to identify optimal results, showing that given a sequence of prior visual frames an equivalent reasonably accurate audio frame estimation can be mapped.
Keywords
Audiovisual; Speech processing; Speech mapping; ANNs
Status | Published |
---|---|
Funders | Engineering and Physical Sciences Research Council |
Title of series | Lecture Notes in Computer Science |
Number in series | 10023 |
Publication date | 31/12/2016 |
Publication date online | 30/11/2016 |
URL | http://hdl.handle.net/1893/24710 |
Publisher | Springer |
Place of publication | Cham, Switzerland |
ISSN of series | 0302-9743 |
ISBN | 978-3-319-49685-6 |
Conference | BICS 2016: International Conference on Brain Inspired Cognitive Systems |
Conference location | Beijing, China |
Dates | – |
People (1)
Emeritus Professor, Psychology