< Back to previous page

Project

Multi-microphone Speech Enhancement: An Integration of A Priori and Data-dependent Spatial Information

A speech signal captured by multiple microphones is often subject to a reduced intelligibility and quality due to the presence of noise and room acoustic interferences. Multi-microphone speech enhancement systems therefore aim at the suppression or cancellation of such undesired signals without substantial distortion of the speech signal. A fundamental aspect to the design of several multi-microphone speech enhancement systems is that of the spatial information which relates each microphone signal to the desired speech source. This spatial information is unknown in practice and has to be somehow estimated. Under certain conditions, however, the estimated spatial information can be inaccurate, which subsequently degrades the performance of a multi-microphone speech enhancement system.

This doctoral dissertation is focused on the development and evaluation of acoustic signal processing algorithms in order to address this issue. Specifically, as opposed to conventional means of estimating spatial information using only a priori knowledge or only observable microphone data, an integrated approach is pursued where both a priori and data-dependent spatial information are explicitly used. An initial investigation into such an a approach is firstly considered for the case of a microphone array from a confidence-based perspective, where a confidence metric is used to optimally combine a priori and data-dependent spatial information. The remainder of the dissertation is then dedicated to the study of a microphone array that has access to one or more external microphones. For this microphone configuration, a geometrically-based integration is investigated for the tasks of noise reduction, binaural speech enhancement, and speech dereverberation, where a priori spatial information is used for the microphone array(s) and data-dependent spatial information estimated from the observable microphone data is used for the external microphone(s). A final conception of an integrated approach is then explored for this microphone configuration by merging the confidence-based and geometrically-based integration techniques.  

The mathematical framework for the integrated approach as applied to the different microphone configurations is presented, along with experimental evaluation using recorded audio data from various acoustic environments. The results have shown that by following an integrated approach, more spatially robust speech enhancement algorithms can be designed as opposed to relying solely on a priori spatial information or only data-dependent spatial information. Furthermore, the advantage of using a priori spatial knowledge was demonstrated as it served to provide contingency spatial information in cases when the data-dependent spatial information was deemed to be inaccurate. A number of experiments involving an assistive hearing device linked with external microphones have also shown that the proposed speech enhancement algorithms can improve speech intelligibility in comparison to only using the assistive hearing device or only listening to an external microphone signal.

Date:11 Aug 2016 →  9 Nov 2020
Keywords:Noise Reduction, Audio Signal Processing
Disciplines:Applied mathematics in specific fields, Computer architecture and networks, Distributed computing, Information sciences, Information systems, Programming languages, Scientific computing, Theoretical computer science, Visual computing, Other information and computing sciences, Modelling, Biological system engineering, Signal processing, Control systems, robotics and automation, Design theories and methods, Mechatronics and robotics, Computer theory
Project type:PhD project