Realistic Visual Speech Synthesis Based on AAM Features and an Articulatory DBN Model with Constrained Asynchrony Vrije Universiteit Brussel
This paper presents a photo realistic visual speech synthesis method based on an audio visual articulatory dynamic Bayesian network model (AF_AVDBN) in which the maximum asynchronies between the articulatory features, such as lips, tongue and glottis/velum, can be controlled. Perceptual linear prediction (PLP) features from the audio speech and active appearance model (AAM) features from mouth images of the visual speech are adopted to train the ...