Abstract:
Many existing phase-aware speech enhancement algorithms consider the phase at all spectral frequencies to be equally important to perceptual quality and intelligibility. Although improvements are observed according to both objective and subjective measures, as compared to phase-insensitive approaches, it is not clear whether phase information is equally important across the frequency spectrum. In this paper, we investigate the importance of estimating phase across spectral regions, by conducting a pairwise listening study to determine if phase enhancement can be limited to certain frequency bands. Our experimental results suggest that estimating phase at lower-frequency bands is mostly important for speech quality in normal-hearing (NH) listeners. We further propose a hybrid deep-learning framework that adopts two sub-networks for handling phase differently across the spectrum. The proposed hybrid-net significantly improves the model compatibility with low-resource platforms while achieving superior performance to the original phase-aware speech enhancement approaches.
Publication(s):
Authors:
Zhuohuang Zhang, Donald Williamson, Yi Shen