|Improved expectation-maximization framework for speech enhancement based on a novel iterative noise estimator and speech presence probability estimation
|Lun, Daniel P. K. (EIE)
|Speech processing systems -- Mathematical models.
Signal processing -- Mathematics.
Hong Kong Polytechnic University -- Dissertations
|Faculty of Engineering
|50 leaves : color illustrations ; 30 cm
|Speech enhancement is a kind of technology that can reduce the background noise behind those speeches corrupted by different types of environmental noise. It can improve our perceptual feelings when hearing those noisy speeches. Besides, it improves many applications in the areas of speech recognition, speech communications, as well as information forensics. Due to its widespread applications, many works in speech enhancement have been reported in the last two decades. Recently, a novel speech enhancement algorithm that adopts the Expectation‐Maximization (EM) framework was developed. It estimates the major parameters of the speech in the E‐step and refines them in the M‐step. The noise energy is thus iteratively reduced. It shows a better performance over many existing algorithms especially in non‐stationary noise environments. But like the other speech enhancement algorithms, it is necessary to estimate the a-priori SNR at the start of the whole algorithm. It means that we cannot avoid the step of estimating the noise power spectrum density (PSD). In the original EM framework,a very traditional method is used to estimate the noise PSD. It is by sampling the noise PSD at speech absence frames. Obviously this method lacks accuracy. Due to the estimation errors, the enhanced speech cannot converge to the true clean speech, which means having more iterations may not improve the speech but may even introduce harmonic distortion. Thus the original EM framework sets a fixed maximum iteration number. However, different frames should have different optimal iteration numbers. A stopping criterion should be determined to control the iteration number. With all the above considerations, this thesis proposes an improved EM framework for speech enhancement. The proposed algorithm first embeds an iterative noise estimator into the E‐step of the original EM framework. Second, the log minimum mean square error (log-mmse) filter is modified for providing better noise reduction performance. Third, a novel iteration terminal criterion is designed. The simulation results show that the new algorithm can give significant improvement over the original EM framework and other traditional speech enhancement algorithms.
|All rights reserved
Files in This Item:
|For All Users (off-campus access for PolyU Staff & Students only)
As a bona fide Library user, I declare that:
- I will abide by the rules and legal ordinances governing copyright regarding the use of the Database.
- I will use the Database for the purpose of my research or private study only and not for circulation or further reproduction or any other purpose.
- I agree to indemnify and hold the University harmless from and against any loss, damage, cost, liability or expenses arising from copyright infringement or unauthorized usage.
By downloading any item(s) listed above, you acknowledge that you have read and understood the copyright undertaking as stated above, and agree to be bound by all of its terms.
Please use this identifier to cite or link to this item: