Speech Enhancement Algorithm Based on Microphone Array and Multi-Channel Parallel GRU-CNN Network
Ji Xi, Zhe Xu, Weiqi Zhang, Yue Xie, Li ZhaoThis paper presents an improved speech enhancement algorithm based on microphone arrays to improve speech enhancement performance in complex settings. The algorithm’s model consists of two key components: the feature extraction module and the speech enhancement module. The feature extraction module processes the speech amplitude spectral features derived from STFT (short-time Fourier transform). It employs parallel GRU-CNN (Gated Recurrent Units and CNN Convolutional Neural Network) structures to capture unique channel information, and skip connections are utilized to enhance the model’s convergence speed. The speech enhancement module focuses on obtaining cross-channel spatial information. By introducing an attention mechanism and applying a global hybrid pooling strategy, it reduces feature loss. This strategy dynamically assigns weights to each channel, emphasizing features that are most beneficial for speech signal restoration. Experimental results on the CHIME3 dataset show that the proposed model effectively suppresses diverse types of noise and outperforms other algorithms in improving speech quality and comprehension.