Speech emotion recognition based on full convolution recurrent neural network

ZHU Min; JIANG Pengxu; ZHAO Li

doi:10.16300/j.cnki.1000-3630.2021.05.009

ZHU Min, JIANG Pengxu, ZHAO Li. Speech emotion recognition based on full convolution recurrent neural networkJ. Technical Acoustics, 2021, 40(5): 645-651. DOI: 10.16300/j.cnki.1000-3630.2021.05.009

Citation:

ZHU Min, JIANG Pengxu, ZHAO Li. Speech emotion recognition based on full convolution recurrent neural networkJ. Technical Acoustics, 2021, 40(5): 645-651. DOI: 10.16300/j.cnki.1000-3630.2021.05.009

Citation:

ZHU Min, JIANG Pengxu, ZHAO Li. Speech emotion recognition based on full convolution recurrent neural networkJ. Technical Acoustics, 2021, 40(5): 645-651. DOI: 10.16300/j.cnki.1000-3630.2021.05.009

Speech emotion recognition based on full convolution recurrent neural network

Abstract

Abstract

Speech emotion recognition is one of the hot research fields of human-computer interaction. However, lack of researches on speech time-frequency information leads to the insufficient depth of exploring emotional information. To better explore the time-frequency related information in speech, a novel fully convolutional recurrent neural network model is proposed, in which, the multi-input parallel model combination method is used to extract features of different functions from two modules. The fully convolutional network (FCN) is used to learn the time-frequency related information in the features of speech spectrogram, and long short-term memory neural network (LTSM) is used to learn the frame-level features of speech to supplement the missing time-dependent information during FCN learning. Finally, the features are fused and classified by classifier. Experiments on two public emotional data sets show the superiority of the proposed algorithm.

FullText(HTML)

References (16)

Cited By

Turn off MathJax

Article Contents

Speech emotion recognition based on full convolution recurrent neural network

Abstract

Catalog

Export File

Citation

Format

Content