高级检索

基于多特征提取的语音情感分类研究

Speech emotion classification based on multi-feature extraction

  • 摘要: 情感识别是计算机对人类情感感知过程的模拟,具有重要的研究意义和应用价值。传统的语音识别系统通常使用单一的特征提取方法,但这些方法有时会丢失语音情感信号中的重要信息,导致识别错误。因此,文章基于改进的完全集成噪声自适应经验模式分解,提出了一种组合多特征提取方法来分类无语意情感语音信号。首先,利用基于改进的完全集成噪声自适应经验模式分解将一维情感语音信号分解得到多个内禀模式;然后,提取每个内禀模式的均值、方差、峰度、偏度、能量、中心频率、峰值幅度和排列熵等特征;最后,通过这些特征对愤怒、快乐、悲伤和无情感四种情感进行分类。研究表明,该方法在通过支持向量机8∶2的模型训练后,得到了91.44%的平均识别率,可为情感语音信号的识别工作提供重要参考。

     

    Abstract: Emotion recognition is a type of computer simulation for human emotion perception process, which is significant in research and applications. Traditional speech recognition systems usually employ a single feature extraction method, which sometimes loses important information from speech emotion signals. Therefore, based on the improved complete ensemble empirical mode decomposition with adaptive noise (ICEEMDAN), a combined multi-feature extraction method to classify semantically independent speech emotion signals is proposed in this paper. ICEEMDAN decomposes one-dimensional speech signals into multiple intrinsic modes, and then extracts characteristics such as energy intensity, average, variance, kurtosis, skewness, center frequency, peak amplitude, permutation entropy from each decomposed mode. Finally, four emotions such as anger, happiness, sadness, and no emotion are classified. The results show that the proposed method achieves an average recognition rate of 91.44% after training with an 8∶2 model of the support vector machine (SVM). It can provide an important reference for speech emotion recognition.

     

/

返回文章
返回