儿童异常肺音识别的时序优化神经网络模型

张龙基; 魏云龙; 郑晓明; 俞英健; 熊丽君

doi:10.16300/j.cnki.1000-3630.24111101

儿童异常肺音识别的时序优化神经网络模型

Time series optimization neural network model for identifying abnormal lung sounds in children

摘要

摘要: 异常肺音听诊识别是儿童支气管肺部疾病诊断的一种重要手段。针对儿童异常肺音分类研究常用的声谱图图像识别方法计算资源大、识别率不高等问题，提出了一种结合梅尔倒谱系数(Mel frequency cepstral coefficients, MFCC)特征、卷积神经网络(convolutional neural network，CNN)与双向长短时记忆网络(bidirectional long short-term memory，BiLSTM)的混合模型，用于儿童异常肺音的分类方法。该方法通过CNN对MFCC特征进行空间特性提取，利用BiLSTM对MFCC音频特征进行时序特性提取，建立了BCNnet(BILSTM CNN network)模型。文章收集并建立了一个儿童肺音数据集，在该数据集上，所提方法平均准确率可达75.3%，与以声谱图为输入的CNN(并行池化)模型相比，准确率提高了3.7个百分点，且在模型大小和识别速度上均有改善。

Abstract: Abnormal lung sound auscultation is an important tool for diagnosing bronchopulmonary diseases in children. Addressing the issues of large computational resource demands and low recognition rates of commonly used spectrogram image recognition methods used in the classification of children's abnormal lung sounds, a hybrid model combining Mel frequency cepstral coefficients(MFCC) features, convolutional neural network (CNN), and bidirectional long short-term memory (BiLSTM) network is proposed This method uses CNN to extract spatial features from MFCC, and BiLSTM to capture the temporal characteristics of the MFCC audio features, thereby establishing the BCNnet (BiLSTM-CNN network) model. This paper collects and establishes a dataset of children's lung sounds. On this dataset, the proposed method achieves an average accuracy of 75.3%, representing a 3.7 percentage points improvement in accuracy compared to the CNN (parallel-pooling) model that uses spectrograms as input. Additionally, the proposed model demonstrates improvements in both size and recognition speed.

HTML全文

参考文献(30)

施引文献

资源附件(0)