高级检索

基于自知识蒸馏的家居场景声音事件检测方法

Sound Event Detection Method for Home Scenarios Based on Self-Knowledge Distillation

  • 摘要: 本研究面向家居场景下的声音事件检测(sound event detection, SED)任务,旨在实现模型轻量化的同时提升检测性能,以适应资源受限设备的部署需求。为此,提出一种基于自知识蒸馏(self-knowledge distillation, SKD)的轻量化声学建模框架SKD-CRNN,通过构建双分支卷积循环神经网络(convolutional recurrent neural network, CRNN)并引入自蒸馏损失,引导学生分支从教师分支中学习,无需外部教师模型即可完成知识迁移。在DCASE2023 Task4数据集上,SKD-CRNN在参数量压缩至基线系统80.7%的同时,在8个家居声音事件类别检测上普遍提升性能,尤其在Electric shaver toothbrush、Dishes等非瞬时事件中检测效果显著提升。如Event-based、Intersection-based和Segment-based F1分数分别提升了7.8%、7.1%、6.2%,PSDS1与PSDS2指标分别提升2.5%和3.2%。实验结果表明,SKD-CRNN在提升SED检测能力的同时,有效降低模型复杂度,适用于家居环境中对计算资源敏感的边缘部署需求。

     

    Abstract: This study addresses the task of Sound Event Detection (SED) in home environments, aiming to improve detection performance while achieving model lightweighting to meet the deployment needs of resource-constrained devices. A lightweight acoustic modeling framework named SKD-CRNN based on Self-Knowledge Distillation (SKD) is proposed. The method constructs a dual-branch Convolutional Recurrent Neural Network (CRNN) and introduces a self-distillation loss to guide the student branch to learn from the teacher branch, enabling knowledge transfer without an external teacher model. Experiments on the DCASE2023 Task4 dataset demonstrate that SKD-CRNN reduces the number of parameters to 80.7% of the baseline system while achieving consistent performance improvements in sound event detection across eight household sound event categories. Notably, the detection of non-instantaneous events such as electric shaver/toothbrush and dishes shows significant enhancement in onset and offset identification. Event-based, intersection-based, and segment-based F1 scores increase by 7.8%, 7.1%, and 6.2%, respectively, with PSDS1 and PSDS2 metrics improving by 2.5% and 3.2%. Experimental results demonstrate that SKD-CRNN improves SED detection capability while effectively reducing model complexity, making it suitable for edge deployment in home environments with limited computational resources.

     

/

返回文章
返回