数据与计算发展前沿 ›› 2021, Vol. 3 ›› Issue (5): 130-140.

doi: 10.11871/jfdc.issn.2096-742X.2021.05.010

• 技术与应用 • 上一篇    下一篇

鸟类音频数据预处理方法

张猛1,2,*(),李健1()   

  1. 1.中国科学院计算机网络信息中心,北京 100190
    2.中国科学院大学,北京 100049
  • 收稿日期:2021-03-08 出版日期:2021-10-20 发布日期:2021-11-24
  • 通讯作者: 张猛
  • 作者简介:张猛, 中国科学院计算机网络信息中心,硕士研究生,主要研究方向为机器学习、大数据分析。
    本文承担工作为:文中负责频谱图筛选算法和分类模型的实验和测试。
    ZHANG Meng is a master student at CNIC. His research interests include machine learning and big data analysis.
    In this paper, he is mainly responsible for experiment and test of spectrum filter algorithm and the classification model.
    E-mail: zhangmeng@cnic.cn|李健,中国科学院计算机网络信息中心,博士,高级工程师,硕士生导师,中国科学院青年创新促进会成员。在国内外核心期刊及会议发表论文20余篇。主要研究方向为科研应用集成、数据挖掘和机器学习、e-Science应用。
    本文主要承担工作为:基于卷积神经网络和聚类的音频频谱图筛选算法总体设计。
    LI Jian, Ph.D., is a senior engineer, master supervisor, and the member of the Youth Innovation Promotion Association of the Chinese Academy of Sciences, Computer Network Information Center of the Chinese Academy of Sciences. He has published more than 20 papers in core journals and conferences at home and abroad. His main research directions are scientific research application integration, data mining and machine learning, and e-Science applications.
    In this paper, he is mainly responsible for the overall design of audio spectrum image filtering algorithm based on convol-utional neural network and clustering.
    E-mail: lijian@cnic.cn
  • 基金资助:
    国家重点研发计划(2019YFC0507405)

Bird Audio Data Preprocessing Method

ZHANG Meng1,2,*(),LI Jian1()   

  1. 1. Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China
    2. University of Chinese Academy of Sciences, Beijing 100049, China
  • Received:2021-03-08 Online:2021-10-20 Published:2021-11-24
  • Contact: ZHANG Meng

摘要:

【目的】从原始鸟类音频频谱图样本集中自动筛选并剔除噪音频谱图,可以提升鸟类物种分类的准确率。【方法】本文基于卷积神经网络,对频谱图提取特征向量,借助Faiss算法库计算特征向量的距离矩阵,然后使用DBSCAN (Density-Based Spatial Clustering of Applications with Noise)聚类算法筛选出噪音频谱图,最后将经过筛选后的频谱图样本集输入到分类模型中进行鸟类物种分类。【结果】通过本方法,从频谱图样本集中剔除了大量噪音频谱图,使得后续的鸟类物种的分类准确率得到了提升。【局限】由于DBSCAN算法聚类的效果受到邻域阈值(ε)和密度阈值(MinPts)参数的影响比较大,因此未来应该去探索自适应的方法获得参数值。【结论】本文将卷积神经网络和数据挖掘中的密度聚类算法相结合,提出了一种鸟类音频数据预处理方法,该方法可以自动筛选噪音频谱图,为后续的鸟类物种识别提供了高质量的频谱图样本集。

关键词: 鸟类音频, 频谱图, 数据筛选, 卷积神经网络, 聚类

Abstract:

[Objective] The accuracy of bird species classification can be improved by noise spectrogram filtering and removing from the sample set of original bird audio spectrograms. [Methods] Based on the convolutional neural network, this paper extracts the feature vector from the spectrogram, calculates the distance matrix of the feature vector with the Faiss algorithm library, and then uses the DBSCAN (Density-Based Spatial Clustering of Applications with Noise) clustering algorithm to filter out the noise spectrogram. Finally, the filtered spectrogram sample set is input into the classification model for bird species classification. [Results]Through this method, a large number of noise spectrograms are removed from the spectrogram sample set so that the accuracy of subsequent bird species classification has been improved. [Limitations] Because the clustering effect of the DBSCAN algorithm is greatly affected by the neighborhood threshold (ε) and density threshold (MinPts) parameters, we should explore adaptive methods to obtain parameter values in the future. [Conclusions] This paper combines the convolutional neural network and the density clustering algorithm in data mining and proposes a bird audio data preprocessing method for automatically noise spectrogram filtering, which provides a high-quality spectrogram sample set for subsequent bird species identification.

Key words: bird audio, spectrogram, data filtering, convolutional neural network, clustering