摘要
色情音频检测是互联网信息安全中的重要组成部分,为国内外色情音频检测研究者提供良好的数据支撑。论文介绍并公开了带有色情及非色情标签的1min或30s音频片段数据集,并提出了语音识别技术和文本分类技术相结合的检测方法,实现了对色情音频自动过滤或预警。实验中论文所提方法在真实音频数据集上最高可以达到97.3%的分类正确率。
Pornographic audio detection is an important part of Internet information security.In order to provide a data support for pornographic audio detection researchers.This paper introduces and discloses a 1 minute or 30 seconds audio clip dataset with pornographic and non-pornographic tags and proposes a detection method combining speech recognition technology and text classification technology,realizes automatic filtering or warning of pornographic audio.The method proposed in this paper can achieve a maximum classification accuracy of 97.3%on the database.
作者
司朋举
SI Pengju(School of Computer Science and Technology,China University of Petroleum(East China),Qingdao 266580)
出处
《计算机与数字工程》
2023年第4期877-880,958,共5页
Computer & Digital Engineering
关键词
色情音频检测
语音识别
文本分类
音频分类
pornographic audio detection
speech recognition
text classification
audio classification