期刊文献+

音视频数据半自动化标注方法 被引量:3

Semi-automatic Labeling Method for Audio and Video Data
下载PDF
导出
摘要 当前主流的基于人工标注方法的样本数据集构建方法耗时耗力,无法构建大规模标准数据集。针对提高数据集的标注效率并依靠深度学习模型的准确性,构建了一种半自动标注数据的方法。通过人工标注少量数据来训练算法模型,利用新构建的模型对大型数据集进行检测识别,选取置信度不高的部分,经过人工审查后加入训练集,经过不断地循环迭代,逐步形成大规模标准数据集。实验结果表明,课题设计的半自动化标注方法能大幅缩短人工标注的时间,并且每次迭代循环都能不同程度的提高算法模型检测识别的准确率。 The current mainstream method of constructing sample data set based on manual annotation methods is time-consuming and labor-intensive,and it has no way to construct large-scale standard data set.Aiming at improving the labeling efficiency of data sets and relying on the accuracy of deep learning models,a semi-automatic data labeling method is constructed.The method is to train the model by manually labeling a small number of data,use the newly constructed model to detect and recognize large data set,select the parts with low confidence,and join the training set after manual review,and gradually form a large-scale standard data set after continuous loop iterations.The experimental results show that the semi-automatic labeling method designed by the subject can greatly shorten the time of manual labeling,and each iteration cycle can improve the accuracy of algorithm model detection and recognition to varying degrees.
作者 白雪冰 韩志峰 蒋龙泉 黄云刚 冯瑞 BAI Xuebing;HAN Zhifeng;JIANG Longquan;HUANG Yungang;FENG Rui(Academy for Engineering&Technology,Fudan University,Shanghai 200243,China;Software School,Fudan University,Shanghai 200243,China;School of Computer Science School,Fudan University,Shanghai 200243,China;Shanghai Haichao Institute For New Technologies,Shanghai 200070,China)
出处 《微型电脑应用》 2021年第8期9-13,17,共6页 Microcomputer Applications
基金 上海市科委一次性项目(202068400859-80001) 重大项目(AWS15J005) 上海市科委项目(20511101502) 上海市科委项目(20DZ1100205)。
关键词 半自动标注 标准数据集 深度学习 音视频 semi-automatic standard data set deep learning audio and video data
  • 相关文献

参考文献2

二级参考文献5

共引文献13

同被引文献24

引证文献3

二级引证文献9

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部