摘要
基于tiny-YOLOv3提出了一种目标检测、单目深度估计和语义分割的三任务神经网络tiny-depth-YOLO,实现在实时视频通信中对背景人员的隐私保护.采用编码器-解码器结构,将逐像素的稠密深度估计转换为深度标签,并同YOLO的边界框、置信度、分类标签一同训练,在推理阶段,直接回归出带有深度的目标检测信息.采用MobileNet的深度可分离卷积优化系统中的卷积操作,减少推理阶段的运算量.实验表明,该系统可以完成对视频图像中人员的实例分割,并根据相对深度信息对背景人员进行遮挡和模糊,较好地实现了准确性和实时性的平衡,可以用于实时视频通信中的隐私保护.
A three-task neural network which is named tiny-depth-YOLO based on tiny-YOLOv3 is proposed for Object detection,monocular depth estimation and semantic segmentation to achieve privacy protection for background personnel in real-time video communication.The encoder-decoder structure is used.The pixel-by-pixel dense depth estimates are converted into depth labels and trained with YOLO′s bounding boxes,confidence,classification labels to directly regress the depth and object detection information in the inference phase.Using MobileNet′s depth-separable convolutional optimization system for convolutional operations in the inference phase,the amount of computation is reduced.Experiments show that the system in this paper can complete the instance segmentation of people in video images,and mask or blur background people based on relative depth information,achieving a better balance of accuracy and real-time,and can be used for privacy protection in real-time video communication.
作者
陈晨
刘世军
沈恂
CHEN Chen;LIU Shi-jun;SHEN Xun(Bengbu University,Bengbu 233030,China)
出处
《通化师范学院学报》
2022年第4期85-92,共8页
Journal of Tonghua Normal University
基金
2020年度蚌埠学院应用型科研项目(2020YYX04)
蚌埠学院校级科研项目(2015ZR07)
蚌埠学院工程中心研究项目(BBXYGC2014B04)
国家级大学生创新创业计划项目(201511305008)。