结合视角矫正和改进ViViT的驾驶员睡意判断方法

Driver drowsiness detection method combining perspective correction and improved ViViT

下载PDF

导出

摘要针对传统检测方法中摄像头视角受限问题,提出了一种结合面部姿态矫正和改进ViViT的多视角下人脸疲倦检测方法。采用Mediapipe Face Mesh定位面部三维特征点并将其矫正为正面,利用提出的FGR-ViViT模型来捕捉矫正后的眼睛、眉毛、嘴巴线条图像帧序列变化。FGR-ViViT通过在ViViT的Temporal Transformer Encoder中添加部件选择模块来捕捉特征在时间维度中的细微差异,同时融合2次dropout和改进的对比损失函数来调整样本的相似性,降低模型过拟合风险并提高泛化能力。实验结果表明,提出的方法在YawDD和DROZY矫正后的线条图像帧的测试集上,F1-分数达到了94.5%和97.6%,相较于原始人脸图像帧分别提高了3.2%和10.4%,其FGR-ViViT相较于原始ViViT分别提高了6.1%和0.7%。所提方法适用于摄像头灵活摆放的多种应用场景,对解决多视角人脸睡意判断具有积极意义。 To address the view limitations in traditional detection methods,this paper proposes a multi-view facial fatigue detection method combining facial pose correction and improved ViViT.First,the 3D feature points of the face are localized and corrected to the frontal face by using Mediapipe Face Mesh,and then the proposed FGR-ViViT model is employed to capture the changes of the corrected eyes,eyebrows and mouth line image frame sequence.FGR-ViViT captures the subtle differences of features in the time dimension by adding a part selection module to the Temporal Transformer Encoder of ViViT.Meanwhile,double dropout and an improved contrast loss function are fused to adjust the similarity of samples,reduce the risk of model over-fitting and improve the generalization ability.Our experimental results show the proposed method achieves F1-scores of 94.5% and 97.6% on the YawDD and DROZY test sets,and the use of corrected line image frames is 3.2% and 10.4% higher than the original face image frames,and the FGR-ViViT is 6.1% and 0.7% higher than the original ViViT.The proposed method is applicable in scenarios of flexible camera placement,exerting positive impacts on achieving multi-view facial fatigue judgment.

作者傅由甲孟雪莹 FU Youjia;MENG Xueying(College of Computer Science and Engineering,Chongqing University of Technology,Chongqing 400054,China)

机构地区重庆理工大学计算机科学与工程学院

出处《重庆理工大学学报（自然科学）》 CAS 北大核心 2024年第6期172-179,共8页 Journal of Chongqing University of Technology：Natural Science

基金重庆市基础研究与前沿探索专项(重庆市自然科学基金)项目(CSTB2022NSCQ-MSX0786) 重庆市教委人文社会科学项目(23SKGH252)。

关键词疲劳检测多视角 Video Vision Transformer 部件选择模块 fatigue detection multi view video vision transformer part selection module

分类号 TP391.41 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献1

1Sara A.Alameen,Areej M.Alhothali.A Lightweight Driver Drowsiness Detection System Using 3DCNN With LSTM[J].Computer Systems Science & Engineering,2023,44(1):895-912. 被引量：1

1吴啟荃,林振坤(受访).听到声响易惊慌,多半是“心虚”[J].中国家庭医生,2024(14):46-47.
2九边.忘记边界[J].北方人,2024(13):52-53.
3张绘敏,赵扬,康会峰.基于卷积神经网络算法的光伏组件热斑图像检测方法研究[J].计算机测量与控制,2024,32(7):57-63.
4王为,赵涛,钟羽中,佃松宜.基于SE-RetinaNet的面向玻璃面板的小尺寸低显著性缺陷检测[J].组合机床与自动化加工技术,2024(7):123-127.
5刘哲,胡芮,宋余庆,刘毅.基于对比学习的半监督肝脏血管分割方法[J].华中科技大学学报（自然科学版）,2024,52(5):70-75.
6陈颜春,尚毓奇,董东雪.潍坊抽水蓄能电站高精度三维建模技术[J].水利水电施工,2024(1):119-122.
7赵福刚.农产品质量安全检测中的快速检测技术与设备研究[J].优质农产品,2024(10):0067-0069.
8袁彪.烧烤增肌[J].健与美,2024(8):163-167.
9孙谦,黄瑞生,徐富家,曹浩,李林,宋扬,马强.介观领域中的激光熔透信号物理特性研究[J].焊接学报,2024,45(7):27-33.
10王园园,史东辉,甘书灵.基于深度神经模糊系统的交通事故严重程度预测研究[J].软件工程,2024,27(8):62-65.

重庆理工大学学报（自然科学）

2024年第6期

浏览历史

内容加载中请稍等...

结合视角矫正和改进ViViT的驾驶员睡意判断方法

参考文献1

相关作者

相关机构

相关主题

浏览历史