期刊文献+
共找到103篇文章
< 1 2 6 >
每页显示 20 50 100
Fourier Locally Linear Soft Constrained MACE for facial landmark localization 被引量:1
1
作者 Wenming Yang Xiang Sun +2 位作者 Weihong Deng Chi Zhang Qingmin Liao 《CAAI Transactions on Intelligence Technology》 2016年第3期241-248,共8页
This paper proposes a novel nonlinear correlation filter for facial landmark localization. Firstly, we prove that SVM as a classifier can also be used for localization. Then, soft constrained Minimum Average Correlati... This paper proposes a novel nonlinear correlation filter for facial landmark localization. Firstly, we prove that SVM as a classifier can also be used for localization. Then, soft constrained Minimum Average Correlation Energy filter (soft constrained MACE) is proposed, which is more resistent to overfittings to training set than other variants of correlation filter. In order to improve the performance for the multi-mode of the targets, locally linear framework is introduced to our model, which results in Fourier Locally Linear Soft Constraint MACE (FL^2 SC-MACE). Furthermore, we formulate the fast implementation and show that the time consumption in test process is independent of the number of training samples. The merits of our method include accurate localization performance, desiring generalization capability to the variance of objects, fast testing speed and insensitivity to parameter settings. We conduct the cross-set eye localization experiments on challenging FRGC, FERET and BioID datasets. Our method surpasses the state-of-arts especially in pixelwise accuracy. 展开更多
关键词 facial landmark localization OVERFITTING MULTIMODE FL^2 SC-MACE
下载PDF
Facial landmark disentangled network with variational autoencoder
2
作者 LIANG Sen ZHOU Zhi-ze +3 位作者 GUO Yu-dong GAO Xuan ZHANG Ju-yong BAO Hu-jun 《Applied Mathematics(A Journal of Chinese Universities)》 SCIE CSCD 2022年第2期290-305,共16页
Learning disentangled representation of data is a key problem in deep learning.Specifically,disentangling 2D facial landmarks into different factors(e.g.,identity and expression)is widely used in the applications of f... Learning disentangled representation of data is a key problem in deep learning.Specifically,disentangling 2D facial landmarks into different factors(e.g.,identity and expression)is widely used in the applications of face reconstruction,face reenactment and talking head et al..However,due to the sparsity of landmarks and the lack of accurate labels for the factors,it is hard to learn the disentangled representation of landmarks.To address these problem,we propose a simple and effective model named FLD-VAE to disentangle arbitrary facial landmarks into identity and expression latent representations,which is based on a Variational Autoencoder framework.Besides,we propose three invariant loss functions in both latent and data levels to constrain the invariance of representations during training stage.Moreover,we implement an identity preservation loss to further enhance the representation ability of identity factor.To the best of our knowledge,this is the first work to end-to-end disentangle identity and expression factors simultaneously from one single facial landmark. 展开更多
关键词 disentanglement representation deep learning facial landmarks variational autoencoder
下载PDF
Facial Landmark Localization by Gibbs Sampling
3
作者 Bofei Wang Diankai Zhang +2 位作者 Chi Zhang Jiani Hu Weihong Deng 《ZTE Communications》 2014年第4期23-29,共7页
In this paper, we introduce a novel method for facial landmark detection. We localize facial landmarks according to the MAP crite rion. Conventional gradient ascent algorithms get stuck at the local optimal solution. ... In this paper, we introduce a novel method for facial landmark detection. We localize facial landmarks according to the MAP crite rion. Conventional gradient ascent algorithms get stuck at the local optimal solution. Gibbs sampling is a kind of Markov Chain Monte Carlo (MCMC) algorithm. We choose it for optimization because it is easy to implement and it guarantees global conver gence. The posterior distribution is obtained by learning prior distribution and likelihood function. Prior distribution is assumed Gaussian. We use Principle Component Analysis (PCA) to reduce the dimensionality and learn the prior distribution. Local Linear Support Vector Machine (LLSVM) is used to get the likelihood function of every key point. In our experiment, we compare our de tector with some other wellknown methods. The results show that the proposed method is very simple and efficient. It can avoid trapping in local optimal solution. 展开更多
关键词 facial landmarks MAP Gibbs sampling MCMC LL-SVM
下载PDF
Landmarks-Driven Triplet Representation for Facial Expression Similarity
4
作者 周逸润 冯向阳 朱明 《Journal of Donghua University(English Edition)》 CAS 2023年第1期34-44,共11页
The facial landmarks can provide valuable information for expression-related tasks.However,most approaches only use landmarks for segmentation preprocessing or directly input them into the neural network for fully con... The facial landmarks can provide valuable information for expression-related tasks.However,most approaches only use landmarks for segmentation preprocessing or directly input them into the neural network for fully connection.Such simple combination not only fails to pass the spatial information to network,but also increases calculation amounts.The method proposed in this paper aims to integrate facial landmarks-driven representation into the triplet network.The spatial information provided by landmarks is introduced into the feature extraction process,so that the model can better capture the location relationship.In addition,coordinate information is also integrated into the triple loss calculation to further enhance similarity prediction.Specifically,for each image,the coordinates of 68 landmarks are detected,and then a region attention map based on these landmarks is generated.For the feature map output by the shallow convolutional layer,it will be multiplied with the attention map to correct the feature activation,so as to strengthen the key region and weaken the unimportant region.Finally,the optimized embedding output can be further used for downstream tasks.Three embeddings of three images output by the network can be regarded as a triplet representation for similarity computation.Through the CK+dataset,the effectiveness of such an optimized feature extraction is verified.After that,it is applied to facial expression similarity tasks.The results on the facial expression comparison(FEC)dataset show that the accuracy rate will be significantly improved after the landmark information is introduced. 展开更多
关键词 facial expression similarity facial landmark triplet network attention mechanism feature optimization
下载PDF
A Robust Method of Bipolar Mental Illness Detection from Facial Micro Expressions Using Machine Learning Methods
5
作者 Ghulam Gilanie Sana Cheema +4 位作者 Akkasha Latif AnumSaher Muhammad Ahsan Hafeez Ullah Diya Oommen 《Intelligent Automation & Soft Computing》 2024年第1期57-71,共15页
Bipolar disorder is a serious mental condition that may be caused by any kind of stress or emotional upset experienced by the patient.It affects a large percentage of people globally,who fluctuate between depression a... Bipolar disorder is a serious mental condition that may be caused by any kind of stress or emotional upset experienced by the patient.It affects a large percentage of people globally,who fluctuate between depression and mania,or vice versa.A pleasant or unpleasant mood is more than a reflection of a state of mind.Normally,it is a difficult task to analyze through physical examination due to a large patient-psychiatrist ratio,so automated procedures are the best options to diagnose and verify the severity of bipolar.In this research work,facial microexpressions have been used for bipolar detection using the proposed Convolutional Neural Network(CNN)-based model.Facial Action Coding System(FACS)is used to extract micro-expressions called Action Units(AUs)connected with sad,happy,and angry emotions.Experiments have been conducted on a dataset collected from Bahawal Victoria Hospital,Bahawalpur,Pakistan,Using the Patient Health Questionnaire-15(PHQ-15)to infer a patient’s mental state.The experimental results showed a validation accuracy of 98.99%for the proposed CNN modelwhile classification through extracted featuresUsing SupportVectorMachines(SVM),K-NearestNeighbour(KNN),and Decision Tree(DT)obtained 99.9%,98.7%,and 98.9%accuracy,respectively.Overall,the outcomes demonstrated the stated method’s superiority over the current best practices. 展开更多
关键词 Bipolar mental illness detection facial micro-expressions facial landmarked images
下载PDF
Joint head pose and facial landmark regression from depth images 被引量:2
6
作者 Jie Wang Juyong Zhang +1 位作者 Changwei Luo Falai Chen 《Computational Visual Media》 CSCD 2017年第3期229-241,共13页
This paper presents a joint head pose and facial landmark regression method with input from depth images for realtime application. Our main contributions are: firstly, a joint optimization method to estimate head pose... This paper presents a joint head pose and facial landmark regression method with input from depth images for realtime application. Our main contributions are: firstly, a joint optimization method to estimate head pose and facial landmarks, i.e., the pose regression result provides supervised initialization for cascaded facial landmark regression, while the regression result for the facial landmarks can also help to further refine the head pose at each stage. Secondly,we classify the head pose space into 9 sub-spaces, and then use a cascaded random forest with a global shape constraint for training facial landmarks in each specific space. This classification-guided method can effectively handle the problem of large pose changes and occlusion.Lastly, we have built a 3D face database containing 73 subjects, each with 14 expressions in various head poses. Experiments on challenging databases show our method achieves state-of-the-art performance on both head pose estimation and facial landmark regression. 展开更多
关键词 head pose facial landmarks depth images
原文传递
The deep spatiotemporal network with dual-flow fusion for video-oriented facial expression recognition
7
作者 Chenquan Gan Jinhui Yao +2 位作者 Shuaiying Ma Zufan Zhang Lianxiang Zhu 《Digital Communications and Networks》 SCIE CSCD 2023年第6期1441-1447,共7页
The video-oriented facial expression recognition has always been an important issue in emotion perception.At present,the key challenge in most existing methods is how to effectively extract robust features to characte... The video-oriented facial expression recognition has always been an important issue in emotion perception.At present,the key challenge in most existing methods is how to effectively extract robust features to characterize facial appearance and geometry changes caused by facial motions.On this basis,the video in this paper is divided into multiple segments,each of which is simultaneously described by optical flow and facial landmark trajectory.To deeply delve the emotional information of these two representations,we propose a Deep Spatiotemporal Network with Dual-flow Fusion(defined as DSN-DF),which highlights the region and strength of expressions by spatiotemporal appearance features and the speed of change by spatiotemporal geometry features.Finally,experiments are implemented on CKþand MMI datasets to demonstrate the superiority of the proposed method. 展开更多
关键词 facial expression recognition Deep spatiotemporal network Optical flow facial landmark trajectory Dual-flow fusion
下载PDF
Robust facial landmark detection and tracking across poses and expressions for in-the-wild monocular video 被引量:2
8
作者 Shuang Liu Yongqiang Zhang +2 位作者 Xiaosong Yang Daming Shi Jian J.Zhang 《Computational Visual Media》 CSCD 2017年第1期33-47,共15页
We present a novel approach for automatically detecting and tracking facial landmarks acrossposesandexpressionsfromin-the-wild monocular video data,e.g.,You Tube videos and smartphone recordings.Our method does not re... We present a novel approach for automatically detecting and tracking facial landmarks acrossposesandexpressionsfromin-the-wild monocular video data,e.g.,You Tube videos and smartphone recordings.Our method does not require any calibration or manual adjustment for new individual input videos or actors.Firstly,we propose a method of robust 2D facial landmark detection across poses,by combining shape-face canonical-correlation analysis with a global supervised descent method.Since 2D regression-based methods are sensitive to unstable initialization,and the temporal and spatial coherence of videos is ignored,we utilize a coarse-todense 3D facial expression reconstruction method to refine the 2D landmarks.On one side,we employ an in-the-wild method to extract the coarse reconstruction result and its corresponding texture using the detected sparse facial landmarks,followed by robust pose,expression,and identity estimation.On the other side,to obtain dense reconstruction results,we give a face tracking flow method that corrects coarse reconstruction results and tracks weakly textured areas;this is used to iteratively update the coarse face model.Finally,a dense reconstruction result is estimated after it converges.Extensive experiments on a variety of video sequences recorded by ourselves or downloaded from You Tube show the results of facial landmark detection and tracking under various lighting conditions,for various head poses and facial expressions.The overall performance and a comparison with state-of-art methods demonstrate the robustness and effectiveness of our method. 展开更多
关键词 face tracking facial reconstruction landmark detection
原文传递
Customized Convolutional Neural Network for Accurate Detection of Deep Fake Images in Video Collections
9
作者 Dmitry Gura Bo Dong +1 位作者 Duaa Mehiar Nidal Al Said 《Computers, Materials & Continua》 SCIE EI 2024年第5期1995-2014,共20页
The motivation for this study is that the quality of deep fakes is constantly improving,which leads to the need to develop new methods for their detection.The proposed Customized Convolutional Neural Network method in... The motivation for this study is that the quality of deep fakes is constantly improving,which leads to the need to develop new methods for their detection.The proposed Customized Convolutional Neural Network method involves extracting structured data from video frames using facial landmark detection,which is then used as input to the CNN.The customized Convolutional Neural Network method is the date augmented-based CNN model to generate‘fake data’or‘fake images’.This study was carried out using Python and its libraries.We used 242 films from the dataset gathered by the Deep Fake Detection Challenge,of which 199 were made up and the remaining 53 were real.Ten seconds were allotted for each video.There were 318 videos used in all,199 of which were fake and 119 of which were real.Our proposedmethod achieved a testing accuracy of 91.47%,loss of 0.342,and AUC score of 0.92,outperforming two alternative approaches,CNN and MLP-CNN.Furthermore,our method succeeded in greater accuracy than contemporary models such as XceptionNet,Meso-4,EfficientNet-BO,MesoInception-4,VGG-16,and DST-Net.The novelty of this investigation is the development of a new Convolutional Neural Network(CNN)learning model that can accurately detect deep fake face photos. 展开更多
关键词 Deep fake detection video analysis convolutional neural network machine learning video dataset collection facial landmark prediction accuracy models
下载PDF
基于多层次自注意力网络的人脸特征点检测 被引量:1
10
作者 徐浩宸 刘满华 《计算机工程》 CAS CSCD 北大核心 2024年第2期239-246,共8页
人脸特征点检测是人脸图像处理的关键步骤之一,常用检测方法是基于深度神经网络的坐标回归方法,具有处理速度快的优点,但是用于回归的高层次网络特征丢失空间结构信息,且缺乏细粒度表征能力,导致检测精度降低。针对该问题,提出一种基于... 人脸特征点检测是人脸图像处理的关键步骤之一,常用检测方法是基于深度神经网络的坐标回归方法,具有处理速度快的优点,但是用于回归的高层次网络特征丢失空间结构信息,且缺乏细粒度表征能力,导致检测精度降低。针对该问题,提出一种基于多层次自注意力网络的人脸关键点检测算法。为提取更具有细粒度表征能力的图像语义特征,构建基于自注意力机制的多层次特征融合模块,实现高层次高语义信息特征和低层次高空间信息特征的跨层次特征融合。在此基础上,设计一种多任务学习人脸特征点检测定位与人脸姿态角估计的训练方式,优化网络对人脸整体朝向姿态的估计,以提升特征点检测的准确性。在人脸特征点主流数据集300W和WFLW上的实验结果表明,与SAAT、AnchorFace等方法相比,该方法有效提升网络的检测精度,标准平均误差指标分别为3.23%和4.55%,相较于基线模型降低0.37和0.59个百分点,在WFLW数据集上错误率指标为3.56%,相较于基线模型降低了2.86个百分点,能够提取更具鲁棒性和细粒度的表达特征。 展开更多
关键词 人脸特征点检测 卷积神经网络 自注意力机制 特征融合 多任务学习 深度学习
下载PDF
An Automated and Real-time Approach of Depression Detection from Facial Micro-expressions 被引量:2
11
作者 Ghulam Gilanie Mahmood ul Hassan +5 位作者 Mutyyba Asghar Ali Mustafa Qamar Hafeez Ullah Rehan Ullah Khan Nida Aslam Irfan Ullah Khan 《Computers, Materials & Continua》 SCIE EI 2022年第11期2513-2528,共16页
Depression is a mental psychological disorder that may cause a physical disorder or lead to death.It is highly impactful on the socialeconomical life of a person;therefore,its effective and timely detection is needful... Depression is a mental psychological disorder that may cause a physical disorder or lead to death.It is highly impactful on the socialeconomical life of a person;therefore,its effective and timely detection is needful.Despite speech and gait,facial expressions have valuable clues to depression.This study proposes a depression detection system based on facial expression analysis.Facial features have been used for depression detection using Support Vector Machine(SVM)and Convolutional Neural Network(CNN).We extracted micro-expressions using Facial Action Coding System(FACS)as Action Units(AUs)correlated with the sad,disgust,and contempt features for depression detection.A CNN-based model is also proposed in this study to auto classify depressed subjects from images or videos in real-time.Experiments have been performed on the dataset obtained from Bahawal Victoria Hospital,Bahawalpur,Pakistan,as per the patient health questionnaire depression scale(PHQ-8);for inferring the mental condition of a patient.The experiments revealed 99.9%validation accuracy on the proposed CNN model,while extracted features obtained 100%accuracy on SVM.Moreover,the results proved the superiority of the reported approach over state-of-the-art methods. 展开更多
关键词 Depression detection facial micro-expressions facial landmarked images
下载PDF
A novel facial emotion recognition scheme based on graph mining 被引量:1
12
作者 Alia K.Hassan Suhaila N.Mohammed 《Defence Technology(防务技术)》 SCIE EI CAS CSCD 2020年第5期1062-1072,共11页
Recent years have seen an explosion in graph data from a variety of scientific,social and technological fields.From these fields,emotion recognition is an interesting research area because it finds many applications i... Recent years have seen an explosion in graph data from a variety of scientific,social and technological fields.From these fields,emotion recognition is an interesting research area because it finds many applications in real life such as in effective social robotics to increase the interactivity of the robot with human,driver safety during driving,pain monitoring during surgery etc.A novel facial emotion recognition based on graph mining has been proposed in this paper to make a paradigm shift in the way of representing the face region,where the face region is represented as a graph of nodes and edges and the gSpan frequent sub-graphs mining algorithm is used to find the frequent sub-structures in the graph database of each emotion.To reduce the number of generated sub-graphs,overlap ratio metric is utilized for this purpose.After encoding the final selected sub-graphs,binary classification is then applied to classify the emotion of the queried input facial image using six levels of classification.Binary cat swarm intelligence is applied within each level of classification to select proper sub-graphs that give the highest accuracy in that level.Different experiments have been conducted using Surrey Audio-Visual Expressed Emotion(SAVEE)database and the final system accuracy was 90.00%.The results show significant accuracy improvements(about 2%)by the proposed system in comparison to current published works in SAVEE database. 展开更多
关键词 Emotion recognition facial landmarks Graph mining gSpan algorithm Binary cat swarm optimization(BCSO) Neural network
下载PDF
人脸关键点检测研究综述
13
作者 张晓行 田启川 +1 位作者 廉露 谭润 《计算机工程与应用》 CSCD 北大核心 2024年第12期48-60,共13页
随着计算机视觉等技术的快速发展,人机交互、医疗辅助、安防监控等领域迅速崛起,人脸关键点检测作为其中一项重要任务备受关注,它可以在图像或视频中定位和检测人脸关键点,具有很高的实用价值。通过对人脸关键点检测方法研究现状的梳理... 随着计算机视觉等技术的快速发展,人机交互、医疗辅助、安防监控等领域迅速崛起,人脸关键点检测作为其中一项重要任务备受关注,它可以在图像或视频中定位和检测人脸关键点,具有很高的实用价值。通过对人脸关键点检测方法研究现状的梳理和分析,将其分为传统的人脸关键点检测方法和基于深度学习的人脸关键点检测方法;对比分析了各类方法的原理及优缺点,介绍常用数据集和评价指标,全面评估了重点方法在不同数据集上的性能表现;归纳人脸关键点检测应用领域,展望其未来发展方向。 展开更多
关键词 人脸关键点检测 深度学习 传统人脸关键点检测
下载PDF
融合人脸图像深度和外观特征的BMI估计方法
14
作者 向成豪 郑秀娟 +1 位作者 庄嘉良 张畅 《传感器与微系统》 CSCD 北大核心 2024年第1期135-138,144,共5页
身体质量指数(BMI)是人类健康重要指标。从2D正脸图像中估计3D人脸信息并提出一个端到端BMI估计框架,以进一步提高BMI估计性能。首先,计算人脸468个3D关键点,并根据关键点相对头部质心的深度绘制深度人脸图;其次,提取人脸图像的方向梯... 身体质量指数(BMI)是人类健康重要指标。从2D正脸图像中估计3D人脸信息并提出一个端到端BMI估计框架,以进一步提高BMI估计性能。首先,计算人脸468个3D关键点,并根据关键点相对头部质心的深度绘制深度人脸图;其次,提取人脸图像的方向梯度直方图(HOG)并可视化以表示外观特征;最后,利用卷积神经网络(CNN)VGGNet、ResNet分别对深度人脸图和HOG进行特征提取,并使用Hadamard积融合2个骨干网络的特征以估计BMI。与目前已有方法的对比实验中,本文提出方法在2个公开数据集上的整体平均绝对误差(MAE)分别比最优结果低0.38和1。上述实验结果证明了本文提出的融合3D人脸图像深度和外观特征的BMI估计方法的有效性。 展开更多
关键词 身体质量指数估计 人脸3D关键点 人脸网格模型 方向梯度直方图 深度卷积神经网络
下载PDF
Research on Facial Expression Capture Based on Two-Stage Neural Network
15
作者 Zhenzhou Wang Shao Cui +1 位作者 Xiang Wang JiaFeng Tian 《Computers, Materials & Continua》 SCIE EI 2022年第9期4709-4725,共17页
To generate realistic three-dimensional animation of virtual character,capturing real facial expression is the primary task.Due to diverse facial expressions and complex background,facial landmarks recognized by exist... To generate realistic three-dimensional animation of virtual character,capturing real facial expression is the primary task.Due to diverse facial expressions and complex background,facial landmarks recognized by existing strategies have the problem of deviations and low accuracy.Therefore,a method for facial expression capture based on two-stage neural network is proposed in this paper which takes advantage of improved multi-task cascaded convolutional networks(MTCNN)and high-resolution network.Firstly,the convolution operation of traditional MTCNN is improved.The face information in the input image is quickly filtered by feature fusion in the first stage and Octave Convolution instead of the original ones is introduced into in the second stage to enhance the feature extraction ability of the network,which further rejects a large number of false candidates.The model outputs more accurate facial candidate windows for better landmarks recognition and locates the faces.Then the images cropped after face detection are input into high-resolution network.Multi-scale feature fusion is realized by parallel connection of multi-resolution streams,and rich high-resolution heatmaps of facial landmarks are obtained.Finally,the changes of facial landmarks recognized are tracked in real-time.The expression parameters are extracted and transmitted to Unity3D engine to drive the virtual character’s face,which can realize facial expression synchronous animation.Extensive experimental results obtained on the WFLW database demonstrate the superiority of the proposed method in terms of accuracy and robustness,especially for diverse expressions and complex background.The method can accurately capture facial expression and generate three-dimensional animation effects,making online entertainment and social interaction more immersive in shared virtual space. 展开更多
关键词 facial expression capture facial landmarks multi-task cascaded convolutional networks high-resolution network animation generation
下载PDF
基于改进SSD算法的学生面部及其关键点检测研究
16
作者 梁岱立 殷文雪 +2 位作者 肖艳辉 李嘉欣 李鑫 《工业控制计算机》 2024年第2期133-134,162,共3页
课堂讲授是教学活动的主要形式,学生的课堂学习状态可以通过其面部信息反映出来,为了准确地获取学生的面部信息,提出了一种更加适用于教室中学生的面部及面部关键点检测模型。由于教室中后排落座的同学属于小目标,为了准确检测后排同学... 课堂讲授是教学活动的主要形式,学生的课堂学习状态可以通过其面部信息反映出来,为了准确地获取学生的面部信息,提出了一种更加适用于教室中学生的面部及面部关键点检测模型。由于教室中后排落座的同学属于小目标,为了准确检测后排同学,将网络前端的三张特征图进行了融合,使网络对小目标的检测能力有所提升;根据学生面部生理结构的特点,改进了预测候选框的生成比例。为了验证算法的先进性,设计了对比实验,所提的模型在三种难度的数据集上均表现最优。 展开更多
关键词 SSD模型 人脸关键点检测 特征融合 学生面部检测
下载PDF
超轻量人脸关键点检测算法 被引量:4
17
作者 朱望纯 张博 《电子测量技术》 北大核心 2023年第5期98-104,共7页
随着深度学习网络研究的深入和网络模型精度的提高,网络层数及深度在逐渐增加,导致计算量增大。同时,基于深度学习模型人脸关键点检测在嵌入式设备上部署的需求,轻量化、高效和准确的网络模型成为研究关键。因此,本文设计了一个基于Ghos... 随着深度学习网络研究的深入和网络模型精度的提高,网络层数及深度在逐渐增加,导致计算量增大。同时,基于深度学习模型人脸关键点检测在嵌入式设备上部署的需求,轻量化、高效和准确的网络模型成为研究关键。因此,本文设计了一个基于Ghost Model块和Ghost Bottleneck架构的超轻量型人脸关键点检测算法,在确保网络精度的同时,尽可能减小网络模型大小,降低计算量。在网络宽度因子为1X的情况下,与现有表现最好的轻量化网络模型PFLD 1X相比,归一化平均误差降低了7%,参数量减小了36%;在宽度因子为0.25X的情况下,本论文提出的网络模型大小仅420 KB,归一化平均误差降低了6.6%,参数量减小了25%。 展开更多
关键词 超轻量化 深度学习 人脸关键点
下载PDF
基于改进SBR算法的人脸特征点稳定检测 被引量:1
18
作者 王宇 胡哲昊 +5 位作者 涂晓光 刘建华 蒋涛 许将军 原子昊 杜金花 《电讯技术》 北大核心 2023年第5期719-724,共6页
基于图像的特征点检测器在静态图像上取得了卓越的性能,然而这些方法应用于视频或序列图像时其精度和稳定性显著降低。配准监督(Supervision-by-Registration,SBR)算法利用光流算法(Lucas-Kanade,LK)追踪,可通过无标注视频训练针对视频... 基于图像的特征点检测器在静态图像上取得了卓越的性能,然而这些方法应用于视频或序列图像时其精度和稳定性显著降低。配准监督(Supervision-by-Registration,SBR)算法利用光流算法(Lucas-Kanade,LK)追踪,可通过无标注视频训练针对视频的特征点检测器,已取得较好的结果,但LK算法仍存在一定局限性,导致检测的特征点序列在时空上的连贯性不强。为获得精准、稳定、连贯的人脸特征点序列检测效果,提出了平滑一致性损失函数、权重掩码函数对传统SBR网络模型进行改进。网络中添加长短期记忆网络(Long Short-Term Memory,LSTM)提高模型训练鲁棒性,在模型训练中使用平滑一致性损失函数提供稳定性约束,获得准确且稳定的人脸视频特征点检测器。在300VW、Youtube Celebrities数据集上的验证显示,SBR改进模型将人脸视频特征点检测的标准化平均误差(Normalized Mean Error,NME)从4.74降低至4.56,且视觉上人脸特征点检测的抖动显著减少。 展开更多
关键词 人脸特征点检测 配准监督(SBR)算法 长短期记忆(LSTM)网络 LK光流算法
下载PDF
基于Transformer人像关键点检测网络的研究 被引量:3
19
作者 陈凯 林珊玲 +3 位作者 林坚普 林志贤 缪志辉 郭太良 《计算机应用研究》 CSCD 北大核心 2023年第6期1870-1875,1881,共7页
为解决目前基于卷积网络的关键点检测模型无法建模远距离关键点之间关系的问题,提出一种Transformer与CNN(卷积网络)多分支并行的人像关键点检测网络,称为MCTN(multi-branch convolution-Transformer network),其利用Transformer的动态... 为解决目前基于卷积网络的关键点检测模型无法建模远距离关键点之间关系的问题,提出一种Transformer与CNN(卷积网络)多分支并行的人像关键点检测网络,称为MCTN(multi-branch convolution-Transformer network),其利用Transformer的动态注意力机制建模关键点之间的远距离联系,多分支并行的结构设计使得MCTN包含共享权重、全局信息融合等特点。此外,提出一种新型的Transformer结构,称为Deformer,它可以将注意力权重更快地集中在稀疏且有意义的位置,解决Transformer收敛缓慢的问题;在WFLW、300W、COFW数据集的人像关键点检测实验中,归一化平均误差分别达到4.33%、3.12%、3.15%,实验结果表明,MCTN利用Transformer与CNN多分支并联结构和Deformer结构,性能大幅超越基于卷积网络的关键点检测算法。 展开更多
关键词 计算机视觉 深度学习 人脸关键点检测 自注意力 TRANSFORMER
下载PDF
基于面部倒立摆模型与信息熵的驾驶员疲劳检测 被引量:1
20
作者 李泰国 张天策 +1 位作者 李超 周星宏 《交通运输系统工程与信息》 EI CSCD 北大核心 2023年第5期24-32,共9页
驾驶员疲劳检测的研究有助于降低交通事故的发生。本文提出一种基于面部倒立摆模型与信息熵的疲劳检测方法,首先,采用PFLD(Practical Facial Landmark Detector)模型检测驾驶员面部关键点坐标,并估计用于表示头部姿态信息的Pitch、Yaw以... 驾驶员疲劳检测的研究有助于降低交通事故的发生。本文提出一种基于面部倒立摆模型与信息熵的疲劳检测方法,首先,采用PFLD(Practical Facial Landmark Detector)模型检测驾驶员面部关键点坐标,并估计用于表示头部姿态信息的Pitch、Yaw以及Roll角度值;然后,以关键点坐标为输入建立面部倒立摆模型,计算模型中连杆系统在驾驶员驾驶过程中的动能和势能;之后,以倒立摆模型的动能、势能以及头部姿态数据作为衡量驾驶员疲劳状态变化的指示特征,基于滑动窗口计算各疲劳特征的信息熵值;通过CNN(Convolutional Neural Networks)处理疲劳特征信息熵值,建立信息熵与驾驶员疲劳状态之间的联系;最后,将CNN在各个时间点上的输出作为LSTM(Long Hort-term Memory)网络的输入特征,通过CNN-LSTM模型实现疲劳特征信息熵的分类预测。实验结果表明,所提模型的预测结果达到95.04%,验证了本文方法的有效性。 展开更多
关键词 智能交通 疲劳检测 倒立摆模型 面部关键点 头部姿态
下载PDF
上一页 1 2 6 下一页 到第
使用帮助 返回顶部