Pulse rate is one of the important characteristics of traditional Chinese medicine pulse diagnosis,and it is of great significance for determining the nature of cold and heat in diseases.The prediction of pulse rate b...Pulse rate is one of the important characteristics of traditional Chinese medicine pulse diagnosis,and it is of great significance for determining the nature of cold and heat in diseases.The prediction of pulse rate based on facial video is an exciting research field for getting palpation information by observation diagnosis.However,most studies focus on optimizing the algorithm based on a small sample of participants without systematically investigating multiple influencing factors.A total of 209 participants and 2,435 facial videos,based on our self-constructed Multi-Scene Sign Dataset and the public datasets,were used to perform a multi-level and multi-factor comprehensive comparison.The effects of different datasets,blood volume pulse signal extraction algorithms,region of interests,time windows,color spaces,pulse rate calculation methods,and video recording scenes were analyzed.Furthermore,we proposed a blood volume pulse signal quality optimization strategy based on the inverse Fourier transform and an improvement strategy for pulse rate estimation based on signal-to-noise ratio threshold sliding.We found that the effects of video estimation of pulse rate in the Multi-Scene Sign Dataset and Pulse Rate Detection Dataset were better than in other datasets.Compared with Fast independent component analysis and Single Channel algorithms,chrominance-based method and plane-orthogonal-to-skin algorithms have a more vital anti-interference ability and higher robustness.The performances of the five-organs fusion area and the full-face area were better than that of single sub-regions,and the fewer motion artifacts and better lighting can improve the precision of pulse rate estimation.展开更多
Deepfake technology can be used to replace people’s faces in videos or pictures to show them saying or doing things they never said or did. Deepfake media are often used to extort, defame, and manipulate public opini...Deepfake technology can be used to replace people’s faces in videos or pictures to show them saying or doing things they never said or did. Deepfake media are often used to extort, defame, and manipulate public opinion. However, despite deepfake technology’s risks, current deepfake detection methods lack generalization and are inconsistent when applied to unknown videos, i.e., videos on which they have not been trained. The purpose of this study is to develop a generalizable deepfake detection model by training convoluted neural networks (CNNs) to classify human facial features in videos. The study formulated the research questions: “How effectively does the developed model provide reliable generalizations?” A CNN model was trained to distinguish between real and fake videos using the facial features of human subjects in videos. The model was trained, validated, and tested using the FaceForensiq++ dataset, which contains more than 500,000 frames and subsets of the DFDC dataset, totaling more than 22,000 videos. The study demonstrated high generalizability, as the accuracy of the unknown dataset was only marginally (about 1%) lower than that of the known dataset. The findings of this study indicate that detection systems can be more generalizable, lighter, and faster by focusing on just a small region (the human face) of an entire video.展开更多
Image photoplethysmography can realize low-cost and easy-to-operate non-contact heart rate detection from the facial video, and effectively overcome the limitations of traditional contact method in daily vital sign mo...Image photoplethysmography can realize low-cost and easy-to-operate non-contact heart rate detection from the facial video, and effectively overcome the limitations of traditional contact method in daily vital sign monitoring. However, it is hard to obtain more accurate heart rate detection values under the conditions of subject’s facial movement, weak ambient light intensity and long detection distance, etc. In this article, a non-contact heart rate detection method based on face tracking is proposed, which can effectively improve the accuracy of non-contact heart rate detection method in practical application. The corner tracker algorithm is used to track the human face to reduce the motion artifact caused by the movement of the subject’s face and enhance the use value of the signal. And the maximum ratio combining algorithm is used to weight the pixel space pulse wave signal in the facial region of interest to improve the pulse wave extraction accuracy. We analyzed the facial images collected under different experimental distances and action states. This proposed method significantly reduces the error rate compared with the independent component analysis method. After theoretical analysis and experimental verification, this method effectively reduces the error rate under different experimental variables and has good consistency with the heart rate value collected by the medical physiological vest. This method will help to improve the accuracy of non-contact heart rate detection in complex environments.展开更多
目的:建立本土化的中国面部表情视频系统(chinese facial expression video system,CFEVS)以增加情绪研究的取材范围。方法:录制强度分为三等级的喜悦、悲伤、惊奇、恐惧、愤怒、厌恶及中性(无表情及咀嚼动作两种)等面部表情视频片段,...目的:建立本土化的中国面部表情视频系统(chinese facial expression video system,CFEVS)以增加情绪研究的取材范围。方法:录制强度分为三等级的喜悦、悲伤、惊奇、恐惧、愤怒、厌恶及中性(无表情及咀嚼动作两种)等面部表情视频片段,经两轮粗选后,请50名中国大学生对剩余视频片段的表情类型、愉悦度、唤醒度及表演者的长相进行自我报告式评定。将表情类型、愉悦度、唤醒度一致性高且表情类型与愉悦度相一致的片段纳入CFEVS,做分布分析,同时分析评测者性别、表演者长相对愉悦度、唤醒度分值的影响。结果:纳入CFEVS的喜悦表情男18女43共61个,悲伤表情男23女28共51个,无表情中性男13女17共31个,咀嚼中性男7女17共24个。散点图显示CFEVS在愉悦度及唤醒度上分布较为广泛。方差分析表明评测者性别及表演者长相对视频片段的愉悦度、唤醒度的影响与其表情类型有关。结论:本研究初步建立了一个拥有喜悦、悲伤及中性表情的CFEVS,并发现评测者的性别及表演者的长相可影响实验结果。展开更多
基金supported by the Key Research Program of the Chinese Academy of Sciences(grant number ZDRW-ZS-2021-1-2).
文摘Pulse rate is one of the important characteristics of traditional Chinese medicine pulse diagnosis,and it is of great significance for determining the nature of cold and heat in diseases.The prediction of pulse rate based on facial video is an exciting research field for getting palpation information by observation diagnosis.However,most studies focus on optimizing the algorithm based on a small sample of participants without systematically investigating multiple influencing factors.A total of 209 participants and 2,435 facial videos,based on our self-constructed Multi-Scene Sign Dataset and the public datasets,were used to perform a multi-level and multi-factor comprehensive comparison.The effects of different datasets,blood volume pulse signal extraction algorithms,region of interests,time windows,color spaces,pulse rate calculation methods,and video recording scenes were analyzed.Furthermore,we proposed a blood volume pulse signal quality optimization strategy based on the inverse Fourier transform and an improvement strategy for pulse rate estimation based on signal-to-noise ratio threshold sliding.We found that the effects of video estimation of pulse rate in the Multi-Scene Sign Dataset and Pulse Rate Detection Dataset were better than in other datasets.Compared with Fast independent component analysis and Single Channel algorithms,chrominance-based method and plane-orthogonal-to-skin algorithms have a more vital anti-interference ability and higher robustness.The performances of the five-organs fusion area and the full-face area were better than that of single sub-regions,and the fewer motion artifacts and better lighting can improve the precision of pulse rate estimation.
文摘Deepfake technology can be used to replace people’s faces in videos or pictures to show them saying or doing things they never said or did. Deepfake media are often used to extort, defame, and manipulate public opinion. However, despite deepfake technology’s risks, current deepfake detection methods lack generalization and are inconsistent when applied to unknown videos, i.e., videos on which they have not been trained. The purpose of this study is to develop a generalizable deepfake detection model by training convoluted neural networks (CNNs) to classify human facial features in videos. The study formulated the research questions: “How effectively does the developed model provide reliable generalizations?” A CNN model was trained to distinguish between real and fake videos using the facial features of human subjects in videos. The model was trained, validated, and tested using the FaceForensiq++ dataset, which contains more than 500,000 frames and subsets of the DFDC dataset, totaling more than 22,000 videos. The study demonstrated high generalizability, as the accuracy of the unknown dataset was only marginally (about 1%) lower than that of the known dataset. The findings of this study indicate that detection systems can be more generalizable, lighter, and faster by focusing on just a small region (the human face) of an entire video.
文摘Image photoplethysmography can realize low-cost and easy-to-operate non-contact heart rate detection from the facial video, and effectively overcome the limitations of traditional contact method in daily vital sign monitoring. However, it is hard to obtain more accurate heart rate detection values under the conditions of subject’s facial movement, weak ambient light intensity and long detection distance, etc. In this article, a non-contact heart rate detection method based on face tracking is proposed, which can effectively improve the accuracy of non-contact heart rate detection method in practical application. The corner tracker algorithm is used to track the human face to reduce the motion artifact caused by the movement of the subject’s face and enhance the use value of the signal. And the maximum ratio combining algorithm is used to weight the pixel space pulse wave signal in the facial region of interest to improve the pulse wave extraction accuracy. We analyzed the facial images collected under different experimental distances and action states. This proposed method significantly reduces the error rate compared with the independent component analysis method. After theoretical analysis and experimental verification, this method effectively reduces the error rate under different experimental variables and has good consistency with the heart rate value collected by the medical physiological vest. This method will help to improve the accuracy of non-contact heart rate detection in complex environments.
文摘目的:建立本土化的中国面部表情视频系统(chinese facial expression video system,CFEVS)以增加情绪研究的取材范围。方法:录制强度分为三等级的喜悦、悲伤、惊奇、恐惧、愤怒、厌恶及中性(无表情及咀嚼动作两种)等面部表情视频片段,经两轮粗选后,请50名中国大学生对剩余视频片段的表情类型、愉悦度、唤醒度及表演者的长相进行自我报告式评定。将表情类型、愉悦度、唤醒度一致性高且表情类型与愉悦度相一致的片段纳入CFEVS,做分布分析,同时分析评测者性别、表演者长相对愉悦度、唤醒度分值的影响。结果:纳入CFEVS的喜悦表情男18女43共61个,悲伤表情男23女28共51个,无表情中性男13女17共31个,咀嚼中性男7女17共24个。散点图显示CFEVS在愉悦度及唤醒度上分布较为广泛。方差分析表明评测者性别及表演者长相对视频片段的愉悦度、唤醒度的影响与其表情类型有关。结论:本研究初步建立了一个拥有喜悦、悲伤及中性表情的CFEVS,并发现评测者的性别及表演者的长相可影响实验结果。