Deepfake technology can be used to replace people’s faces in videos or pictures to show them saying or doing things they never said or did. Deepfake media are often used to extort, defame, and manipulate public opini...Deepfake technology can be used to replace people’s faces in videos or pictures to show them saying or doing things they never said or did. Deepfake media are often used to extort, defame, and manipulate public opinion. However, despite deepfake technology’s risks, current deepfake detection methods lack generalization and are inconsistent when applied to unknown videos, i.e., videos on which they have not been trained. The purpose of this study is to develop a generalizable deepfake detection model by training convoluted neural networks (CNNs) to classify human facial features in videos. The study formulated the research questions: “How effectively does the developed model provide reliable generalizations?” A CNN model was trained to distinguish between real and fake videos using the facial features of human subjects in videos. The model was trained, validated, and tested using the FaceForensiq++ dataset, which contains more than 500,000 frames and subsets of the DFDC dataset, totaling more than 22,000 videos. The study demonstrated high generalizability, as the accuracy of the unknown dataset was only marginally (about 1%) lower than that of the known dataset. The findings of this study indicate that detection systems can be more generalizable, lighter, and faster by focusing on just a small region (the human face) of an entire video.展开更多
目的:探讨基于深度学习(deep learning,DL)的ResNet+VST模型在超声心动图关键帧智能检测方面的可行性。方法:选取南京大学医学院附属鼓楼医院超声医学科采集的663个动态图像含心尖二腔(apical two chambers,A2C)、心尖三腔(apical three...目的:探讨基于深度学习(deep learning,DL)的ResNet+VST模型在超声心动图关键帧智能检测方面的可行性。方法:选取南京大学医学院附属鼓楼医院超声医学科采集的663个动态图像含心尖二腔(apical two chambers,A2C)、心尖三腔(apical three chambers,A3C)与心尖四腔(apical four chambers,A4C)3类临床检查常用切面以及EchoNet⁃Dynamic公开数据集中280个A4C切面动态图像,分别建立南京鼓楼医院数据集与EchoNet⁃Dynamic⁃Tiny数据集,各类别图像按4∶1方式划分为训练集和测试集,进行ResNet+VST模型的训练以及与多种关键帧检测模型的性能对比,验证ResNet+VST模型的先进性。结果:ResNet+VST模型能够更准确地检测心脏舒张末期(end⁃diastole,ED)与收缩末期(end⁃systole,ES)图像帧。在南京鼓楼医院数据集上,模型对A2C、A3C和A4C切面数据的ED预测帧差分别为1.52±1.09、1.62±1.43、1.27±1.17,ES预测帧差分别为1.56±1.16、1.62±1.43、1.45±1.38;在EchoNet⁃Dynamic⁃Tiny数据集上,模型对A4C切面数据的ED预测帧差为1.62±1.26,ES预测帧差为1.71±1.18,优于现有相关研究。此外,ResNet+VST模型有良好的实时性表现,在南京鼓楼医院数据集与EchoNet⁃Dynamic⁃Tiny数据集上,基于GTX 3090Ti GPU对16帧的超声序列片段推理的平均耗时分别为21 ms与10 ms,优于以长短期记忆单元(long short⁃term memory,LSTM)进行时序建模的相关研究,基本满足临床即时处理的需求。结论:本研究提出的ResNet+VST模型在超声心动图关键帧检测的准确性、实时性方面,相较于现有研究有更出色的表现,该模型原则上可推广到任何超声切面,有辅助超声医师提升诊断效率的潜力。展开更多
针对雾霾天气下获取的视频及图像存在雾化、模糊等问题,提出大气散射模型结合关联帧补偿的视频图像去雾及增强算法。首先,设计了多维空间权重注意力模块提取空间信息,转移不同特征信息权重,提高其利用率;其次,构造参数估计子网络提取大...针对雾霾天气下获取的视频及图像存在雾化、模糊等问题,提出大气散射模型结合关联帧补偿的视频图像去雾及增强算法。首先,设计了多维空间权重注意力模块提取空间信息,转移不同特征信息权重,提高其利用率;其次,构造参数估计子网络提取大气光和透射图,结合大气散射模型求取清晰图像;其次,提出关联帧补偿机制,利用视频帧间关联性提高参数估计准确度,降低网络学习难度;最后,设计多项式损失函数进一步提高输出质量。在多个数据集下的实验结果表明,算法处理后的结构相似性(Structure Similarity Index Measure,SSIM)和峰值信噪比(Peak Signal to Noise Ratio,PSNR)分别达到0.91和27.13 dB,均优于对比的经典及新颖算法,有效解决图像雾化问题的同时能增强纹理细节等特征,满足视频和图像实时去雾要求,为后续基于人工智能的视觉任务提供良好基础。展开更多
为解决水深45.000 m深海风机钢管桩基础安装作业可靠性差和精度低等问题,对一种新型深海风机钢管桩基础安装用导向架进行结构优化。采用有限元法(Finite Element Method, FEM)与试验相结合的方法,从环境参数与作用载荷、结构形式、作业...为解决水深45.000 m深海风机钢管桩基础安装作业可靠性差和精度低等问题,对一种新型深海风机钢管桩基础安装用导向架进行结构优化。采用有限元法(Finite Element Method, FEM)与试验相结合的方法,从环境参数与作用载荷、结构形式、作业工况和结构强度与结构稳定性等方面对导向架进行综合研究。经海试验证,优化的导向架的打桩精度与打桩高效性均满足技术指标要求,可大幅提高深海风机钢管桩基础安装作业速度和质量。展开更多
文摘Deepfake technology can be used to replace people’s faces in videos or pictures to show them saying or doing things they never said or did. Deepfake media are often used to extort, defame, and manipulate public opinion. However, despite deepfake technology’s risks, current deepfake detection methods lack generalization and are inconsistent when applied to unknown videos, i.e., videos on which they have not been trained. The purpose of this study is to develop a generalizable deepfake detection model by training convoluted neural networks (CNNs) to classify human facial features in videos. The study formulated the research questions: “How effectively does the developed model provide reliable generalizations?” A CNN model was trained to distinguish between real and fake videos using the facial features of human subjects in videos. The model was trained, validated, and tested using the FaceForensiq++ dataset, which contains more than 500,000 frames and subsets of the DFDC dataset, totaling more than 22,000 videos. The study demonstrated high generalizability, as the accuracy of the unknown dataset was only marginally (about 1%) lower than that of the known dataset. The findings of this study indicate that detection systems can be more generalizable, lighter, and faster by focusing on just a small region (the human face) of an entire video.
文摘目的:探讨基于深度学习(deep learning,DL)的ResNet+VST模型在超声心动图关键帧智能检测方面的可行性。方法:选取南京大学医学院附属鼓楼医院超声医学科采集的663个动态图像含心尖二腔(apical two chambers,A2C)、心尖三腔(apical three chambers,A3C)与心尖四腔(apical four chambers,A4C)3类临床检查常用切面以及EchoNet⁃Dynamic公开数据集中280个A4C切面动态图像,分别建立南京鼓楼医院数据集与EchoNet⁃Dynamic⁃Tiny数据集,各类别图像按4∶1方式划分为训练集和测试集,进行ResNet+VST模型的训练以及与多种关键帧检测模型的性能对比,验证ResNet+VST模型的先进性。结果:ResNet+VST模型能够更准确地检测心脏舒张末期(end⁃diastole,ED)与收缩末期(end⁃systole,ES)图像帧。在南京鼓楼医院数据集上,模型对A2C、A3C和A4C切面数据的ED预测帧差分别为1.52±1.09、1.62±1.43、1.27±1.17,ES预测帧差分别为1.56±1.16、1.62±1.43、1.45±1.38;在EchoNet⁃Dynamic⁃Tiny数据集上,模型对A4C切面数据的ED预测帧差为1.62±1.26,ES预测帧差为1.71±1.18,优于现有相关研究。此外,ResNet+VST模型有良好的实时性表现,在南京鼓楼医院数据集与EchoNet⁃Dynamic⁃Tiny数据集上,基于GTX 3090Ti GPU对16帧的超声序列片段推理的平均耗时分别为21 ms与10 ms,优于以长短期记忆单元(long short⁃term memory,LSTM)进行时序建模的相关研究,基本满足临床即时处理的需求。结论:本研究提出的ResNet+VST模型在超声心动图关键帧检测的准确性、实时性方面,相较于现有研究有更出色的表现,该模型原则上可推广到任何超声切面,有辅助超声医师提升诊断效率的潜力。
文摘针对雾霾天气下获取的视频及图像存在雾化、模糊等问题,提出大气散射模型结合关联帧补偿的视频图像去雾及增强算法。首先,设计了多维空间权重注意力模块提取空间信息,转移不同特征信息权重,提高其利用率;其次,构造参数估计子网络提取大气光和透射图,结合大气散射模型求取清晰图像;其次,提出关联帧补偿机制,利用视频帧间关联性提高参数估计准确度,降低网络学习难度;最后,设计多项式损失函数进一步提高输出质量。在多个数据集下的实验结果表明,算法处理后的结构相似性(Structure Similarity Index Measure,SSIM)和峰值信噪比(Peak Signal to Noise Ratio,PSNR)分别达到0.91和27.13 dB,均优于对比的经典及新颖算法,有效解决图像雾化问题的同时能增强纹理细节等特征,满足视频和图像实时去雾要求,为后续基于人工智能的视觉任务提供良好基础。
文摘为解决水深45.000 m深海风机钢管桩基础安装作业可靠性差和精度低等问题,对一种新型深海风机钢管桩基础安装用导向架进行结构优化。采用有限元法(Finite Element Method, FEM)与试验相结合的方法,从环境参数与作用载荷、结构形式、作业工况和结构强度与结构稳定性等方面对导向架进行综合研究。经海试验证,优化的导向架的打桩精度与打桩高效性均满足技术指标要求,可大幅提高深海风机钢管桩基础安装作业速度和质量。