三维场景注视点渲染深度学习方法综述

Deep learning-based foveated rendering in 3D space:a review

导出

摘要在大型高分辨显示器和头戴式显式设备中实现实时、逼真的渲染仍然是计算机图形学面临的主要挑战之一。注视点渲染(foveated rendering)利用人类视觉系统的局限性,根据注视点调整图像渲染质量,从而在不损失用户感知质量的前提下大大提高渲染速度。随着深度学习方法在渲染领域的广泛应用,涌现出大量基于深度学习的注视点渲染新方法。本文从深度学习的角度对注视点渲染领域的最新方法进行综述。首先,概述了人类视觉感知的背景知识。接着,简要介绍了注视点渲染中最具代表性的非深度学习方法,包括自适应分辨率、几何简化、着色简化和硬件实现,并总结了这些方法的优缺点。随后,描述了文中用于评估深度学习不同方法所使用的评估准则,包括常用的注视点渲染图像的评估指标和注视点预测评估指标。接下来,将注视点渲染中的深度学习方法细分为超分辨率、降噪、补全、图像合成、注视点预测和图像应用,对它们进行详细概述和总结。最后,提出了深度学习方法目前面临的问题和挑战。通过对注视点渲染领域的深度学习方法的讨论,可以更详细地展示深度学习在注视点渲染中的研究前景和发展方向,对后续研究人员在选择研究方向和设计网络架构等方面都有一定的参考价值。 The widespread adoption of virtual reality(VR)and augmented reality technologies across various sectors,including healthcare,education,military,and entertainment,has propelled head-mounted displays with high resolution and wide fields of view into the forefront of display devices.However,attaining a satisfactory level of immersion and interactivity poses a primary challenge in the realm of VR,with latency potentially leading to user discomfort in the form of dizziness and nausea.Multiple studies have underscored the necessity of achieving a highly realistic VR experience while maintaining user comfort,entailing the elevation of the screen’s image refresh rate to 1800 Hz and keeping latency below 3~40 ms.Achieving real-time,photorealistic rendering at high resolution and low latency represents a formidable objective.Foveated rendering is an effective approach to address these issues by adjusting the rendering quality across the image based on gaze position,maintaining high quality in the fovea area while reducing quality in the periphery.This technique leads to substantial computational savings and improved rendering speed without a perceptible loss in visual quality.While previous reviews have examined technical approaches to foveated rendering,they focused more on categorizing the imple mentation techniques.A comprehensive review within the domain of machine learning still needs to be explored.With the ongoing advancements in machine learning within the rendering field,combining machine learning and foveated rendering is considered a promising research area,especially in postprocessing,where machine learning methods have great potential.Nonmachine learning methods inevitably introduce artifacts.By contrast,machine learning methods have a wide range of applications in the postprocessing domain of rendering to optimize and improve foveated rendering results and enhance the realism and immersion of foveated images in a manner unattainable through nonmachine learning approaches.Therefore,this work presents a comprehensive overview of foveated rendering from a machine-learning perspective.In this paper,we first provide an overview of the background knowledge of human visual perception,including aspects of the human visual system,contrast sensitivity functions,visual acuity models,and visual crowding.Subsequently,this paper briefly describes the most representative nonmachine learning methods for point-of-attention rendering,including adaptive resolution,geometric simplification,shading simplification,and hardware implementation,and summarizes these methods’features,advantages,and disadvantages.Additionally,we describe the criteria employed for method evaluation in this review,including evaluation metrics for foveated images and gaze-point prediction.Next,we subdivide machine learning methods into super-resolution,denoise,image reconstruction,image synthesis,gaze prediction,and image application.We provide a detailed summary of them in terms of four aspects:results quality,network speed,user experience,and the ability to handle objects.Among them,super-resolution methods commonly use more neural blocks in the foveal region while fewer neural blocks in the periphery region,resulting in variable regional super-resolution quality.Similarly,foveated denoising usually performs fine denoising in the fovea and coarse denoising in the peripheral,but the denoising aspect has yet to receive extensive attention.The initial attempt to integrate image reconstruction with gaze utilized generative adversarial networks(GANs),yielding promising outcomes.Then,some researchers combined direct prediction and kernel prediction for image reconstruction,which is also the state of the art in this field.Gaze prediction is a key development direction for future VR rendering,which is mostly combined with saliency detection to predict the location of the viewpoint.Substantial work remains in the field,but unfortunately,only a tiny portion of the work can be achieved in real time.Finally,we present the current problems and challenges machine learning methods face.Our review of machine learning approaches in foveated rendering not only elucidates the research prospects and developmental direction but also provides insights for future researchers in choosing research direction and designing network architectures.

作者李英群胡啸徐翔徐延宁王璐 Li Yingqun;Hu Xiao;Xu Xiang;Xu Yanning;Wang Lu(School of Software,Shandong University,Jinan 250101,China;Shandong Key Laboratory of Blockchain Finance,Shandong University of Finance and Economics,Jinan 250014,China)

机构地区山东大学软件学院山东财经大学山东省区块链金融重点实验室

出处《中国图象图形学报》 CSCD 北大核心 2024年第10期2955-2978,共24页 Journal of Image and Graphics

基金国家重点研发计划资助(2022YFB3303203) 国家自然科学基金项目(62272275)。

关键词注视点渲染深度学习实时渲染注视点预测图像补全超分辨率光路追踪降噪 foveated rendering deep learning real-time rendering eye fixations prediction image reconstruction super-resolution ray tracing denoising

分类号 TP391 [自动化与计算机技术—计算机应用技术]

引文网络
相关文献

参考文献6

1Lili Wang,Xuehuai Shi,Yi Liu.Foveated rendering:A state-of-the-art survey[J].Computational Visual Media,2023,9(2):195-228. 被引量：1
2王自全,张永生,于英,闵杰,田浩.深度学习背景下视觉显著性物体检测综述[J].中国图象图形学报,2022,27(7):2112-2128. 被引量：8
3杨航,陈瑞,安仕鹏,魏豪,张衡.深度学习背景下的图像三维重建技术进展综述[J].中国图象图形学报,2023,28(8):2396-2409. 被引量：13
4赵永强,饶元,董世鹏,张君毅.深度学习目标检测方法综述[J].中国图象图形学报,2020,25(4):629-654. 被引量：221
5潘晓英,贾凝心,穆元震,高炫蓉.小目标检测研究综述[J].中国图象图形学报,2023,28(9):2587-2615. 被引量：16
6江俊君,程豪,李震宇,刘贤明,王中元.深度学习视频超分辨率技术综述[J].中国图象图形学报,2023,28(7):1927-1964. 被引量：5

二级参考文献29

1周亮,朱秀昌.基于Bayesian理论的压缩视频超分辨率重构算法[J].中国图象图形学报,2006,11(5):730-735. 被引量：2
2何小海,吴媛媛,陈为龙,卿粼波.视频超分辨率重建技术综述[J].信息与电子工程,2011,9(1):1-6. 被引量：9
3张义轮,干宗良,朱秀昌.相似性约束的视频超分辨率重建[J].中国图象图形学报,2013,18(7):761-767. 被引量：3
4张岩,李建增,李德良,杜玉龙.无人机侦察视频超分辨率重建方法[J].中国图象图形学报,2016,21(7):967-976. 被引量：9
5吴洋,樊桂花.视频序列超分辨率重构技术综述[J].软件,2017,38(4):154-160. 被引量：5
6徐诚极,王晓峰,杨亚东.Attention-YOLO:引入注意力机制的YOLO检测算法[J].计算机工程与应用,2019,55(6):13-23. 被引量：69
7吴博剑,黄惠.透明物体的三维重建综述[J].计算机辅助设计与图形学学报,2020,32(2):173-180. 被引量：5
8刘颖,刘红燕,范九伦,公衍超,李莹华,王富平,卢津.基于深度学习的小目标检测研究与应用综述[J].电子学报,2020,48(3):590-601. 被引量：91
9郑太雄,黄帅,李永福,冯明驰.基于视觉的三维重建关键技术研究综述[J].自动化学报,2020,46(4):631-652. 被引量：105
10麻森权,周克.基于注意力机制和特征融合改进的小目标检测算法[J].计算机应用与软件,2020,37(5):194-199. 被引量：14

共引文献258

1程林,柏杨,都昌平,薛翔天,章品正,於文雪,王世杰,陈阳.基于深度学习的X光地铁危险物品检测算法[J].中国体视学与图像分析,2021,26(3):301-309. 被引量：2
2徐哲壮,黄平,陈丹,吴开田,李建坤.融合机器视觉与邻近度估计的相似工业设备识别策略研究[J].仪器仪表学报,2023,44(1):283-290. 被引量：3
3赵朗月,吴一全.基于机器视觉的表面缺陷检测方法研究进展[J].仪器仪表学报,2022,43(1):198-219. 被引量：85
4黎国溥,陈升东,王亮,邹凯,袁峰.基于改进YOLOv5的车辆端目标检测[J].计算机系统应用,2022,31(12):127-134. 被引量：7
5孔刘玲,刘秀文.基于改进YOLOv4算法的船舶目标检测方法[J].船舶工程,2022,44(1):96-103. 被引量：10
6陈涛.目标检测在数字人文图像中的应用尝试[J].数字人文研究,2021,1(3):39-50. 被引量：2
7胡伏原,李林燕,尚欣茹,沈军宇,戴永良.基于卷积神经网络的目标检测算法综述[J].苏州科技大学学报（自然科学版）,2020,37(2):1-10. 被引量：20
8唐悦,吴戈,朴燕.改进的GDT-YOLOV3目标检测算法[J].液晶与显示,2020,35(8):852-860. 被引量：10
9杨朝红,王伟男.基于优化SSD300的小尺度典型军事目标识别方法研究[J].电脑与信息技术,2020,28(4):19-22. 被引量：5
10赵伟,王正平,张晓辉,向乾,贺云涛.面向疫情防控的无人机关键技术综述[J].无人系统技术,2020,3(3):8-18. 被引量：9

1冯慧琼.5G数据业务满意度提升方法研究[J].通讯世界,2024,31(7):21-23.
2傅院霞,胡守彬,王莉,高慧.可拆卸式光栅单色仪实验设计[J].巢湖学院学报,2023,25(6):153-158.
3李吉洋,程乐超,何靖璇,王章野.神经辐射场的研究现状与展望[J].计算机辅助设计与图形学学报,2024,36(7):995-1013.
4陈秋珍,康兰兰,关嘉辉.商用网络NR小区边缘用户感知优化策略分析[J].广东通信技术,2024,44(7):2-8.
5葛亚明,戴上,梁文腾,李言,宋东阔,陈金,周霞,单宇.基于气象融合与深度学习的分布式光伏出力区间预测[J].电网与清洁能源,2024,40(8):112-120.
6徐翔,吴小龙,陈子凌,陈然,徐延宁,王璐.大规模三维场景光线追踪渲染方法综述[J].计算机辅助设计与图形学学报,2024,36(8):1155-1170.
7李成.大规模复杂时空数据可视化研究[J].长江信息通信,2024,37(8):100-102.
8吕丽萍.数字虚拟现实教育平台设计与开发[J].中国宽带,2023,19(11):76-78.
9苏华权,黄彬系.基于数据挖掘与时间序列的用电量波动风险预警模型[J].微型电脑应用,2024,40(10):194-197.
10王晓萌,方梦园.一种加速渲染NeRF烘焙数据的方法[J].软件工程,2024,27(11):53-56.

中国图象图形学报

2024年第10期

浏览历史

内容加载中请稍等...

三维场景注视点渲染深度学习方法综述

参考文献6

二级参考文献29

共引文献258

相关作者

相关机构

相关主题

浏览历史