Background In this study,we propose a novel 3D scene graph prediction approach for scene understanding from point clouds.Methods It can automatically organize the entities of a scene in a graph,where objects are nodes...Background In this study,we propose a novel 3D scene graph prediction approach for scene understanding from point clouds.Methods It can automatically organize the entities of a scene in a graph,where objects are nodes and their relationships are modeled as edges.More specifically,we employ the DGCNN to capture the features of objects and their relationships in the scene.A Graph Attention Network(GAT)is introduced to exploit latent features obtained from the initial estimation to further refine the object arrangement in the graph structure.A one loss function modified from cross entropy with a variable weight is proposed to solve the multi-category problem in the prediction of object and predicate.Results Experiments reveal that the proposed approach performs favorably against the state-of-the-art methods in terms of predicate classification and relationship prediction and achieves comparable performance on object classification prediction.Conclusions The 3D scene graph prediction approach can form an abstract description of the scene space from point clouds.展开更多
Color,as a significant element of village landscapes,serves various functions such as enhancing aesthetic appeal and attractiveness,conveying emotions and cultural values.To explore the three-dimensional spatial chara...Color,as a significant element of village landscapes,serves various functions such as enhancing aesthetic appeal and attractiveness,conveying emotions and cultural values.To explore the three-dimensional spatial characteristics of color landscapes in beautiful villages,this study conducted a comparative experiment involving eight provinciallevel beautiful villages and eight ordinary villages in Jinzhai County.Landscape pattern indices were used to analyze the color landscape patterns on the facades of these villages,complemented by a quantitative analysis of color attributes using theMunsell color system.The results indicate that(1)Natural landscape colors in beautiful villages are primarily concentrated in the yellow-red to green-yellow interval,while those in ordinary villages are widely distributed in the red to blue-green interval.Artificial landscape colors in beautiful villages aremainly characterized by medium value,with chroma concentrated in the low chroma range.(2)The proportion of color areas for forests,grasslands,and building walls in beautiful villages is higher by 14.76%,2.17%,and 5.16%,respectively,compared to ordinary villages.However,the proportion of yellow exposed areas in ordinary villages ismore than twice that of beautiful villages.(3)The Landscape Shape Index for forests,grasslands,and buildings in beautiful villages is 5.23,8.01,and 8.19,respectively,indicating a higher irregularity in color patches.(4)Ordinary villages exhibit a higher Shannon’s diversity index,indicating amore complex distribution of colors,whereas beautiful villages demonstrate a higher number of connected dominant patches.This study can provide a scientific basis for village color planning and layout.展开更多
With the support of edge computing,the synergy and collaboration among central cloud,edge cloud,and terminal devices form an integrated computing ecosystem known as the cloud-edge-client architecture.This integration ...With the support of edge computing,the synergy and collaboration among central cloud,edge cloud,and terminal devices form an integrated computing ecosystem known as the cloud-edge-client architecture.This integration unlocks the value of data and computational power,presenting significant opportunities for large-scale 3D scene modeling and XR presentation.In this paper,we explore the perspectives and highlight new challenges in 3D scene modeling and XR presentation based on point cloud within the cloud-edge-client integrated architecture.We also propose a novel cloud-edge-client integrated technology framework and a demonstration of municipal governance application to address these challenges.展开更多
in this poper a novel data-and rule-driven system for 3D scene description and segmentation inan unknown environment is presented.This system generatss hierachies of features that correspond tostructural elements such...in this poper a novel data-and rule-driven system for 3D scene description and segmentation inan unknown environment is presented.This system generatss hierachies of features that correspond tostructural elements such as boundaries and shape classes of individual object as well as relationshipsbetween objects.It is implemented as an added high-level component to an existing low-level binocularvision system[1]. Based on a pair of matched stereo images produced by that system,3D segmentation is firstperformed to group object boundary data into several edge-sets,each of which is believed to belong to aparticular object.Then gross features of each object are extracted and stored in an object recbrd.The finalstructural description of the scene is accomplished with information in the object record,a set of rules and arule implementor. The System is designed to handle partially occluded objects of different shapes and sizeson the 2D imager.Experimental results have shown its success in computing both object and structurallevel descriptions of common man-made objects.展开更多
In this paper,we propose a Structure-Aware Fusion Network(SAFNet)for 3D scene understanding.As 2D images present more detailed information while 3D point clouds convey more geometric information,fusing the two complem...In this paper,we propose a Structure-Aware Fusion Network(SAFNet)for 3D scene understanding.As 2D images present more detailed information while 3D point clouds convey more geometric information,fusing the two complementary data can improve the discriminative ability of the model.Fusion is a very challenging task since 2D and 3D data are essentially different and show different formats.The existing methods first extract 2D multi-view image features and then aggregate them into sparse 3D point clouds and achieve superior performance.However,the existing methods ignore the structural relations between pixels and point clouds and directly fuse the two modals of data without adaptation.To address this,we propose a structural deep metric learning method on pixels and points to explore the relations and further utilize them to adaptively map the images and point clouds into a common canonical space for prediction.Extensive experiments on the widely used ScanNetV2 and S3DIS datasets verify the performance of the proposed SAFNet.展开更多
As an important technology of digital construction,real 3D models can improve the immersion and realism of virtual reality(VR)scenes.The large amount of data for real 3D scenes requires more effective rendering method...As an important technology of digital construction,real 3D models can improve the immersion and realism of virtual reality(VR)scenes.The large amount of data for real 3D scenes requires more effective rendering methods,but the current rendering optimization methods have some defects and cannot render real 3D scenes in virtual reality.In this study,the location of the viewing frustum is predicted by a Kalman filter,and eye-tracking equipment is used to recognize the region of interest(ROI)in the scene.Finally,the real 3D model of interest in the predicted frustum is rendered first.The experimental results show that the method of this study can predict the frustrum location approximately 200 ms in advance,the prediction accuracy is approximately 87%,the scene rendering efficiency is improved by 8.3%,and the motion sickness is reduced by approximately 54.5%.These studies help promote the use of real 3D models in virtual reality and ROI recognition methods.In future work,we will further improve the prediction accuracy of viewing frustums in virtual reality and the application of eye tracking in virtual geographic scenes.展开更多
The increasing scale and complexity of 3D scene design work urge an efficient way to understand the design in multi-disciplinary team and exploit the experiences and underlying knowledge in previous works for reuse.Ho...The increasing scale and complexity of 3D scene design work urge an efficient way to understand the design in multi-disciplinary team and exploit the experiences and underlying knowledge in previous works for reuse.However the previous researches lack of concerning on relationship maintaining and design reuse in knowledge level.We propose a novel semantic driven design reuse system,including a property computation algorithm that enables our system to compute the properties while modeling process to maintain the semantic consistency,and a vertex statics based algorithm that enables the system to recognize scene design pattern as universal semantic model for the same type of scenes.With the universal semantic model,the system conducts the modeling process of future design works by suggestions and constraints on operation.The proposed framework empowers the reuse of 3D scene design on both model level and knowledge level.展开更多
Three-dimensional(3D)high-fidelity surface models play an important role in urban scene construction.However,the data quantity of such models is large and places a tremendous burden on rendering.Many applications must...Three-dimensional(3D)high-fidelity surface models play an important role in urban scene construction.However,the data quantity of such models is large and places a tremendous burden on rendering.Many applications must balance the visual quality of the models with the rendering efficiency.The study provides a practical texture baking processing pipeline for generating 3D models to reduce the model complexity and preserve the visually pleasing details.Concretely,we apply a mesh simplification to the original model and use texture baking to create three types of baked textures,namely,a diffuse map,normal map and displacement map.The simplified model with the baked textures has a pleasing visualization effect in a rendering engine.Furthermore,we discuss the influence of various factors in the process on the results,as well as the functional principles and characteristics of the baking textures.The proposed approach is very useful for real-time rendering with limited rendering hardware as no additional memory or computing capacity is required for properly preserving the relief details of the model.Each step in the pipeline is described in detail to facilitate the realization.展开更多
对于场景流估计来说,提取到丰富的全局相关性对于获取精确的特征匹配非常重要。本文提出了一种基于全局相关性的多尺度3D点云场景流估计网络(multi-scale 3D point cloud scene flow based on globalcorrelation,MGCSF),该网络引入了通...对于场景流估计来说,提取到丰富的全局相关性对于获取精确的特征匹配非常重要。本文提出了一种基于全局相关性的多尺度3D点云场景流估计网络(multi-scale 3D point cloud scene flow based on globalcorrelation,MGCSF),该网络引入了通道亲和性注意力(channel affinity attention,CAA)模块和逐点注意力模块(point-wise attention module,PAM),通过融合不同层次点云的特征信息以捕获全局性的运动趋势和变化,在一定程度上减少了点云特征信息丢失,从而可以更好地计算点云场景流。在关键数据集FlyingThings3D和KITTI上的实验性能均取得了一定的提升。与基线相比,在FlyingThings3D数据集上,三维端点误差(3D end-point-error,EPE3D)降低了13%,三维的严格准确率(3D accuracy strict,ACC3DS)提升了11%,三维的宽松准确率(3Daccuracyrelax,ACC3DR)提升了4.7%,三维异常值(3DOutliers,Outliers3D)降低了10.8%;在KITTI数据集上,全部点上的EPE3D(full EPE3D,EPE3Dfull)降低了10.7%,ACC3DS提升了2.1%,ACC3DR提升了1.7%,Outliers3D降低了5.5%。展开更多
基金Supported by National Natural Science Foundation of China(61872024)National Key R&D Program of China under Grant(2018YFB2100603).
文摘Background In this study,we propose a novel 3D scene graph prediction approach for scene understanding from point clouds.Methods It can automatically organize the entities of a scene in a graph,where objects are nodes and their relationships are modeled as edges.More specifically,we employ the DGCNN to capture the features of objects and their relationships in the scene.A Graph Attention Network(GAT)is introduced to exploit latent features obtained from the initial estimation to further refine the object arrangement in the graph structure.A one loss function modified from cross entropy with a variable weight is proposed to solve the multi-category problem in the prediction of object and predicate.Results Experiments reveal that the proposed approach performs favorably against the state-of-the-art methods in terms of predicate classification and relationship prediction and achieves comparable performance on object classification prediction.Conclusions The 3D scene graph prediction approach can form an abstract description of the scene space from point clouds.
基金the National Natural Science Foundation of China(Grant Number 42301478)Natural Science Foundation of Anhui Province(No.2208085QD108)the Major Project of Natural Science Research of Anhui Provincial Department of Education(Grant Number KJ2021ZD0130).
文摘Color,as a significant element of village landscapes,serves various functions such as enhancing aesthetic appeal and attractiveness,conveying emotions and cultural values.To explore the three-dimensional spatial characteristics of color landscapes in beautiful villages,this study conducted a comparative experiment involving eight provinciallevel beautiful villages and eight ordinary villages in Jinzhai County.Landscape pattern indices were used to analyze the color landscape patterns on the facades of these villages,complemented by a quantitative analysis of color attributes using theMunsell color system.The results indicate that(1)Natural landscape colors in beautiful villages are primarily concentrated in the yellow-red to green-yellow interval,while those in ordinary villages are widely distributed in the red to blue-green interval.Artificial landscape colors in beautiful villages aremainly characterized by medium value,with chroma concentrated in the low chroma range.(2)The proportion of color areas for forests,grasslands,and building walls in beautiful villages is higher by 14.76%,2.17%,and 5.16%,respectively,compared to ordinary villages.However,the proportion of yellow exposed areas in ordinary villages ismore than twice that of beautiful villages.(3)The Landscape Shape Index for forests,grasslands,and buildings in beautiful villages is 5.23,8.01,and 8.19,respectively,indicating a higher irregularity in color patches.(4)Ordinary villages exhibit a higher Shannon’s diversity index,indicating amore complex distribution of colors,whereas beautiful villages demonstrate a higher number of connected dominant patches.This study can provide a scientific basis for village color planning and layout.
基金the National Natural Science Foundation of China(U22B2034)the Fundamental Research Funds for the Central Universities(226-2022-00064).
文摘With the support of edge computing,the synergy and collaboration among central cloud,edge cloud,and terminal devices form an integrated computing ecosystem known as the cloud-edge-client architecture.This integration unlocks the value of data and computational power,presenting significant opportunities for large-scale 3D scene modeling and XR presentation.In this paper,we explore the perspectives and highlight new challenges in 3D scene modeling and XR presentation based on point cloud within the cloud-edge-client integrated architecture.We also propose a novel cloud-edge-client integrated technology framework and a demonstration of municipal governance application to address these challenges.
文摘in this poper a novel data-and rule-driven system for 3D scene description and segmentation inan unknown environment is presented.This system generatss hierachies of features that correspond tostructural elements such as boundaries and shape classes of individual object as well as relationshipsbetween objects.It is implemented as an added high-level component to an existing low-level binocularvision system[1]. Based on a pair of matched stereo images produced by that system,3D segmentation is firstperformed to group object boundary data into several edge-sets,each of which is believed to belong to aparticular object.Then gross features of each object are extracted and stored in an object recbrd.The finalstructural description of the scene is accomplished with information in the object record,a set of rules and arule implementor. The System is designed to handle partially occluded objects of different shapes and sizeson the 2D imager.Experimental results have shown its success in computing both object and structurallevel descriptions of common man-made objects.
基金supported by the National Natural Science Foundation of China(No.61976023)。
文摘In this paper,we propose a Structure-Aware Fusion Network(SAFNet)for 3D scene understanding.As 2D images present more detailed information while 3D point clouds convey more geometric information,fusing the two complementary data can improve the discriminative ability of the model.Fusion is a very challenging task since 2D and 3D data are essentially different and show different formats.The existing methods first extract 2D multi-view image features and then aggregate them into sparse 3D point clouds and achieve superior performance.However,the existing methods ignore the structural relations between pixels and point clouds and directly fuse the two modals of data without adaptation.To address this,we propose a structural deep metric learning method on pixels and points to explore the relations and further utilize them to adaptively map the images and point clouds into a common canonical space for prediction.Extensive experiments on the widely used ScanNetV2 and S3DIS datasets verify the performance of the proposed SAFNet.
基金supported by the National Natural Science Foundation of China(grant numbers U2034202,41871289,42171397)the Sichuan Science and Technology Program(grant number 2020JDTD0003).
文摘As an important technology of digital construction,real 3D models can improve the immersion and realism of virtual reality(VR)scenes.The large amount of data for real 3D scenes requires more effective rendering methods,but the current rendering optimization methods have some defects and cannot render real 3D scenes in virtual reality.In this study,the location of the viewing frustum is predicted by a Kalman filter,and eye-tracking equipment is used to recognize the region of interest(ROI)in the scene.Finally,the real 3D model of interest in the predicted frustum is rendered first.The experimental results show that the method of this study can predict the frustrum location approximately 200 ms in advance,the prediction accuracy is approximately 87%,the scene rendering efficiency is improved by 8.3%,and the motion sickness is reduced by approximately 54.5%.These studies help promote the use of real 3D models in virtual reality and ROI recognition methods.In future work,we will further improve the prediction accuracy of viewing frustums in virtual reality and the application of eye tracking in virtual geographic scenes.
基金the National Natural Science Foundation of China(Nos.61073086 and 70871078)the National High Technology Research and Development Program (863) of China(No.2008AA04Z126)
文摘The increasing scale and complexity of 3D scene design work urge an efficient way to understand the design in multi-disciplinary team and exploit the experiences and underlying knowledge in previous works for reuse.However the previous researches lack of concerning on relationship maintaining and design reuse in knowledge level.We propose a novel semantic driven design reuse system,including a property computation algorithm that enables our system to compute the properties while modeling process to maintain the semantic consistency,and a vertex statics based algorithm that enables the system to recognize scene design pattern as universal semantic model for the same type of scenes.With the universal semantic model,the system conducts the modeling process of future design works by suggestions and constraints on operation.The proposed framework empowers the reuse of 3D scene design on both model level and knowledge level.
基金supported by the Key Program of the National Natural Science Foundation of China[grant no 41930104].
文摘Three-dimensional(3D)high-fidelity surface models play an important role in urban scene construction.However,the data quantity of such models is large and places a tremendous burden on rendering.Many applications must balance the visual quality of the models with the rendering efficiency.The study provides a practical texture baking processing pipeline for generating 3D models to reduce the model complexity and preserve the visually pleasing details.Concretely,we apply a mesh simplification to the original model and use texture baking to create three types of baked textures,namely,a diffuse map,normal map and displacement map.The simplified model with the baked textures has a pleasing visualization effect in a rendering engine.Furthermore,we discuss the influence of various factors in the process on the results,as well as the functional principles and characteristics of the baking textures.The proposed approach is very useful for real-time rendering with limited rendering hardware as no additional memory or computing capacity is required for properly preserving the relief details of the model.Each step in the pipeline is described in detail to facilitate the realization.
文摘对于场景流估计来说,提取到丰富的全局相关性对于获取精确的特征匹配非常重要。本文提出了一种基于全局相关性的多尺度3D点云场景流估计网络(multi-scale 3D point cloud scene flow based on globalcorrelation,MGCSF),该网络引入了通道亲和性注意力(channel affinity attention,CAA)模块和逐点注意力模块(point-wise attention module,PAM),通过融合不同层次点云的特征信息以捕获全局性的运动趋势和变化,在一定程度上减少了点云特征信息丢失,从而可以更好地计算点云场景流。在关键数据集FlyingThings3D和KITTI上的实验性能均取得了一定的提升。与基线相比,在FlyingThings3D数据集上,三维端点误差(3D end-point-error,EPE3D)降低了13%,三维的严格准确率(3D accuracy strict,ACC3DS)提升了11%,三维的宽松准确率(3Daccuracyrelax,ACC3DR)提升了4.7%,三维异常值(3DOutliers,Outliers3D)降低了10.8%;在KITTI数据集上,全部点上的EPE3D(full EPE3D,EPE3Dfull)降低了10.7%,ACC3DS提升了2.1%,ACC3DR提升了1.7%,Outliers3D降低了5.5%。
文摘深度歧义是单帧图像多人3D姿态估计面临的重要挑战,提取图像上下文对缓解深度歧义极具潜力.自顶向下方法大多基于人体检测建模关键点关系,人体包围框粒度粗背景噪声占比较大,极易导致关键点偏移或误匹配,还将影响基于人体尺度因子估计绝对深度的可靠性.自底向上的方法直接检出图像中的人体关键点再逐一恢复3D人体姿态.虽然能够显式获取场景上下文,但在相对深度估计方面处于劣势.提出新的双分支网络,自顶向下分支基于关键点区域提议提取人体上下文,自底向上分支基于三维空间提取场景上下文.提出带噪声抑制的人体上下文提取方法,通过建模“关键点区域提议”描述人体目标,建模姿态关联的动态稀疏关键点关系剔除弱连接减少噪声传播.提出从鸟瞰视角提取场景上下文的方法,通过建模图像深度特征并映射鸟瞰平面获得三维空间人体位置布局;设计人体和场景上下文融合网络预测人体绝对深度.在公开数据集MuPoTS-3D和Human3.6M上的实验结果表明:与同类先进模型相比,所提模型HSC-Pose的相对和绝对3D关键点位置精度至少提高2.2%和0.5%;平均根关键点位置误差至少降低4.2 mm.