This paper proposes an improved high-precision 3D semantic mapping method for indoor scenes using RGB-D images.The current semantic mapping algorithms suffer from low semantic annotation accuracy and insufficient real...This paper proposes an improved high-precision 3D semantic mapping method for indoor scenes using RGB-D images.The current semantic mapping algorithms suffer from low semantic annotation accuracy and insufficient real-time performance.To address these issues,we first adopt the Elastic Fusion algorithm to select key frames from indoor environment image sequences captured by the Kinect sensor and construct the indoor environment space model.Then,an indoor RGB-D image semantic segmentation network is proposed,which uses multi-scale feature fusion to quickly and accurately obtain object labeling information at the pixel level of the spatial point cloud model.Finally,Bayesian updating is used to conduct incremental semantic label fusion on the established spatial point cloud model.We also employ dense conditional random fields(CRF)to optimize the 3D semantic map model,resulting in a high-precision spatial semantic map of indoor scenes.Experimental results show that the proposed semantic mapping system can process image sequences collected by RGB-D sensors in real-time and output accurate semantic segmentation results of indoor scene images and the current local spatial semantic map.Finally,it constructs a globally consistent high-precision indoor scenes 3D semantic map.展开更多
Cross-modal semantic mapping and cross-media retrieval are key problems of the multimedia search engine.This study analyzes the hierarchy,the functionality,and the structure in the visual and auditory sensations of co...Cross-modal semantic mapping and cross-media retrieval are key problems of the multimedia search engine.This study analyzes the hierarchy,the functionality,and the structure in the visual and auditory sensations of cognitive system,and establishes a brain-like cross-modal semantic mapping framework based on cognitive computing of visual and auditory sensations.The mechanism of visual-auditory multisensory integration,selective attention in thalamo-cortical,emotional control in limbic system and the memory-enhancing in hippocampal were considered in the framework.Then,the algorithms of cross-modal semantic mapping were given.Experimental results show that the framework can be effectively applied to the cross-modal semantic mapping,and also provides an important significance for brain-like computing of non-von Neumann structure.展开更多
As a consumed and influential natural plant beverage,tea is widely planted in subtropical and tropical areas all over the world.Affected by(sub)tropical climate characteristics,the underlying surface of the tea distri...As a consumed and influential natural plant beverage,tea is widely planted in subtropical and tropical areas all over the world.Affected by(sub)tropical climate characteristics,the underlying surface of the tea distribution area is extremely complex,with a variety of vegetation types.In addition,tea distribution is scattered and fragmentized in most of China.Therefore,it is difficult to obtain accurate tea information based on coarse resolution remote sensing data and existing feature extraction methods.This study proposed a boundary-enhanced,object-oriented random forest method on the basis of high-resolution GF-2 and multi-temporal Sentinel-2 data.This method uses multispectral indexes,textures,vegetable indices,and variation characteristics of time-series NDVI from the multi-temporal Sentinel-2 imageries to obtain abundant features related to the growth of tea plantations.To reduce feature redundancy and computation time,the feature elimination algorithm based on Mean Decrease Accuracy(MDA)was used to generate the optimal feature set.Considering the serious boundary inconsistency problem caused by the complex and fragmented land cover types,high resolution GF-2 image was segmented based on the MultiResolution Segmentation(MRS)algorithm to assist the segmentation of Sentinel-2,which contributes to delineating meaningful objects and enhancing the reliability of the boundary for tea plantations.Finally,the object-oriented random forest method was utilized to extract the tea information based on the optimal feature combination in the Jingmai Mountain,Yunnan Province.The resulting tea plantation map had high accuracy,with a 95.38%overall accuracy and 0.91 kappa coefficient.We conclude that the proposed method is effective for mapping tea plantations in high heterogeneity mountainous areas and has the potential for mapping tea plantations in large areas.展开更多
The complexity of multi-domain access control policy integration makes it difficult to understand and manage the policy conflict information. The policy information visualization technology can express the logical rel...The complexity of multi-domain access control policy integration makes it difficult to understand and manage the policy conflict information. The policy information visualization technology can express the logical relation of the complex information intuitively which can effectively improve the management ability of the multi-domain policy integration. Based on the role-based access control model, this paper proposed two policy analyzing methods on the separated domain statistical information of multi-domain policy integration conflicts and the policy element levels of inter-domain and element mapping of cross-domain respectively. In addition, the corresponding visualization tool is developed. We use the tree-maps algorithm to statistically analyze quantity and type of the policy integration conflicts. On that basis, the semantic substrates algorithm is applied to concretely analyze the policy element levels of inter-domain and role and permission mapping of cross-domain. Experimental result shows tree-maps and semantic substrates can effectively analyze the conflicts of multi-domain policy integration and have a good application value.展开更多
Low-dimensional representation is a convenient method of obtaining a synthetic view of complex datasets and has been used in various domains for a long time. When the representation is related to words in a document, ...Low-dimensional representation is a convenient method of obtaining a synthetic view of complex datasets and has been used in various domains for a long time. When the representation is related to words in a document, this kind of representation is also called a semantic map. The two most popular methods are self-organizing maps and generative topographic mapping. The second approach is statistically well-founded but far less computationally efficient than the first. On the other hand, a drawback of self-organizing maps is that they do not project all points, but only map nodes. This paper presents a method of obtaining the projections for all data points complementary to the self-organizing map nodes. The idea is to project points so that their initial distances to some cluster centers are as conserved as possible. The method is tested on an oil flow dataset and then applied to a large protein sequence dataset described by keywords. It has been integrated into an interactive data browser for biological databases.展开更多
The quick response code based artificial labels are applied to provide semantic concepts and relations of surroundings that permit the understanding of complexity and limitations of semantic recognition and scene only...The quick response code based artificial labels are applied to provide semantic concepts and relations of surroundings that permit the understanding of complexity and limitations of semantic recognition and scene only with robot's vision.By imitating spatial cognizing mechanism of human,the robot constantly received the information of artificial labels at cognitive-guide points in a wide range of structured environment to achieve the perception of the environment and robot navigation.The immune network algorithm was used to form the environmental awareness mechanism with "distributed representation".The color recognition and SIFT feature matching algorithm were fused to achieve the memory and cognition of scenario tag.Then the cognition-guide-action based cognizing semantic map was built.Along with the continuously abundant map,the robot did no longer need to rely on the artificial label,and it could plan path and navigate freely.Experimental results show that the artificial label designed in this work can improve the cognitive ability of the robot,navigate the robot in the case of semi-unknown environment,and build the cognizing semantic map favorably.展开更多
Semantic conflict is the conflict caused by using different ways in heterogeneous systems to express the same entity in reality. This prevents information integration from accomplishing semantic coherence. Since ontol...Semantic conflict is the conflict caused by using different ways in heterogeneous systems to express the same entity in reality. This prevents information integration from accomplishing semantic coherence. Since ontology helps to solve semantic problems, this area has become a hot topic in information integration. In this paper, we introduce semantic conflict into information integration of heterogeneous applications. We discuss the origins and categories of the conflict, and present an ontology-based schema mapping approach to eliminate semantic conflicts. Key words ontology - CCSOL - semantic conflict - schema mapping CLC number TP 301 Biography: LU Han (1980-), male, Master candidate, research direction: ontology and information integration.展开更多
Efficient perception of the real world is a long-standing effort of computer vision.Mod⁃ern visual computing techniques have succeeded in attaching semantic labels to thousands of daily objects and reconstructing dens...Efficient perception of the real world is a long-standing effort of computer vision.Mod⁃ern visual computing techniques have succeeded in attaching semantic labels to thousands of daily objects and reconstructing dense depth maps of complex scenes.However,simultaneous se⁃mantic and spatial joint perception,so-called dense 3D semantic mapping,estimating the 3D ge⁃ometry of a scene and attaching semantic labels to the geometry,remains a challenging problem that,if solved,would make structured vision understanding and editing more widely accessible.Concurrently,progress in computer vision and machine learning has motivated us to pursue the capability of understanding and digitally reconstructing the surrounding world.Neural metric-se⁃mantic understanding is a new and rapidly emerging field that combines differentiable machine learning techniques with physical knowledge from computer vision,e.g.,the integration of visualinertial simultaneous localization and mapping(SLAM),mesh reconstruction,and semantic un⁃derstanding.In this paper,we attempt to summarize the recent trends and applications of neural metric-semantic understanding.Starting with an overview of the underlying computer vision and machine learning concepts,we discuss critical aspects of such perception approaches.Specifical⁃ly,our emphasis is on fully leveraging the joint semantic and 3D information.Later on,many im⁃portant applications of the perception capability such as novel view synthesis and semantic aug⁃mented reality(AR)contents manipulation are also presented.Finally,we conclude with a dis⁃cussion of the technical implications of the technology under a 5G edge computing scenario.展开更多
High resolution satellite images are becoming increasingly available for urban multi-temporal semantic understanding.However,few datasets can be used for land-use/land-cover(LULC)classification,binary change detection...High resolution satellite images are becoming increasingly available for urban multi-temporal semantic understanding.However,few datasets can be used for land-use/land-cover(LULC)classification,binary change detection(BCD)and semantic change detection(SCD)simultaneously because classification datasets always have one time phase and BCD datasets focus only on the changed location,ignoring the changed classes.Public SCD datasets are rare but much needed.To solve the above problems,a tri-temporal SCD dataset made up of Gaofen-2(GF-2)remote sensing imagery(with 11 LULC classes and 60 change directions)was built in this study,namely,the Wuhan Urban Semantic Understanding(WUSU)dataset.Popular deep learning based methods for LULC classification,BCD and SCD are tested to verify the reliability of WUSU.A Siamese-based multi-task joint framework with a multi-task joint loss(MJ loss)named ChangeMJ is proposed to restore the object boundaries and obtains the best results in LULC classification,BCD and SCD,compared to the state-of-the-art(SOTA)methods.Finally,a large spatial-scale mapping for Wuhan central urban area is carried out to verify that the WUsU dataset and the ChangeMJ framework have good application values.展开更多
Accurate and timely information on urban vegetation(UV)can be used as an important indicator to estimate the health of cities.Due to the low cost of RGB cameras,true color imagery(TCI)has been widely used for high spa...Accurate and timely information on urban vegetation(UV)can be used as an important indicator to estimate the health of cities.Due to the low cost of RGB cameras,true color imagery(TCI)has been widely used for high spatial resolution UV mapping.However,the current index-based and classifier-based UV mapping approaches face problems of the poor ability to accurately distinguish UV and the high reliance on massive annotated samples,respectively.To address this issue,an index-guided semantic segmentation(IGSS)framework is proposed in this paper.Firstly,a novel cross-scale vegetation index(CSVI)is calculated by the combination of TCI and Sentinel-2 images,and the index value can be used to provide an initial UV map.Secondly,reliable UV and non-UV samples are automatically generated for training the semantic segmentation model,and then the refined UV map can be produced.The experimental results show that the proposed CSVI outperformed the existingfive RGB vegetation indices in highlighting UV cover and suppressing complex backgrounds,and the proposed IGSS workflow achieved satisfactory results with an OA of 87.72%∼88.16%and an F1 score of 87.73%∼88.37%,which is comparable with the fully-supervised method.展开更多
基金This work was supported in part by the National Natural Science Foundation of China under Grant U20A20225,61833013in part by Shaanxi Provincial Key Research and Development Program under Grant 2022-GY111.
文摘This paper proposes an improved high-precision 3D semantic mapping method for indoor scenes using RGB-D images.The current semantic mapping algorithms suffer from low semantic annotation accuracy and insufficient real-time performance.To address these issues,we first adopt the Elastic Fusion algorithm to select key frames from indoor environment image sequences captured by the Kinect sensor and construct the indoor environment space model.Then,an indoor RGB-D image semantic segmentation network is proposed,which uses multi-scale feature fusion to quickly and accurately obtain object labeling information at the pixel level of the spatial point cloud model.Finally,Bayesian updating is used to conduct incremental semantic label fusion on the established spatial point cloud model.We also employ dense conditional random fields(CRF)to optimize the 3D semantic map model,resulting in a high-precision spatial semantic map of indoor scenes.Experimental results show that the proposed semantic mapping system can process image sequences collected by RGB-D sensors in real-time and output accurate semantic segmentation results of indoor scene images and the current local spatial semantic map.Finally,it constructs a globally consistent high-precision indoor scenes 3D semantic map.
基金Supported by the National Natural Science Foundation of China(No.61305042,61202098)Projects of Center for Remote Sensing Mission Study of China National Space Administration(No.2012A03A0939)Science and Technological Research of Key Projects of Education Department of Henan Province of China(No.13A520071)
文摘Cross-modal semantic mapping and cross-media retrieval are key problems of the multimedia search engine.This study analyzes the hierarchy,the functionality,and the structure in the visual and auditory sensations of cognitive system,and establishes a brain-like cross-modal semantic mapping framework based on cognitive computing of visual and auditory sensations.The mechanism of visual-auditory multisensory integration,selective attention in thalamo-cortical,emotional control in limbic system and the memory-enhancing in hippocampal were considered in the framework.Then,the algorithms of cross-modal semantic mapping were given.Experimental results show that the framework can be effectively applied to the cross-modal semantic mapping,and also provides an important significance for brain-like computing of non-von Neumann structure.
基金National Natural Science Foundation of China(No.41830110)National Key Research Development Program of China(No.2018YFC1503603)+2 种基金Key Laboratory of Land Satellite Remote Sensing Application,Ministry of Natural Resources of the People’s Republic of China(No.KLSMNR-202106)Water Conservancy Science and Technology Project of Jiangsu Province,China(No.2020061)Natural Science Foundation of Jiangsu Province,China(No.20180779)。
文摘As a consumed and influential natural plant beverage,tea is widely planted in subtropical and tropical areas all over the world.Affected by(sub)tropical climate characteristics,the underlying surface of the tea distribution area is extremely complex,with a variety of vegetation types.In addition,tea distribution is scattered and fragmentized in most of China.Therefore,it is difficult to obtain accurate tea information based on coarse resolution remote sensing data and existing feature extraction methods.This study proposed a boundary-enhanced,object-oriented random forest method on the basis of high-resolution GF-2 and multi-temporal Sentinel-2 data.This method uses multispectral indexes,textures,vegetable indices,and variation characteristics of time-series NDVI from the multi-temporal Sentinel-2 imageries to obtain abundant features related to the growth of tea plantations.To reduce feature redundancy and computation time,the feature elimination algorithm based on Mean Decrease Accuracy(MDA)was used to generate the optimal feature set.Considering the serious boundary inconsistency problem caused by the complex and fragmented land cover types,high resolution GF-2 image was segmented based on the MultiResolution Segmentation(MRS)algorithm to assist the segmentation of Sentinel-2,which contributes to delineating meaningful objects and enhancing the reliability of the boundary for tea plantations.Finally,the object-oriented random forest method was utilized to extract the tea information based on the optimal feature combination in the Jingmai Mountain,Yunnan Province.The resulting tea plantation map had high accuracy,with a 95.38%overall accuracy and 0.91 kappa coefficient.We conclude that the proposed method is effective for mapping tea plantations in high heterogeneity mountainous areas and has the potential for mapping tea plantations in large areas.
文摘The complexity of multi-domain access control policy integration makes it difficult to understand and manage the policy conflict information. The policy information visualization technology can express the logical relation of the complex information intuitively which can effectively improve the management ability of the multi-domain policy integration. Based on the role-based access control model, this paper proposed two policy analyzing methods on the separated domain statistical information of multi-domain policy integration conflicts and the policy element levels of inter-domain and element mapping of cross-domain respectively. In addition, the corresponding visualization tool is developed. We use the tree-maps algorithm to statistically analyze quantity and type of the policy integration conflicts. On that basis, the semantic substrates algorithm is applied to concretely analyze the policy element levels of inter-domain and role and permission mapping of cross-domain. Experimental result shows tree-maps and semantic substrates can effectively analyze the conflicts of multi-domain policy integration and have a good application value.
文摘Low-dimensional representation is a convenient method of obtaining a synthetic view of complex datasets and has been used in various domains for a long time. When the representation is related to words in a document, this kind of representation is also called a semantic map. The two most popular methods are self-organizing maps and generative topographic mapping. The second approach is statistically well-founded but far less computationally efficient than the first. On the other hand, a drawback of self-organizing maps is that they do not project all points, but only map nodes. This paper presents a method of obtaining the projections for all data points complementary to the self-organizing map nodes. The idea is to project points so that their initial distances to some cluster centers are as conserved as possible. The method is tested on an oil flow dataset and then applied to a large protein sequence dataset described by keywords. It has been integrated into an interactive data browser for biological databases.
基金Projects(61203330,61104009,61075092)supported by the National Natural Science Foundation of ChinaProject(2013M540546)supported by China Postdoctoral Science Foundation+2 种基金Projects(ZR2012FM031,ZR2011FM011,ZR2010FM007)supported by Shandong Provincal Nature Science Foundation,ChinaProjects(2011JC017,2012TS078)supported by Independent Innovation Foundation of Shandong University,ChinaProject(201203058)supported by Shandong Provincal Postdoctoral Innovation Foundation,China
文摘The quick response code based artificial labels are applied to provide semantic concepts and relations of surroundings that permit the understanding of complexity and limitations of semantic recognition and scene only with robot's vision.By imitating spatial cognizing mechanism of human,the robot constantly received the information of artificial labels at cognitive-guide points in a wide range of structured environment to achieve the perception of the environment and robot navigation.The immune network algorithm was used to form the environmental awareness mechanism with "distributed representation".The color recognition and SIFT feature matching algorithm were fused to achieve the memory and cognition of scenario tag.Then the cognition-guide-action based cognizing semantic map was built.Along with the continuously abundant map,the robot did no longer need to rely on the artificial label,and it could plan path and navigate freely.Experimental results show that the artificial label designed in this work can improve the cognitive ability of the robot,navigate the robot in the case of semi-unknown environment,and build the cognizing semantic map favorably.
文摘Semantic conflict is the conflict caused by using different ways in heterogeneous systems to express the same entity in reality. This prevents information integration from accomplishing semantic coherence. Since ontology helps to solve semantic problems, this area has become a hot topic in information integration. In this paper, we introduce semantic conflict into information integration of heterogeneous applications. We discuss the origins and categories of the conflict, and present an ontology-based schema mapping approach to eliminate semantic conflicts. Key words ontology - CCSOL - semantic conflict - schema mapping CLC number TP 301 Biography: LU Han (1980-), male, Master candidate, research direction: ontology and information integration.
文摘Efficient perception of the real world is a long-standing effort of computer vision.Mod⁃ern visual computing techniques have succeeded in attaching semantic labels to thousands of daily objects and reconstructing dense depth maps of complex scenes.However,simultaneous se⁃mantic and spatial joint perception,so-called dense 3D semantic mapping,estimating the 3D ge⁃ometry of a scene and attaching semantic labels to the geometry,remains a challenging problem that,if solved,would make structured vision understanding and editing more widely accessible.Concurrently,progress in computer vision and machine learning has motivated us to pursue the capability of understanding and digitally reconstructing the surrounding world.Neural metric-se⁃mantic understanding is a new and rapidly emerging field that combines differentiable machine learning techniques with physical knowledge from computer vision,e.g.,the integration of visualinertial simultaneous localization and mapping(SLAM),mesh reconstruction,and semantic un⁃derstanding.In this paper,we attempt to summarize the recent trends and applications of neural metric-semantic understanding.Starting with an overview of the underlying computer vision and machine learning concepts,we discuss critical aspects of such perception approaches.Specifical⁃ly,our emphasis is on fully leveraging the joint semantic and 3D information.Later on,many im⁃portant applications of the perception capability such as novel view synthesis and semantic aug⁃mented reality(AR)contents manipulation are also presented.Finally,we conclude with a dis⁃cussion of the technical implications of the technology under a 5G edge computing scenario.
基金supported by National Key Research and Development Program of China under grant number 2022YFB3903404National Natural Science Foundation of China under grant number 42325105,42071350LIESMARS Special Research Funding.
文摘High resolution satellite images are becoming increasingly available for urban multi-temporal semantic understanding.However,few datasets can be used for land-use/land-cover(LULC)classification,binary change detection(BCD)and semantic change detection(SCD)simultaneously because classification datasets always have one time phase and BCD datasets focus only on the changed location,ignoring the changed classes.Public SCD datasets are rare but much needed.To solve the above problems,a tri-temporal SCD dataset made up of Gaofen-2(GF-2)remote sensing imagery(with 11 LULC classes and 60 change directions)was built in this study,namely,the Wuhan Urban Semantic Understanding(WUSU)dataset.Popular deep learning based methods for LULC classification,BCD and SCD are tested to verify the reliability of WUSU.A Siamese-based multi-task joint framework with a multi-task joint loss(MJ loss)named ChangeMJ is proposed to restore the object boundaries and obtains the best results in LULC classification,BCD and SCD,compared to the state-of-the-art(SOTA)methods.Finally,a large spatial-scale mapping for Wuhan central urban area is carried out to verify that the WUsU dataset and the ChangeMJ framework have good application values.
基金supported by the National Key R&D Program of China under Grant 2022YFC3800802the National Natural Science Foundation of China under Grant 42271472+2 种基金the National Natural Science Foundation of China under Grant 42201338the program A for Outstanding PhD candidate of Nanjing University under Grant 202201A010the Research Project of Nanjing Research Institute of Surveying,Mapping and Geotechnical Investigation,Co.Ltd under Grant 2021RD02.
文摘Accurate and timely information on urban vegetation(UV)can be used as an important indicator to estimate the health of cities.Due to the low cost of RGB cameras,true color imagery(TCI)has been widely used for high spatial resolution UV mapping.However,the current index-based and classifier-based UV mapping approaches face problems of the poor ability to accurately distinguish UV and the high reliance on massive annotated samples,respectively.To address this issue,an index-guided semantic segmentation(IGSS)framework is proposed in this paper.Firstly,a novel cross-scale vegetation index(CSVI)is calculated by the combination of TCI and Sentinel-2 images,and the index value can be used to provide an initial UV map.Secondly,reliable UV and non-UV samples are automatically generated for training the semantic segmentation model,and then the refined UV map can be produced.The experimental results show that the proposed CSVI outperformed the existingfive RGB vegetation indices in highlighting UV cover and suppressing complex backgrounds,and the proposed IGSS workflow achieved satisfactory results with an OA of 87.72%∼88.16%and an F1 score of 87.73%∼88.37%,which is comparable with the fully-supervised method.