The availability of a good viewpoint space partition is crucial in three dimensional (3-D) object recognition on the approach of aspect graph. There are two important events, depicted by the aspect graph approach, e...The availability of a good viewpoint space partition is crucial in three dimensional (3-D) object recognition on the approach of aspect graph. There are two important events, depicted by the aspect graph approach, edge-:edge-edge (EEE) events and edge-vertex (EV) events. This paper presents an algorithm to compute EEE events by characteristic analysis based on conicoid theory, in contrast to current algorithms that focus too much on EV events and often overlook the importance of EEE events. Also, the paper provides a standard flowchart for the viewpoint space partitioning based on aspect graph theory that makes it suitable for perspective models. The partitioning result best demonstrates the algorithm's efficiency with more valuable viewpoints found with the help of EEE events, which can definitely help to achieve high recognition rate for 3-D object recognition.展开更多
In order to find better simplicity measurements for 3D object recognition, a new set of local regularities is developed and tested in a stepwise 3D reconstruction method, including localized minimizing standard deviat...In order to find better simplicity measurements for 3D object recognition, a new set of local regularities is developed and tested in a stepwise 3D reconstruction method, including localized minimizing standard deviation of angles(L-MSDA), localized minimizing standard deviation of segment magnitudes(L-MSDSM), localized minimum standard deviation of areas of child faces (L-MSDAF), localized minimum sum of segment magnitudes of common edges (L-MSSM), and localized minimum sum of areas of child face (L-MSAF). Based on their effectiveness measurements in terms of form and size distortions, it is found that when two local regularities: L-MSDA and L-MSDSM are combined together, they can produce better performance. In addition, the best weightings for them to work together are identified as 10% for L-MSDSM and 90% for L-MSDA. The test results show that the combined usage of L-MSDA and L-MSDSM with identified weightings has a potential to be applied in other optimization based 3D recognition methods to improve their efficacy and robustness.展开更多
In this Paper, a classification method based on neural networks is presented for recognition of 3D objects. Indeed, the objective of this paper is to classify an object query against objects in a database, which leads...In this Paper, a classification method based on neural networks is presented for recognition of 3D objects. Indeed, the objective of this paper is to classify an object query against objects in a database, which leads to recognition of the former. 3D objects of this database are transformations of other objects by one element of the overall transformation. The set of transformations considered in this work is the general affine group.展开更多
The research of ionizing radiation induced foci is an important method of DNA damage repair. Although the visualization technology of foci has been mature, the traditional foci recognition analysis technology has a lo...The research of ionizing radiation induced foci is an important method of DNA damage repair. Although the visualization technology of foci has been mature, the traditional foci recognition analysis technology has a lot of defects due to the spatial overlap of foci.展开更多
This paper will discuss strategies for trinocular image rectification and matching for linear object tracking.It is well known that a pair of stereo images generates two epipolar images.Three overlapped images can yie...This paper will discuss strategies for trinocular image rectification and matching for linear object tracking.It is well known that a pair of stereo images generates two epipolar images.Three overlapped images can yield six epipolar images in situations where any two are required to be rectified for the purpose of image matching.In this case,the search for feature correspondences is computationally intensive and matching complexity increases.A special epipolar image rectification for three stereo images,which simplifies the image matching process,is therefore proposed.This method generates only three rectified images,with the result that the search for matching features becomes more straightforward.With the three rectified images,a particular line_segment_based correspondence strategy is suggested.The primary characteristics of the feature correspondence strategy include application of specific epipolar geometric constraints and reference to three_ray triangulation residuals in object space.展开更多
Object recognition and location has always been one of the research hotspots in machine vision.It is of great value and significance to the development and application of current service robots,industrial automation,u...Object recognition and location has always been one of the research hotspots in machine vision.It is of great value and significance to the development and application of current service robots,industrial automation,unmanned driving and other fields.In order to realize the real-time recognition and location of indoor scene objects,this article proposes an improved YOLOv3 neural network model,which combines densely connected networks and residual networks to construct a new YOLOv3 backbone network,which is applied to the detection and recognition of objects in indoor scenes.In this article,RealSense D415 RGB-D camera is used to obtain the RGB map and depth map,the actual distance value is calculated after each pixel in the scene image is mapped to the real scene.Experiment results proved that the detection and recognition accuracy and real-time performance by the new network are obviously improved compared with the previous YOLOV3 neural network model in the same scene.More objects can be detected after the improvement of network which cannot be detected with the YOLOv3 network before the improvement.The running time of objects detection and recognition is reduced to less than half of the original.This improved network has a certain reference value for practical engineering application.展开更多
Two new recognition methods for the spatial planar POlygon using perspective invariants are presented. The corss-ratio (R c) of a vetex and the co-base area rotio (RA) of a edge in a spatial planar polygon are propose...Two new recognition methods for the spatial planar POlygon using perspective invariants are presented. The corss-ratio (R c) of a vetex and the co-base area rotio (RA) of a edge in a spatial planar polygon are proposed and used as the invariant primitive of the recognition eigenvector. The second distance error decision rule (SD EDR) estimating the relative error of RA is introduced also too. The mthods could recognize a spatial planar polygon with an arbitrary orientation through only a single perspective view. Experimental examples are gievn.展开更多
Recently,securing Copyright has become a hot research topic due to rapidly advancing information technology.As a host cover,watermarking methods are used to conceal or embed sensitive information messages in such a ma...Recently,securing Copyright has become a hot research topic due to rapidly advancing information technology.As a host cover,watermarking methods are used to conceal or embed sensitive information messages in such a manner that it was undetectable to a human observer in contemporary times.Digital media covers may often take any form,including audio,video,photos,even DNA data sequences.In this work,we present a new methodology for watermarking to hide secret data into 3-D objects.The technique of blind extraction based on reversing the steps of the data embedding process is used.The implemented technique uses the features of the 3-D object vertex’discrete cosine transform to embed a grayscale image with high capacity.The coefficient of vertex and the encrypted picture pixels are used in the watermarking procedure.Additionally,the extraction approach is fully blind and is dependent on the backward steps of the encoding procedure to get the hidden data.Correlation distance,Euclidean distance,Manhattan distance,and the Cosine distance are used to evaluate and test the performance of the proposed approach.The visibility and imperceptibility of the proposed method are assessed to show the efficiency of our work compared to previous corresponding methods.展开更多
Automatic target recognition (ATR) is an important issue for military applications, the topic of the ATR system belongs to the field of pattern recognition and classification. In the paper, we present an approach fo...Automatic target recognition (ATR) is an important issue for military applications, the topic of the ATR system belongs to the field of pattern recognition and classification. In the paper, we present an approach for building an ATR system with improved artificial neural network to recog- nize and classify the typical targets in the battle field. The invariant features of Hu invariant moments and roundness were selected to be the inputs of the neural network because they have the invari- ances of rotation, translation and scaling. The pictures of the targets are generated by the 3-D mod- els to improve the recognition rate because it is necessary to provide enough pictures for training the artificial neural network. The simulations prove that the approach can be implement ed in the ATR system and it has a high recognition rate and can be applied in real time.展开更多
The recognition of 3-D objects is quite a difficult task for computer vision systems.This paper presents a new object framework,which utilizes densely sampled grids with different resolutions to represent the local in...The recognition of 3-D objects is quite a difficult task for computer vision systems.This paper presents a new object framework,which utilizes densely sampled grids with different resolutions to represent the local information of the input image.A Markov random field model is then created to model the geometric distribution of the object key nodes.Flexible matching,which aims to find the accurate correspondence map between the key points of two images,is performed by combining the local similarities and the geometric relations together using the highest confidence first method.Afterwards,a global similarity is calculated for object recognition.Experimental results on Coil-100 object database,which consists of 7200 images of 100 objects,are presented.When the numbers of templates vary from 4,8,18 to 36 for each object,and the remaining images compose the test sets,the object recognition rates are 95.75%,99.30%,100.0%and 100.0%,respectively.The excellent recognition performance is much better than those of the other cited references,which indicates that our approach is well-suited for appearance-based object recognition.展开更多
Active vision is inherently attention-driven:an agent actively selects views to attend in order to rapidly perform a vision task while improving its internal representation of the scene being observed.Inspired by the ...Active vision is inherently attention-driven:an agent actively selects views to attend in order to rapidly perform a vision task while improving its internal representation of the scene being observed.Inspired by the recent success of attention-based models in 2D vision tasks based on single RGB images, we address multi-view depth-based active object recognition using an attention mechanism, by use of an end-to-end recurrent 3D attentional network. The architecture takes advantage of a recurrent neural network to store and update an internal representation. Our model,trained with 3D shape datasets, is able to iteratively attend the best views targeting an object of interest for recognizing it. To realize 3D view selection, we derive a 3D spatial transformer network. It is dierentiable,allowing training with backpropagation, and so achieving much faster convergence than the reinforcement learning employed by most existing attention-based models. Experiments show that our method, with only depth input, achieves state-of-the-art next-best-view performance both in terms of time taken and recognition accuracy.展开更多
Holoscopic 3D imaging is a true 3D imaging system mimics fly’s eye technique to acquire a true 3D optical model of a real scene. To reconstruct the 3D image computationally, an efficient implementation of an Auto-Fea...Holoscopic 3D imaging is a true 3D imaging system mimics fly’s eye technique to acquire a true 3D optical model of a real scene. To reconstruct the 3D image computationally, an efficient implementation of an Auto-Feature-Edge (AFE) descriptor algorithm is required that provides an individual feature detector for integration of 3D information to locate objects in the scene. The AFE descriptor plays a key role in simplifying the detection of both edge-based and region-based objects. The detector is based on a Multi-Quantize Adaptive Local Histogram Analysis (MQALHA) algorithm. This is distinctive for each Feature-Edge (FE) block i.e. the large contrast changes (gradients) in FE are easier to localise. The novelty of this work lies in generating a free-noise 3D-Map (3DM) according to a correlation analysis of region contours. This automatically combines the exploitation of the available depth estimation technique with edge-based feature shape recognition technique. The application area consists of two varied domains, which prove the efficiency and robustness of the approach: a) extracting a set of setting feature-edges, for both tracking and mapping process for 3D depthmap estimation, and b) separation and recognition of focus objects in the scene. Experimental results show that the proposed 3DM technique is performed efficiently compared to the state-of-the-art algorithms.展开更多
View-based 3-D object retrieval has become an emerging topic in recent years,especially with the fast development of visual content acquisition devices,such as mobile phones with cameras.Extensive research efforts hav...View-based 3-D object retrieval has become an emerging topic in recent years,especially with the fast development of visual content acquisition devices,such as mobile phones with cameras.Extensive research efforts have been dedicated to this task,while it is still difficult to measure the relevance between two objects with multiple views.In recent years,learning-based methods have been investigated in view-based 3-D object retrieval,such as graph-based learning.It is noted that the graph-based methods suffer from the high computational cost from the graph construction and the corresponding learning process.In this paper,we introduce a general framework to accelerate the learning-based view-based 3-D object matching in large scale data.Given a query object Q and one object O from a 3-D dataset D,the first step is to extract a small set of candidate relevant 3-D objects for object O.Then multiple hypergraphs can be constructed based on this small set of 3-D objects and the learning on the fused hypergraph is conducted to generate the relevance between Q and O,which can be further used in the retrieval procedure.Experiments demonstrate the effectiveness of the proposed framework.展开更多
基金Supported by the National Natural Science Foundation of China (No.60502013)by the National High-Tech Research and Development(863) Program of China(No.2006AA01Z115)
文摘The availability of a good viewpoint space partition is crucial in three dimensional (3-D) object recognition on the approach of aspect graph. There are two important events, depicted by the aspect graph approach, edge-:edge-edge (EEE) events and edge-vertex (EV) events. This paper presents an algorithm to compute EEE events by characteristic analysis based on conicoid theory, in contrast to current algorithms that focus too much on EV events and often overlook the importance of EEE events. Also, the paper provides a standard flowchart for the viewpoint space partitioning based on aspect graph theory that makes it suitable for perspective models. The partitioning result best demonstrates the algorithm's efficiency with more valuable viewpoints found with the help of EEE events, which can definitely help to achieve high recognition rate for 3-D object recognition.
文摘In order to find better simplicity measurements for 3D object recognition, a new set of local regularities is developed and tested in a stepwise 3D reconstruction method, including localized minimizing standard deviation of angles(L-MSDA), localized minimizing standard deviation of segment magnitudes(L-MSDSM), localized minimum standard deviation of areas of child faces (L-MSDAF), localized minimum sum of segment magnitudes of common edges (L-MSSM), and localized minimum sum of areas of child face (L-MSAF). Based on their effectiveness measurements in terms of form and size distortions, it is found that when two local regularities: L-MSDA and L-MSDSM are combined together, they can produce better performance. In addition, the best weightings for them to work together are identified as 10% for L-MSDSM and 90% for L-MSDA. The test results show that the combined usage of L-MSDA and L-MSDSM with identified weightings has a potential to be applied in other optimization based 3D recognition methods to improve their efficacy and robustness.
文摘In this Paper, a classification method based on neural networks is presented for recognition of 3D objects. Indeed, the objective of this paper is to classify an object query against objects in a database, which leads to recognition of the former. 3D objects of this database are transformations of other objects by one element of the overall transformation. The set of transformations considered in this work is the general affine group.
文摘The research of ionizing radiation induced foci is an important method of DNA damage repair. Although the visualization technology of foci has been mature, the traditional foci recognition analysis technology has a lot of defects due to the spatial overlap of foci.
文摘This paper will discuss strategies for trinocular image rectification and matching for linear object tracking.It is well known that a pair of stereo images generates two epipolar images.Three overlapped images can yield six epipolar images in situations where any two are required to be rectified for the purpose of image matching.In this case,the search for feature correspondences is computationally intensive and matching complexity increases.A special epipolar image rectification for three stereo images,which simplifies the image matching process,is therefore proposed.This method generates only three rectified images,with the result that the search for matching features becomes more straightforward.With the three rectified images,a particular line_segment_based correspondence strategy is suggested.The primary characteristics of the feature correspondence strategy include application of specific epipolar geometric constraints and reference to three_ray triangulation residuals in object space.
基金supported by Henan Province Science and Technology Project under Grant No.182102210065.
文摘Object recognition and location has always been one of the research hotspots in machine vision.It is of great value and significance to the development and application of current service robots,industrial automation,unmanned driving and other fields.In order to realize the real-time recognition and location of indoor scene objects,this article proposes an improved YOLOv3 neural network model,which combines densely connected networks and residual networks to construct a new YOLOv3 backbone network,which is applied to the detection and recognition of objects in indoor scenes.In this article,RealSense D415 RGB-D camera is used to obtain the RGB map and depth map,the actual distance value is calculated after each pixel in the scene image is mapped to the real scene.Experiment results proved that the detection and recognition accuracy and real-time performance by the new network are obviously improved compared with the previous YOLOV3 neural network model in the same scene.More objects can be detected after the improvement of network which cannot be detected with the YOLOv3 network before the improvement.The running time of objects detection and recognition is reduced to less than half of the original.This improved network has a certain reference value for practical engineering application.
文摘Two new recognition methods for the spatial planar POlygon using perspective invariants are presented. The corss-ratio (R c) of a vetex and the co-base area rotio (RA) of a edge in a spatial planar polygon are proposed and used as the invariant primitive of the recognition eigenvector. The second distance error decision rule (SD EDR) estimating the relative error of RA is introduced also too. The mthods could recognize a spatial planar polygon with an arbitrary orientation through only a single perspective view. Experimental examples are gievn.
文摘Recently,securing Copyright has become a hot research topic due to rapidly advancing information technology.As a host cover,watermarking methods are used to conceal or embed sensitive information messages in such a manner that it was undetectable to a human observer in contemporary times.Digital media covers may often take any form,including audio,video,photos,even DNA data sequences.In this work,we present a new methodology for watermarking to hide secret data into 3-D objects.The technique of blind extraction based on reversing the steps of the data embedding process is used.The implemented technique uses the features of the 3-D object vertex’discrete cosine transform to embed a grayscale image with high capacity.The coefficient of vertex and the encrypted picture pixels are used in the watermarking procedure.Additionally,the extraction approach is fully blind and is dependent on the backward steps of the encoding procedure to get the hidden data.Correlation distance,Euclidean distance,Manhattan distance,and the Cosine distance are used to evaluate and test the performance of the proposed approach.The visibility and imperceptibility of the proposed method are assessed to show the efficiency of our work compared to previous corresponding methods.
基金Supported by the Ministerial Level Advanced Research Foundation(9140A01010411BQ01)the National Twelfth Five-Year Project(40405050303)
文摘Automatic target recognition (ATR) is an important issue for military applications, the topic of the ATR system belongs to the field of pattern recognition and classification. In the paper, we present an approach for building an ATR system with improved artificial neural network to recog- nize and classify the typical targets in the battle field. The invariant features of Hu invariant moments and roundness were selected to be the inputs of the neural network because they have the invari- ances of rotation, translation and scaling. The pictures of the targets are generated by the 3-D mod- els to improve the recognition rate because it is necessary to provide enough pictures for training the artificial neural network. The simulations prove that the approach can be implement ed in the ATR system and it has a high recognition rate and can be applied in real time.
文摘The recognition of 3-D objects is quite a difficult task for computer vision systems.This paper presents a new object framework,which utilizes densely sampled grids with different resolutions to represent the local information of the input image.A Markov random field model is then created to model the geometric distribution of the object key nodes.Flexible matching,which aims to find the accurate correspondence map between the key points of two images,is performed by combining the local similarities and the geometric relations together using the highest confidence first method.Afterwards,a global similarity is calculated for object recognition.Experimental results on Coil-100 object database,which consists of 7200 images of 100 objects,are presented.When the numbers of templates vary from 4,8,18 to 36 for each object,and the remaining images compose the test sets,the object recognition rates are 95.75%,99.30%,100.0%and 100.0%,respectively.The excellent recognition performance is much better than those of the other cited references,which indicates that our approach is well-suited for appearance-based object recognition.
基金supported by National Natural Science Foundation of China (Nos. 61572507, 61622212, and 61532003)supported by the China Scholarship Council
文摘Active vision is inherently attention-driven:an agent actively selects views to attend in order to rapidly perform a vision task while improving its internal representation of the scene being observed.Inspired by the recent success of attention-based models in 2D vision tasks based on single RGB images, we address multi-view depth-based active object recognition using an attention mechanism, by use of an end-to-end recurrent 3D attentional network. The architecture takes advantage of a recurrent neural network to store and update an internal representation. Our model,trained with 3D shape datasets, is able to iteratively attend the best views targeting an object of interest for recognizing it. To realize 3D view selection, we derive a 3D spatial transformer network. It is dierentiable,allowing training with backpropagation, and so achieving much faster convergence than the reinforcement learning employed by most existing attention-based models. Experiments show that our method, with only depth input, achieves state-of-the-art next-best-view performance both in terms of time taken and recognition accuracy.
文摘Holoscopic 3D imaging is a true 3D imaging system mimics fly’s eye technique to acquire a true 3D optical model of a real scene. To reconstruct the 3D image computationally, an efficient implementation of an Auto-Feature-Edge (AFE) descriptor algorithm is required that provides an individual feature detector for integration of 3D information to locate objects in the scene. The AFE descriptor plays a key role in simplifying the detection of both edge-based and region-based objects. The detector is based on a Multi-Quantize Adaptive Local Histogram Analysis (MQALHA) algorithm. This is distinctive for each Feature-Edge (FE) block i.e. the large contrast changes (gradients) in FE are easier to localise. The novelty of this work lies in generating a free-noise 3D-Map (3DM) according to a correlation analysis of region contours. This automatically combines the exploitation of the available depth estimation technique with edge-based feature shape recognition technique. The application area consists of two varied domains, which prove the efficiency and robustness of the approach: a) extracting a set of setting feature-edges, for both tracking and mapping process for 3D depthmap estimation, and b) separation and recognition of focus objects in the scene. Experimental results show that the proposed 3DM technique is performed efficiently compared to the state-of-the-art algorithms.
文摘View-based 3-D object retrieval has become an emerging topic in recent years,especially with the fast development of visual content acquisition devices,such as mobile phones with cameras.Extensive research efforts have been dedicated to this task,while it is still difficult to measure the relevance between two objects with multiple views.In recent years,learning-based methods have been investigated in view-based 3-D object retrieval,such as graph-based learning.It is noted that the graph-based methods suffer from the high computational cost from the graph construction and the corresponding learning process.In this paper,we introduce a general framework to accelerate the learning-based view-based 3-D object matching in large scale data.Given a query object Q and one object O from a 3-D dataset D,the first step is to extract a small set of candidate relevant 3-D objects for object O.Then multiple hypergraphs can be constructed based on this small set of 3-D objects and the learning on the fused hypergraph is conducted to generate the relevance between Q and O,which can be further used in the retrieval procedure.Experiments demonstrate the effectiveness of the proposed framework.