This paper presents a new approach to the outdoor road scene understand-ing by using omni-view images and backpropagation networks. Both the road directions used for vehicle heading and the road categories used for ve...This paper presents a new approach to the outdoor road scene understand-ing by using omni-view images and backpropagation networks. Both the road directions used for vehicle heading and the road categories used for velilcle local-ization are determined by the integrated system. There are three main features about the work. First, an omni-view image sensor is used to extract image samples, and the original image is preprocessed so that the inputs of the net-work is rotation-invariant and simple. Second, the problem of the network size,especially the number of the hidden units, is decided by the analysis of system-atic experimental results. Finally, the internal representation, which reveals the properties of the neural network, is analyzed in the view point of visual signal processing. Experimental results with real scene images are encouraging.展开更多
This paper reviews our recent fMRI and psychophysical finding on: 1) perceived size represen- tation in V1; 2) border ownership representation in V2; and 3) neural processing of partially occluded face. These find...This paper reviews our recent fMRI and psychophysical finding on: 1) perceived size represen- tation in V1; 2) border ownership representation in V2; and 3) neural processing of partially occluded face. These findings demonstrate that the human early vi- sual cortex not only performs local feature analyses, but also contributes significantly to high-level visual computation with assistance of attention-enabled cortical feed- back. Moreover, by taking advantage of recent findings on early visual cortex from neuroscience and cognitive science, we build a biologically plausible attention model that can well predict human scanpaths on natural images.展开更多
Multi-modality medical image fusion has more and more important applications in medical image analysis and understanding. In this paper, we develop and apply a multi-resolution method based on wavelet pyramid to fuse ...Multi-modality medical image fusion has more and more important applications in medical image analysis and understanding. In this paper, we develop and apply a multi-resolution method based on wavelet pyramid to fuse medical images from different modalities such as PET-MRI and CT-MRI. In particular, we evaluate the different fusion results when applying different selection rules and obtain optimum combination of fusion parameters.展开更多
Image caption generation is an essential task in computer vision and image understanding.Contemporary image caption generation models usually use the encoder-decoder model as the underlying network structure.However,i...Image caption generation is an essential task in computer vision and image understanding.Contemporary image caption generation models usually use the encoder-decoder model as the underlying network structure.However,in the traditional Encoder-Decoder architectures,only the global features of the images are extracted,while the local information of the images is not well utilized.This paper proposed an Encoder-Decoder model based on fused features and a novel mechanism for correcting the generated caption text.We use VGG16 and Faster R-CNN to extract global and local features in the encoder first.Then,we train the bidirectional LSTM network with the fused features in the decoder.Finally,the local features extracted is used to correct the caption text.The experiment results prove that the effectiveness of the proposed method.展开更多
Although line drawings consist of only line segments on a plane, they convey much information about the three-dimensional object structures. For a computer interpreting line drawings, some intelligent mechanism is req...Although line drawings consist of only line segments on a plane, they convey much information about the three-dimensional object structures. For a computer interpreting line drawings, some intelligent mechanism is required to extract three-dimensional information from the two-dimensional line drawings. In this paper, a new labeling theory and method are proposed for the two-dimensional line drawing with hidden-part-draw of a three-dimensional planar object with trihedral vertices. Some rules for labeling line drawing are established. There are 24 kinds of possible junctions for line drawing with hidden-part-draw, in which there are 8 possible Y and 16 W junctions. The three problems are solved that Sugihara's line drawing labeling technique exists. By analyzing the projections of the holes in manifold planar object, we have put forward a labeling method for the line drawing. Our labeling theory and method can discriminate between correct and incorrect hidden-part-draw natural line drawings. The hidden-part-draw natural line drawings can be labeled correctly by our labeling theory and method, whereas the labeling theory of Sugihara can only label the hidden-part-draw unnatural line drawings in which some visible lines must be drawn as hidden lines, and some invisible lines must be drawn as continuous lines.展开更多
Aiming at concrete tasks of information fusion in computer pan vision (CPV) system, information fusion methods are studied thoroughly. Some research progresses are presented. Recognizing of vision testing object is re...Aiming at concrete tasks of information fusion in computer pan vision (CPV) system, information fusion methods are studied thoroughly. Some research progresses are presented. Recognizing of vision testing object is realized by fusing vision information and non vision auxiliary information, which contain recognition of material defects, intelligent robot’s autonomous recognition for parts and computer to defect image understanding and recognition automatically.展开更多
The size determination of dynamical structures from spectral images poses the question where to fix the shape’s boundary. Here, we propose a method, suitable for nearly elliptical shape, based on the fit of a 2D Gaus...The size determination of dynamical structures from spectral images poses the question where to fix the shape’s boundary. Here, we propose a method, suitable for nearly elliptical shape, based on the fit of a 2D Gaussian to the pixel intensities of the spectral image. This method has been tested on a vortex structure embedded in the wake of the 2010 Saturn’s giant storm. On January 4th 2012, the Visual and Infrared Mapping Spectrometer (VIMS), onboard Cassini, observed a giant vortex in the Saturn’s northern hemisphere. The structure was embedded in the wake storm system detected on December 2010 by Fletcher et al. [1]. Therefore, all the VIMS observations focused on the Saturn’s storm have been analyzed to investigate its morphology and development. VIMS detected the vortex from May 2011up to January 2012. The evolution of shape and size has been determined for the vortex cloud top, visible at 890 nm. The largest size resulted 4000 km about and seemed to shrinks continuously up to January 2012, while the shape varied in the second half of the year. The vortex oscillated in 2 degrees latitude around 37°N planetocentric latitude, and drifted in longitude by ~0.75 deg/day in westward direction.展开更多
Object representation based on local features is a topical subject in the domain of image understanding and computer vision. We discuss the defects of global features in present methods and the advantages of local fea...Object representation based on local features is a topical subject in the domain of image understanding and computer vision. We discuss the defects of global features in present methods and the advantages of local features in object recognition, and briefly explore state-of-the-art recognition methods using local features, especially the main approaches of local feature extraction and object representation. To clearly explain these methods, the problem of local feature extraction is divided into feature region detection, feature region description, and feature space optimization. The main components and merits of these steps are presented. Technologies for object presentation are classified into three types: vector space, sliding window, and structure relationship models. Future development trends are discussed briefly.展开更多
Rule selection has long been a problem of great challenge that has to be solved when developing a rule-based knowledge learning system. Many methods have been proposed to evaluate the eligibility of a single rule base...Rule selection has long been a problem of great challenge that has to be solved when developing a rule-based knowledge learning system. Many methods have been proposed to evaluate the eligibility of a single rule based on some criteria. However, in a knowledge learning system there is usually a set of rules. These rules are not independent, but interactive. They tend to affect each other and form a rulesystem. In such case, it is no longer reasonable to isolate each rule from others for evaluation. A best rule according to certain criterion is not always the best one for the whole system. Furthermore, the data in the real world from which people want to create their learning system are often ill-defined and inconsistent. In this case, the completeness and consistency criteria for rule selection are no longer essential. In this paper, some ideas about how to solve the rule-selection problem in a systematic way are proposed. These ideas have been applied in the design of a Chinese business card layout analysis system and gained a good result on the training data set of 425 images. The implementation of the system and the result are presented in this paper.展开更多
in this poper a novel data-and rule-driven system for 3D scene description and segmentation inan unknown environment is presented.This system generatss hierachies of features that correspond tostructural elements such...in this poper a novel data-and rule-driven system for 3D scene description and segmentation inan unknown environment is presented.This system generatss hierachies of features that correspond tostructural elements such as boundaries and shape classes of individual object as well as relationshipsbetween objects.It is implemented as an added high-level component to an existing low-level binocularvision system[1]. Based on a pair of matched stereo images produced by that system,3D segmentation is firstperformed to group object boundary data into several edge-sets,each of which is believed to belong to aparticular object.Then gross features of each object are extracted and stored in an object recbrd.The finalstructural description of the scene is accomplished with information in the object record,a set of rules and arule implementor. The System is designed to handle partially occluded objects of different shapes and sizeson the 2D imager.Experimental results have shown its success in computing both object and structurallevel descriptions of common man-made objects.展开更多
文摘This paper presents a new approach to the outdoor road scene understand-ing by using omni-view images and backpropagation networks. Both the road directions used for vehicle heading and the road categories used for velilcle local-ization are determined by the integrated system. There are three main features about the work. First, an omni-view image sensor is used to extract image samples, and the original image is preprocessed so that the inputs of the net-work is rotation-invariant and simple. Second, the problem of the network size,especially the number of the hidden units, is decided by the analysis of system-atic experimental results. Finally, the internal representation, which reveals the properties of the neural network, is analyzed in the view point of visual signal processing. Experimental results with real scene images are encouraging.
文摘This paper reviews our recent fMRI and psychophysical finding on: 1) perceived size represen- tation in V1; 2) border ownership representation in V2; and 3) neural processing of partially occluded face. These findings demonstrate that the human early vi- sual cortex not only performs local feature analyses, but also contributes significantly to high-level visual computation with assistance of attention-enabled cortical feed- back. Moreover, by taking advantage of recent findings on early visual cortex from neuroscience and cognitive science, we build a biologically plausible attention model that can well predict human scanpaths on natural images.
基金the National Natural Science Foundation of China (No. 19675005).
文摘Multi-modality medical image fusion has more and more important applications in medical image analysis and understanding. In this paper, we develop and apply a multi-resolution method based on wavelet pyramid to fuse medical images from different modalities such as PET-MRI and CT-MRI. In particular, we evaluate the different fusion results when applying different selection rules and obtain optimum combination of fusion parameters.
基金This work is supported by the National Natural Science Foundation of China(6187223).
文摘Image caption generation is an essential task in computer vision and image understanding.Contemporary image caption generation models usually use the encoder-decoder model as the underlying network structure.However,in the traditional Encoder-Decoder architectures,only the global features of the images are extracted,while the local information of the images is not well utilized.This paper proposed an Encoder-Decoder model based on fused features and a novel mechanism for correcting the generated caption text.We use VGG16 and Faster R-CNN to extract global and local features in the encoder first.Then,we train the bidirectional LSTM network with the fused features in the decoder.Finally,the local features extracted is used to correct the caption text.The experiment results prove that the effectiveness of the proposed method.
文摘Although line drawings consist of only line segments on a plane, they convey much information about the three-dimensional object structures. For a computer interpreting line drawings, some intelligent mechanism is required to extract three-dimensional information from the two-dimensional line drawings. In this paper, a new labeling theory and method are proposed for the two-dimensional line drawing with hidden-part-draw of a three-dimensional planar object with trihedral vertices. Some rules for labeling line drawing are established. There are 24 kinds of possible junctions for line drawing with hidden-part-draw, in which there are 8 possible Y and 16 W junctions. The three problems are solved that Sugihara's line drawing labeling technique exists. By analyzing the projections of the holes in manifold planar object, we have put forward a labeling method for the line drawing. Our labeling theory and method can discriminate between correct and incorrect hidden-part-draw natural line drawings. The hidden-part-draw natural line drawings can be labeled correctly by our labeling theory and method, whereas the labeling theory of Sugihara can only label the hidden-part-draw unnatural line drawings in which some visible lines must be drawn as hidden lines, and some invisible lines must be drawn as continuous lines.
文摘Aiming at concrete tasks of information fusion in computer pan vision (CPV) system, information fusion methods are studied thoroughly. Some research progresses are presented. Recognizing of vision testing object is realized by fusing vision information and non vision auxiliary information, which contain recognition of material defects, intelligent robot’s autonomous recognition for parts and computer to defect image understanding and recognition automatically.
文摘The size determination of dynamical structures from spectral images poses the question where to fix the shape’s boundary. Here, we propose a method, suitable for nearly elliptical shape, based on the fit of a 2D Gaussian to the pixel intensities of the spectral image. This method has been tested on a vortex structure embedded in the wake of the 2010 Saturn’s giant storm. On January 4th 2012, the Visual and Infrared Mapping Spectrometer (VIMS), onboard Cassini, observed a giant vortex in the Saturn’s northern hemisphere. The structure was embedded in the wake storm system detected on December 2010 by Fletcher et al. [1]. Therefore, all the VIMS observations focused on the Saturn’s storm have been analyzed to investigate its morphology and development. VIMS detected the vortex from May 2011up to January 2012. The evolution of shape and size has been determined for the vortex cloud top, visible at 890 nm. The largest size resulted 4000 km about and seemed to shrinks continuously up to January 2012, while the shape varied in the second half of the year. The vortex oscillated in 2 degrees latitude around 37°N planetocentric latitude, and drifted in longitude by ~0.75 deg/day in westward direction.
基金supported by the National Basic Research Program (973) of China (No. 2012CB821206)the National Natural Science Foundation of China (No. 71201004)+1 种基金the Scientific Research Common Program of Beijing Municipal Commission of Education (No. KM201310011009)the Funding Project for Innovation on Science, Technology and Graduate Education in Institutions of Higher Learning under the Jurisdiction of Beijing Municipality (Nos. PXM2012_014213_000037 and PXM2012_014213_000079)
文摘Object representation based on local features is a topical subject in the domain of image understanding and computer vision. We discuss the defects of global features in present methods and the advantages of local features in object recognition, and briefly explore state-of-the-art recognition methods using local features, especially the main approaches of local feature extraction and object representation. To clearly explain these methods, the problem of local feature extraction is divided into feature region detection, feature region description, and feature space optimization. The main components and merits of these steps are presented. Technologies for object presentation are classified into three types: vector space, sliding window, and structure relationship models. Future development trends are discussed briefly.
文摘Rule selection has long been a problem of great challenge that has to be solved when developing a rule-based knowledge learning system. Many methods have been proposed to evaluate the eligibility of a single rule based on some criteria. However, in a knowledge learning system there is usually a set of rules. These rules are not independent, but interactive. They tend to affect each other and form a rulesystem. In such case, it is no longer reasonable to isolate each rule from others for evaluation. A best rule according to certain criterion is not always the best one for the whole system. Furthermore, the data in the real world from which people want to create their learning system are often ill-defined and inconsistent. In this case, the completeness and consistency criteria for rule selection are no longer essential. In this paper, some ideas about how to solve the rule-selection problem in a systematic way are proposed. These ideas have been applied in the design of a Chinese business card layout analysis system and gained a good result on the training data set of 425 images. The implementation of the system and the result are presented in this paper.
文摘in this poper a novel data-and rule-driven system for 3D scene description and segmentation inan unknown environment is presented.This system generatss hierachies of features that correspond tostructural elements such as boundaries and shape classes of individual object as well as relationshipsbetween objects.It is implemented as an added high-level component to an existing low-level binocularvision system[1]. Based on a pair of matched stereo images produced by that system,3D segmentation is firstperformed to group object boundary data into several edge-sets,each of which is believed to belong to aparticular object.Then gross features of each object are extracted and stored in an object recbrd.The finalstructural description of the scene is accomplished with information in the object record,a set of rules and arule implementor. The System is designed to handle partially occluded objects of different shapes and sizeson the 2D imager.Experimental results have shown its success in computing both object and structurallevel descriptions of common man-made objects.