Unconstrained face images are interfered by many factors such as illumination,posture,expression,occlusion,age,accessories and so on,resulting in the randomness of the noise pollution implied in the original samples.I...Unconstrained face images are interfered by many factors such as illumination,posture,expression,occlusion,age,accessories and so on,resulting in the randomness of the noise pollution implied in the original samples.In order to improve the sample quality,a weighted block cooperative sparse representation algorithm is proposed based on visual saliency dictionary.First,the algorithm uses the biological visual attention mechanism to quickly and accurately obtain the face salient target and constructs the visual salient dictionary.Then,a block cooperation framework is presented to perform sparse coding for different local structures of human face,and the weighted regular term is introduced in the sparse representation process to enhance the identification of information hidden in the coding coefficients.Finally,by synthesising the sparse representation results of all visual salient block dictionaries,the global coding residual is obtained and the class label is given.The experimental results on four databases,that is,AR,extended Yale B,LFW and PubFig,indicate that the combination of visual saliency dictionary,block cooperative sparse representation and weighted constraint coding can effectively enhance the accuracy of sparse representation of the samples to be tested and improve the performance of unconstrained face recognition.展开更多
Traditional vehicle detection algorithms use traverse search based vehicle candidate generation and hand crafted based classifier training for vehicle candidate verification.These types of methods generally have high ...Traditional vehicle detection algorithms use traverse search based vehicle candidate generation and hand crafted based classifier training for vehicle candidate verification.These types of methods generally have high processing times and low vehicle detection performance.To address this issue,a visual saliency and deep sparse convolution hierarchical model based vehicle detection algorithm is proposed.A visual saliency calculation is firstly used to generate a small vehicle candidate area.The vehicle candidate sub images are then loaded into a sparse deep convolution hierarchical model with an SVM-based classifier to perform the final detection.The experimental results demonstrate that the proposed method is with 94.81% correct rate and 0.78% false detection rate on the existing datasets and the real road pictures captured by our group,which outperforms the existing state-of-the-art algorithms.More importantly,high discriminative multi-scale features are generated by deep sparse convolution network which has broad application prospects in target recognition in the field of intelligent vehicle.展开更多
Street-level visualization is an important application of 3D city models.Challenges to street-level visualization include the cluttering of buildings due to fine detail and visualization performance.In this paper,a no...Street-level visualization is an important application of 3D city models.Challenges to street-level visualization include the cluttering of buildings due to fine detail and visualization performance.In this paper,a novel method is proposed for streetlevel visualization based on visual saliency evaluation.The basic idea of the method is to preserve these salient buildings in a scene while removing those that are non-salient.The method can be divided into pre-processing procedures and real-time visualization.The first step in pre-processing is to convert 3D building models at higher Levels of Detail(Lo Ds) into LoD 1 models with simplified ground plans.Then,a number of index viewpoints are created along the streets; these indices refer to both the position and the direction of each street site.A visual saliency value is computed for each building,with respect to the index site,based on a visual difference between the original model and the generalized model.We calculate and evaluate three methods for visual saliency:local difference,global difference and minimum projection area.The real-time visualization process begins by mapping the observer to its closest indices.The street view is then generated based on the building information stored in those indexes.A user study shows that the local visual saliency method performs better than do the global visual saliency,area and image-based methods and that the framework proposed in this paper may improve the performance of 3D visualization.展开更多
Craters are salient terrain features on planetary surfaces, and provide useful information about the relative dating of geological unit of planets. In addition, they are ideal landmarks for spacecraft navigation. Due ...Craters are salient terrain features on planetary surfaces, and provide useful information about the relative dating of geological unit of planets. In addition, they are ideal landmarks for spacecraft navigation. Due to low contrast and uneven illumination, automatic extraction of craters remains a challenging task. This paper presents a saliency detection method for crater edges and a feature matching algorithm based on edges informa- tion. The craters are extracted through saliency edges detection, edge extraction and selection, feature matching of the same crater edges and robust ellipse fitting. In the edges matching algorithm, a crater feature model is proposed by analyzing the relationship between highlight region edges and shadow region ones. Then, crater edges are paired through the effective matching algorithm. Experiments of real planetary images show that the proposed approach is robust to different lights and topographies, and the detection rate is larger than 90%.展开更多
Based on salient visual regions for mobile robot navigation in unknown environments, a new place recognition system was presented. The system uses monocular camera to acquire omni-directional images of the environment...Based on salient visual regions for mobile robot navigation in unknown environments, a new place recognition system was presented. The system uses monocular camera to acquire omni-directional images of the environment where the robot locates. Salient local regions are detected from these images using center-surround difference method, which computes opponencies of color and texture among multi-scale image spaces. And then they are organized using hidden Markov model (HMM) to form the vertex of topological map. So localization, that is place recognition in our system, can be converted to evaluation of HMM. Experimental results show that the saliency detection is immune to the changes of scale, 2D rotation and viewpoint etc. The created topological map has smaller size and a higher ratio of recognition is obtained.展开更多
Video summarization is applied to reduce redundancy and developa concise representation of key frames in the video, more recently, video summaries have been used through visual attention modeling. In these schemes,the...Video summarization is applied to reduce redundancy and developa concise representation of key frames in the video, more recently, video summaries have been used through visual attention modeling. In these schemes,the frames that stand out visually are extracted as key frames based on humanattention modeling theories. The schemes for modeling visual attention haveproven to be effective for video summaries. Nevertheless, the high cost ofcomputing in such techniques restricts their usability in everyday situations.In this context, we propose a method based on KFE (key frame extraction)technique, which is recommended based on an efficient and accurate visualattention model. The calculation effort is minimized by utilizing dynamicvisual highlighting based on the temporal gradient instead of the traditionaloptical flow techniques. In addition, an efficient technique using a discretecosine transformation is utilized for the static visual salience. The dynamic andstatic visual attention metrics are merged by means of a non-linear weightedfusion technique. Results of the system are compared with some existing stateof-the-art techniques for the betterment of accuracy. The experimental resultsof our proposed model indicate the efficiency and high standard in terms ofthe key frames extraction as output.展开更多
BACKGROUND Cognitive issues such as Alzheimer’s disease and other dementias confer a substantial negative impact.Problems relating to sensitivity,subjectivity,and inherent bias can limit the usefulness of many tradit...BACKGROUND Cognitive issues such as Alzheimer’s disease and other dementias confer a substantial negative impact.Problems relating to sensitivity,subjectivity,and inherent bias can limit the usefulness of many traditional methods of assessing cognitive impairment.AIM To determine cut-off scores for classification of cognitive impairment,and assess Cognivue®safety and efficacy in a large validation study.METHODS Adults(age 55-95 years)at risk for age-related cognitive decline or dementia were invited via posters and email to participate in two cohort studies conducted at various outpatient clinics and assisted-and independent-living facilities.In the cut-off score determination study(n=92),optimization analyses by positive percent agreement(PPA)and negative percent agreement(NPA),and by accuracy and error bias were conducted.In the clinical validation study(n=401),regression,rank linear regression,and factor analyses were conducted.Participants in the clinical validation study also completed other neuropsychological tests.RESULTS For the cut-off score determination study,92 participants completed St.Louis University Mental Status(SLUMS,reference standard)and Cognivue^®tests.Analyses showed that SLUMS cut-off scores of<21(impairment)and>26(no impairment)corresponded to Cognivue^®scores of 54.5(NPA=0.92;PPA=0.64)and 78.5(NPA=0.5;PPA=0.79),respectively.Therefore,conservatively,Cognivue^®scores of 55-64 corresponded to impairment,and 74-79 to no impairment.For the clinical validation study,401 participants completed≥1 testing session,and 358 completed 2 sessions 1-2 wk apart.Cognivue^®classification scores were validated,demonstrating good agreement with SLUMS scores(weightedκ0.57;95%CI:0.50-0.63).Reliability analyses showed similar scores across repeated testing for Cognivue^®(R2=0.81;r=0.90)and SLUMS(R2=0.67;r=0.82).Psychometric validity of Cognivue^®was demonstrated vs.traditional neuropsychological tests.Scores were most closely correlated with measures of verbal processing,manual dexterity/speed,visual contrast sensitivity,visuospatial/executive function,and speed/sequencing.CONCLUSION Cognivue^®scores≤50 avoid misclassification of impairment,and scores≥75 avoid misclassification of unimpairment.The validation study demonstrates good agreement between Cognivue^®and SLUMS;superior reliability;and good psychometric validity.展开更多
In this paper, we present a video coding scheme which applies the technique of visual saliency computation to adjust image fidelity before compression. To extract visually salient features, we construct a spatio-tempo...In this paper, we present a video coding scheme which applies the technique of visual saliency computation to adjust image fidelity before compression. To extract visually salient features, we construct a spatio-temporal saliency map by analyzing the video using a combined bottom-up and top-down visual saliency model. We then use an extended bilateral filter, in which the local intensity and spatial scales are adjusted according to visual saliency, to adaptively alter the image fidelity. Our implementation is based on the H.264 video encoder JM12.0. Besides evaluating our scheme with the H.264 reference software, we also compare it to a more traditional foreground-background segmentation-based method and a foveation-based approach which employs Gaussian blurring. Our results show that the proposed algorithm can improve the compression ratio significantly while effectively preserving perceptual visual quality.展开更多
Since of the scale and the various shapes of down in the image,it is difficult for traditional image recognition method to correctly recognize the type of down image and get the required recognition accuracy,even for ...Since of the scale and the various shapes of down in the image,it is difficult for traditional image recognition method to correctly recognize the type of down image and get the required recognition accuracy,even for the Traditional Convolutional Neural Network(TCNN).To deal with the above problems,a Deep Convolutional Neural Network(DCNN)for down image classification is constructed,and a new weight initialization method is proposed.Firstly,the salient regions of a down image were cut from the image using the visual saliency model.Then,these salient regions of the image were used to train a sparse autoencoder and get a collection of convolutional filters,which accord with the statistical characteristics of dataset.At last,a DCNN with Inception module and its variants was constructed.To improve the recognition accuracy,the depth of the network is deepened.The experiment results indicate that the constructed DCNN increases the recognition accuracy by 2.7% compared to TCNN,when recognizing the down in the images.The convergence rate of the proposed DCNN with the new weight initialization method is improved by 25.5% compared to TCNN.展开更多
基金Natural Science Foundation of Jiangsu Province,Grant/Award Number:BK20170765National Natural Science Foundation of China,Grant/Award Number:61703201+1 种基金Future Network Scientific Research Fund Project,Grant/Award Number:FNSRFP2021YB26Science Foundation of Nanjing Institute of Technology,Grant/Award Numbers:ZKJ202002,ZKJ202003,and YKJ202019。
文摘Unconstrained face images are interfered by many factors such as illumination,posture,expression,occlusion,age,accessories and so on,resulting in the randomness of the noise pollution implied in the original samples.In order to improve the sample quality,a weighted block cooperative sparse representation algorithm is proposed based on visual saliency dictionary.First,the algorithm uses the biological visual attention mechanism to quickly and accurately obtain the face salient target and constructs the visual salient dictionary.Then,a block cooperation framework is presented to perform sparse coding for different local structures of human face,and the weighted regular term is introduced in the sparse representation process to enhance the identification of information hidden in the coding coefficients.Finally,by synthesising the sparse representation results of all visual salient block dictionaries,the global coding residual is obtained and the class label is given.The experimental results on four databases,that is,AR,extended Yale B,LFW and PubFig,indicate that the combination of visual saliency dictionary,block cooperative sparse representation and weighted constraint coding can effectively enhance the accuracy of sparse representation of the samples to be tested and improve the performance of unconstrained face recognition.
基金Supported by National Natural Science Foundation of China(Grant Nos.U1564201,61573171,61403172,51305167)China Postdoctoral Science Foundation(Grant Nos.2015T80511,2014M561592)+3 种基金Jiangsu Provincial Natural Science Foundation of China(Grant No.BK20140555)Six Talent Peaks Project of Jiangsu Province,China(Grant Nos.2015-JXQC-012,2014-DZXX-040)Jiangsu Postdoctoral Science Foundation,China(Grant No.1402097C)Jiangsu University Scientific Research Foundation for Senior Professionals,China(Grant No.14JDG028)
文摘Traditional vehicle detection algorithms use traverse search based vehicle candidate generation and hand crafted based classifier training for vehicle candidate verification.These types of methods generally have high processing times and low vehicle detection performance.To address this issue,a visual saliency and deep sparse convolution hierarchical model based vehicle detection algorithm is proposed.A visual saliency calculation is firstly used to generate a small vehicle candidate area.The vehicle candidate sub images are then loaded into a sparse deep convolution hierarchical model with an SVM-based classifier to perform the final detection.The experimental results demonstrate that the proposed method is with 94.81% correct rate and 0.78% false detection rate on the existing datasets and the real road pictures captured by our group,which outperforms the existing state-of-the-art algorithms.More importantly,high discriminative multi-scale features are generated by deep sparse convolution network which has broad application prospects in target recognition in the field of intelligent vehicle.
基金supported by the National Natural Science Foundation of China(Grant No.41201486)the National Key Technologies R&D Program of China(Grant No.SQ2013GX07E00985)the project of the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD) in the Collaborative Innovation Center of Modern Grain Circulation and Security,Nanjing University of Finance and Economics
文摘Street-level visualization is an important application of 3D city models.Challenges to street-level visualization include the cluttering of buildings due to fine detail and visualization performance.In this paper,a novel method is proposed for streetlevel visualization based on visual saliency evaluation.The basic idea of the method is to preserve these salient buildings in a scene while removing those that are non-salient.The method can be divided into pre-processing procedures and real-time visualization.The first step in pre-processing is to convert 3D building models at higher Levels of Detail(Lo Ds) into LoD 1 models with simplified ground plans.Then,a number of index viewpoints are created along the streets; these indices refer to both the position and the direction of each street site.A visual saliency value is computed for each building,with respect to the index site,based on a visual difference between the original model and the generalized model.We calculate and evaluate three methods for visual saliency:local difference,global difference and minimum projection area.The real-time visualization process begins by mapping the observer to its closest indices.The street view is then generated based on the building information stored in those indexes.A user study shows that the local visual saliency method performs better than do the global visual saliency,area and image-based methods and that the framework proposed in this paper may improve the performance of 3D visualization.
基金supported by the National Natural Science Foundation of China(61210012)
文摘Craters are salient terrain features on planetary surfaces, and provide useful information about the relative dating of geological unit of planets. In addition, they are ideal landmarks for spacecraft navigation. Due to low contrast and uneven illumination, automatic extraction of craters remains a challenging task. This paper presents a saliency detection method for crater edges and a feature matching algorithm based on edges informa- tion. The craters are extracted through saliency edges detection, edge extraction and selection, feature matching of the same crater edges and robust ellipse fitting. In the edges matching algorithm, a crater feature model is proposed by analyzing the relationship between highlight region edges and shadow region ones. Then, crater edges are paired through the effective matching algorithm. Experiments of real planetary images show that the proposed approach is robust to different lights and topographies, and the detection rate is larger than 90%.
基金Projects(60234030 ,60404021) supported by the National Natural Science Foundation of China
文摘Based on salient visual regions for mobile robot navigation in unknown environments, a new place recognition system was presented. The system uses monocular camera to acquire omni-directional images of the environment where the robot locates. Salient local regions are detected from these images using center-surround difference method, which computes opponencies of color and texture among multi-scale image spaces. And then they are organized using hidden Markov model (HMM) to form the vertex of topological map. So localization, that is place recognition in our system, can be converted to evaluation of HMM. Experimental results show that the saliency detection is immune to the changes of scale, 2D rotation and viewpoint etc. The created topological map has smaller size and a higher ratio of recognition is obtained.
基金This work was supported in part by Qatar National Library,Doha,Qatar,and in part by the Qatar University Internal under Grant IRCC-2021-010。
文摘Video summarization is applied to reduce redundancy and developa concise representation of key frames in the video, more recently, video summaries have been used through visual attention modeling. In these schemes,the frames that stand out visually are extracted as key frames based on humanattention modeling theories. The schemes for modeling visual attention haveproven to be effective for video summaries. Nevertheless, the high cost ofcomputing in such techniques restricts their usability in everyday situations.In this context, we propose a method based on KFE (key frame extraction)technique, which is recommended based on an efficient and accurate visualattention model. The calculation effort is minimized by utilizing dynamicvisual highlighting based on the temporal gradient instead of the traditionaloptical flow techniques. In addition, an efficient technique using a discretecosine transformation is utilized for the static visual salience. The dynamic andstatic visual attention metrics are merged by means of a non-linear weightedfusion technique. Results of the system are compared with some existing stateof-the-art techniques for the betterment of accuracy. The experimental resultsof our proposed model indicate the efficiency and high standard in terms ofthe key frames extraction as output.
文摘BACKGROUND Cognitive issues such as Alzheimer’s disease and other dementias confer a substantial negative impact.Problems relating to sensitivity,subjectivity,and inherent bias can limit the usefulness of many traditional methods of assessing cognitive impairment.AIM To determine cut-off scores for classification of cognitive impairment,and assess Cognivue®safety and efficacy in a large validation study.METHODS Adults(age 55-95 years)at risk for age-related cognitive decline or dementia were invited via posters and email to participate in two cohort studies conducted at various outpatient clinics and assisted-and independent-living facilities.In the cut-off score determination study(n=92),optimization analyses by positive percent agreement(PPA)and negative percent agreement(NPA),and by accuracy and error bias were conducted.In the clinical validation study(n=401),regression,rank linear regression,and factor analyses were conducted.Participants in the clinical validation study also completed other neuropsychological tests.RESULTS For the cut-off score determination study,92 participants completed St.Louis University Mental Status(SLUMS,reference standard)and Cognivue^®tests.Analyses showed that SLUMS cut-off scores of<21(impairment)and>26(no impairment)corresponded to Cognivue^®scores of 54.5(NPA=0.92;PPA=0.64)and 78.5(NPA=0.5;PPA=0.79),respectively.Therefore,conservatively,Cognivue^®scores of 55-64 corresponded to impairment,and 74-79 to no impairment.For the clinical validation study,401 participants completed≥1 testing session,and 358 completed 2 sessions 1-2 wk apart.Cognivue^®classification scores were validated,demonstrating good agreement with SLUMS scores(weightedκ0.57;95%CI:0.50-0.63).Reliability analyses showed similar scores across repeated testing for Cognivue^®(R2=0.81;r=0.90)and SLUMS(R2=0.67;r=0.82).Psychometric validity of Cognivue^®was demonstrated vs.traditional neuropsychological tests.Scores were most closely correlated with measures of verbal processing,manual dexterity/speed,visual contrast sensitivity,visuospatial/executive function,and speed/sequencing.CONCLUSION Cognivue^®scores≤50 avoid misclassification of impairment,and scores≥75 avoid misclassification of unimpairment.The validation study demonstrates good agreement between Cognivue^®and SLUMS;superior reliability;and good psychometric validity.
基金supported partially by the National High-Tech Research and Development 863 Program of China under Grant No. 2009AA01Z330the National Natural Science Foundation of China under Grant Nos.61033012 and 60970100
文摘In this paper, we present a video coding scheme which applies the technique of visual saliency computation to adjust image fidelity before compression. To extract visually salient features, we construct a spatio-temporal saliency map by analyzing the video using a combined bottom-up and top-down visual saliency model. We then use an extended bilateral filter, in which the local intensity and spatial scales are adjusted according to visual saliency, to adaptively alter the image fidelity. Our implementation is based on the H.264 video encoder JM12.0. Besides evaluating our scheme with the H.264 reference software, we also compare it to a more traditional foreground-background segmentation-based method and a foveation-based approach which employs Gaussian blurring. Our results show that the proposed algorithm can improve the compression ratio significantly while effectively preserving perceptual visual quality.
基金supported by the Natural Science Foundation of Hebei Provence[grant numbers:F2015201033,F2017201069]the foundation of H3C[grant number:2017A20004]。
文摘Since of the scale and the various shapes of down in the image,it is difficult for traditional image recognition method to correctly recognize the type of down image and get the required recognition accuracy,even for the Traditional Convolutional Neural Network(TCNN).To deal with the above problems,a Deep Convolutional Neural Network(DCNN)for down image classification is constructed,and a new weight initialization method is proposed.Firstly,the salient regions of a down image were cut from the image using the visual saliency model.Then,these salient regions of the image were used to train a sparse autoencoder and get a collection of convolutional filters,which accord with the statistical characteristics of dataset.At last,a DCNN with Inception module and its variants was constructed.To improve the recognition accuracy,the depth of the network is deepened.The experiment results indicate that the constructed DCNN increases the recognition accuracy by 2.7% compared to TCNN,when recognizing the down in the images.The convergence rate of the proposed DCNN with the new weight initialization method is improved by 25.5% compared to TCNN.