Remote sensing image scene classification and remote sensing technology applications are hot research topics.Although CNN-based models have reached high average accuracy,some classes are still misclassified,such as“f...Remote sensing image scene classification and remote sensing technology applications are hot research topics.Although CNN-based models have reached high average accuracy,some classes are still misclassified,such as“freeway,”“spare residential,”and“commercial_area.”These classes contain typical decisive features,spatial-relation features,and mixed decisive and spatial-relation features,which limit high-quality image scene classification.To address this issue,this paper proposes a Grad-CAM and capsule network hybrid method for image scene classification.The Grad-CAM and capsule network structures have the potential to recognize decisive features and spatial-relation features,respectively.By using a pre-trained model,hybrid structure,and structure adjustment,the proposed model can recognize both decisive and spatial-relation features.A group of experiments is designed on three popular data sets with increasing classification difficulties.In the most advanced experiment,92.67%average accuracy is achieved.Specifically,83%,75%,and 86%accuracies are obtained in the classes of“church,”“palace,”and“commercial_area,”respectively.This research demonstrates that the hybrid structure can effectively improve performance by considering both decisive and spatial-relation features.Therefore,Grad-CAM-CapsNet is a promising and powerful structure for image scene classification.展开更多
Crime scene investigation(CSI)image is key evidence carrier during criminal investiga-tion,in which CSI image retrieval can assist the public police to obtain criminal clues.Moreover,with the rapid development of deep...Crime scene investigation(CSI)image is key evidence carrier during criminal investiga-tion,in which CSI image retrieval can assist the public police to obtain criminal clues.Moreover,with the rapid development of deep learning,data-driven paradigm has become the mainstreammethod of CSI image feature extraction and representation,and in this process,datasets provideeffective support for CSI retrieval performance.However,there is a lack of systematic research onCSI image retrieval methods and datasets.Therefore,we present an overview of the existing worksabout one-class and multi-class CSI image retrieval based on deep learning.According to theresearch,based on their technical functionalities and implementation methods,CSI image retrievalis roughly classified into five categories:feature representation,metric learning,generative adversar-ial networks,autoencoder networks and attention networks.Furthermore,We analyzed the remain-ing challenges and discussed future work directions in this field.展开更多
To solve the heterogeneous image scene matching problem, a non-linear pre-processing method for the original images before intensity-based correlation is proposed. The result shows that the proper matching probability...To solve the heterogeneous image scene matching problem, a non-linear pre-processing method for the original images before intensity-based correlation is proposed. The result shows that the proper matching probability is raised greatly. Especially for the low S/N image pairs, the effect is more remarkable.展开更多
Speedometer identification has been researched for many years.The common approaches to that problem are usually based on image subtraction,which does not adapt to image offsets caused by camera vibration.To cope with ...Speedometer identification has been researched for many years.The common approaches to that problem are usually based on image subtraction,which does not adapt to image offsets caused by camera vibration.To cope with the rapidity,robust and accurate requirements of this kind of work in dynamic scene,a fast speedometer identification algorithm is proposed,it utilizes phase correlation method based on regional entire template translation to estimate the offset between images.In order to effectively reduce unnecessary computation and false detection rate,an improved linear Hough transform method with two optimization strategies is presented for pointer line detection.Based on VC++ 6.0 software platform with OpenCV library,the algorithm performance under experiments has shown that it celerity and precision.展开更多
In today’s real world, an important research part in image processing isscene text detection and recognition. Scene text can be in different languages,fonts, sizes, colours, orientations and structures. Moreover, the...In today’s real world, an important research part in image processing isscene text detection and recognition. Scene text can be in different languages,fonts, sizes, colours, orientations and structures. Moreover, the aspect ratios andlayouts of a scene text may differ significantly. All these variations appear assignificant challenges for the detection and recognition algorithms that are consideredfor the text in natural scenes. In this paper, a new intelligent text detection andrecognition method for detectingthe text from natural scenes and forrecognizingthe text by applying the newly proposed Conditional Random Field-based fuzzyrules incorporated Convolutional Neural Network (CR-CNN) has been proposed.Moreover, we have recommended a new text detection method for detecting theexact text from the input natural scene images. For enhancing the presentation ofthe edge detection process, image pre-processing activities such as edge detectionand color modeling have beenapplied in this work. In addition, we have generatednew fuzzy rules for making effective decisions on the processes of text detectionand recognition. The experiments have been directedusing the standard benchmark datasets such as the ICDAR 2003, the ICDAR 2011, the ICDAR2005 and the SVT and have achieved better detection accuracy intext detectionand recognition. By using these three datasets, five different experiments havebeen conducted for evaluating the proposed model. And also, we have comparedthe proposed system with the other classifiers such as the SVM, the MLP and theCNN. In these comparisons, the proposed model has achieved better classificationaccuracywhen compared with the other existing works.展开更多
For the task of content retrieval,analysis and generation of film and television scene images in the field of intelligent editing,fine-grained emotion recognition and prediction of images is of great significance.In t...For the task of content retrieval,analysis and generation of film and television scene images in the field of intelligent editing,fine-grained emotion recognition and prediction of images is of great significance.In this paper,the fusion of traditional perceptual features,art features and multi-channel deep learning features are used to reflect the emotion expression of different levels of the image.In addition,the integrated learning model with stacking architecture based on linear regression coefficient and sentiment correlations,which is called the LS-stacking model,is proposed according to the factor association between multi-dimensional emotions.The experimental results prove that the mixed feature and LS-stacking model can predict well on the 16 emotion categories of the self-built image dataset.This study improves the fine-grained recognition ability of image emotion by computers,which helps to increase the intelligence and automation degree of visual retrieval and post-production system.展开更多
Detecting and recognizing text from natural scene images presents a challenge because the image quality depends on the conditions in which the image is captured,such as viewing angles,blurring,sensor noise,etc.However...Detecting and recognizing text from natural scene images presents a challenge because the image quality depends on the conditions in which the image is captured,such as viewing angles,blurring,sensor noise,etc.However,in this paper,a prototype for text detection and recognition from natural scene images is proposed.This prototype is based on the Raspberry Pi 4 and the Universal Serial Bus(USB)camera and embedded our text detection and recognition model,which was developed using the Python language.Our model is based on the deep learning text detector model through the Efficient and Accurate Scene Text Detec-tor(EAST)model for text localization and detection and the Tesseract-OCR,which is used as an Optical Character Recognition(OCR)engine for text recog-nition.Our prototype is controlled by the Virtual Network Computing(VNC)tool through a computer via a wireless connection.The experiment results show that the recognition rate for the captured image through the camera by our prototype can reach 99.75%with low computational complexity.Furthermore,our proto-type is more performant than the Tesseract software in terms of the recognition rate.Besides,it provides the same performance in terms of the recognition rate with a huge decrease in the execution time by an average of 89%compared to the EasyOCR software on the Raspberry Pi 4 board.展开更多
基金funded by the open fund of the Key Laboratory of Jianghuai Arable Land Resources Protection and Eco-restoration(Ministry of Natural Resources)(No.2022-ARPE-KF04)the Open Fund of Key Laboratory of Urban Land Resources Monitoring and Simulation(Ministry of Natural Resources)(No.KF-2020-05-084).
文摘Remote sensing image scene classification and remote sensing technology applications are hot research topics.Although CNN-based models have reached high average accuracy,some classes are still misclassified,such as“freeway,”“spare residential,”and“commercial_area.”These classes contain typical decisive features,spatial-relation features,and mixed decisive and spatial-relation features,which limit high-quality image scene classification.To address this issue,this paper proposes a Grad-CAM and capsule network hybrid method for image scene classification.The Grad-CAM and capsule network structures have the potential to recognize decisive features and spatial-relation features,respectively.By using a pre-trained model,hybrid structure,and structure adjustment,the proposed model can recognize both decisive and spatial-relation features.A group of experiments is designed on three popular data sets with increasing classification difficulties.In the most advanced experiment,92.67%average accuracy is achieved.Specifically,83%,75%,and 86%accuracies are obtained in the classes of“church,”“palace,”and“commercial_area,”respectively.This research demonstrates that the hybrid structure can effectively improve performance by considering both decisive and spatial-relation features.Therefore,Grad-CAM-CapsNet is a promising and powerful structure for image scene classification.
文摘Crime scene investigation(CSI)image is key evidence carrier during criminal investiga-tion,in which CSI image retrieval can assist the public police to obtain criminal clues.Moreover,with the rapid development of deep learning,data-driven paradigm has become the mainstreammethod of CSI image feature extraction and representation,and in this process,datasets provideeffective support for CSI retrieval performance.However,there is a lack of systematic research onCSI image retrieval methods and datasets.Therefore,we present an overview of the existing worksabout one-class and multi-class CSI image retrieval based on deep learning.According to theresearch,based on their technical functionalities and implementation methods,CSI image retrievalis roughly classified into five categories:feature representation,metric learning,generative adversar-ial networks,autoencoder networks and attention networks.Furthermore,We analyzed the remain-ing challenges and discussed future work directions in this field.
文摘To solve the heterogeneous image scene matching problem, a non-linear pre-processing method for the original images before intensity-based correlation is proposed. The result shows that the proper matching probability is raised greatly. Especially for the low S/N image pairs, the effect is more remarkable.
基金Supported by the National Natural Science Foundation of China (61004139)Beijing Municipal Natural Science Foundation(4101001)2008 Yangtze Fund Scholar and Innovative Research Team Development Schemes of Ministry of Education
文摘Speedometer identification has been researched for many years.The common approaches to that problem are usually based on image subtraction,which does not adapt to image offsets caused by camera vibration.To cope with the rapidity,robust and accurate requirements of this kind of work in dynamic scene,a fast speedometer identification algorithm is proposed,it utilizes phase correlation method based on regional entire template translation to estimate the offset between images.In order to effectively reduce unnecessary computation and false detection rate,an improved linear Hough transform method with two optimization strategies is presented for pointer line detection.Based on VC++ 6.0 software platform with OpenCV library,the algorithm performance under experiments has shown that it celerity and precision.
文摘In today’s real world, an important research part in image processing isscene text detection and recognition. Scene text can be in different languages,fonts, sizes, colours, orientations and structures. Moreover, the aspect ratios andlayouts of a scene text may differ significantly. All these variations appear assignificant challenges for the detection and recognition algorithms that are consideredfor the text in natural scenes. In this paper, a new intelligent text detection andrecognition method for detectingthe text from natural scenes and forrecognizingthe text by applying the newly proposed Conditional Random Field-based fuzzyrules incorporated Convolutional Neural Network (CR-CNN) has been proposed.Moreover, we have recommended a new text detection method for detecting theexact text from the input natural scene images. For enhancing the presentation ofthe edge detection process, image pre-processing activities such as edge detectionand color modeling have beenapplied in this work. In addition, we have generatednew fuzzy rules for making effective decisions on the processes of text detectionand recognition. The experiments have been directedusing the standard benchmark datasets such as the ICDAR 2003, the ICDAR 2011, the ICDAR2005 and the SVT and have achieved better detection accuracy intext detectionand recognition. By using these three datasets, five different experiments havebeen conducted for evaluating the proposed model. And also, we have comparedthe proposed system with the other classifiers such as the SVM, the MLP and theCNN. In these comparisons, the proposed model has achieved better classificationaccuracywhen compared with the other existing works.
基金Supported by the Open Project of Key Laboratory of Audio and Video Restoration and Evaluation(2021KFKT005)。
文摘For the task of content retrieval,analysis and generation of film and television scene images in the field of intelligent editing,fine-grained emotion recognition and prediction of images is of great significance.In this paper,the fusion of traditional perceptual features,art features and multi-channel deep learning features are used to reflect the emotion expression of different levels of the image.In addition,the integrated learning model with stacking architecture based on linear regression coefficient and sentiment correlations,which is called the LS-stacking model,is proposed according to the factor association between multi-dimensional emotions.The experimental results prove that the mixed feature and LS-stacking model can predict well on the 16 emotion categories of the self-built image dataset.This study improves the fine-grained recognition ability of image emotion by computers,which helps to increase the intelligence and automation degree of visual retrieval and post-production system.
基金This work was funded by the Deanship of Scientific Research at Jouf University(Kingdom of Saudi Arabia)under Grant No.DSR-2021-02-0392.
文摘Detecting and recognizing text from natural scene images presents a challenge because the image quality depends on the conditions in which the image is captured,such as viewing angles,blurring,sensor noise,etc.However,in this paper,a prototype for text detection and recognition from natural scene images is proposed.This prototype is based on the Raspberry Pi 4 and the Universal Serial Bus(USB)camera and embedded our text detection and recognition model,which was developed using the Python language.Our model is based on the deep learning text detector model through the Efficient and Accurate Scene Text Detec-tor(EAST)model for text localization and detection and the Tesseract-OCR,which is used as an Optical Character Recognition(OCR)engine for text recog-nition.Our prototype is controlled by the Virtual Network Computing(VNC)tool through a computer via a wireless connection.The experiment results show that the recognition rate for the captured image through the camera by our prototype can reach 99.75%with low computational complexity.Furthermore,our proto-type is more performant than the Tesseract software in terms of the recognition rate.Besides,it provides the same performance in terms of the recognition rate with a huge decrease in the execution time by an average of 89%compared to the EasyOCR software on the Raspberry Pi 4 board.