Salient object detection(SOD)in RGB and depth images has attracted increasing research interest.Existing RGB-D SOD models usually adopt fusion strategies to learn a shared representation from RGB and depth modalities,...Salient object detection(SOD)in RGB and depth images has attracted increasing research interest.Existing RGB-D SOD models usually adopt fusion strategies to learn a shared representation from RGB and depth modalities,while few methods explicitly consider how to preserve modality-specific characteristics.In this study,we propose a novel framework,the specificity-preserving network(SPNet),which improves SOD performance by exploring both the shared information and modality-specific properties.Specifically,we use two modality-specific networks and a shared learning network to generate individual and shared saliency prediction maps.To effectively fuse cross-modal features in the shared learning network,we propose a cross-enhanced integration module(CIM)and propagate the fused feature to the next layer to integrate cross-level information.Moreover,to capture rich complementary multi-modal information to boost SOD performance,we use a multi-modal feature aggregation(MFA)module to integrate the modalityspecific features from each individual decoder into the shared decoder.By using skip connections between encoder and decoder layers,hierarchical features can be fully combined.Extensive experiments demonstrate that our SPNet outperforms cutting-edge approaches on six popular RGB-D SOD and three camouflaged object detection benchmarks.The project is publicly available at https://github.com/taozh2017/SPNet.展开更多
Visual attention is a mechanism that enables the visual system to detect potentially important objects in complex environment. Most computational visual attention models are designed with inspirations from mammalian v...Visual attention is a mechanism that enables the visual system to detect potentially important objects in complex environment. Most computational visual attention models are designed with inspirations from mammalian visual systems.However, electrophysiological and behavioral evidences indicate that avian species are animals with high visual capability that can process complex information accurately in real time. Therefore,the visual system of the avian species, especially the nuclei related to the visual attention mechanism, are investigated in this paper. Afterwards, a hierarchical visual attention model is proposed for saliency detection. The optic tectum neuron responses are computed and the self-information is used to compute primary saliency maps in the first hierarchy. The "winner-takeall" network in the tecto-isthmal projection is simulated and final saliency maps are estimated with the regularized random walks ranking in the second hierarchy. Comparison results verify that the proposed model, which can define the focus of attention accurately, outperforms several state-of-the-art models.This study provides insights into the relationship between the visual attention mechanism and the avian visual pathways. The computational visual attention model may reveal the underlying neural mechanism of the nuclei for biological visual attention.展开更多
Craters are salient terrain features on planetary surfaces, and provide useful information about the relative dating of geological unit of planets. In addition, they are ideal landmarks for spacecraft navigation. Due ...Craters are salient terrain features on planetary surfaces, and provide useful information about the relative dating of geological unit of planets. In addition, they are ideal landmarks for spacecraft navigation. Due to low contrast and uneven illumination, automatic extraction of craters remains a challenging task. This paper presents a saliency detection method for crater edges and a feature matching algorithm based on edges informa- tion. The craters are extracted through saliency edges detection, edge extraction and selection, feature matching of the same crater edges and robust ellipse fitting. In the edges matching algorithm, a crater feature model is proposed by analyzing the relationship between highlight region edges and shadow region ones. Then, crater edges are paired through the effective matching algorithm. Experiments of real planetary images show that the proposed approach is robust to different lights and topographies, and the detection rate is larger than 90%.展开更多
Reliable saliency detection can be used to quickly and effectively locate objects in images. In this paper, a novel algorithm for saliency detection based on superpixels clustering and stereo disparity (SDC) is prop...Reliable saliency detection can be used to quickly and effectively locate objects in images. In this paper, a novel algorithm for saliency detection based on superpixels clustering and stereo disparity (SDC) is proposed. Firstly, we use an improved superpixels clustering method to decompose the given image. Then, the disparity of each superpixel is computed by a modified stereo correspondence algorithm. Finally, a new measure which combines stereo disparity with color contrast and spatial coherence is defined to evaluate the saliency of each superpixel. From the experiments we can see that regions with high disparity can get higher saliency value, and the saliency maps have the same resolution with the source images, objects in the map have clear boundaries. Due to the use of superpixel and stereo disparity information, the proposed method is computationally efficient and outperforms some state-of-the-art color- based saliency detection methods.展开更多
Pests detecting is an important research subject in grain storage field.In the past decades,many edge detection methods have been applied to the edge detection of stored grain pests.Although some of them can realize t...Pests detecting is an important research subject in grain storage field.In the past decades,many edge detection methods have been applied to the edge detection of stored grain pests.Although some of them can realize the stored grain pests detecting,precision and robustness are not good enough.Spectral residual(SR)saliency edge detection defines the logarithmic spectrumof image as novelty part of the image information.The remaining spectrumis converted to the airspace to obtain edge detection results.SR algorithm is completely based on frequency domain processing.It not only can effectively simplify the target detection algorithm,but also can improve the effectiveness of target recognition.The experimental results show that the edge results of stored grain pests detected by SR method are effective and stable.展开更多
Traditional vehicle detection algorithms use traverse search based vehicle candidate generation and hand crafted based classifier training for vehicle candidate verification.These types of methods generally have high ...Traditional vehicle detection algorithms use traverse search based vehicle candidate generation and hand crafted based classifier training for vehicle candidate verification.These types of methods generally have high processing times and low vehicle detection performance.To address this issue,a visual saliency and deep sparse convolution hierarchical model based vehicle detection algorithm is proposed.A visual saliency calculation is firstly used to generate a small vehicle candidate area.The vehicle candidate sub images are then loaded into a sparse deep convolution hierarchical model with an SVM-based classifier to perform the final detection.The experimental results demonstrate that the proposed method is with 94.81% correct rate and 0.78% false detection rate on the existing datasets and the real road pictures captured by our group,which outperforms the existing state-of-the-art algorithms.More importantly,high discriminative multi-scale features are generated by deep sparse convolution network which has broad application prospects in target recognition in the field of intelligent vehicle.展开更多
In order to better represent infrared target features under different environments, a saliency detection method based on region covariance and global feature is proposed. Firstly, the region covariance features on dif...In order to better represent infrared target features under different environments, a saliency detection method based on region covariance and global feature is proposed. Firstly, the region covariance features on different scale spaces and different image regions are extracted and transformed into sigma features,then combined with central position feature, the local salient map is generated. Next, a global salient map is generated by gray contrast and density estimation. Finally, the saliency detection result of infrared images is obtained by fusing the local and global salient maps. The experimental results show that the salient map of the proposed method has complete target features and obvious edges,and the proposed method is better than the state of art method both qualitatively and quantitatively.展开更多
Saliency detection models, which are used to extract salient regions in visual scenes, are widely used in various multimedia processing applications. It has attracted much attention in the area of computer vision over...Saliency detection models, which are used to extract salient regions in visual scenes, are widely used in various multimedia processing applications. It has attracted much attention in the area of computer vision over the past decades. Since most images or videos over the Internet are stored in compressed domains such as images in JPEG format and videos in MPEG2 format, H.264 format, and MPEG4 Visual format, many saliency detection models have been proposed in the compressed domain recently. We provide a review of our works on saliency detection models in the compressed domain in this paper.Besides, we introduce some commonly used fusion strategies to combine spatial saliency map and temporal saliency map to compute the final video saliency map.展开更多
Advances in machine vision systems have revolutionized applications such as autonomous driving,robotic navigation,and augmented reality.Despite substantial progress,challenges persist,including dynamic backgrounds,occ...Advances in machine vision systems have revolutionized applications such as autonomous driving,robotic navigation,and augmented reality.Despite substantial progress,challenges persist,including dynamic backgrounds,occlusion,and limited labeled data.To address these challenges,we introduce a comprehensive methodology toenhance image classification and object detection accuracy.The proposed approach involves the integration ofmultiple methods in a complementary way.The process commences with the application of Gaussian filters tomitigate the impact of noise interference.These images are then processed for segmentation using Fuzzy C-Meanssegmentation in parallel with saliency mapping techniques to find the most prominent regions.The Binary RobustIndependent Elementary Features(BRIEF)characteristics are then extracted fromdata derived fromsaliency mapsand segmented images.For precise object separation,Oriented FAST and Rotated BRIEF(ORB)algorithms areemployed.Genetic Algorithms(GAs)are used to optimize Random Forest classifier parameters which lead toimproved performance.Our method stands out due to its comprehensive approach,adeptly addressing challengessuch as changing backdrops,occlusion,and limited labeled data concurrently.A significant enhancement hasbeen achieved by integrating Genetic Algorithms(GAs)to precisely optimize parameters.This minor adjustmentnot only boosts the uniqueness of our system but also amplifies its overall efficacy.The proposed methodologyhas demonstrated notable classification accuracies of 90.9%and 89.0%on the challenging Corel-1k and MSRCdatasets,respectively.Furthermore,detection accuracies of 87.2%and 86.6%have been attained.Although ourmethod performed well in both datasets it may face difficulties in real-world data especially where datasets havehighly complex backgrounds.Despite these limitations,GAintegration for parameter optimization shows a notablestrength in enhancing the overall adaptability and performance of our system.展开更多
Background Co-salient object detection(Co-SOD)aims to identify and segment commonly salient objects in a set of related images.However,most current Co-SOD methods encounter issues with the inclusion of irrelevant info...Background Co-salient object detection(Co-SOD)aims to identify and segment commonly salient objects in a set of related images.However,most current Co-SOD methods encounter issues with the inclusion of irrelevant information in the co-representation.These issues hamper their ability to locate co-salient objects and significantly restrict the accuracy of detection.Methods To address this issue,this study introduces a novel Co-SOD method with iterative purification and predictive optimization(IPPO)comprising a common salient purification module(CSPM),predictive optimizing module(POM),and diminishing mixed enhancement block(DMEB).Results These components are designed to explore noise-free joint representations,assist the model in enhancing the quality of the final prediction results,and significantly improve the performance of the Co-SOD algorithm.Furthermore,through a comprehensive evaluation of IPPO and state-of-the-art algorithms focusing on the roles of CSPM,POM,and DMEB,our experiments confirmed that these components are pivotal in enhancing the performance of the model,substantiating the significant advancements of our method over existing benchmarks.Experiments on several challenging benchmark co-saliency datasets demonstrate that the proposed IPPO achieves state-of-the-art performance.展开更多
This study explores the challenges posed by pedestrian detection and occlusion in AR applications, employing a novel approach that utilizes RGB-D-based skeleton reconstruction to reduce the overhead of classical pedes...This study explores the challenges posed by pedestrian detection and occlusion in AR applications, employing a novel approach that utilizes RGB-D-based skeleton reconstruction to reduce the overhead of classical pedestrian detection algorithms during training. Furthermore, it is dedicated to addressing occlusion issues in pedestrian detection by using Azure Kinect for body tracking and integrating a robust occlusion management algorithm, significantly enhancing detection efficiency. In experiments, an average latency of 204 milliseconds was measured, and the detection accuracy reached an outstanding level of 97%. Additionally, this approach has been successfully applied in creating a simple yet captivating augmented reality game, demonstrating the practical application of the algorithm.展开更多
The accurate detection of cooperative targets plays a key and foundational role in unmanned aerial vehicle (UAV) landing autonomously. The standard method based on fixed threshold is too susceptible to both illuminati...The accurate detection of cooperative targets plays a key and foundational role in unmanned aerial vehicle (UAV) landing autonomously. The standard method based on fixed threshold is too susceptible to both illumination variations and interference. To overcome issues above, a robust detection algorithm with triple constraints for cooperative targets based on spectral residual (TCSR) is proposed. Firstly, by designing an asymmetric cooperative target, which comprises red background, green H and triangle target, the captured original image is converted into a Lab color space, whose saliency map is yielded by constructing the spectral residual. Then, the triple constraints are developed according to the prior knowledge of the cooperative target. Finally, the salient region in saliency map is considered as the cooperative target, and it meets the triple constraints. Experimental results in complex environments show that the proposed TCSR outperforms the standard methods in higher detection accuracy and lower false alarm rate.展开更多
This paper concerns the problem of object segmentation in real-time for picking system. A region proposal method inspired by human glance based on the convolutional neural network is proposed to select promising regio...This paper concerns the problem of object segmentation in real-time for picking system. A region proposal method inspired by human glance based on the convolutional neural network is proposed to select promising regions, allowing more processing is reserved only for these regions. The speed of object segmentation is significantly improved by the region proposal method.By the combination of the region proposal method based on the convolutional neural network and superpixel method, the category and location information can be used to segment objects and image redundancy is significantly reduced. The processing time is reduced considerably by this to achieve the real time. Experiments show that the proposed method can segment the interested target object in real time on an ordinary laptop.展开更多
Melanoma,due to its higher mortality rate,is considered as one of the most pernicious types of skin cancers,mostly affecting the white populations.It has been reported a number of times and is now widely accepted,that...Melanoma,due to its higher mortality rate,is considered as one of the most pernicious types of skin cancers,mostly affecting the white populations.It has been reported a number of times and is now widely accepted,that early detection of melanoma increases the chances of the subject’s survival.Computer-aided diagnostic systems help the experts in diagnosing the skin lesion at earlier stages using machine learning techniques.In thiswork,we propose a framework that accurately segments,and later classifies,the lesion using improved image segmentation and fusion methods.The proposed technique takes an image and passes it through two methods simultaneously;one is the weighted visual saliency-based method,and the second is improved HDCT based saliency estimation.The resultant image maps are later fused using the proposed image fusion technique to generate a localized lesion region.The resultant binary image is later mapped back to the RGB image and fed into the Inception-ResNet-V2 pre-trained model-trained by applying transfer learning.The simulation results show improved performance compared to several existing methods.展开更多
Straightforward image resizing operators without considering image contents (e.g., uniform scaling) cannot usually produce satisfactory results, while content-aware image retargeting aims to arbitrarily change image...Straightforward image resizing operators without considering image contents (e.g., uniform scaling) cannot usually produce satisfactory results, while content-aware image retargeting aims to arbitrarily change image size while preserving visually prominent features. In this paper, a cluster-based saliency-guided seam carving algorithm for content- aware image retargeting is proposed. To cope with the main drawback of the original seam carving algorithm relying on only gradient-based image importance map, we integrate a gradient-based map and a cluster-based saliency map to generate a more reliable importance map, resulting in better single image retargeting results. Experimental results have demonstrated the efficacy of the proposed algorithm.展开更多
This paper proposes a cascade deep convolutional neural network to address the loosening detection problem of bolts on axlebox covers.Firstly,an SSD network based on ResNet50 and CBAM module by improving bolt image fe...This paper proposes a cascade deep convolutional neural network to address the loosening detection problem of bolts on axlebox covers.Firstly,an SSD network based on ResNet50 and CBAM module by improving bolt image features is proposed for locating bolts on axlebox covers.And then,theA2-PFN is proposed according to the slender features of the marker lines for extracting more accurate marker lines regions of the bolts.Finally,a rectangular approximationmethod is proposed to regularize themarker line regions asaway tocalculate the angle of themarker line and plot all the angle values into an angle table,according to which the criteria of the angle table can determine whether the bolt with the marker line is in danger of loosening.Meanwhile,our improved algorithm is compared with the pre-improved algorithmin the object localization stage.The results show that our proposed method has a significant improvement in both detection accuracy and detection speed,where ourmAP(IoU=0.75)reaches 0.77 and fps reaches 16.6.And in the saliency detection stage,after qualitative comparison and quantitative comparison,our method significantly outperforms other state-of-the-art methods,where our MAE reaches 0.092,F-measure reaches 0.948 and AUC reaches 0.943.Ultimately,according to the angle table,out of 676 bolt samples,a total of 60 bolts are loose,69 bolts are at risk of loosening,and 547 bolts are tightened.展开更多
Target detection in low light background is one of the main tasks of night patrol robots for airport terminal.However,if some algorithms can run on a robot platform with limited computing resources,it is difficult for...Target detection in low light background is one of the main tasks of night patrol robots for airport terminal.However,if some algorithms can run on a robot platform with limited computing resources,it is difficult for these algorithms to ensure the detection accuracy of human body in the airport terminal. A novel thermal infrared salient human detection model combined with thermal features called TFSHD is proposed. The TFSHD model is still based on U-Net,but the decoder module structure and model lightweight have been redesigned. In order to improve the detection accuracy of the algorithm in complex scenes,a fusion module composed of thermal branch and saliency branch is added to the decoder of the TFSHD model. Furthermore,a predictive loss function that is more sensitive to high temperature regions of the image is designed. Additionally,for the sake of reducing the computing resource requirements of the algorithm,a model lightweight scheme that includes simplifying the encoder network structure and controlling the number of decoder channels is adopted. The experimental results on four data sets show that the proposed method can not only ensure high detection accuracy and robustness of the algorithm,but also meet the needs of real-time detection of patrol robots with detection speed above 40 f/s.展开更多
基金supported in part by the National Natural Science Foundation of China under Grant No.62172228in part by an Open Project of the Key Laboratory of System Control and Information Processing,Ministry of Education(Shanghai Jiao Tong University,No.Scip202102).
文摘Salient object detection(SOD)in RGB and depth images has attracted increasing research interest.Existing RGB-D SOD models usually adopt fusion strategies to learn a shared representation from RGB and depth modalities,while few methods explicitly consider how to preserve modality-specific characteristics.In this study,we propose a novel framework,the specificity-preserving network(SPNet),which improves SOD performance by exploring both the shared information and modality-specific properties.Specifically,we use two modality-specific networks and a shared learning network to generate individual and shared saliency prediction maps.To effectively fuse cross-modal features in the shared learning network,we propose a cross-enhanced integration module(CIM)and propagate the fused feature to the next layer to integrate cross-level information.Moreover,to capture rich complementary multi-modal information to boost SOD performance,we use a multi-modal feature aggregation(MFA)module to integrate the modalityspecific features from each individual decoder into the shared decoder.By using skip connections between encoder and decoder layers,hierarchical features can be fully combined.Extensive experiments demonstrate that our SPNet outperforms cutting-edge approaches on six popular RGB-D SOD and three camouflaged object detection benchmarks.The project is publicly available at https://github.com/taozh2017/SPNet.
基金supported by Natural Science Foundation of China(61425008,61333004,61273054)
文摘Visual attention is a mechanism that enables the visual system to detect potentially important objects in complex environment. Most computational visual attention models are designed with inspirations from mammalian visual systems.However, electrophysiological and behavioral evidences indicate that avian species are animals with high visual capability that can process complex information accurately in real time. Therefore,the visual system of the avian species, especially the nuclei related to the visual attention mechanism, are investigated in this paper. Afterwards, a hierarchical visual attention model is proposed for saliency detection. The optic tectum neuron responses are computed and the self-information is used to compute primary saliency maps in the first hierarchy. The "winner-takeall" network in the tecto-isthmal projection is simulated and final saliency maps are estimated with the regularized random walks ranking in the second hierarchy. Comparison results verify that the proposed model, which can define the focus of attention accurately, outperforms several state-of-the-art models.This study provides insights into the relationship between the visual attention mechanism and the avian visual pathways. The computational visual attention model may reveal the underlying neural mechanism of the nuclei for biological visual attention.
基金supported by the National Natural Science Foundation of China(61210012)
文摘Craters are salient terrain features on planetary surfaces, and provide useful information about the relative dating of geological unit of planets. In addition, they are ideal landmarks for spacecraft navigation. Due to low contrast and uneven illumination, automatic extraction of craters remains a challenging task. This paper presents a saliency detection method for crater edges and a feature matching algorithm based on edges informa- tion. The craters are extracted through saliency edges detection, edge extraction and selection, feature matching of the same crater edges and robust ellipse fitting. In the edges matching algorithm, a crater feature model is proposed by analyzing the relationship between highlight region edges and shadow region ones. Then, crater edges are paired through the effective matching algorithm. Experiments of real planetary images show that the proposed approach is robust to different lights and topographies, and the detection rate is larger than 90%.
基金supported by NSFC Joint Fund with Guangdong under Key Project(U1201258)National Natural Science foundation of China(61402261+3 种基金6130308861572286)the scientific research foundation of Shandong Province of Outstanding Young Scientist Award(BS2013DX048)Shandong Ji’nan Science and Technology Development Project(201202015)
文摘Reliable saliency detection can be used to quickly and effectively locate objects in images. In this paper, a novel algorithm for saliency detection based on superpixels clustering and stereo disparity (SDC) is proposed. Firstly, we use an improved superpixels clustering method to decompose the given image. Then, the disparity of each superpixel is computed by a modified stereo correspondence algorithm. Finally, a new measure which combines stereo disparity with color contrast and spatial coherence is defined to evaluate the saliency of each superpixel. From the experiments we can see that regions with high disparity can get higher saliency value, and the saliency maps have the same resolution with the source images, objects in the map have clear boundaries. Due to the use of superpixel and stereo disparity information, the proposed method is computationally efficient and outperforms some state-of-the-art color- based saliency detection methods.
基金financially supported by National Natural Science Foundation of China(No.61871176)Key Scientific and Technological Project of Science and Technology Department of Henan Province(No.172102210030,182102110099)+2 种基金Key Scientific Research Project Program of Universities of Henan Province(No.18B520025)Open Fund of Key Laboratory of Grain Information Processing and Control(No.KFJJ-2018-102)supported by Collaborative Innovation Center of Grain Storage and Security of Henan Province
文摘Pests detecting is an important research subject in grain storage field.In the past decades,many edge detection methods have been applied to the edge detection of stored grain pests.Although some of them can realize the stored grain pests detecting,precision and robustness are not good enough.Spectral residual(SR)saliency edge detection defines the logarithmic spectrumof image as novelty part of the image information.The remaining spectrumis converted to the airspace to obtain edge detection results.SR algorithm is completely based on frequency domain processing.It not only can effectively simplify the target detection algorithm,but also can improve the effectiveness of target recognition.The experimental results show that the edge results of stored grain pests detected by SR method are effective and stable.
基金Supported by National Natural Science Foundation of China(Grant Nos.U1564201,61573171,61403172,51305167)China Postdoctoral Science Foundation(Grant Nos.2015T80511,2014M561592)+3 种基金Jiangsu Provincial Natural Science Foundation of China(Grant No.BK20140555)Six Talent Peaks Project of Jiangsu Province,China(Grant Nos.2015-JXQC-012,2014-DZXX-040)Jiangsu Postdoctoral Science Foundation,China(Grant No.1402097C)Jiangsu University Scientific Research Foundation for Senior Professionals,China(Grant No.14JDG028)
文摘Traditional vehicle detection algorithms use traverse search based vehicle candidate generation and hand crafted based classifier training for vehicle candidate verification.These types of methods generally have high processing times and low vehicle detection performance.To address this issue,a visual saliency and deep sparse convolution hierarchical model based vehicle detection algorithm is proposed.A visual saliency calculation is firstly used to generate a small vehicle candidate area.The vehicle candidate sub images are then loaded into a sparse deep convolution hierarchical model with an SVM-based classifier to perform the final detection.The experimental results demonstrate that the proposed method is with 94.81% correct rate and 0.78% false detection rate on the existing datasets and the real road pictures captured by our group,which outperforms the existing state-of-the-art algorithms.More importantly,high discriminative multi-scale features are generated by deep sparse convolution network which has broad application prospects in target recognition in the field of intelligent vehicle.
基金supported by the National Natural Science Foundation of China(61303192)the China Postdoctoral Science Foundation(2015M5726942016T90979)
文摘In order to better represent infrared target features under different environments, a saliency detection method based on region covariance and global feature is proposed. Firstly, the region covariance features on different scale spaces and different image regions are extracted and transformed into sigma features,then combined with central position feature, the local salient map is generated. Next, a global salient map is generated by gray contrast and density estimation. Finally, the saliency detection result of infrared images is obtained by fusing the local and global salient maps. The experimental results show that the salient map of the proposed method has complete target features and obvious edges,and the proposed method is better than the state of art method both qualitatively and quantitatively.
文摘Saliency detection models, which are used to extract salient regions in visual scenes, are widely used in various multimedia processing applications. It has attracted much attention in the area of computer vision over the past decades. Since most images or videos over the Internet are stored in compressed domains such as images in JPEG format and videos in MPEG2 format, H.264 format, and MPEG4 Visual format, many saliency detection models have been proposed in the compressed domain recently. We provide a review of our works on saliency detection models in the compressed domain in this paper.Besides, we introduce some commonly used fusion strategies to combine spatial saliency map and temporal saliency map to compute the final video saliency map.
基金a grant from the Basic Science Research Program through the National Research Foundation(NRF)(2021R1F1A1063634)funded by the Ministry of Science and ICT(MSIT)Republic of Korea.This research is supported and funded by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2024R410)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.The authors are thankful to the Deanship of Scientific Research at Najran University for funding this work under the Research Group Funding program Grant Code(NU/RG/SERC/12/6).
文摘Advances in machine vision systems have revolutionized applications such as autonomous driving,robotic navigation,and augmented reality.Despite substantial progress,challenges persist,including dynamic backgrounds,occlusion,and limited labeled data.To address these challenges,we introduce a comprehensive methodology toenhance image classification and object detection accuracy.The proposed approach involves the integration ofmultiple methods in a complementary way.The process commences with the application of Gaussian filters tomitigate the impact of noise interference.These images are then processed for segmentation using Fuzzy C-Meanssegmentation in parallel with saliency mapping techniques to find the most prominent regions.The Binary RobustIndependent Elementary Features(BRIEF)characteristics are then extracted fromdata derived fromsaliency mapsand segmented images.For precise object separation,Oriented FAST and Rotated BRIEF(ORB)algorithms areemployed.Genetic Algorithms(GAs)are used to optimize Random Forest classifier parameters which lead toimproved performance.Our method stands out due to its comprehensive approach,adeptly addressing challengessuch as changing backdrops,occlusion,and limited labeled data concurrently.A significant enhancement hasbeen achieved by integrating Genetic Algorithms(GAs)to precisely optimize parameters.This minor adjustmentnot only boosts the uniqueness of our system but also amplifies its overall efficacy.The proposed methodologyhas demonstrated notable classification accuracies of 90.9%and 89.0%on the challenging Corel-1k and MSRCdatasets,respectively.Furthermore,detection accuracies of 87.2%and 86.6%have been attained.Although ourmethod performed well in both datasets it may face difficulties in real-world data especially where datasets havehighly complex backgrounds.Despite these limitations,GAintegration for parameter optimization shows a notablestrength in enhancing the overall adaptability and performance of our system.
基金Supported by the National Natural Science Foundation of China under Grant(62301330,62101346)the Guangdong Basic and Applied Basic Research Foundation(2024A1515010496,2022A1515110101)+1 种基金the Stable Support Plan for Shenzhen Higher Education Institutions(20231121103807001)the Guangdong Provincial Key Laboratory under(2023B1212060076).
文摘Background Co-salient object detection(Co-SOD)aims to identify and segment commonly salient objects in a set of related images.However,most current Co-SOD methods encounter issues with the inclusion of irrelevant information in the co-representation.These issues hamper their ability to locate co-salient objects and significantly restrict the accuracy of detection.Methods To address this issue,this study introduces a novel Co-SOD method with iterative purification and predictive optimization(IPPO)comprising a common salient purification module(CSPM),predictive optimizing module(POM),and diminishing mixed enhancement block(DMEB).Results These components are designed to explore noise-free joint representations,assist the model in enhancing the quality of the final prediction results,and significantly improve the performance of the Co-SOD algorithm.Furthermore,through a comprehensive evaluation of IPPO and state-of-the-art algorithms focusing on the roles of CSPM,POM,and DMEB,our experiments confirmed that these components are pivotal in enhancing the performance of the model,substantiating the significant advancements of our method over existing benchmarks.Experiments on several challenging benchmark co-saliency datasets demonstrate that the proposed IPPO achieves state-of-the-art performance.
文摘This study explores the challenges posed by pedestrian detection and occlusion in AR applications, employing a novel approach that utilizes RGB-D-based skeleton reconstruction to reduce the overhead of classical pedestrian detection algorithms during training. Furthermore, it is dedicated to addressing occlusion issues in pedestrian detection by using Azure Kinect for body tracking and integrating a robust occlusion management algorithm, significantly enhancing detection efficiency. In experiments, an average latency of 204 milliseconds was measured, and the detection accuracy reached an outstanding level of 97%. Additionally, this approach has been successfully applied in creating a simple yet captivating augmented reality game, demonstrating the practical application of the algorithm.
基金supported by the National Natural Science Foundation of China(61135001)the Scientific Research Program of Shaanxi Provincial Department of Education(16JK1499)+2 种基金the Doctoral Fund of Xi’an University of Science and Technology(2015QDJ007)the Cultivation of Xi’an University of Science and Technology(2014015)the Ministry of Education Key Laboratory of Information Fusion Technology(LIFT2015-G-1)
文摘The accurate detection of cooperative targets plays a key and foundational role in unmanned aerial vehicle (UAV) landing autonomously. The standard method based on fixed threshold is too susceptible to both illumination variations and interference. To overcome issues above, a robust detection algorithm with triple constraints for cooperative targets based on spectral residual (TCSR) is proposed. Firstly, by designing an asymmetric cooperative target, which comprises red background, green H and triangle target, the captured original image is converted into a Lab color space, whose saliency map is yielded by constructing the spectral residual. Then, the triple constraints are developed according to the prior knowledge of the cooperative target. Finally, the salient region in saliency map is considered as the cooperative target, and it meets the triple constraints. Experimental results in complex environments show that the proposed TCSR outperforms the standard methods in higher detection accuracy and lower false alarm rate.
基金supported by the National Natural Science Foundation of China(61233010 61305106)+2 种基金the Shanghai Natural Science Foundation(17ZR1409700 18ZR1415300)the basic research project of Shanghai Municipal Science and Technology Commission(16JC1400900)
文摘This paper concerns the problem of object segmentation in real-time for picking system. A region proposal method inspired by human glance based on the convolutional neural network is proposed to select promising regions, allowing more processing is reserved only for these regions. The speed of object segmentation is significantly improved by the region proposal method.By the combination of the region proposal method based on the convolutional neural network and superpixel method, the category and location information can be used to segment objects and image redundancy is significantly reduced. The processing time is reduced considerably by this to achieve the real time. Experiments show that the proposed method can segment the interested target object in real time on an ordinary laptop.
基金The authors extend their appreciation to the Deanship of Scientific Research at King Saud University for funding this work through research Group No.(RG-1438-034)and co-authors K.A.and M.A.
文摘Melanoma,due to its higher mortality rate,is considered as one of the most pernicious types of skin cancers,mostly affecting the white populations.It has been reported a number of times and is now widely accepted,that early detection of melanoma increases the chances of the subject’s survival.Computer-aided diagnostic systems help the experts in diagnosing the skin lesion at earlier stages using machine learning techniques.In thiswork,we propose a framework that accurately segments,and later classifies,the lesion using improved image segmentation and fusion methods.The proposed technique takes an image and passes it through two methods simultaneously;one is the weighted visual saliency-based method,and the second is improved HDCT based saliency estimation.The resultant image maps are later fused using the proposed image fusion technique to generate a localized lesion region.The resultant binary image is later mapped back to the RGB image and fed into the Inception-ResNet-V2 pre-trained model-trained by applying transfer learning.The simulation results show improved performance compared to several existing methods.
基金supported by the National Natural Science Foundation of China (No. 91320201 and No. 61471262)the International (Regional) Collaborative Key Research Projects (No. 61520106002)
基金supported by“MOST”under Grants No.105-2628-E-224-001-MY3 and No.103-2221-E-224-034-MY2
文摘Straightforward image resizing operators without considering image contents (e.g., uniform scaling) cannot usually produce satisfactory results, while content-aware image retargeting aims to arbitrarily change image size while preserving visually prominent features. In this paper, a cluster-based saliency-guided seam carving algorithm for content- aware image retargeting is proposed. To cope with the main drawback of the original seam carving algorithm relying on only gradient-based image importance map, we integrate a gradient-based map and a cluster-based saliency map to generate a more reliable importance map, resulting in better single image retargeting results. Experimental results have demonstrated the efficacy of the proposed algorithm.
文摘This paper proposes a cascade deep convolutional neural network to address the loosening detection problem of bolts on axlebox covers.Firstly,an SSD network based on ResNet50 and CBAM module by improving bolt image features is proposed for locating bolts on axlebox covers.And then,theA2-PFN is proposed according to the slender features of the marker lines for extracting more accurate marker lines regions of the bolts.Finally,a rectangular approximationmethod is proposed to regularize themarker line regions asaway tocalculate the angle of themarker line and plot all the angle values into an angle table,according to which the criteria of the angle table can determine whether the bolt with the marker line is in danger of loosening.Meanwhile,our improved algorithm is compared with the pre-improved algorithmin the object localization stage.The results show that our proposed method has a significant improvement in both detection accuracy and detection speed,where ourmAP(IoU=0.75)reaches 0.77 and fps reaches 16.6.And in the saliency detection stage,after qualitative comparison and quantitative comparison,our method significantly outperforms other state-of-the-art methods,where our MAE reaches 0.092,F-measure reaches 0.948 and AUC reaches 0.943.Ultimately,according to the angle table,out of 676 bolt samples,a total of 60 bolts are loose,69 bolts are at risk of loosening,and 547 bolts are tightened.
基金supported in part by the National Key Research and Development Program of China(No. 2018YFC0309104)the Construction System Science and Technology Project of Jiangsu Province (No.2021JH03)。
文摘Target detection in low light background is one of the main tasks of night patrol robots for airport terminal.However,if some algorithms can run on a robot platform with limited computing resources,it is difficult for these algorithms to ensure the detection accuracy of human body in the airport terminal. A novel thermal infrared salient human detection model combined with thermal features called TFSHD is proposed. The TFSHD model is still based on U-Net,but the decoder module structure and model lightweight have been redesigned. In order to improve the detection accuracy of the algorithm in complex scenes,a fusion module composed of thermal branch and saliency branch is added to the decoder of the TFSHD model. Furthermore,a predictive loss function that is more sensitive to high temperature regions of the image is designed. Additionally,for the sake of reducing the computing resource requirements of the algorithm,a model lightweight scheme that includes simplifying the encoder network structure and controlling the number of decoder channels is adopted. The experimental results on four data sets show that the proposed method can not only ensure high detection accuracy and robustness of the algorithm,but also meet the needs of real-time detection of patrol robots with detection speed above 40 f/s.