Road traffic monitoring is an imperative topic widely discussed among researchers.Systems used to monitor traffic frequently rely on cameras mounted on bridges or roadsides.However,aerial images provide the flexibilit...Road traffic monitoring is an imperative topic widely discussed among researchers.Systems used to monitor traffic frequently rely on cameras mounted on bridges or roadsides.However,aerial images provide the flexibility to use mobile platforms to detect the location and motion of the vehicle over a larger area.To this end,different models have shown the ability to recognize and track vehicles.However,these methods are not mature enough to produce accurate results in complex road scenes.Therefore,this paper presents an algorithm that combines state-of-the-art techniques for identifying and tracking vehicles in conjunction with image bursts.The extracted frames were converted to grayscale,followed by the application of a georeferencing algorithm to embed coordinate information into the images.The masking technique eliminated irrelevant data and reduced the computational cost of the overall monitoring system.Next,Sobel edge detection combined with Canny edge detection and Hough line transform has been applied for noise reduction.After preprocessing,the blob detection algorithm helped detect the vehicles.Vehicles of varying sizes have been detected by implementing a dynamic thresholding scheme.Detection was done on the first image of every burst.Then,to track vehicles,the model of each vehicle was made to find its matches in the succeeding images using the template matching algorithm.To further improve the tracking accuracy by incorporating motion information,Scale Invariant Feature Transform(SIFT)features have been used to find the best possible match among multiple matches.An accuracy rate of 87%for detection and 80%accuracy for tracking in the A1 Motorway Netherland dataset has been achieved.For the Vehicle Aerial Imaging from Drone(VAID)dataset,an accuracy rate of 86%for detection and 78%accuracy for tracking has been achieved.展开更多
Recently,there has been a notable surge of interest in scientific research regarding spectral images.The potential of these images to revolutionize the digital photography industry,like aerial photography through Unma...Recently,there has been a notable surge of interest in scientific research regarding spectral images.The potential of these images to revolutionize the digital photography industry,like aerial photography through Unmanned Aerial Vehicles(UAVs),has captured considerable attention.One encouraging aspect is their combination with machine learning and deep learning algorithms,which have demonstrated remarkable outcomes in image classification.As a result of this powerful amalgamation,the adoption of spectral images has experienced exponential growth across various domains,with agriculture being one of the prominent beneficiaries.This paper presents an extensive survey encompassing multispectral and hyperspectral images,focusing on their applications for classification challenges in diverse agricultural areas,including plants,grains,fruits,and vegetables.By meticulously examining primary studies,we delve into the specific agricultural domains where multispectral and hyperspectral images have found practical use.Additionally,our attention is directed towards utilizing machine learning techniques for effectively classifying hyperspectral images within the agricultural context.The findings of our investigation reveal that deep learning and support vector machines have emerged as widely employed methods for hyperspectral image classification in agriculture.Nevertheless,we also shed light on the various issues and limitations of working with spectral images.This comprehensive analysis aims to provide valuable insights into the current state of spectral imaging in agriculture and its potential for future advancements.展开更多
Object detection in unmanned aerial vehicle(UAV)aerial images has become increasingly important in military and civil applications.General object detection models are not robust enough against interclass similarity an...Object detection in unmanned aerial vehicle(UAV)aerial images has become increasingly important in military and civil applications.General object detection models are not robust enough against interclass similarity and intraclass variability of small objects,and UAV-specific nuisances such as uncontrolledweather conditions.Unlike previous approaches focusing on high-level semantic information,we report the importance of underlying features to improve detection accuracy and robustness fromthe information-theoretic perspective.Specifically,we propose a robust and discriminative feature learning approach through mutual information maximization(RD-MIM),which can be integrated into numerous object detection methods for aerial images.Firstly,we present the rank sample mining method to reduce underlying feature differences between the natural image domain and the aerial image domain.Then,we design a momentum contrast learning strategy to make object features similar to the same category and dissimilar to different categories.Finally,we construct a transformer-based global attention mechanism to boost object location semantics by leveraging the high interrelation of different receptive fields.We conduct extensive experiments on the VisDrone and Unmanned Aerial Vehicle Benchmark Object Detection and Tracking(UAVDT)datasets to prove the effectiveness of the proposed method.The experimental results show that our approach brings considerable robustness gains to basic detectors and advanced detection methods,achieving relative growth rates of 51.0%and 39.4%in corruption robustness,respectively.Our code is available at https://github.com/cq100/RD-MIM(accessed on 2 August 2024).展开更多
In rice production,the prevention and management of pests and diseases have always received special attention.Traditional methods require human experts,which is costly and time-consuming.Due to the complexity of the s...In rice production,the prevention and management of pests and diseases have always received special attention.Traditional methods require human experts,which is costly and time-consuming.Due to the complexity of the structure of rice diseases and pests,quickly and reliably recognizing and locating them is difficult.Recently,deep learning technology has been employed to detect and identify rice diseases and pests.This paper introduces common publicly available datasets;summarizes the applications on rice diseases and pests from the aspects of image recognition,object detection,image segmentation,attention mechanism,and few-shot learning methods according to the network structure differences;and compares the performances of existing studies.Finally,the current issues and challenges are explored fromthe perspective of data acquisition,data processing,and application,providing possible solutions and suggestions.This study aims to review various DL models and provide improved insight into DL techniques and their cutting-edge progress in the prevention and management of rice diseases and pests.展开更多
Rapid and accurate identification of potential structural deficiencies is a crucial task in evaluating seismic vulnerability of large building inventories in a region. In the case of multi-story structures, abrupt ver...Rapid and accurate identification of potential structural deficiencies is a crucial task in evaluating seismic vulnerability of large building inventories in a region. In the case of multi-story structures, abrupt vertical variations of story stiffness are known to significantly increase the likelihood of collapse during moderate or severe earthquakes. Identifying and retrofitting buildings with such irregularities—generally termed as soft-story buildings—is, therefore, vital in earthquake preparedness and loss mitigation efforts. Soft-story building identification through conventional means is a labor-intensive and time-consuming process. In this study, an automated procedure was devised based on deep learning techniques for identifying soft-story buildings from street-view images at a regional scale. A database containing a large number of building images and a semi-automated image labeling approach that effectively annotates new database entries was developed for developing the deep learning model. Extensive computational experiments were carried out to examine the effectiveness of the proposed procedure, and to gain insights into automated soft-story building identification.展开更多
This study presents an innovative approach to improving the performance of YOLO-v8 model for small object detection in radar images.Initially,a local histogram equalization technique was applied to the original images...This study presents an innovative approach to improving the performance of YOLO-v8 model for small object detection in radar images.Initially,a local histogram equalization technique was applied to the original images,resulting in a notable enhancement in both contrast and detail representation.Subsequently,the YOLO-v8 backbone network was augmented by incorporating convolutional kernels based on a multidimensional attention mechanism and a parallel processing strategy,which facilitated more effective feature information fusion.At the model’s head,an upsampling layer was added,along with the fusion of outputs from the shallow network,and a detection head specifically tailored for small object detection,thereby further improving accuracy.Additionally,the loss function was modified to incorporate focal-intersection over union(IoU)in conjunction with scaled-IoU,which enhanced the model’s performance.A weighting strategy was also introduced,effectively improving detection accuracy for small targets.Experimental results demonstrate that the customized model outperforms traditional approaches across various evaluation metrics,including recall,precision,F1-score,and the receiver operating characteristic(ROC)curve,validating its efficacy and innovation in small object detection within radar imagery.The results indicate a substantial improvement in accuracy compared to conventional methods such as image segmentation and standard convolutional neural networks.展开更多
This paper introduces some of the image processing techniques developed in the Canada Research Chair in Advanced Geomatics Image Processing Laboratory (CRC-AGIP Lab) and in the Department of Geodesy and Geomatics Engi...This paper introduces some of the image processing techniques developed in the Canada Research Chair in Advanced Geomatics Image Processing Laboratory (CRC-AGIP Lab) and in the Department of Geodesy and Geomatics Engineering (GGE) at the University of New Brunswick (UNB), Canada. The techniques were developed by innovatively/“smartly” utilizing the characteristics of the available very high resolution optical remote sensing images to solve important problems or create new applications in photogrammetry and remote sensing. The techniques to be introduced are: automated image fusion (UNB-PanSharp), satellite image online mapping, street view technology, moving vehicle detection using single set satellite imagery, supervised image segmentation, image matching in smooth areas, and change detection using images from different viewing angles. Because of their broad application potential, some of the techniques have made a global impact, and some have demonstrated the potential for a global impact.展开更多
Tabletop integral imaging display with a more realistic and immersive experience has always been a hot spot in three-dimensional imaging technology,widely used in biomedical imaging and visualization to enhance medica...Tabletop integral imaging display with a more realistic and immersive experience has always been a hot spot in three-dimensional imaging technology,widely used in biomedical imaging and visualization to enhance medical diagnosis.However,the traditional structural characteristics of integral imaging display inevitably introduce the flipping effect outside the effective viewing angle.Here,a full-parallax tabletop integral imaging display without the flipping effect based on space-multiplexed voxel screen and compound lens array is demonstrated,and two holographic functional screens with different parameters are optically designed and fabricated.To eliminate the flipping effect in the reconstruction process,the space-multiplexed voxel screen consisting of a projector array and the holographic functional screen is presented to constrain light beams passing through the corresponding lens.To greatly promote imaging quality within the viewing area,the aspherical structure of the compound lens is optimized to balance the aberrations.It cooperates with the holographic functional screen to modulate the light field spatial distribution.Compared with the simulation results,the distortion rate of the imaging display is reduced to less than 9%from more than 30%.In the experiment,the floating high-quality reconstructed three-dimensional image without the flipping effect can be observed with the correct 3D perception at 96°×96°viewing angle,where 44,100 viewpoints are employed.展开更多
The transmission of video content over a network raises various issues relating to copyright authenticity,ethics,legality,and privacy.The protection of copyrighted video content is a significant issue in the video ind...The transmission of video content over a network raises various issues relating to copyright authenticity,ethics,legality,and privacy.The protection of copyrighted video content is a significant issue in the video industry,and it is essential to find effective solutions to prevent tampering and modification of digital video content during its transmission through digital media.However,there are stillmany unresolved challenges.This paper aims to address those challenges by proposing a new technique for detectingmoving objects in digital videos,which can help prove the credibility of video content by detecting any fake objects inserted by hackers.The proposed technique involves using two methods,the H.264 and the extraction color features methods,to embed and extract watermarks in video frames.The study tested the performance of the system against various attacks and found it to be robust.The evaluation was done using different metrics such as Peak-Signal-to-Noise Ratio(PSNR),Mean Squared Error(MSE),Structural Similarity Index Measure(SSIM),Bit Correction Ratio(BCR),and Normalized Correlation.The accuracy of identifying moving objects was high,ranging from 96.3%to 98.7%.The system was also able to embed a fragile watermark with a success rate of over 93.65%and had an average capacity of hiding of 78.67.The reconstructed video frames had high quality with a PSNR of at least 65.45 dB and SSIMof over 0.97,making them imperceptible to the human eye.The system also had an acceptable average time difference(T=1.227/s)compared with other state-of-the-art methods.展开更多
Collaborative Robotics is one of the high-interest research topics in the area of academia and industry.It has been progressively utilized in numerous applications,particularly in intelligent surveillance systems.It a...Collaborative Robotics is one of the high-interest research topics in the area of academia and industry.It has been progressively utilized in numerous applications,particularly in intelligent surveillance systems.It allows the deployment of smart cameras or optical sensors with computer vision techniques,which may serve in several object detection and tracking tasks.These tasks have been considered challenging and high-level perceptual problems,frequently dominated by relative information about the environment,where main concerns such as occlusion,illumination,background,object deformation,and object class variations are commonplace.In order to show the importance of top view surveillance,a collaborative robotics framework has been presented.It can assist in the detection and tracking of multiple objects in top view surveillance.The framework consists of a smart robotic camera embedded with the visual processing unit.The existing pre-trained deep learning models named SSD and YOLO has been adopted for object detection and localization.The detection models are further combined with different tracking algorithms,including GOTURN,MEDIANFLOW,TLD,KCF,MIL,and BOOSTING.These algorithms,along with detection models,help to track and predict the trajectories of detected objects.The pre-trained models are employed;therefore,the generalization performance is also investigated through testing the models on various sequences of top view data set.The detection models achieved maximum True Detection Rate 93%to 90%with a maximum 0.6%False Detection Rate.The tracking results of different algorithms are nearly identical,with tracking accuracy ranging from 90%to 94%.Furthermore,a discussion has been carried out on output results along with future guidelines.展开更多
In many image analysis and processing problems, discriminating the size and shape of each individual object in an aggregate pile projected in an image is an important practice. It is relatively easy to distinguish the...In many image analysis and processing problems, discriminating the size and shape of each individual object in an aggregate pile projected in an image is an important practice. It is relatively easy to distinguish these features among the objects already separated from each other. The problems will be undoubtedly more complex and of greater challenge if the objects are touched or/and overlapped. This letter presents an algorithm that can be used to separate the touches and overlaps existing in the objects within a 2-D image. The approach is first to convert the gray-scale image to its corresponding binary one and then to the 3-D topographic one using the erosion operations. A template (or mask) is engineered to search the topographic surface for the saddle point, from which the segmenting orientation is determined followed by the desired separating operation. The algorithm is tested on a real image and the running result is adequately satisfying and encouraging.展开更多
Virtual reality(VR) environment can provide immersive experience to viewers.Under the VR environment, providing a good quality of experience is extremely important.Therefore, in this paper, we present an image quality...Virtual reality(VR) environment can provide immersive experience to viewers.Under the VR environment, providing a good quality of experience is extremely important.Therefore, in this paper, we present an image quality assessment(IQA) study on omnidirectional images. We first build an omnidirectional IQA(OIQA) database, including 16 source images with their corresponding 320 distorted images. We add four commonly encountered distortions. These distortions are JPEG compression, JPEG2000 compression, Gaussian blur, and Gaussian noise. Then we conduct a subjective quality evaluation study in the VR environment based on the OIQA database. Considering that visual attention is more important in VR environment, head and eye movement data are also tracked and collected during the quality rating experiments. The 16 raw and their corresponding distorted images,subjective quality assessment scores, and the head-orientation data and eye-gaze data together constitute the OIQA database. Based on the OIQA database, we test some state-of-the-art full-reference IQA(FR-IQA) measures on equirectangular format or cubic formatomnidirectional images. The results show that applying FR-IQA metrics on cubic format omnidirectional images could improve their performance. The performance of some FR-IQA metrics combining the saliency weight of three different types are also tested based on our database. Some new phenomena different from traditional IQA are observed.展开更多
Light field 3D display technology is considered a revolutionary technology to address the critical visual fatigue issues in the existing 3D displays.Tabletop light field 3D display provides a brand-new display form th...Light field 3D display technology is considered a revolutionary technology to address the critical visual fatigue issues in the existing 3D displays.Tabletop light field 3D display provides a brand-new display form that satisfies multi-user shared viewing and collaborative works,and it is poised to become a potential alternative to the traditional wall and portable display forms.However,a large radial viewing angle and correct radial perspective and parallax are still out of reach for most current tabletop light field 3D displays due to the limited amount of spatial information.To address the viewing angle and perspective issues,a novel integral imaging-based tabletop light field 3D display with a simple flat-panel structure is proposed and developed by applying a compound lens array,two spliced 8K liquid crystal display panels,and a light shaping diffuser screen.The compound lens array is designed to be composed of multiple three-piece compound lens units by employing a reverse design scheme,which greatly extends the radial viewing angle in the case of a limited amount of spatial information and balances other important 3D display parameters.The proposed display has a radial viewing angle of 68.7°in a large display size of 43.5 inches,which is larger than the conventional tabletop light field 3D displays.The radial perspective and parallax are correct,and high-resolution 3D images can be reproduced in large radial viewing positions.We envision that this proposed display opens up possibility for redefining the display forms of consumer electronics.展开更多
This paper presents a robust image feature that can be used to automatically establish match correspondences between aerial images of suburban areas with large view variations. Unlike most commonly used invariant imag...This paper presents a robust image feature that can be used to automatically establish match correspondences between aerial images of suburban areas with large view variations. Unlike most commonly used invariant image features, this feature is view variant. The geometrical structure of the feature allows predicting its visual appearance according to the observer’s view. This feature is named 2EC (2 Edges and a Corner) as it utilizes two line segments or edges and their intersection or corner. These lines are constrained to correspond to the boundaries of rooftops. The description of each feature includes the two edges’ length, their intersection, orientation, and the image patch surrounded by a parallelogram that is constructed with the two edges. Potential match candidates are obtained by comparing features, while accounting for the geometrical changes that are expected due to large view variation. Once the putative matches are obtained, the outliers are filtered out using a projective matrix optimization method. Based on the results of the optimization process, a second round of matching is conducted within a more confined search space that leads to a more accurate match establishment. We demonstrate how establishing match correspondences using these features lead to computing more accurate camera parameters and fundamental matrix and therefore more accurate image registration and 3D reconstruction.展开更多
Cone photoreceptor cell identication is important for the early diagnosis of retinopathy.In this study,an object detection algorithm is used for cone cell identication in confocal adaptive optics scanning laser ophtha...Cone photoreceptor cell identication is important for the early diagnosis of retinopathy.In this study,an object detection algorithm is used for cone cell identication in confocal adaptive optics scanning laser ophthalmoscope(AOSLO)images.An effectiveness evaluation of identication using the proposed method reveals precision,recall,and F_(1)-score of 95.8%,96.5%,and 96.1%,respectively,considering manual identication as the ground truth.Various object detection and identication results from images with different cone photoreceptor cell distributions further demonstrate the performance of the proposed method.Overall,the proposed method can accurately identify cone photoreceptor cells on confocal adaptive optics scanning laser ophthalmoscope images,being comparable to manual identication.展开更多
A femtosecond optical Kerr gate time-gated ballistic imaging method is demonstrated to image a transparent object in a turbid medium. The shape features of the object are obtained by time-resolved selection of the bal...A femtosecond optical Kerr gate time-gated ballistic imaging method is demonstrated to image a transparent object in a turbid medium. The shape features of the object are obtained by time-resolved selection of the ballistic photons with different optical path lengths, the thickness distribution of the object is mapped, and the maximum is less than 3.6%. This time-resolved ballistic imaging has potential applications in studying properties of the liquid core in the near field of the fuel spray.展开更多
Measurement of vegetation coverage on a small scale is the foundation for the monitoring of changes in vegetation coverage and of the inversion model of monitoring vegetation coverage on a large scale by remote sensin...Measurement of vegetation coverage on a small scale is the foundation for the monitoring of changes in vegetation coverage and of the inversion model of monitoring vegetation coverage on a large scale by remote sensing. Using the object-oriented analytical software, Definiens Professional 5, a new method for calculating vegetation coverage based on high-resolution images (aerial photographs or near-surface photography) is proposed. Our research supplies references to remote sensing measurements of vegetation coverage on a small scale and accurate fundamental data for the inversion model of vegetation coverage on a large and intermediate scale to improve the accuracy of remote sensing monitoring of changes in vegetation coverage.展开更多
A new method of view synthesis is proposed based on Delaunay triangulation. The first step of this method is making the Delaunay triangulation of 2 reference images. Secondly, matching the image points using the epipo...A new method of view synthesis is proposed based on Delaunay triangulation. The first step of this method is making the Delaunay triangulation of 2 reference images. Secondly, matching the image points using the epipolar geometry constraint. Finally, constructing the third view according to pixel transferring under the trilinear constraint. The method gets rid of the classic time consuming dense matching technique and takes advantage of Delaunay triangulation. So it can not only save the computation time but also enhance the quality of the synthesized view. The significance of this method is that it can be used directly in the fields of video coding, image compressing and virtual reality.展开更多
基金supported by a grant from the Basic Science Research Program through the National Research Foundation(NRF)(2021R1F1A1063634)funded by the Ministry of Science and ICT(MSIT),Republic of KoreaThe authors are thankful to the Deanship of Scientific Research at Najran University for funding this work under the Research Group Funding Program Grant Code(NU/RG/SERC/13/40)+2 种基金Also,the authors are thankful to Prince Satam bin Abdulaziz University for supporting this study via funding from Prince Satam bin Abdulaziz University project number(PSAU/2024/R/1445)This work was also supported by Princess Nourah bint Abdulrahman University Researchers Supporting Project Number(PNURSP2023R54)Princess Nourah bint Abdulrahman University,Riyadh,Saudi Arabia.
文摘Road traffic monitoring is an imperative topic widely discussed among researchers.Systems used to monitor traffic frequently rely on cameras mounted on bridges or roadsides.However,aerial images provide the flexibility to use mobile platforms to detect the location and motion of the vehicle over a larger area.To this end,different models have shown the ability to recognize and track vehicles.However,these methods are not mature enough to produce accurate results in complex road scenes.Therefore,this paper presents an algorithm that combines state-of-the-art techniques for identifying and tracking vehicles in conjunction with image bursts.The extracted frames were converted to grayscale,followed by the application of a georeferencing algorithm to embed coordinate information into the images.The masking technique eliminated irrelevant data and reduced the computational cost of the overall monitoring system.Next,Sobel edge detection combined with Canny edge detection and Hough line transform has been applied for noise reduction.After preprocessing,the blob detection algorithm helped detect the vehicles.Vehicles of varying sizes have been detected by implementing a dynamic thresholding scheme.Detection was done on the first image of every burst.Then,to track vehicles,the model of each vehicle was made to find its matches in the succeeding images using the template matching algorithm.To further improve the tracking accuracy by incorporating motion information,Scale Invariant Feature Transform(SIFT)features have been used to find the best possible match among multiple matches.An accuracy rate of 87%for detection and 80%accuracy for tracking in the A1 Motorway Netherland dataset has been achieved.For the Vehicle Aerial Imaging from Drone(VAID)dataset,an accuracy rate of 86%for detection and 78%accuracy for tracking has been achieved.
文摘Recently,there has been a notable surge of interest in scientific research regarding spectral images.The potential of these images to revolutionize the digital photography industry,like aerial photography through Unmanned Aerial Vehicles(UAVs),has captured considerable attention.One encouraging aspect is their combination with machine learning and deep learning algorithms,which have demonstrated remarkable outcomes in image classification.As a result of this powerful amalgamation,the adoption of spectral images has experienced exponential growth across various domains,with agriculture being one of the prominent beneficiaries.This paper presents an extensive survey encompassing multispectral and hyperspectral images,focusing on their applications for classification challenges in diverse agricultural areas,including plants,grains,fruits,and vegetables.By meticulously examining primary studies,we delve into the specific agricultural domains where multispectral and hyperspectral images have found practical use.Additionally,our attention is directed towards utilizing machine learning techniques for effectively classifying hyperspectral images within the agricultural context.The findings of our investigation reveal that deep learning and support vector machines have emerged as widely employed methods for hyperspectral image classification in agriculture.Nevertheless,we also shed light on the various issues and limitations of working with spectral images.This comprehensive analysis aims to provide valuable insights into the current state of spectral imaging in agriculture and its potential for future advancements.
基金supported by the National Natural Science Foundation of China under Grant 61671219.
文摘Object detection in unmanned aerial vehicle(UAV)aerial images has become increasingly important in military and civil applications.General object detection models are not robust enough against interclass similarity and intraclass variability of small objects,and UAV-specific nuisances such as uncontrolledweather conditions.Unlike previous approaches focusing on high-level semantic information,we report the importance of underlying features to improve detection accuracy and robustness fromthe information-theoretic perspective.Specifically,we propose a robust and discriminative feature learning approach through mutual information maximization(RD-MIM),which can be integrated into numerous object detection methods for aerial images.Firstly,we present the rank sample mining method to reduce underlying feature differences between the natural image domain and the aerial image domain.Then,we design a momentum contrast learning strategy to make object features similar to the same category and dissimilar to different categories.Finally,we construct a transformer-based global attention mechanism to boost object location semantics by leveraging the high interrelation of different receptive fields.We conduct extensive experiments on the VisDrone and Unmanned Aerial Vehicle Benchmark Object Detection and Tracking(UAVDT)datasets to prove the effectiveness of the proposed method.The experimental results show that our approach brings considerable robustness gains to basic detectors and advanced detection methods,achieving relative growth rates of 51.0%and 39.4%in corruption robustness,respectively.Our code is available at https://github.com/cq100/RD-MIM(accessed on 2 August 2024).
基金funded by Hunan Provincial Natural Science Foundation of China with Grant Numbers(2022JJ50016,2023JJ50096)Innovation Platform Open Fund of Hengyang Normal University Grant 2021HSKFJJ039Hengyang Science and Technology Plan Guiding Project with Number 202222025902.
文摘In rice production,the prevention and management of pests and diseases have always received special attention.Traditional methods require human experts,which is costly and time-consuming.Due to the complexity of the structure of rice diseases and pests,quickly and reliably recognizing and locating them is difficult.Recently,deep learning technology has been employed to detect and identify rice diseases and pests.This paper introduces common publicly available datasets;summarizes the applications on rice diseases and pests from the aspects of image recognition,object detection,image segmentation,attention mechanism,and few-shot learning methods according to the network structure differences;and compares the performances of existing studies.Finally,the current issues and challenges are explored fromthe perspective of data acquisition,data processing,and application,providing possible solutions and suggestions.This study aims to review various DL models and provide improved insight into DL techniques and their cutting-edge progress in the prevention and management of rice diseases and pests.
基金supported by the US National Science Foundation under Grant No. 1612843. NHERI Design Safe (Rathje et al., 2017)Texas Advanced Computing Center (TACC)。
文摘Rapid and accurate identification of potential structural deficiencies is a crucial task in evaluating seismic vulnerability of large building inventories in a region. In the case of multi-story structures, abrupt vertical variations of story stiffness are known to significantly increase the likelihood of collapse during moderate or severe earthquakes. Identifying and retrofitting buildings with such irregularities—generally termed as soft-story buildings—is, therefore, vital in earthquake preparedness and loss mitigation efforts. Soft-story building identification through conventional means is a labor-intensive and time-consuming process. In this study, an automated procedure was devised based on deep learning techniques for identifying soft-story buildings from street-view images at a regional scale. A database containing a large number of building images and a semi-automated image labeling approach that effectively annotates new database entries was developed for developing the deep learning model. Extensive computational experiments were carried out to examine the effectiveness of the proposed procedure, and to gain insights into automated soft-story building identification.
基金supported by the Na‑tional Natural Science Foundation of China Joint Fund(No.U21B2028)the National Key R&D Program of China(No.2021YFC 2100100)the Shanghai Science and Technology Project(Nos.21JC1403400,23JC1402300).
文摘This study presents an innovative approach to improving the performance of YOLO-v8 model for small object detection in radar images.Initially,a local histogram equalization technique was applied to the original images,resulting in a notable enhancement in both contrast and detail representation.Subsequently,the YOLO-v8 backbone network was augmented by incorporating convolutional kernels based on a multidimensional attention mechanism and a parallel processing strategy,which facilitated more effective feature information fusion.At the model’s head,an upsampling layer was added,along with the fusion of outputs from the shallow network,and a detection head specifically tailored for small object detection,thereby further improving accuracy.Additionally,the loss function was modified to incorporate focal-intersection over union(IoU)in conjunction with scaled-IoU,which enhanced the model’s performance.A weighting strategy was also introduced,effectively improving detection accuracy for small targets.Experimental results demonstrate that the customized model outperforms traditional approaches across various evaluation metrics,including recall,precision,F1-score,and the receiver operating characteristic(ROC)curve,validating its efficacy and innovation in small object detection within radar imagery.The results indicate a substantial improvement in accuracy compared to conventional methods such as image segmentation and standard convolutional neural networks.
文摘This paper introduces some of the image processing techniques developed in the Canada Research Chair in Advanced Geomatics Image Processing Laboratory (CRC-AGIP Lab) and in the Department of Geodesy and Geomatics Engineering (GGE) at the University of New Brunswick (UNB), Canada. The techniques were developed by innovatively/“smartly” utilizing the characteristics of the available very high resolution optical remote sensing images to solve important problems or create new applications in photogrammetry and remote sensing. The techniques to be introduced are: automated image fusion (UNB-PanSharp), satellite image online mapping, street view technology, moving vehicle detection using single set satellite imagery, supervised image segmentation, image matching in smooth areas, and change detection using images from different viewing angles. Because of their broad application potential, some of the techniques have made a global impact, and some have demonstrated the potential for a global impact.
基金The Basic Research Fund of Central-Level Nonprofit Scientific Research Institutes(No.TKS20220304)The Key Research and Development Projects of Guangxi Science and Technology Department(No.2021AB05087).
文摘Tabletop integral imaging display with a more realistic and immersive experience has always been a hot spot in three-dimensional imaging technology,widely used in biomedical imaging and visualization to enhance medical diagnosis.However,the traditional structural characteristics of integral imaging display inevitably introduce the flipping effect outside the effective viewing angle.Here,a full-parallax tabletop integral imaging display without the flipping effect based on space-multiplexed voxel screen and compound lens array is demonstrated,and two holographic functional screens with different parameters are optically designed and fabricated.To eliminate the flipping effect in the reconstruction process,the space-multiplexed voxel screen consisting of a projector array and the holographic functional screen is presented to constrain light beams passing through the corresponding lens.To greatly promote imaging quality within the viewing area,the aspherical structure of the compound lens is optimized to balance the aberrations.It cooperates with the holographic functional screen to modulate the light field spatial distribution.Compared with the simulation results,the distortion rate of the imaging display is reduced to less than 9%from more than 30%.In the experiment,the floating high-quality reconstructed three-dimensional image without the flipping effect can be observed with the correct 3D perception at 96°×96°viewing angle,where 44,100 viewpoints are employed.
文摘The transmission of video content over a network raises various issues relating to copyright authenticity,ethics,legality,and privacy.The protection of copyrighted video content is a significant issue in the video industry,and it is essential to find effective solutions to prevent tampering and modification of digital video content during its transmission through digital media.However,there are stillmany unresolved challenges.This paper aims to address those challenges by proposing a new technique for detectingmoving objects in digital videos,which can help prove the credibility of video content by detecting any fake objects inserted by hackers.The proposed technique involves using two methods,the H.264 and the extraction color features methods,to embed and extract watermarks in video frames.The study tested the performance of the system against various attacks and found it to be robust.The evaluation was done using different metrics such as Peak-Signal-to-Noise Ratio(PSNR),Mean Squared Error(MSE),Structural Similarity Index Measure(SSIM),Bit Correction Ratio(BCR),and Normalized Correlation.The accuracy of identifying moving objects was high,ranging from 96.3%to 98.7%.The system was also able to embed a fragile watermark with a success rate of over 93.65%and had an average capacity of hiding of 78.67.The reconstructed video frames had high quality with a PSNR of at least 65.45 dB and SSIMof over 0.97,making them imperceptible to the human eye.The system also had an acceptable average time difference(T=1.227/s)compared with other state-of-the-art methods.
基金the Framework of International Cooperation Program managed by the National Research Foundation of Korea(2019K1A3A1A8011295711).
文摘Collaborative Robotics is one of the high-interest research topics in the area of academia and industry.It has been progressively utilized in numerous applications,particularly in intelligent surveillance systems.It allows the deployment of smart cameras or optical sensors with computer vision techniques,which may serve in several object detection and tracking tasks.These tasks have been considered challenging and high-level perceptual problems,frequently dominated by relative information about the environment,where main concerns such as occlusion,illumination,background,object deformation,and object class variations are commonplace.In order to show the importance of top view surveillance,a collaborative robotics framework has been presented.It can assist in the detection and tracking of multiple objects in top view surveillance.The framework consists of a smart robotic camera embedded with the visual processing unit.The existing pre-trained deep learning models named SSD and YOLO has been adopted for object detection and localization.The detection models are further combined with different tracking algorithms,including GOTURN,MEDIANFLOW,TLD,KCF,MIL,and BOOSTING.These algorithms,along with detection models,help to track and predict the trajectories of detected objects.The pre-trained models are employed;therefore,the generalization performance is also investigated through testing the models on various sequences of top view data set.The detection models achieved maximum True Detection Rate 93%to 90%with a maximum 0.6%False Detection Rate.The tracking results of different algorithms are nearly identical,with tracking accuracy ranging from 90%to 94%.Furthermore,a discussion has been carried out on output results along with future guidelines.
基金Suppprted by the Scientific Research Start-up foundation of Ningbo University (No.2004037)Zhejiang Provincial Foundation for Returned Overseas Students and Scholars (No.2004884).
文摘In many image analysis and processing problems, discriminating the size and shape of each individual object in an aggregate pile projected in an image is an important practice. It is relatively easy to distinguish these features among the objects already separated from each other. The problems will be undoubtedly more complex and of greater challenge if the objects are touched or/and overlapped. This letter presents an algorithm that can be used to separate the touches and overlaps existing in the objects within a 2-D image. The approach is first to convert the gray-scale image to its corresponding binary one and then to the 3-D topographic one using the erosion operations. A template (or mask) is engineered to search the topographic surface for the saddle point, from which the segmenting orientation is determined followed by the desired separating operation. The algorithm is tested on a real image and the running result is adequately satisfying and encouraging.
文摘Virtual reality(VR) environment can provide immersive experience to viewers.Under the VR environment, providing a good quality of experience is extremely important.Therefore, in this paper, we present an image quality assessment(IQA) study on omnidirectional images. We first build an omnidirectional IQA(OIQA) database, including 16 source images with their corresponding 320 distorted images. We add four commonly encountered distortions. These distortions are JPEG compression, JPEG2000 compression, Gaussian blur, and Gaussian noise. Then we conduct a subjective quality evaluation study in the VR environment based on the OIQA database. Considering that visual attention is more important in VR environment, head and eye movement data are also tracked and collected during the quality rating experiments. The 16 raw and their corresponding distorted images,subjective quality assessment scores, and the head-orientation data and eye-gaze data together constitute the OIQA database. Based on the OIQA database, we test some state-of-the-art full-reference IQA(FR-IQA) measures on equirectangular format or cubic formatomnidirectional images. The results show that applying FR-IQA metrics on cubic format omnidirectional images could improve their performance. The performance of some FR-IQA metrics combining the saliency weight of three different types are also tested based on our database. Some new phenomena different from traditional IQA are observed.
基金We are grateful for financial supports from National Key R&D Program of China(Grant No.2021YFB2802300)the National Natural Science Foundation of China(Grant Nos.62105014,62105016,and 62020106010)。
文摘Light field 3D display technology is considered a revolutionary technology to address the critical visual fatigue issues in the existing 3D displays.Tabletop light field 3D display provides a brand-new display form that satisfies multi-user shared viewing and collaborative works,and it is poised to become a potential alternative to the traditional wall and portable display forms.However,a large radial viewing angle and correct radial perspective and parallax are still out of reach for most current tabletop light field 3D displays due to the limited amount of spatial information.To address the viewing angle and perspective issues,a novel integral imaging-based tabletop light field 3D display with a simple flat-panel structure is proposed and developed by applying a compound lens array,two spliced 8K liquid crystal display panels,and a light shaping diffuser screen.The compound lens array is designed to be composed of multiple three-piece compound lens units by employing a reverse design scheme,which greatly extends the radial viewing angle in the case of a limited amount of spatial information and balances other important 3D display parameters.The proposed display has a radial viewing angle of 68.7°in a large display size of 43.5 inches,which is larger than the conventional tabletop light field 3D displays.The radial perspective and parallax are correct,and high-resolution 3D images can be reproduced in large radial viewing positions.We envision that this proposed display opens up possibility for redefining the display forms of consumer electronics.
文摘This paper presents a robust image feature that can be used to automatically establish match correspondences between aerial images of suburban areas with large view variations. Unlike most commonly used invariant image features, this feature is view variant. The geometrical structure of the feature allows predicting its visual appearance according to the observer’s view. This feature is named 2EC (2 Edges and a Corner) as it utilizes two line segments or edges and their intersection or corner. These lines are constrained to correspond to the boundaries of rooftops. The description of each feature includes the two edges’ length, their intersection, orientation, and the image patch surrounded by a parallelogram that is constructed with the two edges. Potential match candidates are obtained by comparing features, while accounting for the geometrical changes that are expected due to large view variation. Once the putative matches are obtained, the outliers are filtered out using a projective matrix optimization method. Based on the results of the optimization process, a second round of matching is conducted within a more confined search space that leads to a more accurate match establishment. We demonstrate how establishing match correspondences using these features lead to computing more accurate camera parameters and fundamental matrix and therefore more accurate image registration and 3D reconstruction.
基金the Natural Science Foundation of Jiangsu Province(BK20200214)National Key R&D Program of China(2017YFB0403701)+5 种基金Jiangsu Province Key R&D Program(BE2019682 and BE2018667)National Natural Science Foundation of China(61605210,61675226,and 62075235)Youth Innovation Promotion Association of Chinese Academy of Sciences(2019320)Frontier Science Research Project of the Chinese Academy of Sciences(QYZDB-SSW-JSC03)Strategic Priority Research Program of the Chinese Academy of Sciences(XDB02060000)and Entrepreneurship and Innova-tion Talents in Jiangsu Province(Innovation of Scienti¯c Research Institutes).
文摘Cone photoreceptor cell identication is important for the early diagnosis of retinopathy.In this study,an object detection algorithm is used for cone cell identication in confocal adaptive optics scanning laser ophthalmoscope(AOSLO)images.An effectiveness evaluation of identication using the proposed method reveals precision,recall,and F_(1)-score of 95.8%,96.5%,and 96.1%,respectively,considering manual identication as the ground truth.Various object detection and identication results from images with different cone photoreceptor cell distributions further demonstrate the performance of the proposed method.Overall,the proposed method can accurately identify cone photoreceptor cells on confocal adaptive optics scanning laser ophthalmoscope images,being comparable to manual identication.
基金Supported by the National Natural Science Foundation of China under Grant Nos 61427816 and 61690221the Collaborative Innovation Center of Suzhou Nano Science and Technology
文摘A femtosecond optical Kerr gate time-gated ballistic imaging method is demonstrated to image a transparent object in a turbid medium. The shape features of the object are obtained by time-resolved selection of the ballistic photons with different optical path lengths, the thickness distribution of the object is mapped, and the maximum is less than 3.6%. This time-resolved ballistic imaging has potential applications in studying properties of the liquid core in the near field of the fuel spray.
基金funded by the National Natural Science Foundation of China(Grant No.40571029).
文摘Measurement of vegetation coverage on a small scale is the foundation for the monitoring of changes in vegetation coverage and of the inversion model of monitoring vegetation coverage on a large scale by remote sensing. Using the object-oriented analytical software, Definiens Professional 5, a new method for calculating vegetation coverage based on high-resolution images (aerial photographs or near-surface photography) is proposed. Our research supplies references to remote sensing measurements of vegetation coverage on a small scale and accurate fundamental data for the inversion model of vegetation coverage on a large and intermediate scale to improve the accuracy of remote sensing monitoring of changes in vegetation coverage.
文摘A new method of view synthesis is proposed based on Delaunay triangulation. The first step of this method is making the Delaunay triangulation of 2 reference images. Secondly, matching the image points using the epipolar geometry constraint. Finally, constructing the third view according to pixel transferring under the trilinear constraint. The method gets rid of the classic time consuming dense matching technique and takes advantage of Delaunay triangulation. So it can not only save the computation time but also enhance the quality of the synthesized view. The significance of this method is that it can be used directly in the fields of video coding, image compressing and virtual reality.