Geological discontinuity(GD)plays a pivotal role in determining the catastrophic mechanical failure of jointed rock masses.Accurate and efficient acquisition of GD networks is essential for characterizing and understa...Geological discontinuity(GD)plays a pivotal role in determining the catastrophic mechanical failure of jointed rock masses.Accurate and efficient acquisition of GD networks is essential for characterizing and understanding the progressive damage mechanisms of slopes based on monitoring image data.Inspired by recent advances in computer vision,deep learning(DL)models have been widely utilized for image-based fracture identification.The multi-scale characteristics,image resolution and annotation quality of images will cause a scale-space effect(SSE)that makes features indistinguishable from noise,directly affecting the accuracy.However,this effect has not received adequate attention.Herein,we try to address this gap by collecting slope images at various proportional scales and constructing multi-scale datasets using image processing techniques.Next,we quantify the intensity of feature signals using metrics such as peak signal-to-noise ratio(PSNR)and structural similarity(SSIM).Combining these metrics with the scale-space theory,we investigate the influence of the SSE on the differentiation of multi-scale features and the accuracy of recognition.It is found that augmenting the image's detail capacity does not always yield benefits for vision-based recognition models.In light of these observations,we propose a scale hybridization approach based on the diffusion mechanism of scale-space representation.The results show that scale hybridization strengthens the tolerance of multi-scale feature recognition under complex environmental noise interference and significantly enhances the recognition accuracy of GD.It also facilitates the objective understanding,description and analysis of the rock behavior and stability of slopes from the perspective of image data.展开更多
In recent times,an image enhancement approach,which learns the global transformation function using deep neural networks,has gained attention.However,many existing methods based on this approach have a limitation:thei...In recent times,an image enhancement approach,which learns the global transformation function using deep neural networks,has gained attention.However,many existing methods based on this approach have a limitation:their transformation functions are too simple to imitate complex colour transformations between low-quality images and manually retouched high-quality images.In order to address this limitation,a simple yet effective approach for image enhancement is proposed.The proposed algorithm based on the channel-wise intensity transformation is designed.However,this transformation is applied to the learnt embedding space instead of specific colour spaces and then return enhanced features to colours.To this end,the authors define the continuous intensity transformation(CIT)to describe the mapping between input and output intensities on the embedding space.Then,the enhancement network is developed,which produces multi-scale feature maps from input images,derives the set of transformation functions,and performs the CIT to obtain enhanced images.Extensive experiments on the MIT-Adobe 5K dataset demonstrate that the authors’approach improves the performance of conventional intensity transforms on colour space metrics.Specifically,the authors achieved a 3.8%improvement in peak signal-to-noise ratio,a 1.8%improvement in structual similarity index measure,and a 27.5%improvement in learned perceptual image patch similarity.Also,the authors’algorithm outperforms state-of-the-art alternatives on three image enhancement datasets:MIT-Adobe 5K,Low-Light,and Google HDRþ.展开更多
We have developed a novel method for co-adding multiple under-sampled images that combines the iteratively reweighted least squares and divide-and-conquer algorithms.Our approach not only allows for the anti-aliasing ...We have developed a novel method for co-adding multiple under-sampled images that combines the iteratively reweighted least squares and divide-and-conquer algorithms.Our approach not only allows for the anti-aliasing of the images but also enables Point-Spread Function(PSF)deconvolution,resulting in enhanced restoration of extended sources,the highest peak signal-to-noise ratio,and reduced ringing artefacts.To test our method,we conducted numerical simulations that replicated observation runs of the China Space Station Telescope/the VLT Survey Telescope(VST)and compared our results to those obtained using previous algorithms.The simulation showed that our method outperforms previous approaches in several ways,such as restoring the profile of extended sources and minimizing ringing artefacts.Additionally,because our method relies on the inherent advantages of least squares fitting,it is more versatile and does not depend on the local uniformity hypothesis for the PSF.However,the new method consumes much more computation than the other approaches.展开更多
Algal blooms,the spread of algae on the surface of water bodies,have adverse effects not only on aquatic ecosystems but also on human life.The adverse effects of harmful algal blooms(HABs)necessitate a convenient solu...Algal blooms,the spread of algae on the surface of water bodies,have adverse effects not only on aquatic ecosystems but also on human life.The adverse effects of harmful algal blooms(HABs)necessitate a convenient solution for detection and monitoring.Unmanned aerial vehicles(UAVs)have recently emerged as a tool for algal bloom detection,efficiently providing on-demand images at high spatiotemporal resolutions.This study developed an image processing method for algal bloom area estimation from the aerial images(obtained from the internet)captured using UAVs.As a remote sensing method of HAB detection,analysis,and monitoring,a combination of histogram and texture analyses was used to efficiently estimate the area of HABs.Statistical features like entropy(using the Kullback-Leibler method)were emphasized with the aid of a gray-level co-occurrence matrix.The results showed that the orthogonal images demonstrated fewer errors,and the morphological filter best detected algal blooms in real time,with a precision of 80%.This study provided efficient image processing approaches using on-board UAVs for HAB monitoring.展开更多
As a branch of quantum image processing,quantum image scaling has been widely studied.However,most of the existing quantum image scaling algorithms are based on nearest-neighbor interpolation and bilinear interpolatio...As a branch of quantum image processing,quantum image scaling has been widely studied.However,most of the existing quantum image scaling algorithms are based on nearest-neighbor interpolation and bilinear interpolation,the quantum version of bicubic interpolation has not yet been studied.In this work,we present the first quantum image scaling scheme for bicubic interpolation based on the novel enhanced quantum representation(NEQR).Our scheme can realize synchronous enlargement and reduction of the image with the size of 2^(n)×2^(n) by integral multiple.Firstly,the image is represented by NEQR and the original image coordinates are obtained through multiple CNOT modules.Then,16 neighborhood pixels are obtained by quantum operation circuits,and the corresponding weights of these pixels are calculated by quantum arithmetic modules.Finally,a quantum matrix operation,instead of a classical convolution operation,is used to realize the sum of convolution of these pixels.Through simulation experiments and complexity analysis,we demonstrate that our scheme achieves exponential speedup over the classical bicubic interpolation algorithm,and has better effect than the quantum version of bilinear interpolation.展开更多
Oscillation detection has been a hot research topic in industries due to the high incidence of oscillation loops and their negative impact on plant profitability.Although numerous automatic detection techniques have b...Oscillation detection has been a hot research topic in industries due to the high incidence of oscillation loops and their negative impact on plant profitability.Although numerous automatic detection techniques have been proposed,most of them can only address part of the practical difficulties.An oscillation is heuristically defined as a visually apparent periodic variation.However,manual visual inspection is labor-intensive and prone to missed detection.Convolutional neural networks(CNNs),inspired by animal visual systems,have been raised with powerful feature extraction capabilities.In this work,an exploration of the typical CNN models for visual oscillation detection is performed.Specifically,we tested MobileNet-V1,ShuffleNet-V2,Efficient Net-B0,and GhostNet models,and found that such a visual framework is well-suited for oscillation detection.The feasibility and validity of this framework are verified utilizing extensive numerical and industrial cases.Compared with state-of-theart oscillation detectors,the suggested framework is more straightforward and more robust to noise and mean-nonstationarity.In addition,this framework generalizes well and is capable of handling features that are not present in the training data,such as multiple oscillations and outliers.展开更多
Diagnosing various diseases such as glaucoma,age-related macular degeneration,cardiovascular conditions,and diabetic retinopathy involves segmenting retinal blood vessels.The task is particularly challenging when deal...Diagnosing various diseases such as glaucoma,age-related macular degeneration,cardiovascular conditions,and diabetic retinopathy involves segmenting retinal blood vessels.The task is particularly challenging when dealing with color fundus images due to issues like non-uniformillumination,low contrast,and variations in vessel appearance,especially in the presence of different pathologies.Furthermore,the speed of the retinal vessel segmentation system is of utmost importance.With the surge of now available big data,the speed of the algorithm becomes increasingly important,carrying almost equivalent weightage to the accuracy of the algorithm.To address these challenges,we present a novel approach for retinal vessel segmentation,leveraging efficient and robust techniques based on multiscale line detection and mathematical morphology.Our algorithm’s performance is evaluated on two publicly available datasets,namely the Digital Retinal Images for Vessel Extraction dataset(DRIVE)and the Structure Analysis of Retina(STARE)dataset.The experimental results demonstrate the effectiveness of our method,withmean accuracy values of 0.9467 forDRIVE and 0.9535 for STARE datasets,aswell as sensitivity values of 0.6952 forDRIVE and 0.6809 for STARE datasets.Notably,our algorithmexhibits competitive performance with state-of-the-art methods.Importantly,it operates at an average speed of 3.73 s per image for DRIVE and 3.75 s for STARE datasets.It is worth noting that these results were achieved using Matlab scripts containing multiple loops.This suggests that the processing time can be further reduced by replacing loops with vectorization.Thus the proposed algorithm can be deployed in real time applications.In summary,our proposed system strikes a fine balance between swift computation and accuracy that is on par with the best available methods in the field.展开更多
As a part of quantum image processing,quantum image filtering is a crucial technology in the development of quantum computing.Low-pass filtering can effectively achieve anti-aliasing effects on images.Currently,most q...As a part of quantum image processing,quantum image filtering is a crucial technology in the development of quantum computing.Low-pass filtering can effectively achieve anti-aliasing effects on images.Currently,most quantum image filterings are based on classical domains and grayscale images,and there are relatively fewer studies on anti-aliasing in the quantum domain.This paper proposes a scheme for anti-aliasing filtering based on quantum grayscale and color image scaling in the spatial domain.It achieves the effect of anti-aliasing filtering on quantum images during the scaling process.First,we use the novel enhanced quantum representation(NEQR)and the improved quantum representation of color images(INCQI)to represent classical images.Since aliasing phenomena are more pronounced when images are scaled down,this paper focuses only on the anti-aliasing effects in the case of reduction.Subsequently,we perform anti-aliasing filtering on the quantum representation of the original image and then use bilinear interpolation to scale down the image,achieving the anti-aliasing effect.The constructed pyramid model is then used to select an appropriate image for upscaling to the original image size.Finally,the complexity of the circuit is analyzed.Compared to the images experiencing aliasing effects solely due to scaling,applying anti-aliasing filtering to the images results in smoother and clearer outputs.Additionally,the anti-aliasing filtering allows for manual intervention to select the desired level of image smoothness.展开更多
The mechanical properties and failure mechanism of lightweight aggregate concrete(LWAC)is a hot topic in the engineering field,and the relationship between its microstructure and macroscopic mechanical properties is a...The mechanical properties and failure mechanism of lightweight aggregate concrete(LWAC)is a hot topic in the engineering field,and the relationship between its microstructure and macroscopic mechanical properties is also a frontier research topic in the academic field.In this study,the image processing technology is used to establish a micro-structure model of lightweight aggregate concrete.Through the information extraction and processing of the section image of actual light aggregate concrete specimens,the mesostructural model of light aggregate concrete with real aggregate characteristics is established.The numerical simulation of uniaxial tensile test,uniaxial compression test and three-point bending test of lightweight aggregate concrete are carried out using a new finite element method-the base force element method respectively.Firstly,the image processing technology is used to produce beam specimens,uniaxial compression specimens and uniaxial tensile specimens of light aggregate concrete,which can better simulate the aggregate shape and random distribution of real light aggregate concrete.Secondly,the three-point bending test is numerically simulated.Thirdly,the uniaxial compression specimen generated by image processing technology is numerically simulated.Fourth,the uniaxial tensile specimen generated by image processing technology is numerically simulated.The mechanical behavior and damage mode of the specimen during loading were analyzed.The results of numerical simulation are compared and analyzed with those of relevant experiments.The feasibility and correctness of the micromodel established in this study for analyzing the micromechanics of lightweight aggregate concrete materials are verified.Image processing technology has a broad application prospect in the field of concrete mesoscopic damage analysis.展开更多
Semantic segmentation of driving scene images is crucial for autonomous driving.While deep learning technology has significantly improved daytime image semantic segmentation,nighttime images pose challenges due to fac...Semantic segmentation of driving scene images is crucial for autonomous driving.While deep learning technology has significantly improved daytime image semantic segmentation,nighttime images pose challenges due to factors like poor lighting and overexposure,making it difficult to recognize small objects.To address this,we propose an Image Adaptive Enhancement(IAEN)module comprising a parameter predictor(Edip),multiple image processing filters(Mdif),and a Detail Processing Module(DPM).Edip combines image processing filters to predict parameters like exposure and hue,optimizing image quality.We adopt a novel image encoder to enhance parameter prediction accuracy by enabling Edip to handle features at different scales.DPM strengthens overlooked image details,extending the IAEN module’s functionality.After the segmentation network,we integrate a Depth Guided Filter(DGF)to refine segmentation outputs.The entire network is trained end-to-end,with segmentation results guiding parameter prediction optimization,promoting self-learning and network improvement.This lightweight and efficient network architecture is particularly suitable for addressing challenges in nighttime image segmentation.Extensive experiments validate significant performance improvements of our approach on the ACDC-night and Nightcity datasets.展开更多
Due to hardware limitations,existing hyperspectral(HS)camera often suffer from low spatial/temporal resolution.Recently,it has been prevalent to super-resolve a low reso-lution(LR)HS image into a high resolution(HR)HS...Due to hardware limitations,existing hyperspectral(HS)camera often suffer from low spatial/temporal resolution.Recently,it has been prevalent to super-resolve a low reso-lution(LR)HS image into a high resolution(HR)HS image with a HR RGB(or mul-tispectral)image guidance.Previous approaches for this guided super-resolution task often model the intrinsic characteristic of the desired HR HS image using hand-crafted priors.Recently,researchers pay more attention to deep learning methods with direct supervised or unsupervised learning,which exploit deep prior only from training dataset or testing data.In this article,an efficient convolutional neural network-based method is presented to progressively super-resolve HS image with RGB image guidance.Specif-ically,a progressive HS image super-resolution network is proposed,which progressively super-resolve the LR HS image with pixel shuffled HR RGB image guidance.Then,the super-resolution network is progressively trained with supervised pre-training and un-supervised adaption,where supervised pre-training learns the general prior on training data and unsupervised adaptation generalises the general prior to specific prior for variant testing scenes.The proposed method can effectively exploit prior from training dataset and testing HS and RGB images with spectral-spatial constraint.It has a good general-isation capability,especially for blind HS image super-resolution.Comprehensive experimental results show that the proposed deep progressive learning method out-performs the existing state-of-the-art methods for HS image super-resolution in non-blind and blind cases.展开更多
Obtaining high precision is an important consideration for astrometric studies using images from the Narrow Angle Camera(NAC)of the Cassini Imaging Science Subsystem(ISS).Selecting the best centering algorithm is key ...Obtaining high precision is an important consideration for astrometric studies using images from the Narrow Angle Camera(NAC)of the Cassini Imaging Science Subsystem(ISS).Selecting the best centering algorithm is key to enhancing astrometric accuracy.In this study,we compared the accuracy of five centering algorithms:Gaussian fitting,the modified moments method,and three point-spread function(PSF)fitting methods(effective PSF(ePSF),PSFEx,and extended PSF(x PSF)from the Cassini Imaging Central Laboratory for Operations(CICLOPS)).We assessed these algorithms using 70 ISS NAC star field images taken with CL1 and CL2 filters across different stellar magnitudes.The ePSF method consistently demonstrated the highest accuracy,achieving precision below 0.03 pixels for stars of magnitude 8-9.Compared to the previously considered best,the modified moments method,the e PSF method improved overall accuracy by about 10%and 21%in the sample and line directions,respectively.Surprisingly,the xPSF model provided by CICLOPS had lower precision than the ePSF.Conversely,the ePSF exhibits an improvement in measurement precision of 23%and 17%in the sample and line directions,respectively,over the xPSF.This discrepancy might be attributed to the xPSF focusing on photometry rather than astrometry.These findings highlight the necessity of constructing PSF models specifically tailored for astrometric purposes in NAC images and provide guidance for enhancing astrometric measurements using these ISS NAC images.展开更多
Hyperspectral images typically have high spectral resolution but low spatial resolution,which impacts the reliability and accuracy of subsequent applications,for example,remote sensingclassification and mineral identi...Hyperspectral images typically have high spectral resolution but low spatial resolution,which impacts the reliability and accuracy of subsequent applications,for example,remote sensingclassification and mineral identification.But in traditional methods via deep convolution neural net-works,indiscriminately extracting and fusing spectral and spatial features makes it challenging toutilize the differentiated information across adjacent spectral channels.Thus,we proposed a multi-branch interleaved iterative upsampling hyperspectral image super-resolution reconstruction net-work(MIIUSR)to address the above problems.We reinforce spatial feature extraction by integrat-ing detailed features from different receptive fields across adjacent channels.Furthermore,we pro-pose an interleaved iterative upsampling process during the reconstruction stage,which progres-sively fuses incremental information among adjacent frequency bands.Additionally,we add twoparallel three dimensional(3D)feature extraction branches to the backbone network to extractspectral and spatial features of varying granularity.We further enhance the backbone network’sconstruction results by leveraging the difference between two dimensional(2D)channel-groupingspatial features and 3D multi-granularity features.The results obtained by applying the proposednetwork model to the CAVE test set show that,at a scaling factor of×4,the peak signal to noiseratio,spectral angle mapping,and structural similarity are 37.310 dB,3.525 and 0.9438,respec-tively.Besides,extensive experiments conducted on the Harvard and Foster datasets demonstratethe superior potential of the proposed model in hyperspectral super-resolution reconstruction.展开更多
Underwater images are often with biased colours and reduced contrast because of the absorption and scattering effects when light propagates in water.Such images with degradation cannot meet the needs of underwater ope...Underwater images are often with biased colours and reduced contrast because of the absorption and scattering effects when light propagates in water.Such images with degradation cannot meet the needs of underwater operations.The main problem in classic underwater image restoration or enhancement methods is that they consume long calcu-lation time,and often,the colour or contrast of the result images is still unsatisfied.Instead of using the complicated physical model of underwater imaging degradation,we propose a new method to deal with underwater images by imitating the colour constancy mechanism of human vision using double-opponency.Firstly,the original image is converted to the LMS space.Then the signals are linearly combined,and Gaussian convolutions are per-formed to imitate the function of receptive fields(RFs).Next,two RFs with different sizes work together to constitute the double-opponency response.Finally,the underwater light is estimated to correct the colours in the image.Further contrast stretching on the luminance is optional.Experiments show that the proposed method can obtain clarified underwater images with higher quality than before,and it spends significantly less time cost compared to other previously published typical methods.展开更多
Object tracking is one of the major tasks for mobile robots in many real-world applications.Also,artificial intelligence and automatic control techniques play an important role in enhancing the performance of mobile r...Object tracking is one of the major tasks for mobile robots in many real-world applications.Also,artificial intelligence and automatic control techniques play an important role in enhancing the performance of mobile robot navigation.In contrast to previous simulation studies,this paper presents a new intelligent mobile robot for accomplishing multi-tasks by tracking red-green-blue(RGB)colored objects in a real experimental field.Moreover,a practical smart controller is developed based on adaptive fuzzy logic and custom proportional-integral-derivative(PID)schemes to achieve accurate tracking results,considering robot command delay and tolerance errors.The design of developed controllers implies some motion rules to mimic the knowledge of experienced operators.Twelve scenarios of three colored object combinations have been successfully tested and evaluated by using the developed controlled image-based robot tracker.Classical PID control failed to handle some tracking scenarios in this study.The proposed adaptive fuzzy PID control achieved the best accurate results with the minimum average final error of 13.8 cm to reach the colored targets,while our designed custom PID control is efficient in saving both average time and traveling distance of 6.6 s and 14.3 cm,respectively.These promising results demonstrate the feasibility of applying our developed image-based robotic system in a colored object-tracking environment to reduce human workloads.展开更多
The Solar Polar-orbit Observatory(SPO),proposed by Chinese scientists,is designed to observe the solar polar regions in an unprecedented way with a spacecraft traveling in a large solar inclination angle and a small e...The Solar Polar-orbit Observatory(SPO),proposed by Chinese scientists,is designed to observe the solar polar regions in an unprecedented way with a spacecraft traveling in a large solar inclination angle and a small ellipticity.However,one of the most significant challenges lies in ultra-long-distance data transmission,particularly for the Magnetic and Helioseismic Imager(MHI),which is the most important payload and generates the largest volume of data in SPO.In this paper,we propose a tailored lossless data compression method based on the measurement mode and characteristics of MHI data.The background out of the solar disk is removed to decrease the pixel number of an image under compression.Multiple predictive coding methods are combined to eliminate the redundancy utilizing the correlation(space,spectrum,and polarization)in data set,improving the compression ratio.Experimental results demonstrate that our method achieves an average compression ratio of 3.67.The compression time is also less than the general observation period.The method exhibits strong feasibility and can be easily adapted to MHI.展开更多
Attitude is one of the crucial parameters for space objects and plays a vital role in collision prediction and debris removal.Analyzing light curves to determine attitude is the most commonly used method.In photometri...Attitude is one of the crucial parameters for space objects and plays a vital role in collision prediction and debris removal.Analyzing light curves to determine attitude is the most commonly used method.In photometric observations,outliers may exist in the obtained light curves due to various reasons.Therefore,preprocessing is required to remove these outliers to obtain high quality light curves.Through statistical analysis,the reasons leading to outliers can be categorized into two main types:first,the brightness of the object significantly increases due to the passage of a star nearby,referred to as“stellar contamination,”and second,the brightness markedly decreases due to cloudy cover,referred to as“cloudy contamination.”The traditional approach of manually inspecting images for contamination is time-consuming and labor-intensive.However,we propose the utilization of machine learning methods as a substitute.Convolutional Neural Networks and SVMs are employed to identify cases of stellar contamination and cloudy contamination,achieving F1 scores of 1.00 and 0.98 on a test set,respectively.We also explore other machine learning methods such as ResNet-18 and Light Gradient Boosting Machine,then conduct comparative analyses of the results.展开更多
The quality of synthetic aperture radar(SAR)image degrades in the case of multiple imaging projection planes(IPPs)and multiple overlapping ship targets,and then the performance of target classification and recognition...The quality of synthetic aperture radar(SAR)image degrades in the case of multiple imaging projection planes(IPPs)and multiple overlapping ship targets,and then the performance of target classification and recognition can be influenced.For addressing this issue,a method for extracting ship targets with overlaps via the expectation maximization(EM)algorithm is pro-posed.First,the scatterers of ship targets are obtained via the target detection technique.Then,the EM algorithm is applied to extract the scatterers of a single ship target with a single IPP.Afterwards,a novel image amplitude estimation approach is pro-posed,with which the radar image of a single target with a sin-gle IPP can be generated.The proposed method can accom-plish IPP selection and targets separation in the image domain,which can improve the image quality and reserve the target information most possibly.Results of simulated and real mea-sured data demonstrate the effectiveness of the proposed method.展开更多
The Internet of Multimedia Things(IoMT)refers to a network of interconnected multimedia devices that communicate with each other over the Internet.Recently,smart healthcare has emerged as a significant application of ...The Internet of Multimedia Things(IoMT)refers to a network of interconnected multimedia devices that communicate with each other over the Internet.Recently,smart healthcare has emerged as a significant application of the IoMT,particularly in the context of knowledge‐based learning systems.Smart healthcare systems leverage knowledge‐based learning to become more context‐aware,adaptable,and auditable while maintain-ing the ability to learn from historical data.In smart healthcare systems,devices capture images,such as X‐rays,Magnetic Resonance Imaging.The security and integrity of these images are crucial for the databases used in knowledge‐based learning systems to foster structured decision‐making and enhance the learning abilities of AI.Moreover,in knowledge‐driven systems,the storage and transmission of HD medical images exert a burden on the limited bandwidth of the communication channel,leading to data trans-mission delays.To address the security and latency concerns,this paper presents a lightweight medical image encryption scheme utilising bit‐plane decomposition and chaos theory.The results of the experiment yield entropy,energy,and correlation values of 7.999,0.0156,and 0.0001,respectively.This validates the effectiveness of the encryption system proposed in this paper,which offers high‐quality encryption,a large key space,key sensitivity,and resistance to statistical attacks.展开更多
Person image generation aims to generate images that maintain the original human appearance in different target poses.Recent works have revealed that the critical element in achieving this task is the alignment of app...Person image generation aims to generate images that maintain the original human appearance in different target poses.Recent works have revealed that the critical element in achieving this task is the alignment of appearance domain and pose domain.Previous alignment methods,such as appearance flow warping,correspondence learning and cross attention,often encounter challenges when it comes to producing fine texture details.These approaches suffer from limitations in accurately estimating appearance flows due to the lack of global receptive field.Alternatively,they can only perform cross-domain alignment on high-level feature maps with small spatial dimensions since the computational complexity increases quadratically with larger feature sizes.In this article,the significance of multi-scale alignment,in both low-level and high-level domains,for ensuring reliable cross-domain alignment of appearance and pose is demonstrated.To this end,a novel and effective method,named Multi-scale Crossdomain Alignment(MCA)is proposed.Firstly,MCA adopts global context aggregation transformer to model multi-scale interaction between pose and appearance inputs,which employs pair-wise window-based cross attention.Furthermore,leveraging the integrated global source information for each target position,MCA applies flexible flow prediction head and point correlation to effectively conduct warping and fusing for final transformed person image generation.Our proposed MCA achieves superior performance on two popular datasets than other methods,which verifies the effectiveness of our approach.展开更多
基金supported by the National Natural Science Foundation of China(Grant No.52090081)the State Key Laboratory of Hydro-science and Hydraulic Engineering(Grant No.2021-KY-04).
文摘Geological discontinuity(GD)plays a pivotal role in determining the catastrophic mechanical failure of jointed rock masses.Accurate and efficient acquisition of GD networks is essential for characterizing and understanding the progressive damage mechanisms of slopes based on monitoring image data.Inspired by recent advances in computer vision,deep learning(DL)models have been widely utilized for image-based fracture identification.The multi-scale characteristics,image resolution and annotation quality of images will cause a scale-space effect(SSE)that makes features indistinguishable from noise,directly affecting the accuracy.However,this effect has not received adequate attention.Herein,we try to address this gap by collecting slope images at various proportional scales and constructing multi-scale datasets using image processing techniques.Next,we quantify the intensity of feature signals using metrics such as peak signal-to-noise ratio(PSNR)and structural similarity(SSIM).Combining these metrics with the scale-space theory,we investigate the influence of the SSE on the differentiation of multi-scale features and the accuracy of recognition.It is found that augmenting the image's detail capacity does not always yield benefits for vision-based recognition models.In light of these observations,we propose a scale hybridization approach based on the diffusion mechanism of scale-space representation.The results show that scale hybridization strengthens the tolerance of multi-scale feature recognition under complex environmental noise interference and significantly enhances the recognition accuracy of GD.It also facilitates the objective understanding,description and analysis of the rock behavior and stability of slopes from the perspective of image data.
基金National Research Foundation of Korea,Grant/Award Numbers:2022R1I1A3069113,RS-2023-00221365Electronics and Telecommunications Research Institute,Grant/Award Number:2014-3-00123。
文摘In recent times,an image enhancement approach,which learns the global transformation function using deep neural networks,has gained attention.However,many existing methods based on this approach have a limitation:their transformation functions are too simple to imitate complex colour transformations between low-quality images and manually retouched high-quality images.In order to address this limitation,a simple yet effective approach for image enhancement is proposed.The proposed algorithm based on the channel-wise intensity transformation is designed.However,this transformation is applied to the learnt embedding space instead of specific colour spaces and then return enhanced features to colours.To this end,the authors define the continuous intensity transformation(CIT)to describe the mapping between input and output intensities on the embedding space.Then,the enhancement network is developed,which produces multi-scale feature maps from input images,derives the set of transformation functions,and performs the CIT to obtain enhanced images.Extensive experiments on the MIT-Adobe 5K dataset demonstrate that the authors’approach improves the performance of conventional intensity transforms on colour space metrics.Specifically,the authors achieved a 3.8%improvement in peak signal-to-noise ratio,a 1.8%improvement in structual similarity index measure,and a 27.5%improvement in learned perceptual image patch similarity.Also,the authors’algorithm outperforms state-of-the-art alternatives on three image enhancement datasets:MIT-Adobe 5K,Low-Light,and Google HDRþ.
基金supported by the GHfund A(202302017475)supported by the Foundation for Distinguished Young Scholars of Jiangsu Province(No.BK20140050)+5 种基金the National Natural Science Foundation of China(Nos.11973070,11333008,11273061,11825303,and 11673065)the China Manned Space Project with No.CMS-CSST-2021-A01,CMSCSST-2021-A03,CMS-CSST-2021-B01the Joint Funds of the National Natural Science Foundation of China(No.U1931210)the support from Key Research Program of Frontier Sciences,CAS,grant No.ZDBS-LY-7013Program of Shanghai Academic/Technology Research Leaderthe support from the science research grants from the China Manned Space Project with CMS-CSST-2021-A04,CMS-CSST-2021-A07。
文摘We have developed a novel method for co-adding multiple under-sampled images that combines the iteratively reweighted least squares and divide-and-conquer algorithms.Our approach not only allows for the anti-aliasing of the images but also enables Point-Spread Function(PSF)deconvolution,resulting in enhanced restoration of extended sources,the highest peak signal-to-noise ratio,and reduced ringing artefacts.To test our method,we conducted numerical simulations that replicated observation runs of the China Space Station Telescope/the VLT Survey Telescope(VST)and compared our results to those obtained using previous algorithms.The simulation showed that our method outperforms previous approaches in several ways,such as restoring the profile of extended sources and minimizing ringing artefacts.Additionally,because our method relies on the inherent advantages of least squares fitting,it is more versatile and does not depend on the local uniformity hypothesis for the PSF.However,the new method consumes much more computation than the other approaches.
文摘Algal blooms,the spread of algae on the surface of water bodies,have adverse effects not only on aquatic ecosystems but also on human life.The adverse effects of harmful algal blooms(HABs)necessitate a convenient solution for detection and monitoring.Unmanned aerial vehicles(UAVs)have recently emerged as a tool for algal bloom detection,efficiently providing on-demand images at high spatiotemporal resolutions.This study developed an image processing method for algal bloom area estimation from the aerial images(obtained from the internet)captured using UAVs.As a remote sensing method of HAB detection,analysis,and monitoring,a combination of histogram and texture analyses was used to efficiently estimate the area of HABs.Statistical features like entropy(using the Kullback-Leibler method)were emphasized with the aid of a gray-level co-occurrence matrix.The results showed that the orthogonal images demonstrated fewer errors,and the morphological filter best detected algal blooms in real time,with a precision of 80%.This study provided efficient image processing approaches using on-board UAVs for HAB monitoring.
基金Project supported by the Scientific Research Fund of Hunan Provincial Education Department,China (Grant No.21A0470)the Natural Science Foundation of Hunan Province,China (Grant No.2023JJ50268)+1 种基金the National Natural Science Foundation of China (Grant Nos.62172268 and 62302289)the Shanghai Science and Technology Project,China (Grant Nos.21JC1402800 and 23YF1416200)。
文摘As a branch of quantum image processing,quantum image scaling has been widely studied.However,most of the existing quantum image scaling algorithms are based on nearest-neighbor interpolation and bilinear interpolation,the quantum version of bicubic interpolation has not yet been studied.In this work,we present the first quantum image scaling scheme for bicubic interpolation based on the novel enhanced quantum representation(NEQR).Our scheme can realize synchronous enlargement and reduction of the image with the size of 2^(n)×2^(n) by integral multiple.Firstly,the image is represented by NEQR and the original image coordinates are obtained through multiple CNOT modules.Then,16 neighborhood pixels are obtained by quantum operation circuits,and the corresponding weights of these pixels are calculated by quantum arithmetic modules.Finally,a quantum matrix operation,instead of a classical convolution operation,is used to realize the sum of convolution of these pixels.Through simulation experiments and complexity analysis,we demonstrate that our scheme achieves exponential speedup over the classical bicubic interpolation algorithm,and has better effect than the quantum version of bilinear interpolation.
基金the National Natural Science Foundation of China(62003298,62163036)the Major Project of Science and Technology of Yunnan Province(202202AD080005,202202AH080009)the Yunnan University Professional Degree Graduate Practice Innovation Fund Project(ZC-22222770)。
文摘Oscillation detection has been a hot research topic in industries due to the high incidence of oscillation loops and their negative impact on plant profitability.Although numerous automatic detection techniques have been proposed,most of them can only address part of the practical difficulties.An oscillation is heuristically defined as a visually apparent periodic variation.However,manual visual inspection is labor-intensive and prone to missed detection.Convolutional neural networks(CNNs),inspired by animal visual systems,have been raised with powerful feature extraction capabilities.In this work,an exploration of the typical CNN models for visual oscillation detection is performed.Specifically,we tested MobileNet-V1,ShuffleNet-V2,Efficient Net-B0,and GhostNet models,and found that such a visual framework is well-suited for oscillation detection.The feasibility and validity of this framework are verified utilizing extensive numerical and industrial cases.Compared with state-of-theart oscillation detectors,the suggested framework is more straightforward and more robust to noise and mean-nonstationarity.In addition,this framework generalizes well and is capable of handling features that are not present in the training data,such as multiple oscillations and outliers.
文摘Diagnosing various diseases such as glaucoma,age-related macular degeneration,cardiovascular conditions,and diabetic retinopathy involves segmenting retinal blood vessels.The task is particularly challenging when dealing with color fundus images due to issues like non-uniformillumination,low contrast,and variations in vessel appearance,especially in the presence of different pathologies.Furthermore,the speed of the retinal vessel segmentation system is of utmost importance.With the surge of now available big data,the speed of the algorithm becomes increasingly important,carrying almost equivalent weightage to the accuracy of the algorithm.To address these challenges,we present a novel approach for retinal vessel segmentation,leveraging efficient and robust techniques based on multiscale line detection and mathematical morphology.Our algorithm’s performance is evaluated on two publicly available datasets,namely the Digital Retinal Images for Vessel Extraction dataset(DRIVE)and the Structure Analysis of Retina(STARE)dataset.The experimental results demonstrate the effectiveness of our method,withmean accuracy values of 0.9467 forDRIVE and 0.9535 for STARE datasets,aswell as sensitivity values of 0.6952 forDRIVE and 0.6809 for STARE datasets.Notably,our algorithmexhibits competitive performance with state-of-the-art methods.Importantly,it operates at an average speed of 3.73 s per image for DRIVE and 3.75 s for STARE datasets.It is worth noting that these results were achieved using Matlab scripts containing multiple loops.This suggests that the processing time can be further reduced by replacing loops with vectorization.Thus the proposed algorithm can be deployed in real time applications.In summary,our proposed system strikes a fine balance between swift computation and accuracy that is on par with the best available methods in the field.
基金Project supported by the National Natural Science Foundation of China(Grant Nos.62172268 and 62302289)the Shanghai Science and Technology Project(Grant Nos.21JC1402800 and 23YF1416200)。
文摘As a part of quantum image processing,quantum image filtering is a crucial technology in the development of quantum computing.Low-pass filtering can effectively achieve anti-aliasing effects on images.Currently,most quantum image filterings are based on classical domains and grayscale images,and there are relatively fewer studies on anti-aliasing in the quantum domain.This paper proposes a scheme for anti-aliasing filtering based on quantum grayscale and color image scaling in the spatial domain.It achieves the effect of anti-aliasing filtering on quantum images during the scaling process.First,we use the novel enhanced quantum representation(NEQR)and the improved quantum representation of color images(INCQI)to represent classical images.Since aliasing phenomena are more pronounced when images are scaled down,this paper focuses only on the anti-aliasing effects in the case of reduction.Subsequently,we perform anti-aliasing filtering on the quantum representation of the original image and then use bilinear interpolation to scale down the image,achieving the anti-aliasing effect.The constructed pyramid model is then used to select an appropriate image for upscaling to the original image size.Finally,the complexity of the circuit is analyzed.Compared to the images experiencing aliasing effects solely due to scaling,applying anti-aliasing filtering to the images results in smoother and clearer outputs.Additionally,the anti-aliasing filtering allows for manual intervention to select the desired level of image smoothness.
基金supported by the National Science Foundation of China(10972015,11172015)the Beijing Natural Science Foundation(8162008).
文摘The mechanical properties and failure mechanism of lightweight aggregate concrete(LWAC)is a hot topic in the engineering field,and the relationship between its microstructure and macroscopic mechanical properties is also a frontier research topic in the academic field.In this study,the image processing technology is used to establish a micro-structure model of lightweight aggregate concrete.Through the information extraction and processing of the section image of actual light aggregate concrete specimens,the mesostructural model of light aggregate concrete with real aggregate characteristics is established.The numerical simulation of uniaxial tensile test,uniaxial compression test and three-point bending test of lightweight aggregate concrete are carried out using a new finite element method-the base force element method respectively.Firstly,the image processing technology is used to produce beam specimens,uniaxial compression specimens and uniaxial tensile specimens of light aggregate concrete,which can better simulate the aggregate shape and random distribution of real light aggregate concrete.Secondly,the three-point bending test is numerically simulated.Thirdly,the uniaxial compression specimen generated by image processing technology is numerically simulated.Fourth,the uniaxial tensile specimen generated by image processing technology is numerically simulated.The mechanical behavior and damage mode of the specimen during loading were analyzed.The results of numerical simulation are compared and analyzed with those of relevant experiments.The feasibility and correctness of the micromodel established in this study for analyzing the micromechanics of lightweight aggregate concrete materials are verified.Image processing technology has a broad application prospect in the field of concrete mesoscopic damage analysis.
基金This work is supported in part by The National Natural Science Foundation of China(Grant Number 61971078),which provided domain expertise and computational power that greatly assisted the activityThis work was financially supported by Chongqing Municipal Education Commission Grants for-Major Science and Technology Project(Grant Number gzlcx20243175).
文摘Semantic segmentation of driving scene images is crucial for autonomous driving.While deep learning technology has significantly improved daytime image semantic segmentation,nighttime images pose challenges due to factors like poor lighting and overexposure,making it difficult to recognize small objects.To address this,we propose an Image Adaptive Enhancement(IAEN)module comprising a parameter predictor(Edip),multiple image processing filters(Mdif),and a Detail Processing Module(DPM).Edip combines image processing filters to predict parameters like exposure and hue,optimizing image quality.We adopt a novel image encoder to enhance parameter prediction accuracy by enabling Edip to handle features at different scales.DPM strengthens overlooked image details,extending the IAEN module’s functionality.After the segmentation network,we integrate a Depth Guided Filter(DGF)to refine segmentation outputs.The entire network is trained end-to-end,with segmentation results guiding parameter prediction optimization,promoting self-learning and network improvement.This lightweight and efficient network architecture is particularly suitable for addressing challenges in nighttime image segmentation.Extensive experiments validate significant performance improvements of our approach on the ACDC-night and Nightcity datasets.
基金National Key R&D Program of China,Grant/Award Number:2022YFC3300704National Natural Science Foundation of China,Grant/Award Numbers:62171038,62088101,62006023。
文摘Due to hardware limitations,existing hyperspectral(HS)camera often suffer from low spatial/temporal resolution.Recently,it has been prevalent to super-resolve a low reso-lution(LR)HS image into a high resolution(HR)HS image with a HR RGB(or mul-tispectral)image guidance.Previous approaches for this guided super-resolution task often model the intrinsic characteristic of the desired HR HS image using hand-crafted priors.Recently,researchers pay more attention to deep learning methods with direct supervised or unsupervised learning,which exploit deep prior only from training dataset or testing data.In this article,an efficient convolutional neural network-based method is presented to progressively super-resolve HS image with RGB image guidance.Specif-ically,a progressive HS image super-resolution network is proposed,which progressively super-resolve the LR HS image with pixel shuffled HR RGB image guidance.Then,the super-resolution network is progressively trained with supervised pre-training and un-supervised adaption,where supervised pre-training learns the general prior on training data and unsupervised adaptation generalises the general prior to specific prior for variant testing scenes.The proposed method can effectively exploit prior from training dataset and testing HS and RGB images with spectral-spatial constraint.It has a good general-isation capability,especially for blind HS image super-resolution.Comprehensive experimental results show that the proposed deep progressive learning method out-performs the existing state-of-the-art methods for HS image super-resolution in non-blind and blind cases.
基金supported by the National Natural Science Foundation of China(No.12373073,U2031104,No.12173015)Guangdong Basic and Applied Basic Research Foundation(No.2023A1515011340)。
文摘Obtaining high precision is an important consideration for astrometric studies using images from the Narrow Angle Camera(NAC)of the Cassini Imaging Science Subsystem(ISS).Selecting the best centering algorithm is key to enhancing astrometric accuracy.In this study,we compared the accuracy of five centering algorithms:Gaussian fitting,the modified moments method,and three point-spread function(PSF)fitting methods(effective PSF(ePSF),PSFEx,and extended PSF(x PSF)from the Cassini Imaging Central Laboratory for Operations(CICLOPS)).We assessed these algorithms using 70 ISS NAC star field images taken with CL1 and CL2 filters across different stellar magnitudes.The ePSF method consistently demonstrated the highest accuracy,achieving precision below 0.03 pixels for stars of magnitude 8-9.Compared to the previously considered best,the modified moments method,the e PSF method improved overall accuracy by about 10%and 21%in the sample and line directions,respectively.Surprisingly,the xPSF model provided by CICLOPS had lower precision than the ePSF.Conversely,the ePSF exhibits an improvement in measurement precision of 23%and 17%in the sample and line directions,respectively,over the xPSF.This discrepancy might be attributed to the xPSF focusing on photometry rather than astrometry.These findings highlight the necessity of constructing PSF models specifically tailored for astrometric purposes in NAC images and provide guidance for enhancing astrometric measurements using these ISS NAC images.
基金the National Natural Science Foun-dation of China(Nos.61471263,61872267 and U21B2024)the Natural Science Foundation of Tianjin,China(No.16JCZDJC31100)Tianjin University Innovation Foundation(No.2021XZC0024).
文摘Hyperspectral images typically have high spectral resolution but low spatial resolution,which impacts the reliability and accuracy of subsequent applications,for example,remote sensingclassification and mineral identification.But in traditional methods via deep convolution neural net-works,indiscriminately extracting and fusing spectral and spatial features makes it challenging toutilize the differentiated information across adjacent spectral channels.Thus,we proposed a multi-branch interleaved iterative upsampling hyperspectral image super-resolution reconstruction net-work(MIIUSR)to address the above problems.We reinforce spatial feature extraction by integrat-ing detailed features from different receptive fields across adjacent channels.Furthermore,we pro-pose an interleaved iterative upsampling process during the reconstruction stage,which progres-sively fuses incremental information among adjacent frequency bands.Additionally,we add twoparallel three dimensional(3D)feature extraction branches to the backbone network to extractspectral and spatial features of varying granularity.We further enhance the backbone network’sconstruction results by leveraging the difference between two dimensional(2D)channel-groupingspatial features and 3D multi-granularity features.The results obtained by applying the proposednetwork model to the CAVE test set show that,at a scaling factor of×4,the peak signal to noiseratio,spectral angle mapping,and structural similarity are 37.310 dB,3.525 and 0.9438,respec-tively.Besides,extensive experiments conducted on the Harvard and Foster datasets demonstratethe superior potential of the proposed model in hyperspectral super-resolution reconstruction.
文摘Underwater images are often with biased colours and reduced contrast because of the absorption and scattering effects when light propagates in water.Such images with degradation cannot meet the needs of underwater operations.The main problem in classic underwater image restoration or enhancement methods is that they consume long calcu-lation time,and often,the colour or contrast of the result images is still unsatisfied.Instead of using the complicated physical model of underwater imaging degradation,we propose a new method to deal with underwater images by imitating the colour constancy mechanism of human vision using double-opponency.Firstly,the original image is converted to the LMS space.Then the signals are linearly combined,and Gaussian convolutions are per-formed to imitate the function of receptive fields(RFs).Next,two RFs with different sizes work together to constitute the double-opponency response.Finally,the underwater light is estimated to correct the colours in the image.Further contrast stretching on the luminance is optional.Experiments show that the proposed method can obtain clarified underwater images with higher quality than before,and it spends significantly less time cost compared to other previously published typical methods.
基金The authors extend their appreciation to the Deanship of Scientific Research at Shaqra University for funding this research work through the Project Number(SU-ANN-2023016).
文摘Object tracking is one of the major tasks for mobile robots in many real-world applications.Also,artificial intelligence and automatic control techniques play an important role in enhancing the performance of mobile robot navigation.In contrast to previous simulation studies,this paper presents a new intelligent mobile robot for accomplishing multi-tasks by tracking red-green-blue(RGB)colored objects in a real experimental field.Moreover,a practical smart controller is developed based on adaptive fuzzy logic and custom proportional-integral-derivative(PID)schemes to achieve accurate tracking results,considering robot command delay and tolerance errors.The design of developed controllers implies some motion rules to mimic the knowledge of experienced operators.Twelve scenarios of three colored object combinations have been successfully tested and evaluated by using the developed controlled image-based robot tracker.Classical PID control failed to handle some tracking scenarios in this study.The proposed adaptive fuzzy PID control achieved the best accurate results with the minimum average final error of 13.8 cm to reach the colored targets,while our designed custom PID control is efficient in saving both average time and traveling distance of 6.6 s and 14.3 cm,respectively.These promising results demonstrate the feasibility of applying our developed image-based robotic system in a colored object-tracking environment to reduce human workloads.
基金supported by the National Key R&D Program of China(grant No.2022YFF0503800)by the National Natural Science Foundation of China(NSFC)(grant No.11427901)+1 种基金by the Strategic Priority Research Program of the Chinese Academy of Sciences(CAS-SPP)(grant No.XDA15320102)by the Youth Innovation Promotion Association(CAS No.2022057)。
文摘The Solar Polar-orbit Observatory(SPO),proposed by Chinese scientists,is designed to observe the solar polar regions in an unprecedented way with a spacecraft traveling in a large solar inclination angle and a small ellipticity.However,one of the most significant challenges lies in ultra-long-distance data transmission,particularly for the Magnetic and Helioseismic Imager(MHI),which is the most important payload and generates the largest volume of data in SPO.In this paper,we propose a tailored lossless data compression method based on the measurement mode and characteristics of MHI data.The background out of the solar disk is removed to decrease the pixel number of an image under compression.Multiple predictive coding methods are combined to eliminate the redundancy utilizing the correlation(space,spectrum,and polarization)in data set,improving the compression ratio.Experimental results demonstrate that our method achieves an average compression ratio of 3.67.The compression time is also less than the general observation period.The method exhibits strong feasibility and can be easily adapted to MHI.
基金funded by the National Natural Science Foundation of China(NSFC,Nos.12373086 and 12303082)CAS“Light of West China”Program+2 种基金Yunnan Revitalization Talent Support Program in Yunnan ProvinceNational Key R&D Program of ChinaGravitational Wave Detection Project No.2022YFC2203800。
文摘Attitude is one of the crucial parameters for space objects and plays a vital role in collision prediction and debris removal.Analyzing light curves to determine attitude is the most commonly used method.In photometric observations,outliers may exist in the obtained light curves due to various reasons.Therefore,preprocessing is required to remove these outliers to obtain high quality light curves.Through statistical analysis,the reasons leading to outliers can be categorized into two main types:first,the brightness of the object significantly increases due to the passage of a star nearby,referred to as“stellar contamination,”and second,the brightness markedly decreases due to cloudy cover,referred to as“cloudy contamination.”The traditional approach of manually inspecting images for contamination is time-consuming and labor-intensive.However,we propose the utilization of machine learning methods as a substitute.Convolutional Neural Networks and SVMs are employed to identify cases of stellar contamination and cloudy contamination,achieving F1 scores of 1.00 and 0.98 on a test set,respectively.We also explore other machine learning methods such as ResNet-18 and Light Gradient Boosting Machine,then conduct comparative analyses of the results.
基金This work was supported by the National Science Fund for Distinguished Young Scholars(62325104).
文摘The quality of synthetic aperture radar(SAR)image degrades in the case of multiple imaging projection planes(IPPs)and multiple overlapping ship targets,and then the performance of target classification and recognition can be influenced.For addressing this issue,a method for extracting ship targets with overlaps via the expectation maximization(EM)algorithm is pro-posed.First,the scatterers of ship targets are obtained via the target detection technique.Then,the EM algorithm is applied to extract the scatterers of a single ship target with a single IPP.Afterwards,a novel image amplitude estimation approach is pro-posed,with which the radar image of a single target with a sin-gle IPP can be generated.The proposed method can accom-plish IPP selection and targets separation in the image domain,which can improve the image quality and reserve the target information most possibly.Results of simulated and real mea-sured data demonstrate the effectiveness of the proposed method.
文摘The Internet of Multimedia Things(IoMT)refers to a network of interconnected multimedia devices that communicate with each other over the Internet.Recently,smart healthcare has emerged as a significant application of the IoMT,particularly in the context of knowledge‐based learning systems.Smart healthcare systems leverage knowledge‐based learning to become more context‐aware,adaptable,and auditable while maintain-ing the ability to learn from historical data.In smart healthcare systems,devices capture images,such as X‐rays,Magnetic Resonance Imaging.The security and integrity of these images are crucial for the databases used in knowledge‐based learning systems to foster structured decision‐making and enhance the learning abilities of AI.Moreover,in knowledge‐driven systems,the storage and transmission of HD medical images exert a burden on the limited bandwidth of the communication channel,leading to data trans-mission delays.To address the security and latency concerns,this paper presents a lightweight medical image encryption scheme utilising bit‐plane decomposition and chaos theory.The results of the experiment yield entropy,energy,and correlation values of 7.999,0.0156,and 0.0001,respectively.This validates the effectiveness of the encryption system proposed in this paper,which offers high‐quality encryption,a large key space,key sensitivity,and resistance to statistical attacks.
基金National Natural Science Foundation of China,Grant/Award Number:62274142Hangzhou Major Technology Innovation Project of Artificial Intelligence,Grant/Award Number:2022AIZD0060。
文摘Person image generation aims to generate images that maintain the original human appearance in different target poses.Recent works have revealed that the critical element in achieving this task is the alignment of appearance domain and pose domain.Previous alignment methods,such as appearance flow warping,correspondence learning and cross attention,often encounter challenges when it comes to producing fine texture details.These approaches suffer from limitations in accurately estimating appearance flows due to the lack of global receptive field.Alternatively,they can only perform cross-domain alignment on high-level feature maps with small spatial dimensions since the computational complexity increases quadratically with larger feature sizes.In this article,the significance of multi-scale alignment,in both low-level and high-level domains,for ensuring reliable cross-domain alignment of appearance and pose is demonstrated.To this end,a novel and effective method,named Multi-scale Crossdomain Alignment(MCA)is proposed.Firstly,MCA adopts global context aggregation transformer to model multi-scale interaction between pose and appearance inputs,which employs pair-wise window-based cross attention.Furthermore,leveraging the integrated global source information for each target position,MCA applies flexible flow prediction head and point correlation to effectively conduct warping and fusing for final transformed person image generation.Our proposed MCA achieves superior performance on two popular datasets than other methods,which verifies the effectiveness of our approach.