A new spectral matching algorithm is proposed by us- ing nonsubsampled contourlet transform and scale-invariant fea- ture transform. The nonsubsampled contourlet transform is used to decompose an image into a low freq...A new spectral matching algorithm is proposed by us- ing nonsubsampled contourlet transform and scale-invariant fea- ture transform. The nonsubsampled contourlet transform is used to decompose an image into a low frequency image and several high frequency images, and the scale-invariant feature transform is employed to extract feature points from the low frequency im- age. A proximity matrix is constructed for the feature points of two related images. By singular value decomposition of the proximity matrix, a matching matrix (or matching result) reflecting the match- ing degree among feature points is obtained. Experimental results indicate that the proposed algorithm can reduce time complexity and possess a higher accuracy.展开更多
To meet the needs in the fundus examination,including outlook widening,pathology tracking,etc.,this paper describes a robust feature-based method for fully-automatic mosaic of the curved human retinal images photograp...To meet the needs in the fundus examination,including outlook widening,pathology tracking,etc.,this paper describes a robust feature-based method for fully-automatic mosaic of the curved human retinal images photographed by a fundus microscope. The kernel of this new algorithm is the scale-,rotation-and illumination-invariant interest point detector & feature descriptor-Scale-Invariant Feature Transform. When matched interest points according to second-nearest-neighbor strategy,the parameters of the model are estimated using the correct matches of the interest points,extracted by a new inlier identification scheme based on Sampson distance from putative sets. In order to preserve image features,bilinear warping and multi-band blending techniques are used to create panoramic retinal images. Experiments show that the proposed method works well with rejection error in 0.3 pixels,even for those cases where the retinal images without discernable vascular structure in contrast to the state-of-the-art algorithms.展开更多
Content-based satellite image registration is a difficult issue in the fields of remote sensing and image processing. The difficulty is more significant in the case of matching multisource remote sensing images which ...Content-based satellite image registration is a difficult issue in the fields of remote sensing and image processing. The difficulty is more significant in the case of matching multisource remote sensing images which suffer from illumination, rotation, and source differences. The scale-invariant feature transform (SIFT) algorithm has been used successfully in satellite image registration problems. Also, many researchers have applied a local SIFT descriptor to improve the image retrieval process. Despite its robustness, this algorithm has some difficulties with the quality and quantity of the extracted local feature points in multisource remote sensing. Furthermore, high dimensionality of the local features extracted by SIFT results in time-consuming computational processes alongside high storage requirements for saving the relevant information, which are important factors in content-based image retrieval (CBIR) applications. In this paper, a novel method is introduced to transform the local SIFT features to global features for multisource remote sensing. The quality and quantity of SIFT local features have been enhanced by applying contrast equalization on images in a pre-processing stage. Considering the local features of each image in the reference database as a separate class, linear discriminant analysis (LDA) is used to transform the local features to global features while reducing di- mensionality of the feature space. This will also significantly reduce the computational time and storage required. Applying the trained kernel on verification data and mapping them showed a successful retrieval rate of 91.67% for test feature points.展开更多
In this paper, we proposed a registration method by combining the morphological component analysis(MCA) and scale-invariant feature transform(SIFT) algorithm. This method uses the perception dictionaries,and combines ...In this paper, we proposed a registration method by combining the morphological component analysis(MCA) and scale-invariant feature transform(SIFT) algorithm. This method uses the perception dictionaries,and combines the Basis-Pursuit algorithm and the Total-Variation regularization scheme to extract the cartoon part containing basic geometrical information from the original image, and is stable and unsusceptible to noise interference. Then a smaller number of the distinctive key points will be obtained by using the SIFT algorithm based on the cartoon part of the original image. Matching the key points by the constrained Euclidean distance,we will obtain a more correct and robust matching result. The experimental results show that the geometrical transform parameters inferred by the matched key points based on MCA+SIFT registration method are more exact than the ones based on the direct SIFT algorithm.展开更多
Contraposing the need of the robust digital watermark for the copyright protection field, a new digital watermarking algorithm in the non-subsampled contourlet transform (NSCT) domain is proposed. The largest energy...Contraposing the need of the robust digital watermark for the copyright protection field, a new digital watermarking algorithm in the non-subsampled contourlet transform (NSCT) domain is proposed. The largest energy sub-band after NSCT is selected to embed watermark. The watermark is embedded into scaleinvariant feature transform (SIFT) regions. During embedding, the initial region is divided into some cirque sub-regions with the same area, and each watermark bit is embedded into one sub-region. Extensive simulation results and comparisons show that the algorithm gets a good trade-off of invisibility, robustness and capacity, thus obtaining good quality of the image while being able to effectively resist common image processing, and geometric and combo attacks, and normalized similarity is almost all reached.展开更多
Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to est...Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to establish relationships between distant but relevant points. To overcome the limitation of local spatial attention, we propose a point content-based Transformer architecture, called PointConT for short. It exploits the locality of points in the feature space(content-based), which clusters the sampled points with similar features into the same class and computes the self-attention within each class, thus enabling an effective trade-off between capturing long-range dependencies and computational complexity. We further introduce an inception feature aggregator for point cloud classification, which uses parallel structures to aggregate high-frequency and low-frequency information in each branch separately. Extensive experiments show that our PointConT model achieves a remarkable performance on point cloud shape classification. Especially, our method exhibits 90.3% Top-1 accuracy on the hardest setting of ScanObjectN N. Source code of this paper is available at https://github.com/yahuiliu99/PointC onT.展开更多
Photoacoustic imaging(PAI)is a noninvasive emerging imaging method based on the photoacoustic effect,which provides necessary assistance for medical diagnosis.It has the characteristics of large imaging depth and high...Photoacoustic imaging(PAI)is a noninvasive emerging imaging method based on the photoacoustic effect,which provides necessary assistance for medical diagnosis.It has the characteristics of large imaging depth and high contrast.However,limited by the equipment cost and reconstruction time requirements,the existing PAI systems distributed with annular array transducers are difficult to take into account both the image quality and the imaging speed.In this paper,a triple-path feature transform network(TFT-Net)for ring-array photoacoustic tomography is proposed to enhance the imaging quality from limited-view and sparse measurement data.Specifically,the network combines the raw photoacoustic pressure signals and conventional linear reconstruction images as input data,and takes the photoacoustic physical model as a prior information to guide the reconstruction process.In addition,to enhance the ability of extracting signal features,the residual block and squeeze and excitation block are introduced into the TFT-Net.For further efficient reconstruction,the final output of photoacoustic signals uses‘filter-then-upsample’operation with a pixel-shuffle multiplexer and a max out module.Experiment results on simulated and in-vivo data demonstrate that the constructed TFT-Net can restore the target boundary clearly,reduce background noise,and realize fast and high-quality photoacoustic image reconstruction of limited view with sparse sampling.展开更多
Olive trees are susceptible to a variety of diseases that can cause significant crop damage and economic losses.Early detection of these diseases is essential for effective management.We propose a novel transformed wa...Olive trees are susceptible to a variety of diseases that can cause significant crop damage and economic losses.Early detection of these diseases is essential for effective management.We propose a novel transformed wavelet,feature-fused,pre-trained deep learning model for detecting olive leaf diseases.The proposed model combines wavelet transforms with pre-trained deep-learning models to extract discriminative features from olive leaf images.The model has four main phases:preprocessing using data augmentation,three-level wavelet transformation,learning using pre-trained deep learning models,and a fused deep learning model.In the preprocessing phase,the image dataset is augmented using techniques such as resizing,rescaling,flipping,rotation,zooming,and contrasting.In wavelet transformation,the augmented images are decomposed into three frequency levels.Three pre-trained deep learning models,EfficientNet-B7,DenseNet-201,and ResNet-152-V2,are used in the learning phase.The models were trained using the approximate images of the third-level sub-band of the wavelet transform.In the fused phase,the fused model consists of a merge layer,three dense layers,and two dropout layers.The proposed model was evaluated using a dataset of images of healthy and infected olive leaves.It achieved an accuracy of 99.72%in the diagnosis of olive leaf diseases,which exceeds the accuracy of other methods reported in the literature.This finding suggests that our proposed method is a promising tool for the early detection of olive leaf diseases.展开更多
Addressing the challenges posed by the nonlinear and non-stationary vibrations in rotating machinery,where weak fault characteristic signals hinder accurate fault state representation,we propose a novel feature extrac...Addressing the challenges posed by the nonlinear and non-stationary vibrations in rotating machinery,where weak fault characteristic signals hinder accurate fault state representation,we propose a novel feature extraction method that combines the Flexible Analytic Wavelet Transform(FAWT)with Nonlinear Quantum Permutation Entropy.FAWT,leveraging fractional orders and arbitrary scaling and translation factors,exhibits superior translational invariance and adjustable fundamental oscillatory characteristics.This flexibility enables FAWT to provide well-suited wavelet shapes,effectively matching subtle fault components and avoiding performance degradation associated with fixed frequency partitioning and low-oscillation bases in detecting weak faults.In our approach,gearbox vibration signals undergo FAWT to obtain sub-bands.Quantum theory is then introduced into permutation entropy to propose Nonlinear Quantum Permutation Entropy,a feature that more accurately characterizes the operational state of vibration simulation signals.The nonlinear quantum permutation entropy extracted from sub-bands is utilized to characterize the operating state of rotating machinery.A comprehensive analysis of vibration signals from rolling bearings and gearboxes validates the feasibility of the proposed method.Comparative assessments with parameters derived from traditional permutation entropy,sample entropy,wavelet transform(WT),and empirical mode decomposition(EMD)underscore the superior effectiveness of this approach in fault detection and classification for rotating machinery.展开更多
In a crowd density estimation dataset,the annotation of crowd locations is an extremely laborious task,and they are not taken into the evaluation metrics.In this paper,we aim to reduce the annotation cost of crowd dat...In a crowd density estimation dataset,the annotation of crowd locations is an extremely laborious task,and they are not taken into the evaluation metrics.In this paper,we aim to reduce the annotation cost of crowd datasets,and propose a crowd density estimation method based on weakly-supervised learning,in the absence of crowd position supervision information,which directly reduces the number of crowds by using the number of pedestrians in the image as the supervised information.For this purpose,we design a new training method,which exploits the correlation between global and local image features by incremental learning to train the network.Specifically,we design a parent-child network(PC-Net)focusing on the global and local image respectively,and propose a linear feature calibration structure to train the PC-Net simultaneously,and the child network learns feature transfer factors and feature bias weights,and uses the transfer factors and bias weights to linearly feature calibrate the features extracted from the Parent network,to improve the convergence of the network by using local features hidden in the crowd images.In addition,we use the pyramid vision transformer as the backbone of the PC-Net to extract crowd features at different levels,and design a global-local feature loss function(L2).We combine it with a crowd counting loss(LC)to enhance the sensitivity of the network to crowd features during the training process,which effectively improves the accuracy of crowd density estimation.The experimental results show that the PC-Net significantly reduces the gap between fullysupervised and weakly-supervised crowd density estimation,and outperforms the comparison methods on five datasets of Shanghai Tech Part A,ShanghaiTech Part B,UCF_CC_50,UCF_QNRF and JHU-CROWD++.展开更多
A method is proposed for the analysis of vibration signals from components ofrotating machines, based on the wavelet packet transformation (WPT) and the underlying physicalconcepts of modulation mechanism. The method ...A method is proposed for the analysis of vibration signals from components ofrotating machines, based on the wavelet packet transformation (WPT) and the underlying physicalconcepts of modulation mechanism. The method provides a finer analysis and better time-frequencylocalization capabilities than any other analysis methods. Both details and approximations are splitinto finer components and result in better-localized frequency ranges corresponding to each node ofa wavelet packet tree. For the punpose of feature extraction, a hard threshold is given and theenergy of the coefficients above the threshold is used, as a criterion for the selection of the bestvector. The feature extraction of a vibration signal is accomplished by computing thereconstruction signal and its spectrum. When applied to a rolling bear vibration signal featureextraction, the proposed method can lead to be very effective.展开更多
It is an important precondition for machine fault diagnosis that vibrationsignal can be extracted effectively. Based on the characteristic of noise interfused during thecourse of sampling vibration signal, independent...It is an important precondition for machine fault diagnosis that vibrationsignal can be extracted effectively. Based on the characteristic of noise interfused during thecourse of sampling vibration signal, independent component analysis (ICA) method is combined withwavelet to de-noise. Firstly, The sampled signal can be separated with ICA, then the function offrequency band chosen with multi-resolution wavelet transform can be used to judge whether thestochastic disturbance singular signal is interfused. By these ways, the vibration signals can beextracted effectively, which provides favorable condition for subsequent feature detection ofvibration signal and fault diagnosis.展开更多
Computer-aided diagnosis of pneumonia based on deep learning is a research hotspot.However,there are some problems that the features of different sizes and different directions are not sufficient when extracting the f...Computer-aided diagnosis of pneumonia based on deep learning is a research hotspot.However,there are some problems that the features of different sizes and different directions are not sufficient when extracting the features in lung X-ray images.A pneumonia classification model based on multi-scale directional feature enhancement MSD-Net is proposed in this paper.The main innovations are as follows:Firstly,the Multi-scale Residual Feature Extraction Module(MRFEM)is designed to effectively extract multi-scale features.The MRFEM uses dilated convolutions with different expansion rates to increase the receptive field and extract multi-scale features effectively.Secondly,the Multi-scale Directional Feature Perception Module(MDFPM)is designed,which uses a three-branch structure of different sizes convolution to transmit direction feature layer by layer,and focuses on the target region to enhance the feature information.Thirdly,the Axial Compression Former Module(ACFM)is designed to perform global calculations to enhance the perception ability of global features in different directions.To verify the effectiveness of the MSD-Net,comparative experiments and ablation experiments are carried out.In the COVID-19 RADIOGRAPHY DATABASE,the Accuracy,Recall,Precision,F1 Score,and Specificity of MSD-Net are 97.76%,95.57%,95.52%,95.52%,and 98.51%,respectively.In the chest X-ray dataset,the Accuracy,Recall,Precision,F1 Score and Specificity of MSD-Net are 97.78%,95.22%,96.49%,95.58%,and 98.11%,respectively.This model improves the accuracy of lung image recognition effectively and provides an important clinical reference to pneumonia Computer-Aided Diagnosis.展开更多
Gliomas have the highest mortality rate of all brain tumors.Correctly classifying the glioma risk period can help doctors make reasonable treatment plans and improve patients’survival rates.This paper proposes a hier...Gliomas have the highest mortality rate of all brain tumors.Correctly classifying the glioma risk period can help doctors make reasonable treatment plans and improve patients’survival rates.This paper proposes a hierarchical multi-scale attention feature fusion medical image classification network(HMAC-Net),which effectively combines global features and local features.The network framework consists of three parallel layers:The global feature extraction layer,the local feature extraction layer,and the multi-scale feature fusion layer.A linear sparse attention mechanism is designed in the global feature extraction layer to reduce information redundancy.In the local feature extraction layer,a bilateral local attention mechanism is introduced to improve the extraction of relevant information between adjacent slices.In the multi-scale feature fusion layer,a channel fusion block combining convolutional attention mechanism and residual inverse multi-layer perceptron is proposed to prevent gradient disappearance and network degradation and improve feature representation capability.The double-branch iterative multi-scale classification block is used to improve the classification performance.On the brain glioma risk grading dataset,the results of the ablation experiment and comparison experiment show that the proposed HMAC-Net has the best performance in both qualitative analysis of heat maps and quantitative analysis of evaluation indicators.On the dataset of skin cancer classification,the generalization experiment results show that the proposed HMAC-Net has a good generalization effect.展开更多
The precise and automatic segmentation of prostate magnetic resonance imaging(MRI)images is vital for assisting doctors in diagnosing prostate diseases.In recent years,many advanced methods have been applied to prosta...The precise and automatic segmentation of prostate magnetic resonance imaging(MRI)images is vital for assisting doctors in diagnosing prostate diseases.In recent years,many advanced methods have been applied to prostate segmentation,but due to the variability caused by prostate diseases,automatic segmentation of the prostate presents significant challenges.In this paper,we propose an attention-guided multi-scale feature fusion network(AGMSF-Net)to segment prostate MRI images.We propose an attention mechanism for extracting multi-scale features,and introduce a 3D transformer module to enhance global feature representation by adding it during the transition phase from encoder to decoder.In the decoder stage,a feature fusion module is proposed to obtain global context information.We evaluate our model on MRI images of the prostate acquired from a local hospital.The relative volume difference(RVD)and dice similarity coefficient(DSC)between the results of automatic prostate segmentation and ground truth were 1.21%and 93.68%,respectively.To quantitatively evaluate prostate volume on MRI,which is of significant clinical significance,we propose a unique AGMSF-Net.The essential performance evaluation and validation experiments have demonstrated the effectiveness of our method in automatic prostate segmentation.展开更多
Feature extraction is an important part of signal processing,which is significant for signal detection,classification,and recognition.The nonlinear dynamic analysis method can extract the nonlinear characteristics of ...Feature extraction is an important part of signal processing,which is significant for signal detection,classification,and recognition.The nonlinear dynamic analysis method can extract the nonlinear characteristics of signals and is widely used in different fields.Reverse dispersion entropy(RDE)proposed by us recently,as a nonlinear dynamic analysis method,has the advantages of fast computing speed and strong anti-noise ability,which is more suitable for measuring the complexity of signal than traditional permutation entropy(PE)and dispersion entropy(DE).Empirical wavelet transform(EWT),based on the theory of wavelet analysis,can decompose a complex non-stationary signal into a number of empirical wavelet functions(EWFs)with compact support set spectrum,which has better decomposition performance than empirical mode decomposition(EMD)and its improved algorithms.Considering the advantages of RDE and EWT,on the one hand,we introduce EWT into the field of underwater acoustic signal processing and fault diagnosis to improve the signal decomposition accuracy;on the other hand,we use RDE as the features of EWFs to improve the signal separability and stability.Finally,we propose a novel signal feature extraction technology based on EWT and RDE in this paper.Experimental results show that the proposed feature extraction technology can effectively extract the complexity features of actual signals.Moreover,it also has higher distinguishing ability for different types of signals than five latest feature extraction technologies.展开更多
The semantic segmentation methods based on CNN have made great progress,but there are still some shortcomings in the application of remote sensing images segmentation,such as the small receptive field can not effectiv...The semantic segmentation methods based on CNN have made great progress,but there are still some shortcomings in the application of remote sensing images segmentation,such as the small receptive field can not effectively capture global context.In order to solve this problem,this paper proposes a hybrid model based on ResNet50 and swin transformer to directly capture long-range dependence,which fuses features through Cross Feature Modulation Module(CFMM).Experimental results on two publicly available datasets,Vaihingen and Potsdam,are mIoU of 70.27%and 76.63%,respectively.Thus,CFM-UNet can maintain a high segmentation performance compared with other competitive networks.展开更多
With the new system radar put into practical use, the characteristics of complex radar signals are changing and developing. The traditional analysis method of one-dimensional transformation domain is no longer applica...With the new system radar put into practical use, the characteristics of complex radar signals are changing and developing. The traditional analysis method of one-dimensional transformation domain is no longer applicable to the modern radar signal processing, and it is necessary to seek new methods in the two-dimensional transformation domain. The time-frequency analysis method is the most widely used method in the two-dimensional transformation domain. In this paper, two typical time-frequency analysis methods of short-time Fourier transform and Wigner-Ville distribution are studied by analyzing the time-frequency transform of typical radar reconnaissance linear frequency modulation signal, aiming at the problem of low accuracy and sen-sitivity to the signal noise of common methods, the improved wavelet transform algorithm was proposed.展开更多
Recently,a reversible image transformation(RIT)technology that transforms a secret image to a freely-selected target image is proposed.It not only can generate a stego-image that looks similar to the target image,but ...Recently,a reversible image transformation(RIT)technology that transforms a secret image to a freely-selected target image is proposed.It not only can generate a stego-image that looks similar to the target image,but also can recover the secret image without any loss.It also has been proved to be very useful in image content protection and reversible data hiding in encrypted images.However,the standard deviation(SD)is selected as the only feature during the matching of the secret and target image blocks in RIT methods,the matching result is not so good and needs to be further improved since the distributions of SDs of the two images may be not very similar.Therefore,this paper proposes a Gray level co-occurrence matrix(GLCM)based approach for reversible image transformation,in which,an effective feature extraction algorithm is utilized to increase the accuracy of blocks matching for improving the visual quality of transformed image,while the auxiliary information,which is utilized to record the transformation parameters,is not increased.Thus,the visual quality of the stego-image should be improved.Experimental results also show that the root mean square of stego-image can be reduced by 4.24%compared with the previous method.展开更多
The widespread availability of digital multimedia data has led to a new challenge in digital forensics.Traditional source camera identification algorithms usually rely on various traces in the capturing process.Howeve...The widespread availability of digital multimedia data has led to a new challenge in digital forensics.Traditional source camera identification algorithms usually rely on various traces in the capturing process.However,these traces have become increasingly difficult to extract due to wide availability of various image processing algorithms.Convolutional Neural Networks(CNN)-based algorithms have demonstrated good discriminative capabilities for different brands and even different models of camera devices.However,their performances is not ideal in case of distinguishing between individual devices of the same model,because cameras of the same model typically use the same optical lens,image sensor,and image processing algorithms,that result in minimal overall differences.In this paper,we propose a camera forensics algorithm based on multi-scale feature fusion to address these issues.The proposed algorithm extracts different local features from feature maps of different scales and then fuses them to obtain a comprehensive feature representation.This representation is then fed into a subsequent camera fingerprint classification network.Building upon the Swin-T network,we utilize Transformer Blocks and Graph Convolutional Network(GCN)modules to fuse multi-scale features from different stages of the backbone network.Furthermore,we conduct experiments on established datasets to demonstrate the feasibility and effectiveness of the proposed approach.展开更多
基金supported by the National Natural Science Foundation of China (6117212711071002)+1 种基金the Specialized Research Fund for the Doctoral Program of Higher Education (20113401110006)the Innovative Research Team of 211 Project in Anhui University (KJTD007A)
文摘A new spectral matching algorithm is proposed by us- ing nonsubsampled contourlet transform and scale-invariant fea- ture transform. The nonsubsampled contourlet transform is used to decompose an image into a low frequency image and several high frequency images, and the scale-invariant feature transform is employed to extract feature points from the low frequency im- age. A proximity matrix is constructed for the feature points of two related images. By singular value decomposition of the proximity matrix, a matching matrix (or matching result) reflecting the match- ing degree among feature points is obtained. Experimental results indicate that the proposed algorithm can reduce time complexity and possess a higher accuracy.
基金Program for NewCentury Excellent Talents in UniversityGrant number:50051+1 种基金The Key Project for Technology Research of Ministry Education of ChinaCrant number:106030
文摘To meet the needs in the fundus examination,including outlook widening,pathology tracking,etc.,this paper describes a robust feature-based method for fully-automatic mosaic of the curved human retinal images photographed by a fundus microscope. The kernel of this new algorithm is the scale-,rotation-and illumination-invariant interest point detector & feature descriptor-Scale-Invariant Feature Transform. When matched interest points according to second-nearest-neighbor strategy,the parameters of the model are estimated using the correct matches of the interest points,extracted by a new inlier identification scheme based on Sampson distance from putative sets. In order to preserve image features,bilinear warping and multi-band blending techniques are used to create panoramic retinal images. Experiments show that the proposed method works well with rejection error in 0.3 pixels,even for those cases where the retinal images without discernable vascular structure in contrast to the state-of-the-art algorithms.
文摘Content-based satellite image registration is a difficult issue in the fields of remote sensing and image processing. The difficulty is more significant in the case of matching multisource remote sensing images which suffer from illumination, rotation, and source differences. The scale-invariant feature transform (SIFT) algorithm has been used successfully in satellite image registration problems. Also, many researchers have applied a local SIFT descriptor to improve the image retrieval process. Despite its robustness, this algorithm has some difficulties with the quality and quantity of the extracted local feature points in multisource remote sensing. Furthermore, high dimensionality of the local features extracted by SIFT results in time-consuming computational processes alongside high storage requirements for saving the relevant information, which are important factors in content-based image retrieval (CBIR) applications. In this paper, a novel method is introduced to transform the local SIFT features to global features for multisource remote sensing. The quality and quantity of SIFT local features have been enhanced by applying contrast equalization on images in a pre-processing stage. Considering the local features of each image in the reference database as a separate class, linear discriminant analysis (LDA) is used to transform the local features to global features while reducing di- mensionality of the feature space. This will also significantly reduce the computational time and storage required. Applying the trained kernel on verification data and mapping them showed a successful retrieval rate of 91.67% for test feature points.
基金the National Science Foundation of China(No.61471185)the Natural Science Foundation of Shandong Province(No.ZR2016FM21)+1 种基金Shandong Province Science and Technology Plan Project(No.2015GSF116001)Yantai City Key Research and Development Plan Project(Nos.2014ZH157 and2016ZH057)
文摘In this paper, we proposed a registration method by combining the morphological component analysis(MCA) and scale-invariant feature transform(SIFT) algorithm. This method uses the perception dictionaries,and combines the Basis-Pursuit algorithm and the Total-Variation regularization scheme to extract the cartoon part containing basic geometrical information from the original image, and is stable and unsusceptible to noise interference. Then a smaller number of the distinctive key points will be obtained by using the SIFT algorithm based on the cartoon part of the original image. Matching the key points by the constrained Euclidean distance,we will obtain a more correct and robust matching result. The experimental results show that the geometrical transform parameters inferred by the matched key points based on MCA+SIFT registration method are more exact than the ones based on the direct SIFT algorithm.
基金supported by the National Natural Science Foundation of China(61379010)the Natural Science Basic Research Plan in Shaanxi Province of China(2015JM6293)
文摘Contraposing the need of the robust digital watermark for the copyright protection field, a new digital watermarking algorithm in the non-subsampled contourlet transform (NSCT) domain is proposed. The largest energy sub-band after NSCT is selected to embed watermark. The watermark is embedded into scaleinvariant feature transform (SIFT) regions. During embedding, the initial region is divided into some cirque sub-regions with the same area, and each watermark bit is embedded into one sub-region. Extensive simulation results and comparisons show that the algorithm gets a good trade-off of invisibility, robustness and capacity, thus obtaining good quality of the image while being able to effectively resist common image processing, and geometric and combo attacks, and normalized similarity is almost all reached.
基金supported in part by the Nationa Natural Science Foundation of China (61876011)the National Key Research and Development Program of China (2022YFB4703700)+1 种基金the Key Research and Development Program 2020 of Guangzhou (202007050002)the Key-Area Research and Development Program of Guangdong Province (2020B090921003)。
文摘Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to establish relationships between distant but relevant points. To overcome the limitation of local spatial attention, we propose a point content-based Transformer architecture, called PointConT for short. It exploits the locality of points in the feature space(content-based), which clusters the sampled points with similar features into the same class and computes the self-attention within each class, thus enabling an effective trade-off between capturing long-range dependencies and computational complexity. We further introduce an inception feature aggregator for point cloud classification, which uses parallel structures to aggregate high-frequency and low-frequency information in each branch separately. Extensive experiments show that our PointConT model achieves a remarkable performance on point cloud shape classification. Especially, our method exhibits 90.3% Top-1 accuracy on the hardest setting of ScanObjectN N. Source code of this paper is available at https://github.com/yahuiliu99/PointC onT.
基金supported by National Key R&D Program of China[2022YFC2402400]the National Natural Science Foundation of China[Grant No.62275062]Guangdong Provincial Key Laboratory of Biomedical Optical Imaging Technology[Grant No.2020B121201010-4].
文摘Photoacoustic imaging(PAI)is a noninvasive emerging imaging method based on the photoacoustic effect,which provides necessary assistance for medical diagnosis.It has the characteristics of large imaging depth and high contrast.However,limited by the equipment cost and reconstruction time requirements,the existing PAI systems distributed with annular array transducers are difficult to take into account both the image quality and the imaging speed.In this paper,a triple-path feature transform network(TFT-Net)for ring-array photoacoustic tomography is proposed to enhance the imaging quality from limited-view and sparse measurement data.Specifically,the network combines the raw photoacoustic pressure signals and conventional linear reconstruction images as input data,and takes the photoacoustic physical model as a prior information to guide the reconstruction process.In addition,to enhance the ability of extracting signal features,the residual block and squeeze and excitation block are introduced into the TFT-Net.For further efficient reconstruction,the final output of photoacoustic signals uses‘filter-then-upsample’operation with a pixel-shuffle multiplexer and a max out module.Experiment results on simulated and in-vivo data demonstrate that the constructed TFT-Net can restore the target boundary clearly,reduce background noise,and realize fast and high-quality photoacoustic image reconstruction of limited view with sparse sampling.
文摘Olive trees are susceptible to a variety of diseases that can cause significant crop damage and economic losses.Early detection of these diseases is essential for effective management.We propose a novel transformed wavelet,feature-fused,pre-trained deep learning model for detecting olive leaf diseases.The proposed model combines wavelet transforms with pre-trained deep-learning models to extract discriminative features from olive leaf images.The model has four main phases:preprocessing using data augmentation,three-level wavelet transformation,learning using pre-trained deep learning models,and a fused deep learning model.In the preprocessing phase,the image dataset is augmented using techniques such as resizing,rescaling,flipping,rotation,zooming,and contrasting.In wavelet transformation,the augmented images are decomposed into three frequency levels.Three pre-trained deep learning models,EfficientNet-B7,DenseNet-201,and ResNet-152-V2,are used in the learning phase.The models were trained using the approximate images of the third-level sub-band of the wavelet transform.In the fused phase,the fused model consists of a merge layer,three dense layers,and two dropout layers.The proposed model was evaluated using a dataset of images of healthy and infected olive leaves.It achieved an accuracy of 99.72%in the diagnosis of olive leaf diseases,which exceeds the accuracy of other methods reported in the literature.This finding suggests that our proposed method is a promising tool for the early detection of olive leaf diseases.
基金supported financially by FundamentalResearch Program of Shanxi Province(No.202103021223056).
文摘Addressing the challenges posed by the nonlinear and non-stationary vibrations in rotating machinery,where weak fault characteristic signals hinder accurate fault state representation,we propose a novel feature extraction method that combines the Flexible Analytic Wavelet Transform(FAWT)with Nonlinear Quantum Permutation Entropy.FAWT,leveraging fractional orders and arbitrary scaling and translation factors,exhibits superior translational invariance and adjustable fundamental oscillatory characteristics.This flexibility enables FAWT to provide well-suited wavelet shapes,effectively matching subtle fault components and avoiding performance degradation associated with fixed frequency partitioning and low-oscillation bases in detecting weak faults.In our approach,gearbox vibration signals undergo FAWT to obtain sub-bands.Quantum theory is then introduced into permutation entropy to propose Nonlinear Quantum Permutation Entropy,a feature that more accurately characterizes the operational state of vibration simulation signals.The nonlinear quantum permutation entropy extracted from sub-bands is utilized to characterize the operating state of rotating machinery.A comprehensive analysis of vibration signals from rolling bearings and gearboxes validates the feasibility of the proposed method.Comparative assessments with parameters derived from traditional permutation entropy,sample entropy,wavelet transform(WT),and empirical mode decomposition(EMD)underscore the superior effectiveness of this approach in fault detection and classification for rotating machinery.
基金the Humanities and Social Science Fund of the Ministry of Education of China(21YJAZH077)。
文摘In a crowd density estimation dataset,the annotation of crowd locations is an extremely laborious task,and they are not taken into the evaluation metrics.In this paper,we aim to reduce the annotation cost of crowd datasets,and propose a crowd density estimation method based on weakly-supervised learning,in the absence of crowd position supervision information,which directly reduces the number of crowds by using the number of pedestrians in the image as the supervised information.For this purpose,we design a new training method,which exploits the correlation between global and local image features by incremental learning to train the network.Specifically,we design a parent-child network(PC-Net)focusing on the global and local image respectively,and propose a linear feature calibration structure to train the PC-Net simultaneously,and the child network learns feature transfer factors and feature bias weights,and uses the transfer factors and bias weights to linearly feature calibrate the features extracted from the Parent network,to improve the convergence of the network by using local features hidden in the crowd images.In addition,we use the pyramid vision transformer as the backbone of the PC-Net to extract crowd features at different levels,and design a global-local feature loss function(L2).We combine it with a crowd counting loss(LC)to enhance the sensitivity of the network to crowd features during the training process,which effectively improves the accuracy of crowd density estimation.The experimental results show that the PC-Net significantly reduces the gap between fullysupervised and weakly-supervised crowd density estimation,and outperforms the comparison methods on five datasets of Shanghai Tech Part A,ShanghaiTech Part B,UCF_CC_50,UCF_QNRF and JHU-CROWD++.
文摘A method is proposed for the analysis of vibration signals from components ofrotating machines, based on the wavelet packet transformation (WPT) and the underlying physicalconcepts of modulation mechanism. The method provides a finer analysis and better time-frequencylocalization capabilities than any other analysis methods. Both details and approximations are splitinto finer components and result in better-localized frequency ranges corresponding to each node ofa wavelet packet tree. For the punpose of feature extraction, a hard threshold is given and theenergy of the coefficients above the threshold is used, as a criterion for the selection of the bestvector. The feature extraction of a vibration signal is accomplished by computing thereconstruction signal and its spectrum. When applied to a rolling bear vibration signal featureextraction, the proposed method can lead to be very effective.
基金This project is supported by National Natural Science Foundation of China (No.50275154) Municipal Natural Science Foundation of Chongqing, China (No.8773).
文摘It is an important precondition for machine fault diagnosis that vibrationsignal can be extracted effectively. Based on the characteristic of noise interfused during thecourse of sampling vibration signal, independent component analysis (ICA) method is combined withwavelet to de-noise. Firstly, The sampled signal can be separated with ICA, then the function offrequency band chosen with multi-resolution wavelet transform can be used to judge whether thestochastic disturbance singular signal is interfused. By these ways, the vibration signals can beextracted effectively, which provides favorable condition for subsequent feature detection ofvibration signal and fault diagnosis.
基金supported in part by the National Natural Science Foundation of China(Grant No.62062003)Natural Science Foundation of Ningxia(Grant No.2023AAC03293).
文摘Computer-aided diagnosis of pneumonia based on deep learning is a research hotspot.However,there are some problems that the features of different sizes and different directions are not sufficient when extracting the features in lung X-ray images.A pneumonia classification model based on multi-scale directional feature enhancement MSD-Net is proposed in this paper.The main innovations are as follows:Firstly,the Multi-scale Residual Feature Extraction Module(MRFEM)is designed to effectively extract multi-scale features.The MRFEM uses dilated convolutions with different expansion rates to increase the receptive field and extract multi-scale features effectively.Secondly,the Multi-scale Directional Feature Perception Module(MDFPM)is designed,which uses a three-branch structure of different sizes convolution to transmit direction feature layer by layer,and focuses on the target region to enhance the feature information.Thirdly,the Axial Compression Former Module(ACFM)is designed to perform global calculations to enhance the perception ability of global features in different directions.To verify the effectiveness of the MSD-Net,comparative experiments and ablation experiments are carried out.In the COVID-19 RADIOGRAPHY DATABASE,the Accuracy,Recall,Precision,F1 Score,and Specificity of MSD-Net are 97.76%,95.57%,95.52%,95.52%,and 98.51%,respectively.In the chest X-ray dataset,the Accuracy,Recall,Precision,F1 Score and Specificity of MSD-Net are 97.78%,95.22%,96.49%,95.58%,and 98.11%,respectively.This model improves the accuracy of lung image recognition effectively and provides an important clinical reference to pneumonia Computer-Aided Diagnosis.
基金Major Program of National Natural Science Foundation of China(NSFC12292980,NSFC12292984)National Key R&D Program of China(2023YFA1009000,2023YFA1009004,2020YFA0712203,2020YFA0712201)+2 种基金Major Program of National Natural Science Foundation of China(NSFC12031016)Beijing Natural Science Foundation(BNSFZ210003)Department of Science,Technology and Information of the Ministry of Education(8091B042240).
文摘Gliomas have the highest mortality rate of all brain tumors.Correctly classifying the glioma risk period can help doctors make reasonable treatment plans and improve patients’survival rates.This paper proposes a hierarchical multi-scale attention feature fusion medical image classification network(HMAC-Net),which effectively combines global features and local features.The network framework consists of three parallel layers:The global feature extraction layer,the local feature extraction layer,and the multi-scale feature fusion layer.A linear sparse attention mechanism is designed in the global feature extraction layer to reduce information redundancy.In the local feature extraction layer,a bilateral local attention mechanism is introduced to improve the extraction of relevant information between adjacent slices.In the multi-scale feature fusion layer,a channel fusion block combining convolutional attention mechanism and residual inverse multi-layer perceptron is proposed to prevent gradient disappearance and network degradation and improve feature representation capability.The double-branch iterative multi-scale classification block is used to improve the classification performance.On the brain glioma risk grading dataset,the results of the ablation experiment and comparison experiment show that the proposed HMAC-Net has the best performance in both qualitative analysis of heat maps and quantitative analysis of evaluation indicators.On the dataset of skin cancer classification,the generalization experiment results show that the proposed HMAC-Net has a good generalization effect.
基金This work was supported in part by the National Natural Science Foundation of China(Grant#:82260362)in part by the National Key R&D Program of China(Grant#:2021ZD0111000)+1 种基金in part by the Key R&D Project of Hainan Province(Grant#:ZDYF2021SHFZ243)in part by the Major Science and Technology Project of Haikou(Grant#:2020-009).
文摘The precise and automatic segmentation of prostate magnetic resonance imaging(MRI)images is vital for assisting doctors in diagnosing prostate diseases.In recent years,many advanced methods have been applied to prostate segmentation,but due to the variability caused by prostate diseases,automatic segmentation of the prostate presents significant challenges.In this paper,we propose an attention-guided multi-scale feature fusion network(AGMSF-Net)to segment prostate MRI images.We propose an attention mechanism for extracting multi-scale features,and introduce a 3D transformer module to enhance global feature representation by adding it during the transition phase from encoder to decoder.In the decoder stage,a feature fusion module is proposed to obtain global context information.We evaluate our model on MRI images of the prostate acquired from a local hospital.The relative volume difference(RVD)and dice similarity coefficient(DSC)between the results of automatic prostate segmentation and ground truth were 1.21%and 93.68%,respectively.To quantitatively evaluate prostate volume on MRI,which is of significant clinical significance,we propose a unique AGMSF-Net.The essential performance evaluation and validation experiments have demonstrated the effectiveness of our method in automatic prostate segmentation.
基金the supported by National Natural Science Foundation of China(No.61871318 and 11574250)Scientific Research Plan Projects of Shaanxi Education Department(No.19JK0568).
文摘Feature extraction is an important part of signal processing,which is significant for signal detection,classification,and recognition.The nonlinear dynamic analysis method can extract the nonlinear characteristics of signals and is widely used in different fields.Reverse dispersion entropy(RDE)proposed by us recently,as a nonlinear dynamic analysis method,has the advantages of fast computing speed and strong anti-noise ability,which is more suitable for measuring the complexity of signal than traditional permutation entropy(PE)and dispersion entropy(DE).Empirical wavelet transform(EWT),based on the theory of wavelet analysis,can decompose a complex non-stationary signal into a number of empirical wavelet functions(EWFs)with compact support set spectrum,which has better decomposition performance than empirical mode decomposition(EMD)and its improved algorithms.Considering the advantages of RDE and EWT,on the one hand,we introduce EWT into the field of underwater acoustic signal processing and fault diagnosis to improve the signal decomposition accuracy;on the other hand,we use RDE as the features of EWFs to improve the signal separability and stability.Finally,we propose a novel signal feature extraction technology based on EWT and RDE in this paper.Experimental results show that the proposed feature extraction technology can effectively extract the complexity features of actual signals.Moreover,it also has higher distinguishing ability for different types of signals than five latest feature extraction technologies.
基金Young Innovative Talents Project of Guangdong Ordinary Universities(No.2022KQNCX225)School-level Teaching and Research Project of Guangzhou City Polytechnic(No.2022xky046)。
文摘The semantic segmentation methods based on CNN have made great progress,but there are still some shortcomings in the application of remote sensing images segmentation,such as the small receptive field can not effectively capture global context.In order to solve this problem,this paper proposes a hybrid model based on ResNet50 and swin transformer to directly capture long-range dependence,which fuses features through Cross Feature Modulation Module(CFMM).Experimental results on two publicly available datasets,Vaihingen and Potsdam,are mIoU of 70.27%and 76.63%,respectively.Thus,CFM-UNet can maintain a high segmentation performance compared with other competitive networks.
文摘With the new system radar put into practical use, the characteristics of complex radar signals are changing and developing. The traditional analysis method of one-dimensional transformation domain is no longer applicable to the modern radar signal processing, and it is necessary to seek new methods in the two-dimensional transformation domain. The time-frequency analysis method is the most widely used method in the two-dimensional transformation domain. In this paper, two typical time-frequency analysis methods of short-time Fourier transform and Wigner-Ville distribution are studied by analyzing the time-frequency transform of typical radar reconnaissance linear frequency modulation signal, aiming at the problem of low accuracy and sen-sitivity to the signal noise of common methods, the improved wavelet transform algorithm was proposed.
基金This work is supported by the National Key R&D Program of China under grant 2018YFB1003205by the National Natural Science Foundation of China under grant 61502242,U1536206,U1405254,61772283,61602253,61672294+2 种基金by the Jiangsu Basic Research Programs-Natural Science Foundation under grant numbers BK20150925 and BK20151530by the Priority Academic Program Development of Jiangsu Higher Education Institutions(PAPD)fundby the Collaborative Innovation Center of Atmospheric Environment and Equipment Technology(CICAEET)fund,China.
文摘Recently,a reversible image transformation(RIT)technology that transforms a secret image to a freely-selected target image is proposed.It not only can generate a stego-image that looks similar to the target image,but also can recover the secret image without any loss.It also has been proved to be very useful in image content protection and reversible data hiding in encrypted images.However,the standard deviation(SD)is selected as the only feature during the matching of the secret and target image blocks in RIT methods,the matching result is not so good and needs to be further improved since the distributions of SDs of the two images may be not very similar.Therefore,this paper proposes a Gray level co-occurrence matrix(GLCM)based approach for reversible image transformation,in which,an effective feature extraction algorithm is utilized to increase the accuracy of blocks matching for improving the visual quality of transformed image,while the auxiliary information,which is utilized to record the transformation parameters,is not increased.Thus,the visual quality of the stego-image should be improved.Experimental results also show that the root mean square of stego-image can be reduced by 4.24%compared with the previous method.
基金This work was funded by the National Natural Science Foundation of China(Grant No.62172132)Public Welfare Technology Research Project of Zhejiang Province(Grant No.LGF21F020014)the Opening Project of Key Laboratory of Public Security Information Application Based on Big-Data Architecture,Ministry of Public Security of Zhejiang Police College(Grant No.2021DSJSYS002).
文摘The widespread availability of digital multimedia data has led to a new challenge in digital forensics.Traditional source camera identification algorithms usually rely on various traces in the capturing process.However,these traces have become increasingly difficult to extract due to wide availability of various image processing algorithms.Convolutional Neural Networks(CNN)-based algorithms have demonstrated good discriminative capabilities for different brands and even different models of camera devices.However,their performances is not ideal in case of distinguishing between individual devices of the same model,because cameras of the same model typically use the same optical lens,image sensor,and image processing algorithms,that result in minimal overall differences.In this paper,we propose a camera forensics algorithm based on multi-scale feature fusion to address these issues.The proposed algorithm extracts different local features from feature maps of different scales and then fuses them to obtain a comprehensive feature representation.This representation is then fed into a subsequent camera fingerprint classification network.Building upon the Swin-T network,we utilize Transformer Blocks and Graph Convolutional Network(GCN)modules to fuse multi-scale features from different stages of the backbone network.Furthermore,we conduct experiments on established datasets to demonstrate the feasibility and effectiveness of the proposed approach.