Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware reso...Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.展开更多
We design a new hybrid quantum-classical convolutional neural network(HQCCNN)model based on parameter quantum circuits.In this model,we use parameterized quantum circuits(PQCs)to redesign the convolutional layer in cl...We design a new hybrid quantum-classical convolutional neural network(HQCCNN)model based on parameter quantum circuits.In this model,we use parameterized quantum circuits(PQCs)to redesign the convolutional layer in classical convolutional neural networks,forming a new quantum convolutional layer to achieve unitary transformation of quantum states,enabling the model to more accurately extract hidden information from images.At the same time,we combine the classical fully connected layer with PQCs to form a new hybrid quantum-classical fully connected layer to further improve the accuracy of classification.Finally,we use the MNIST dataset to test the potential of the HQCCNN.The results indicate that the HQCCNN has good performance in solving classification problems.In binary classification tasks,the classification accuracy of numbers 5 and 7 is as high as 99.71%.In multivariate classification,the accuracy rate also reaches 98.51%.Finally,we compare the performance of the HQCCNN with other models and find that the HQCCNN has better classification performance and convergence speed.展开更多
Background The use of remote photoplethysmography(rPPG)to estimate blood volume pulse in a noncontact manner has been an active research topic in recent years.Existing methods are primarily based on a singlescale regi...Background The use of remote photoplethysmography(rPPG)to estimate blood volume pulse in a noncontact manner has been an active research topic in recent years.Existing methods are primarily based on a singlescale region of interest(ROI).However,some noise signals that are not easily separated in a single-scale space can be easily separated in a multi-scale space.Also,existing spatiotemporal networks mainly focus on local spatiotemporal information and do not emphasize temporal information,which is crucial in pulse extraction problems,resulting in insufficient spatiotemporal feature modelling.Methods Here,we propose a multi-scale facial video pulse extraction network based on separable spatiotemporal convolution(SSTC)and dimension separable attention(DSAT).First,to solve the problem of a single-scale ROI,we constructed a multi-scale feature space for initial signal separation.Second,SSTC and DSAT were designed for efficient spatiotemporal correlation modeling,which increased the information interaction between the long-span time and space dimensions;this placed more emphasis on temporal features.Results The signal-to-noise ratio(SNR)of the proposed network reached 9.58dB on the PURE dataset and 6.77dB on the UBFC-rPPG dataset,outperforming state-of-the-art algorithms.Conclusions The results showed that fusing multi-scale signals yielded better results than methods based on only single-scale signals.The proposed SSTC and dimension-separable attention mechanism will contribute to more accurate pulse signal extraction.展开更多
Nuclearmagnetic resonance imaging of breasts often presents complex backgrounds.Breast tumors exhibit varying sizes,uneven intensity,and indistinct boundaries.These characteristics can lead to challenges such as low a...Nuclearmagnetic resonance imaging of breasts often presents complex backgrounds.Breast tumors exhibit varying sizes,uneven intensity,and indistinct boundaries.These characteristics can lead to challenges such as low accuracy and incorrect segmentation during tumor segmentation.Thus,we propose a two-stage breast tumor segmentation method leveraging multi-scale features and boundary attention mechanisms.Initially,the breast region of interest is extracted to isolate the breast area from surrounding tissues and organs.Subsequently,we devise a fusion network incorporatingmulti-scale features and boundary attentionmechanisms for breast tumor segmentation.We incorporate multi-scale parallel dilated convolution modules into the network,enhancing its capability to segment tumors of various sizes through multi-scale convolution and novel fusion techniques.Additionally,attention and boundary detection modules are included to augment the network’s capacity to locate tumors by capturing nonlocal dependencies in both spatial and channel domains.Furthermore,a hybrid loss function with boundary weight is employed to address sample class imbalance issues and enhance the network’s boundary maintenance capability through additional loss.Themethod was evaluated using breast data from 207 patients at RuijinHospital,resulting in a 6.64%increase in Dice similarity coefficient compared to the benchmarkU-Net.Experimental results demonstrate the superiority of the method over other segmentation techniques,with fewer model parameters.展开更多
A face-mask object detection model incorporating hybrid dilation convolutional network termed ResNet Hybrid-dilation-convolution Face-mask-detector (RHF) is proposed in this paper. Furthermore, a lightweight face-mask...A face-mask object detection model incorporating hybrid dilation convolutional network termed ResNet Hybrid-dilation-convolution Face-mask-detector (RHF) is proposed in this paper. Furthermore, a lightweight face-mask dataset named Light Masked Face Dataset (LMFD) and a medium-sized face-mask dataset named Masked Face Dataset (MFD) with data augmentation methods applied is also constructed in this paper. The hybrid dilation convolutional network is able to expand the perception of the convolutional kernel without concern about the discontinuity of image information during the convolution process. For the given two datasets being constructed above, the trained models are significantly optimized in terms of detection performance, training time, and other related metrics. By using the MFD dataset of 55,905 images, the RHF model requires roughly 10 hours less training time compared to ResNet50 with better detection results with mAP of 93.45%.展开更多
Aquatic medicine knowledge graph is an effective means to realize intelligent aquaculture.Graph completion technology is key to improving the quality of knowledge graph construction.However,the difficulty of semantic ...Aquatic medicine knowledge graph is an effective means to realize intelligent aquaculture.Graph completion technology is key to improving the quality of knowledge graph construction.However,the difficulty of semantic discrimination among similar entities and inconspicuous semantic features result in low accuracy when completing aquatic medicine knowledge graph with complex relationships.In this study,an aquatic medicine knowledge graph completion method(TransH+HConvAM)is proposed.Firstly,TransH is applied to split the vector plane between entities and relations,ameliorating the poor completion effect caused by low semantic resolution of entities.Then,hybrid convolution is introduced to obtain the global interaction of triples based on the complete interaction between head/tail entities and relations,which improves the semantic features of triples and enhances the completion effect of complex relationships in the graph.Experiments are conducted to verify the performance of the proposed method.The MR,MRR and Hit@10 of the TransH+HConvAM are found to be 674,0.339,and 0.361,respectively.This study shows that the model effectively overcomes the poor completion effect of complex relationships and improves the construction quality of the aquatic medicine knowledge graph,providing technical support for intelligent aquaculture.展开更多
Aiming at the difficulty of fault identification caused by manual extraction of fault features of rotating machinery,a one-dimensional multi-scale convolutional auto-encoder fault diagnosis model is proposed,based on ...Aiming at the difficulty of fault identification caused by manual extraction of fault features of rotating machinery,a one-dimensional multi-scale convolutional auto-encoder fault diagnosis model is proposed,based on the standard convolutional auto-encoder.In this model,the parallel convolutional and deconvolutional kernels of different scales are used to extract the features from the input signal and reconstruct the input signal;then the feature map extracted by multi-scale convolutional kernels is used as the input of the classifier;and finally the parameters of the whole model are fine-tuned using labeled data.Experiments on one set of simulation fault data and two sets of rolling bearing fault data are conducted to validate the proposed method.The results show that the model can achieve 99.75%,99.3%and 100%diagnostic accuracy,respectively.In addition,the diagnostic accuracy and reconstruction error of the one-dimensional multi-scale convolutional auto-encoder are compared with traditional machine learning,convolutional neural networks and a traditional convolutional auto-encoder.The final results show that the proposed model has a better recognition effect for rolling bearing fault data.展开更多
As an integrated application of modern information technologies and artificial intelligence,Prognostic and Health Management(PHM)is important for machine health monitoring.Prediction of tool wear is one of the symboli...As an integrated application of modern information technologies and artificial intelligence,Prognostic and Health Management(PHM)is important for machine health monitoring.Prediction of tool wear is one of the symbolic applications of PHM technology in modern manufacturing systems and industry.In this paper,a multi-scale Convolutional Gated Recurrent Unit network(MCGRU)is proposed to address raw sensory data for tool wear prediction.At the bottom of MCGRU,six parallel and independent branches with different kernel sizes are designed to form a multi-scale convolutional neural network,which augments the adaptability to features of different time scales.These features of different scales extracted from raw data are then fed into a Deep Gated Recurrent Unit network to capture long-term dependencies and learn significant representations.At the top of the MCGRU,a fully connected layer and a regression layer are built for cutting tool wear prediction.Two case studies are performed to verify the capability and effectiveness of the proposed MCGRU network and results show that MCGRU outperforms several state-of-the-art baseline models.展开更多
Pedestrian attribute classification from a pedestrian image captured in surveillance scenarios is challenging due to diverse clothing appearances,varied poses and different camera views. A multiscale and multi-label c...Pedestrian attribute classification from a pedestrian image captured in surveillance scenarios is challenging due to diverse clothing appearances,varied poses and different camera views. A multiscale and multi-label convolutional neural network( MSMLCNN) is proposed to predict multiple pedestrian attributes simultaneously. The pedestrian attribute classification problem is firstly transformed into a multi-label problem including multiple binary attributes needed to be classified. Then,the multi-label problem is solved by fully connecting all binary attributes to multi-scale features with logistic regression functions. Moreover,the multi-scale features are obtained by concatenating those featured maps produced from multiple pooling layers of the MSMLCNN at different scales. Extensive experiment results show that the proposed MSMLCNN outperforms state-of-the-art pedestrian attribute classification methods with a large margin.展开更多
Local and global optimization methods are widely used in geophysical inversion but each has its own advantages and disadvantages. The combination of the two methods will make it possible to overcome their weaknesses. ...Local and global optimization methods are widely used in geophysical inversion but each has its own advantages and disadvantages. The combination of the two methods will make it possible to overcome their weaknesses. Based on the simulated annealing genetic algorithm (SAGA) and the simplex algorithm, an efficient and robust 2-D nonlinear method for seismic travel-time inversion is presented in this paper. First we do a global search over a large range by SAGA and then do a rapid local search using the simplex method. A multi-scale tomography method is adopted in order to reduce non-uniqueness. The velocity field is divided into different spatial scales and velocities at the grid nodes are taken as unknown parameters. The model is parameterized by a bi-cubic spline function. The finite-difference method is used to solve the forward problem while the hybrid method combining multi-scale SAGA and simplex algorithms is applied to the inverse problem. The algorithm has been applied to a numerical test and a travel-time perturbation test using an anomalous low-velocity body. For a practical example, it is used in the study of upper crustal velocity structure of the A'nyemaqen suture zone at the north-east edge of the Qinghai-Tibet Plateau. The model test and practical application both prove that the method is effective and robust.展开更多
Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.Th...Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.This paper presents a convolutional structure with multi-scale fusion to optimize the step of clothing feature extraction and a self-attention module to capture long-range association information.The structure enables the self-attention mechanism to directly participate in the process of information exchange through the down-scaling projection operation of the multi-scale framework.In addition,the improved self-attention module introduces the extraction of 2-dimensional relative position information to make up for its lack of ability to extract spatial position features from clothing images.The experimental results based on the colorful fashion parsing dataset(CFPD)show that the proposed network structure achieves 53.68%mean intersection over union(mIoU)and has better performance on the clothing parsing task.展开更多
The vibration problem of a fluid conveying cylindrical shell consisted of newly developed multi-scale hybrid nanocomposites is solved in the present manuscript within the framework of an analytical solution.The consis...The vibration problem of a fluid conveying cylindrical shell consisted of newly developed multi-scale hybrid nanocomposites is solved in the present manuscript within the framework of an analytical solution.The consistent material is considered to be made from an initial matrix strengthened via both macro-and nano-scale reinforcements.The influence of nanofillers’agglomeration,generated due to the high surface to volume ratio in nanostructures,is included by implementing Eshelby-Mori-Tanaka homogenization scheme.Afterwards,the equivalent material properties of the carbon nanotube reinforced(CNTR)nanocomposite are coupled with those of CFs within the framework of a modified rule of mixture.On the other hand,the influences of viscous flow are covered by extending the Navier-Stokes equation for cylinders.A cylindrical coordinate system is chosen and mixed with the infinitesimal strains of first-order shear deformation theory of shells to obtain the motion equations on the basis of the dynamic form of principle of virtual work.Next,the achieved governing equations will be solved by Galerkin’s method to reach the natural frequency of the structure for both simply supported and clamped boundary conditions.Presenting a set of illustrations,effects of each parameter on the dimensionless frequency of nanocomposite shells will be shown graphically.展开更多
We propose a multiscale approach to study the influence of carbon nanotubes’agglomeration on the stability of hybrid nanocomposite plates.The hybrid nanocomposite consists of both macro-and nano-scale reinforcing fib...We propose a multiscale approach to study the influence of carbon nanotubes’agglomeration on the stability of hybrid nanocomposite plates.The hybrid nanocomposite consists of both macro-and nano-scale reinforcing fibers dispersed in a polymer matrix.The equivalent material properties are calculated by coupling the Eshelby-Mori-Tanaka model with the rule of mixture accounting for effects of CNTs inside the generated clusters.Furthermore,an energy based approach is implemented to obtain the governing equations of the problem utilizing a refined higher-order plate theorem.Subsequently,the derived equations are solved by Galerkin’s analytical method to predict the critical buckling load.The influence of various boundary conditions is studied as well.After validation,a set of numerical examples are presented to explain how each variant can affect the plate’s natural frequency.展开更多
Visual tracking is a classical computer vision problem with many applications.Efficient convolution operators(ECO)is one of the most outstanding visual tracking algorithms in recent years,it has shown great performanc...Visual tracking is a classical computer vision problem with many applications.Efficient convolution operators(ECO)is one of the most outstanding visual tracking algorithms in recent years,it has shown great performance using discriminative correlation filter(DCF)together with HOG,color maps and VGGNet features.Inspired by new deep learning models,this paper propose a hybrid efficient convolution operators integrating fully convolution network(FCN)and residual network(ResNet)for visual tracking,where FCN and ResNet are introduced in our proposed method to segment the objects from backgrounds and extract hierarchical feature maps of objects,respectively.Compared with the traditional VGGNet,our approach has higher accuracy for dealing with the issues of segmentation and image size.The experiments show that our approach would obtain better performance than ECO in terms of precision plot and success rate plot on OTB-2013 and UAV123 datasets.展开更多
Geological discontinuity(GD)plays a pivotal role in determining the catastrophic mechanical failure of jointed rock masses.Accurate and efficient acquisition of GD networks is essential for characterizing and understa...Geological discontinuity(GD)plays a pivotal role in determining the catastrophic mechanical failure of jointed rock masses.Accurate and efficient acquisition of GD networks is essential for characterizing and understanding the progressive damage mechanisms of slopes based on monitoring image data.Inspired by recent advances in computer vision,deep learning(DL)models have been widely utilized for image-based fracture identification.The multi-scale characteristics,image resolution and annotation quality of images will cause a scale-space effect(SSE)that makes features indistinguishable from noise,directly affecting the accuracy.However,this effect has not received adequate attention.Herein,we try to address this gap by collecting slope images at various proportional scales and constructing multi-scale datasets using image processing techniques.Next,we quantify the intensity of feature signals using metrics such as peak signal-to-noise ratio(PSNR)and structural similarity(SSIM).Combining these metrics with the scale-space theory,we investigate the influence of the SSE on the differentiation of multi-scale features and the accuracy of recognition.It is found that augmenting the image's detail capacity does not always yield benefits for vision-based recognition models.In light of these observations,we propose a scale hybridization approach based on the diffusion mechanism of scale-space representation.The results show that scale hybridization strengthens the tolerance of multi-scale feature recognition under complex environmental noise interference and significantly enhances the recognition accuracy of GD.It also facilitates the objective understanding,description and analysis of the rock behavior and stability of slopes from the perspective of image data.展开更多
In recent years,wearable devices-based Human Activity Recognition(HAR)models have received significant attention.Previously developed HAR models use hand-crafted features to recognize human activities,leading to the e...In recent years,wearable devices-based Human Activity Recognition(HAR)models have received significant attention.Previously developed HAR models use hand-crafted features to recognize human activities,leading to the extraction of basic features.The images captured by wearable sensors contain advanced features,allowing them to be analyzed by deep learning algorithms to enhance the detection and recognition of human actions.Poor lighting and limited sensor capabilities can impact data quality,making the recognition of human actions a challenging task.The unimodal-based HAR approaches are not suitable in a real-time environment.Therefore,an updated HAR model is developed using multiple types of data and an advanced deep-learning approach.Firstly,the required signals and sensor data are accumulated from the standard databases.From these signals,the wave features are retrieved.Then the extracted wave features and sensor data are given as the input to recognize the human activity.An Adaptive Hybrid Deep Attentive Network(AHDAN)is developed by incorporating a“1D Convolutional Neural Network(1DCNN)”with a“Gated Recurrent Unit(GRU)”for the human activity recognition process.Additionally,the Enhanced Archerfish Hunting Optimizer(EAHO)is suggested to fine-tune the network parameters for enhancing the recognition process.An experimental evaluation is performed on various deep learning networks and heuristic algorithms to confirm the effectiveness of the proposed HAR model.The EAHO-based HAR model outperforms traditional deep learning networks with an accuracy of 95.36,95.25 for recall,95.48 for specificity,and 95.47 for precision,respectively.The result proved that the developed model is effective in recognizing human action by taking less time.Additionally,it reduces the computation complexity and overfitting issue through using an optimization approach.展开更多
The tradeoff between efficiency and model size of the convolutional neural network(CNN)is an essential issue for applications of CNN-based algorithms to diverse real-world tasks.Although deep learning-based methods ha...The tradeoff between efficiency and model size of the convolutional neural network(CNN)is an essential issue for applications of CNN-based algorithms to diverse real-world tasks.Although deep learning-based methods have achieved significant improvements in image super-resolution(SR),current CNNbased techniques mainly contain massive parameters and a high computational complexity,limiting their practical applications.In this paper,we present a fast and lightweight framework,named weighted multi-scale residual network(WMRN),for a better tradeoff between SR performance and computational efficiency.With the modified residual structure,depthwise separable convolutions(DS Convs)are employed to improve convolutional operations’efficiency.Furthermore,several weighted multi-scale residual blocks(WMRBs)are stacked to enhance the multi-scale representation capability.In the reconstruction subnetwork,a group of Conv layers are introduced to filter feature maps to reconstruct the final high-quality image.Extensive experiments were conducted to evaluate the proposed model,and the comparative results with several state-of-the-art algorithms demonstrate the effectiveness of WMRN.展开更多
The background pattern of patterned fabrics is complex,which has a great interference in the extraction of defect features.Traditional machine vision algorithms rely on artificially designed features,which are greatly...The background pattern of patterned fabrics is complex,which has a great interference in the extraction of defect features.Traditional machine vision algorithms rely on artificially designed features,which are greatly affected by background patterns and are difficult to effectively extract flaw features.Therefore,a convolutional neural network(CNN)with automatic feature extraction is proposed.On the basis of the two-stage detection model Faster R-CNN,Resnet-50 is used as the backbone network,and the problem of flaws with extreme aspect ratio is solved by improving the initialization algorithm of the prior frame aspect ratio,and the improved multi-scale model is designed to improve detection of small defects.The cascade R-CNN is introduced to improve the accuracy of defect detection,and the online hard example mining(OHEM)algorithm is used to strengthen the learning of hard samples to reduce the interference of complex backgrounds on the defect detection of patterned fabrics,and construct the focal loss as a loss function to reduce the impact of sample imbalance.In order to verify the effectiveness of the improved algorithm,a defect detection comparison experiment was set up.The experimental results show that the accuracy of the defect detection algorithm of patterned fabrics in this paper can reach 95.7%,and it can accurately locate the defect location and meet the actual needs of the factory.展开更多
Cardiomyopathy is one of the most serious public health threats.The precise structural and functional cardiac measurement is an essential step for clinical diagnosis and follow-up treatment planning.Cardiologists are ...Cardiomyopathy is one of the most serious public health threats.The precise structural and functional cardiac measurement is an essential step for clinical diagnosis and follow-up treatment planning.Cardiologists are often required to draw endocardial and epicardial contours of the left ventricle(LV)manually in routine clinical diagnosis or treatment planning period.This task is time-consuming and error-prone.Therefore,it is necessary to develop a fully automated end-to-end semantic segmentation method on cardiac magnetic resonance(CMR)imaging datasets.However,due to the low image quality and the deformation caused by heartbeat,there is no effective tool for fully automated end-to-end cardiac segmentation task.In this work,we propose a multi-scale segmentation network(MSSN)for left ventricle segmentation.It can effectively learn myocardium and blood pool structure representations from 2D short-axis CMR image slices in a multi-scale way.Specifically,our method employs both parallel and serial of dilated convolution layers with different dilation rates to capture multi-scale semantic features.Moreover,we design graduated up-sampling layers with subpixel layers as the decoder to reconstruct lost spatial information and produce accurate segmentation masks.We validated our method using 164 T1 Mapping CMR images and showed that it outperforms the advanced convolutional neural network(CNN)models.In validation metrics,we archived the Dice Similarity Coefficient(DSC)metric of 78.96%.展开更多
Convolutional Neural Networks(CNNs)have recently attracted much attention in the ship detection from Synthetic Aperture Radar(SAR)images.However,compared with optical images,SAR ones are hard to understand.Moreover,du...Convolutional Neural Networks(CNNs)have recently attracted much attention in the ship detection from Synthetic Aperture Radar(SAR)images.However,compared with optical images,SAR ones are hard to understand.Moreover,due to the high similarity between the man-made targets near shore and inshore ships,the classical methods are unable to achieve effective detection of inshore ships.To mitigate the influence of onshore ship-like objects,this paper proposes an inshore ship detection method in SAR images by using hybrid features.Firstly,the sea-land segmentation is applied in the pre-processing to exclude obvious land regions from SAR images.Then,a CNN model is designed to extract deep features for identifying potential ship targets in both inshore and offshore water.On this basis,the high-energy point number of amplitude spectrum is further introduced as an important and delicate feature to suppress false alarms left.Finally,to verify the effectiveness of the proposed method,numerical and comparative studies are carried out in experiments on Sentinel-1 SAR images.展开更多
文摘Convolutional neural networks (CNNs) are widely used in image classification tasks, but their increasing model size and computation make them challenging to implement on embedded systems with constrained hardware resources. To address this issue, the MobileNetV1 network was developed, which employs depthwise convolution to reduce network complexity. MobileNetV1 employs a stride of 2 in several convolutional layers to decrease the spatial resolution of feature maps, thereby lowering computational costs. However, this stride setting can lead to a loss of spatial information, particularly affecting the detection and representation of smaller objects or finer details in images. To maintain the trade-off between complexity and model performance, a lightweight convolutional neural network with hierarchical multi-scale feature fusion based on the MobileNetV1 network is proposed. The network consists of two main subnetworks. The first subnetwork uses a depthwise dilated separable convolution (DDSC) layer to learn imaging features with fewer parameters, which results in a lightweight and computationally inexpensive network. Furthermore, depthwise dilated convolution in DDSC layer effectively expands the field of view of filters, allowing them to incorporate a larger context. The second subnetwork is a hierarchical multi-scale feature fusion (HMFF) module that uses parallel multi-resolution branches architecture to process the input feature map in order to extract the multi-scale feature information of the input image. Experimental results on the CIFAR-10, Malaria, and KvasirV1 datasets demonstrate that the proposed method is efficient, reducing the network parameters and computational cost by 65.02% and 39.78%, respectively, while maintaining the network performance compared to the MobileNetV1 baseline.
基金Project supported by the Natural Science Foundation of Shandong Province,China (Grant No.ZR2021MF049)the Joint Fund of Natural Science Foundation of Shandong Province (Grant Nos.ZR2022LLZ012 and ZR2021LLZ001)。
文摘We design a new hybrid quantum-classical convolutional neural network(HQCCNN)model based on parameter quantum circuits.In this model,we use parameterized quantum circuits(PQCs)to redesign the convolutional layer in classical convolutional neural networks,forming a new quantum convolutional layer to achieve unitary transformation of quantum states,enabling the model to more accurately extract hidden information from images.At the same time,we combine the classical fully connected layer with PQCs to form a new hybrid quantum-classical fully connected layer to further improve the accuracy of classification.Finally,we use the MNIST dataset to test the potential of the HQCCNN.The results indicate that the HQCCNN has good performance in solving classification problems.In binary classification tasks,the classification accuracy of numbers 5 and 7 is as high as 99.71%.In multivariate classification,the accuracy rate also reaches 98.51%.Finally,we compare the performance of the HQCCNN with other models and find that the HQCCNN has better classification performance and convergence speed.
基金Supported by the National Natural Science Foundation of China(61903336,61976190)the Natural Science Foundation of Zhejiang Province(LY21F030015)。
文摘Background The use of remote photoplethysmography(rPPG)to estimate blood volume pulse in a noncontact manner has been an active research topic in recent years.Existing methods are primarily based on a singlescale region of interest(ROI).However,some noise signals that are not easily separated in a single-scale space can be easily separated in a multi-scale space.Also,existing spatiotemporal networks mainly focus on local spatiotemporal information and do not emphasize temporal information,which is crucial in pulse extraction problems,resulting in insufficient spatiotemporal feature modelling.Methods Here,we propose a multi-scale facial video pulse extraction network based on separable spatiotemporal convolution(SSTC)and dimension separable attention(DSAT).First,to solve the problem of a single-scale ROI,we constructed a multi-scale feature space for initial signal separation.Second,SSTC and DSAT were designed for efficient spatiotemporal correlation modeling,which increased the information interaction between the long-span time and space dimensions;this placed more emphasis on temporal features.Results The signal-to-noise ratio(SNR)of the proposed network reached 9.58dB on the PURE dataset and 6.77dB on the UBFC-rPPG dataset,outperforming state-of-the-art algorithms.Conclusions The results showed that fusing multi-scale signals yielded better results than methods based on only single-scale signals.The proposed SSTC and dimension-separable attention mechanism will contribute to more accurate pulse signal extraction.
基金funded by the National Natural Foundation of China under Grant No.61172167the Science Fund Project of Heilongjiang Province(LH2020F035).
文摘Nuclearmagnetic resonance imaging of breasts often presents complex backgrounds.Breast tumors exhibit varying sizes,uneven intensity,and indistinct boundaries.These characteristics can lead to challenges such as low accuracy and incorrect segmentation during tumor segmentation.Thus,we propose a two-stage breast tumor segmentation method leveraging multi-scale features and boundary attention mechanisms.Initially,the breast region of interest is extracted to isolate the breast area from surrounding tissues and organs.Subsequently,we devise a fusion network incorporatingmulti-scale features and boundary attentionmechanisms for breast tumor segmentation.We incorporate multi-scale parallel dilated convolution modules into the network,enhancing its capability to segment tumors of various sizes through multi-scale convolution and novel fusion techniques.Additionally,attention and boundary detection modules are included to augment the network’s capacity to locate tumors by capturing nonlocal dependencies in both spatial and channel domains.Furthermore,a hybrid loss function with boundary weight is employed to address sample class imbalance issues and enhance the network’s boundary maintenance capability through additional loss.Themethod was evaluated using breast data from 207 patients at RuijinHospital,resulting in a 6.64%increase in Dice similarity coefficient compared to the benchmarkU-Net.Experimental results demonstrate the superiority of the method over other segmentation techniques,with fewer model parameters.
文摘A face-mask object detection model incorporating hybrid dilation convolutional network termed ResNet Hybrid-dilation-convolution Face-mask-detector (RHF) is proposed in this paper. Furthermore, a lightweight face-mask dataset named Light Masked Face Dataset (LMFD) and a medium-sized face-mask dataset named Masked Face Dataset (MFD) with data augmentation methods applied is also constructed in this paper. The hybrid dilation convolutional network is able to expand the perception of the convolutional kernel without concern about the discontinuity of image information during the convolution process. For the given two datasets being constructed above, the trained models are significantly optimized in terms of detection performance, training time, and other related metrics. By using the MFD dataset of 55,905 images, the RHF model requires roughly 10 hours less training time compared to ResNet50 with better detection results with mAP of 93.45%.
基金supported by the Key Laboratory of Environment Controlled Aquaculture(Dalian Ocean University)Ministry of Education(No.2021-MOEKLECA-KF-05)the National Natural Science Foundation of China Youth Science(No.61802046)。
文摘Aquatic medicine knowledge graph is an effective means to realize intelligent aquaculture.Graph completion technology is key to improving the quality of knowledge graph construction.However,the difficulty of semantic discrimination among similar entities and inconspicuous semantic features result in low accuracy when completing aquatic medicine knowledge graph with complex relationships.In this study,an aquatic medicine knowledge graph completion method(TransH+HConvAM)is proposed.Firstly,TransH is applied to split the vector plane between entities and relations,ameliorating the poor completion effect caused by low semantic resolution of entities.Then,hybrid convolution is introduced to obtain the global interaction of triples based on the complete interaction between head/tail entities and relations,which improves the semantic features of triples and enhances the completion effect of complex relationships in the graph.Experiments are conducted to verify the performance of the proposed method.The MR,MRR and Hit@10 of the TransH+HConvAM are found to be 674,0.339,and 0.361,respectively.This study shows that the model effectively overcomes the poor completion effect of complex relationships and improves the construction quality of the aquatic medicine knowledge graph,providing technical support for intelligent aquaculture.
基金The National Natural Science Foundation of China(No.51675098)
文摘Aiming at the difficulty of fault identification caused by manual extraction of fault features of rotating machinery,a one-dimensional multi-scale convolutional auto-encoder fault diagnosis model is proposed,based on the standard convolutional auto-encoder.In this model,the parallel convolutional and deconvolutional kernels of different scales are used to extract the features from the input signal and reconstruct the input signal;then the feature map extracted by multi-scale convolutional kernels is used as the input of the classifier;and finally the parameters of the whole model are fine-tuned using labeled data.Experiments on one set of simulation fault data and two sets of rolling bearing fault data are conducted to validate the proposed method.The results show that the model can achieve 99.75%,99.3%and 100%diagnostic accuracy,respectively.In addition,the diagnostic accuracy and reconstruction error of the one-dimensional multi-scale convolutional auto-encoder are compared with traditional machine learning,convolutional neural networks and a traditional convolutional auto-encoder.The final results show that the proposed model has a better recognition effect for rolling bearing fault data.
基金Supported in part by Natural Science Foundation of China(Grant Nos.51835009,51705398)Shaanxi Province 2020 Natural Science Basic Research Plan(Grant No.2020JQ-042)Aeronautical Science Foundation(Grant No.2019ZB070001).
文摘As an integrated application of modern information technologies and artificial intelligence,Prognostic and Health Management(PHM)is important for machine health monitoring.Prediction of tool wear is one of the symbolic applications of PHM technology in modern manufacturing systems and industry.In this paper,a multi-scale Convolutional Gated Recurrent Unit network(MCGRU)is proposed to address raw sensory data for tool wear prediction.At the bottom of MCGRU,six parallel and independent branches with different kernel sizes are designed to form a multi-scale convolutional neural network,which augments the adaptability to features of different time scales.These features of different scales extracted from raw data are then fed into a Deep Gated Recurrent Unit network to capture long-term dependencies and learn significant representations.At the top of the MCGRU,a fully connected layer and a regression layer are built for cutting tool wear prediction.Two case studies are performed to verify the capability and effectiveness of the proposed MCGRU network and results show that MCGRU outperforms several state-of-the-art baseline models.
基金Supported by the National Natural Science Foundation of China(No.61602191,61672521,61375037,61473291,61572501,61572536,61502491,61372107,61401167)the Natural Science Foundation of Fujian Province(No.2016J01308)+3 种基金the Scientific and Technology Funds of Quanzhou(No.2015Z114)the Scientific and Technology Funds of Xiamen(No.3502Z20173045)the Promotion Program for Young and Middle aged Teacher in Science and Technology Research of Huaqiao University(No.ZQN-PY418,ZQN-YX403)the Scientific Research Funds of Huaqiao University(No.16BS108)
文摘Pedestrian attribute classification from a pedestrian image captured in surveillance scenarios is challenging due to diverse clothing appearances,varied poses and different camera views. A multiscale and multi-label convolutional neural network( MSMLCNN) is proposed to predict multiple pedestrian attributes simultaneously. The pedestrian attribute classification problem is firstly transformed into a multi-label problem including multiple binary attributes needed to be classified. Then,the multi-label problem is solved by fully connecting all binary attributes to multi-scale features with logistic regression functions. Moreover,the multi-scale features are obtained by concatenating those featured maps produced from multiple pooling layers of the MSMLCNN at different scales. Extensive experiment results show that the proposed MSMLCNN outperforms state-of-the-art pedestrian attribute classification methods with a large margin.
基金supported by the National Natural Science Foundation of China (Grant Nos.40334040 and 40974033)the Promoting Foundation for Advanced Persons of Talent of NCWU
文摘Local and global optimization methods are widely used in geophysical inversion but each has its own advantages and disadvantages. The combination of the two methods will make it possible to overcome their weaknesses. Based on the simulated annealing genetic algorithm (SAGA) and the simplex algorithm, an efficient and robust 2-D nonlinear method for seismic travel-time inversion is presented in this paper. First we do a global search over a large range by SAGA and then do a rapid local search using the simplex method. A multi-scale tomography method is adopted in order to reduce non-uniqueness. The velocity field is divided into different spatial scales and velocities at the grid nodes are taken as unknown parameters. The model is parameterized by a bi-cubic spline function. The finite-difference method is used to solve the forward problem while the hybrid method combining multi-scale SAGA and simplex algorithms is applied to the inverse problem. The algorithm has been applied to a numerical test and a travel-time perturbation test using an anomalous low-velocity body. For a practical example, it is used in the study of upper crustal velocity structure of the A'nyemaqen suture zone at the north-east edge of the Qinghai-Tibet Plateau. The model test and practical application both prove that the method is effective and robust.
文摘Due to the lack of long-range association and spatial location information,fine details and accurate boundaries of complex clothing images cannot always be obtained by using the existing deep learning-based methods.This paper presents a convolutional structure with multi-scale fusion to optimize the step of clothing feature extraction and a self-attention module to capture long-range association information.The structure enables the self-attention mechanism to directly participate in the process of information exchange through the down-scaling projection operation of the multi-scale framework.In addition,the improved self-attention module introduces the extraction of 2-dimensional relative position information to make up for its lack of ability to extract spatial position features from clothing images.The experimental results based on the colorful fashion parsing dataset(CFPD)show that the proposed network structure achieves 53.68%mean intersection over union(mIoU)and has better performance on the clothing parsing task.
文摘The vibration problem of a fluid conveying cylindrical shell consisted of newly developed multi-scale hybrid nanocomposites is solved in the present manuscript within the framework of an analytical solution.The consistent material is considered to be made from an initial matrix strengthened via both macro-and nano-scale reinforcements.The influence of nanofillers’agglomeration,generated due to the high surface to volume ratio in nanostructures,is included by implementing Eshelby-Mori-Tanaka homogenization scheme.Afterwards,the equivalent material properties of the carbon nanotube reinforced(CNTR)nanocomposite are coupled with those of CFs within the framework of a modified rule of mixture.On the other hand,the influences of viscous flow are covered by extending the Navier-Stokes equation for cylinders.A cylindrical coordinate system is chosen and mixed with the infinitesimal strains of first-order shear deformation theory of shells to obtain the motion equations on the basis of the dynamic form of principle of virtual work.Next,the achieved governing equations will be solved by Galerkin’s method to reach the natural frequency of the structure for both simply supported and clamped boundary conditions.Presenting a set of illustrations,effects of each parameter on the dimensionless frequency of nanocomposite shells will be shown graphically.
文摘We propose a multiscale approach to study the influence of carbon nanotubes’agglomeration on the stability of hybrid nanocomposite plates.The hybrid nanocomposite consists of both macro-and nano-scale reinforcing fibers dispersed in a polymer matrix.The equivalent material properties are calculated by coupling the Eshelby-Mori-Tanaka model with the rule of mixture accounting for effects of CNTs inside the generated clusters.Furthermore,an energy based approach is implemented to obtain the governing equations of the problem utilizing a refined higher-order plate theorem.Subsequently,the derived equations are solved by Galerkin’s analytical method to predict the critical buckling load.The influence of various boundary conditions is studied as well.After validation,a set of numerical examples are presented to explain how each variant can affect the plate’s natural frequency.
文摘Visual tracking is a classical computer vision problem with many applications.Efficient convolution operators(ECO)is one of the most outstanding visual tracking algorithms in recent years,it has shown great performance using discriminative correlation filter(DCF)together with HOG,color maps and VGGNet features.Inspired by new deep learning models,this paper propose a hybrid efficient convolution operators integrating fully convolution network(FCN)and residual network(ResNet)for visual tracking,where FCN and ResNet are introduced in our proposed method to segment the objects from backgrounds and extract hierarchical feature maps of objects,respectively.Compared with the traditional VGGNet,our approach has higher accuracy for dealing with the issues of segmentation and image size.The experiments show that our approach would obtain better performance than ECO in terms of precision plot and success rate plot on OTB-2013 and UAV123 datasets.
基金supported by the National Natural Science Foundation of China(Grant No.52090081)the State Key Laboratory of Hydro-science and Hydraulic Engineering(Grant No.2021-KY-04).
文摘Geological discontinuity(GD)plays a pivotal role in determining the catastrophic mechanical failure of jointed rock masses.Accurate and efficient acquisition of GD networks is essential for characterizing and understanding the progressive damage mechanisms of slopes based on monitoring image data.Inspired by recent advances in computer vision,deep learning(DL)models have been widely utilized for image-based fracture identification.The multi-scale characteristics,image resolution and annotation quality of images will cause a scale-space effect(SSE)that makes features indistinguishable from noise,directly affecting the accuracy.However,this effect has not received adequate attention.Herein,we try to address this gap by collecting slope images at various proportional scales and constructing multi-scale datasets using image processing techniques.Next,we quantify the intensity of feature signals using metrics such as peak signal-to-noise ratio(PSNR)and structural similarity(SSIM).Combining these metrics with the scale-space theory,we investigate the influence of the SSE on the differentiation of multi-scale features and the accuracy of recognition.It is found that augmenting the image's detail capacity does not always yield benefits for vision-based recognition models.In light of these observations,we propose a scale hybridization approach based on the diffusion mechanism of scale-space representation.The results show that scale hybridization strengthens the tolerance of multi-scale feature recognition under complex environmental noise interference and significantly enhances the recognition accuracy of GD.It also facilitates the objective understanding,description and analysis of the rock behavior and stability of slopes from the perspective of image data.
文摘In recent years,wearable devices-based Human Activity Recognition(HAR)models have received significant attention.Previously developed HAR models use hand-crafted features to recognize human activities,leading to the extraction of basic features.The images captured by wearable sensors contain advanced features,allowing them to be analyzed by deep learning algorithms to enhance the detection and recognition of human actions.Poor lighting and limited sensor capabilities can impact data quality,making the recognition of human actions a challenging task.The unimodal-based HAR approaches are not suitable in a real-time environment.Therefore,an updated HAR model is developed using multiple types of data and an advanced deep-learning approach.Firstly,the required signals and sensor data are accumulated from the standard databases.From these signals,the wave features are retrieved.Then the extracted wave features and sensor data are given as the input to recognize the human activity.An Adaptive Hybrid Deep Attentive Network(AHDAN)is developed by incorporating a“1D Convolutional Neural Network(1DCNN)”with a“Gated Recurrent Unit(GRU)”for the human activity recognition process.Additionally,the Enhanced Archerfish Hunting Optimizer(EAHO)is suggested to fine-tune the network parameters for enhancing the recognition process.An experimental evaluation is performed on various deep learning networks and heuristic algorithms to confirm the effectiveness of the proposed HAR model.The EAHO-based HAR model outperforms traditional deep learning networks with an accuracy of 95.36,95.25 for recall,95.48 for specificity,and 95.47 for precision,respectively.The result proved that the developed model is effective in recognizing human action by taking less time.Additionally,it reduces the computation complexity and overfitting issue through using an optimization approach.
基金the National Natural Science Foundation of China(61772149,61866009,61762028,U1701267,61702169)Guangxi Science and Technology Project(2019GXNSFFA245014,ZY20198016,AD18281079,AD18216004)+1 种基金the Natural Science Foundation of Hunan Province(2020JJ3014)Guangxi Colleges and Universities Key Laboratory of Intelligent Processing of Computer Images and Graphics(GIIP202001).
文摘The tradeoff between efficiency and model size of the convolutional neural network(CNN)is an essential issue for applications of CNN-based algorithms to diverse real-world tasks.Although deep learning-based methods have achieved significant improvements in image super-resolution(SR),current CNNbased techniques mainly contain massive parameters and a high computational complexity,limiting their practical applications.In this paper,we present a fast and lightweight framework,named weighted multi-scale residual network(WMRN),for a better tradeoff between SR performance and computational efficiency.With the modified residual structure,depthwise separable convolutions(DS Convs)are employed to improve convolutional operations’efficiency.Furthermore,several weighted multi-scale residual blocks(WMRBs)are stacked to enhance the multi-scale representation capability.In the reconstruction subnetwork,a group of Conv layers are introduced to filter feature maps to reconstruct the final high-quality image.Extensive experiments were conducted to evaluate the proposed model,and the comparative results with several state-of-the-art algorithms demonstrate the effectiveness of WMRN.
基金National Key Research and Development Project,China(No.2018YFB1308800)。
文摘The background pattern of patterned fabrics is complex,which has a great interference in the extraction of defect features.Traditional machine vision algorithms rely on artificially designed features,which are greatly affected by background patterns and are difficult to effectively extract flaw features.Therefore,a convolutional neural network(CNN)with automatic feature extraction is proposed.On the basis of the two-stage detection model Faster R-CNN,Resnet-50 is used as the backbone network,and the problem of flaws with extreme aspect ratio is solved by improving the initialization algorithm of the prior frame aspect ratio,and the improved multi-scale model is designed to improve detection of small defects.The cascade R-CNN is introduced to improve the accuracy of defect detection,and the online hard example mining(OHEM)algorithm is used to strengthen the learning of hard samples to reduce the interference of complex backgrounds on the defect detection of patterned fabrics,and construct the focal loss as a loss function to reduce the impact of sample imbalance.In order to verify the effectiveness of the improved algorithm,a defect detection comparison experiment was set up.The experimental results show that the accuracy of the defect detection algorithm of patterned fabrics in this paper can reach 95.7%,and it can accurately locate the defect location and meet the actual needs of the factory.
基金This work was supported by the Project of Sichuan Outstanding Young Scientific and Technological Talents(19JCQN0003)the major Project of Education Department in Sichuan(17ZA0063 and 2017JQ0030)+1 种基金in part by the Natural Science Foundation for Young Scientists of CUIT(J201704)the Sichuan Science and Technology Program(2019JDRC0077).
文摘Cardiomyopathy is one of the most serious public health threats.The precise structural and functional cardiac measurement is an essential step for clinical diagnosis and follow-up treatment planning.Cardiologists are often required to draw endocardial and epicardial contours of the left ventricle(LV)manually in routine clinical diagnosis or treatment planning period.This task is time-consuming and error-prone.Therefore,it is necessary to develop a fully automated end-to-end semantic segmentation method on cardiac magnetic resonance(CMR)imaging datasets.However,due to the low image quality and the deformation caused by heartbeat,there is no effective tool for fully automated end-to-end cardiac segmentation task.In this work,we propose a multi-scale segmentation network(MSSN)for left ventricle segmentation.It can effectively learn myocardium and blood pool structure representations from 2D short-axis CMR image slices in a multi-scale way.Specifically,our method employs both parallel and serial of dilated convolution layers with different dilation rates to capture multi-scale semantic features.Moreover,we design graduated up-sampling layers with subpixel layers as the decoder to reconstruct lost spatial information and produce accurate segmentation masks.We validated our method using 164 T1 Mapping CMR images and showed that it outperforms the advanced convolutional neural network(CNN)models.In validation metrics,we archived the Dice Similarity Coefficient(DSC)metric of 78.96%.
基金Aeronautical Science Foundation of China(No.2018ZC51022)。
文摘Convolutional Neural Networks(CNNs)have recently attracted much attention in the ship detection from Synthetic Aperture Radar(SAR)images.However,compared with optical images,SAR ones are hard to understand.Moreover,due to the high similarity between the man-made targets near shore and inshore ships,the classical methods are unable to achieve effective detection of inshore ships.To mitigate the influence of onshore ship-like objects,this paper proposes an inshore ship detection method in SAR images by using hybrid features.Firstly,the sea-land segmentation is applied in the pre-processing to exclude obvious land regions from SAR images.Then,a CNN model is designed to extract deep features for identifying potential ship targets in both inshore and offshore water.On this basis,the high-energy point number of amplitude spectrum is further introduced as an important and delicate feature to suppress false alarms left.Finally,to verify the effectiveness of the proposed method,numerical and comparative studies are carried out in experiments on Sentinel-1 SAR images.