Graph Convolutional Neural Networks(GCNs)have been widely used in various fields due to their powerful capabilities in processing graph-structured data.However,GCNs encounter significant challenges when applied to sca...Graph Convolutional Neural Networks(GCNs)have been widely used in various fields due to their powerful capabilities in processing graph-structured data.However,GCNs encounter significant challenges when applied to scale-free graphs with power-law distributions,resulting in substantial distortions.Moreover,most of the existing GCN models are shallow structures,which restricts their ability to capture dependencies among distant nodes and more refined high-order node features in scale-free graphs with hierarchical structures.To more broadly and precisely apply GCNs to real-world graphs exhibiting scale-free or hierarchical structures and utilize multi-level aggregation of GCNs for capturing high-level information in local representations,we propose the Hyperbolic Deep Graph Convolutional Neural Network(HDGCNN),an end-to-end deep graph representation learning framework that can map scale-free graphs from Euclidean space to hyperbolic space.In HDGCNN,we define the fundamental operations of deep graph convolutional neural networks in hyperbolic space.Additionally,we introduce a hyperbolic feature transformation method based on identity mapping and a dense connection scheme based on a novel non-local message passing framework.In addition,we present a neighborhood aggregation method that combines initial structural featureswith hyperbolic attention coefficients.Through the above methods,HDGCNN effectively leverages both the structural features and node features of graph data,enabling enhanced exploration of non-local structural features and more refined node features in scale-free or hierarchical graphs.Experimental results demonstrate that HDGCNN achieves remarkable performance improvements over state-ofthe-art GCNs in node classification and link prediction tasks,even when utilizing low-dimensional embedding representations.Furthermore,when compared to shallow hyperbolic graph convolutional neural network models,HDGCNN exhibits notable advantages and performance enhancements.展开更多
The development of defect prediction plays a significant role in improving software quality. Such predictions are used to identify defective modules before the testing and to minimize the time and cost. The software w...The development of defect prediction plays a significant role in improving software quality. Such predictions are used to identify defective modules before the testing and to minimize the time and cost. The software with defects negatively impacts operational costs and finally affects customer satisfaction. Numerous approaches exist to predict software defects. However, the timely and accurate software bugs are the major challenging issues. To improve the timely and accurate software defect prediction, a novel technique called Nonparametric Statistical feature scaled QuAdratic regressive convolution Deep nEural Network (SQADEN) is introduced. The proposed SQADEN technique mainly includes two major processes namely metric or feature selection and classification. First, the SQADEN uses the nonparametric statistical Torgerson–Gower scaling technique for identifying the relevant software metrics by measuring the similarity using the dice coefficient. The feature selection process is used to minimize the time complexity of software fault prediction. With the selected metrics, software fault perdition with the help of the Quadratic Censored regressive convolution deep neural network-based classification. The deep learning classifier analyzes the training and testing samples using the contingency correlation coefficient. The softstep activation function is used to provide the final fault prediction results. To minimize the error, the Nelder–Mead method is applied to solve non-linear least-squares problems. Finally, accurate classification results with a minimum error are obtained at the output layer. Experimental evaluation is carried out with different quantitative metrics such as accuracy, precision, recall, F-measure, and time complexity. The analyzed results demonstrate the superior performance of our proposed SQADEN technique with maximum accuracy, sensitivity and specificity by 3%, 3%, 2% and 3% and minimum time and space by 13% and 15% when compared with the two state-of-the-art methods.展开更多
In this paper,we utilized the deep convolutional neural network D-LinkNet,a model for semantic segmentation,to analyze the Himawari-8 satellite data captured from 16 channels at a spatial resolution of 0.5 km,with a f...In this paper,we utilized the deep convolutional neural network D-LinkNet,a model for semantic segmentation,to analyze the Himawari-8 satellite data captured from 16 channels at a spatial resolution of 0.5 km,with a focus on the area over the Yellow Sea and the Bohai Sea(32°-42°N,117°-127°E).The objective was to develop an algorithm for fusing and segmenting multi-channel images from geostationary meteorological satellites,specifically for monitoring sea fog in this region.Firstly,the extreme gradient boosting algorithm was adopted to evaluate the data from the 16 channels of the Himawari-8 satellite for sea fog detection,and we found that the top three channels in order of importance were channels 3,4,and 14,which were fused into false color daytime images,while channels 7,13,and 15 were fused into false color nighttime images.Secondly,the simple linear iterative super-pixel clustering algorithm was used for the pixel-level segmentation of false color images,and based on super-pixel blocks,manual sea-fog annotation was performed to obtain fine-grained annotation labels.The deep convolutional neural network D-LinkNet was built on the ResNet backbone and the dilated convolutional layers with direct connections were added in the central part to form a string-and-combine structure with five branches having different depths and receptive fields.Results show that the accuracy rate of fog area(proportion of detected real fog to detected fog)was 66.5%,the recognition rate of fog zone(proportion of detected real fog to real fog or cloud cover)was 51.9%,and the detection accuracy rate(proportion of samples detected correctly to total samples)was 93.2%.展开更多
This paper proposes a cascade deep convolutional neural network to address the loosening detection problem of bolts on axlebox covers.Firstly,an SSD network based on ResNet50 and CBAM module by improving bolt image fe...This paper proposes a cascade deep convolutional neural network to address the loosening detection problem of bolts on axlebox covers.Firstly,an SSD network based on ResNet50 and CBAM module by improving bolt image features is proposed for locating bolts on axlebox covers.And then,theA2-PFN is proposed according to the slender features of the marker lines for extracting more accurate marker lines regions of the bolts.Finally,a rectangular approximationmethod is proposed to regularize themarker line regions asaway tocalculate the angle of themarker line and plot all the angle values into an angle table,according to which the criteria of the angle table can determine whether the bolt with the marker line is in danger of loosening.Meanwhile,our improved algorithm is compared with the pre-improved algorithmin the object localization stage.The results show that our proposed method has a significant improvement in both detection accuracy and detection speed,where ourmAP(IoU=0.75)reaches 0.77 and fps reaches 16.6.And in the saliency detection stage,after qualitative comparison and quantitative comparison,our method significantly outperforms other state-of-the-art methods,where our MAE reaches 0.092,F-measure reaches 0.948 and AUC reaches 0.943.Ultimately,according to the angle table,out of 676 bolt samples,a total of 60 bolts are loose,69 bolts are at risk of loosening,and 547 bolts are tightened.展开更多
Enabling high mobility applications in millimeter wave(mmWave)based systems opens up a slew of new possibilities,including vehicle communi-cations in addition to wireless virtual/augmented reality.The narrow beam usag...Enabling high mobility applications in millimeter wave(mmWave)based systems opens up a slew of new possibilities,including vehicle communi-cations in addition to wireless virtual/augmented reality.The narrow beam usage in addition to the millimeter waves sensitivity might block the coverage along with the reliability of the mobile links.In this research work,the improvement in the quality of experience faced by the user for multimedia-related applications over the millimeter-wave band is investigated.The high attenuation loss in high frequencies is compensated with a massive array structure named Multiple Input and Multiple Output(MIMO)which is utilized in a hyperdense environment called heterogeneous networks(HetNet).The optimization problem which arises while maximizing the Mean Opinion Score(MOS)is analyzed along with the QoE(Quality of Experience)metric by considering the Base Station(BS)powers in addition to the needed Quality of Service(QoS).Most of the approaches related to wireless network communication are not suitable for the millimeter-wave band because of its problems due to high complexity and its dynamic nature.Hence a deep reinforcement learning framework is developed for tackling the same opti-mization problem.In this work,a Fuzzy-based Deep Convolutional Neural Net-work(FDCNN)is proposed in addition to a Deep Reinforcing Learning Framework(DRLF)for extracting the features of highly correlated data.The investigational results prove that the proposed method yields the highest satisfac-tion to the user by increasing the number of antennas in addition with the small-scale antennas at the base stations.The proposed work outperforms in terms of MOS with multiple antennas.展开更多
In recent years,computer visionfinds wide applications in maritime surveillance with its sophisticated algorithms and advanced architecture.Auto-matic ship detection with computer vision techniques provide an efficien...In recent years,computer visionfinds wide applications in maritime surveillance with its sophisticated algorithms and advanced architecture.Auto-matic ship detection with computer vision techniques provide an efficient means to monitor as well as track ships in water bodies.Waterways being an important medium of transport require continuous monitoring for protection of national security.The remote sensing satellite images of ships in harbours and water bodies are the image data that aid the neural network models to localize ships and to facilitate early identification of possible threats at sea.This paper proposes a deep learning based model capable enough to classify between ships and no-ships as well as to localize ships in the original images using bounding box tech-nique.Furthermore,classified ships are again segmented with deep learning based auto-encoder model.The proposed model,in terms of classification,provides suc-cessful results generating 99.5%and 99.2%validation and training accuracy respectively.The auto-encoder model also produces 85.1%and 84.2%validation and training accuracies.Moreover the IoU metric of the segmented images is found to be of 0.77 value.The experimental results reveal that the model is accu-rate and can be implemented for automatic ship detection in water bodies consid-ering remote sensing satellite images as input to the computer vision system.展开更多
The most widely farmed fruit in the world is mango.Both the production and quality of the mangoes are hampered by many diseases.These diseases need to be effectively controlled and mitigated.Therefore,a quick and accu...The most widely farmed fruit in the world is mango.Both the production and quality of the mangoes are hampered by many diseases.These diseases need to be effectively controlled and mitigated.Therefore,a quick and accurate diagnosis of the disorders is essential.Deep convolutional neural networks,renowned for their independence in feature extraction,have established their value in numerous detection and classification tasks.However,it requires large training datasets and several parameters that need careful adjustment.The proposed Modified Dense Convolutional Network(MDCN)provides a successful classification scheme for plant diseases affecting mango leaves.This model employs the strength of pre-trained networks and modifies them for the particular context of mango leaf diseases by incorporating transfer learning techniques.The data loader also builds mini-batches for training the models to reduce training time.Finally,optimization approaches help increase the overall model’s efficiency and lower computing costs.MDCN employed on the MangoLeafBD Dataset consists of a total of 4,000 images.Following the experimental results,the proposed system is compared with existing techniques and it is clear that the proposed algorithm surpasses the existing algorithms by achieving high performance and overall throughput.展开更多
The development of precision agriculture demands high accuracy and efficiency of cultivated land information extraction. As a new means of monitoring the ground in recent years, unmanned aerial vehicle (UAV) low-hei...The development of precision agriculture demands high accuracy and efficiency of cultivated land information extraction. As a new means of monitoring the ground in recent years, unmanned aerial vehicle (UAV) low-height remote sensing technique, which is flexible, efficient with low cost and with high resolution, is widely applied to investing various resources. Based on this, a novel extraction method for cultivated land information based on Deep Convolutional Neural Network and Transfer Learning (DTCLE) was proposed. First, linear features (roads and ridges etc.) were excluded based on Deep Convolutional Neural Network (DCNN). Next, feature extraction method learned from DCNN was used to cultivated land information extraction by introducing transfer learning mechanism. Last, cultivated land information extraction results were completed by the DTCLE and eCognifion for cultivated land information extraction (ECLE). The location of the Pengzhou County and Guanghan County, Sichuan Province were selected for the experimental purpose. The experimental results showed that the overall precision for the experimental image 1, 2 and 3 (of extracting cultivated land) with the DTCLE method was 91.7%, 88.1% and 88.2% respectively, and the overall precision of ECLE is 9o.7%, 90.5% and 87.0%, respectively. Accuracy of DTCLE was equivalent to that of ECLE, and also outperformed ECLE in terms of integrity and continuity.展开更多
Compressive strength of concrete is a significant factor to assess building structure health and safety.Therefore,various methods have been developed to evaluate the compressive strength of concrete structures.However...Compressive strength of concrete is a significant factor to assess building structure health and safety.Therefore,various methods have been developed to evaluate the compressive strength of concrete structures.However,previous methods have several challenges in costly,time-consuming,and unsafety.To address these drawbacks,this paper proposed a digital vision based concrete compressive strength evaluating model using deep convolutional neural network(DCNN).The proposed model presented an alternative approach to evaluating the concrete strength and contributed to improving efficiency and accuracy.The model was developed with 4,000 digital images and 61,996 images extracted from video recordings collected from concrete samples.The experimental results indicated a root mean square error(RMSE)value of 3.56(MPa),demonstrating a strong feasibility that the proposed model can be utilized to predict the concrete strength with digital images of their surfaces and advantages to overcome the previous limitations.This experiment contributed to provide the basis that could be extended to future research with image analysis technique and artificial neural network in the diagnosis of concrete building structures.展开更多
In this study,we examined the efficacy of a deep convolutional neural network(DCNN)in recognizing concrete surface images and predicting the compressive strength of concrete.A digital single-lens reflex(DSLR)camera an...In this study,we examined the efficacy of a deep convolutional neural network(DCNN)in recognizing concrete surface images and predicting the compressive strength of concrete.A digital single-lens reflex(DSLR)camera and microscope were simultaneously used to obtain concrete surface images used as the input data for the DCNN.Thereafter,training,validation,and testing of the DCNNs were performed based on the DSLR camera and microscope image data.Results of the analysis indicated that the DCNN employing DSLR image data achieved a relatively higher accuracy.The accuracy of the DSLR-derived image data was attributed to the relatively wider range of the DSLR camera,which was beneficial for extracting a larger number of features.Moreover,the DSLR camera procured more realistic images than the microscope.Thus,when the compressive strength of concrete was evaluated using the DCNN employing a DSLR camera,time and cost were reduced,whereas the usefulness increased.Furthermore,an indirect comparison of the accuracy of the DCNN with that of existing non-destructive methods for evaluating the strength of concrete proved the reliability of DCNN-derived concrete strength predictions.In addition,it was determined that the DCNN used for concrete strength evaluations in this study can be further expanded to detect and evaluate various deteriorative factors that affect the durability of structures,such as salt damage,carbonation,sulfation,corrosion,and freezing-thawing.展开更多
Early diagnosis and detection are important tasks in controlling the spread of COVID-19.A number of Deep Learning techniques has been established by researchers to detect the presence of COVID-19 using CT scan images ...Early diagnosis and detection are important tasks in controlling the spread of COVID-19.A number of Deep Learning techniques has been established by researchers to detect the presence of COVID-19 using CT scan images and X-rays.However,these methods suffer from biased results and inaccurate detection of the disease.So,the current research article developed Oppositional-based Chimp Optimization Algorithm and Deep Dense Convolutional Neural Network(OCOA-DDCNN)for COVID-19 prediction using CT images in IoT environment.The proposed methodology works on the basis of two stages such as pre-processing and prediction.Initially,CT scan images generated from prospective COVID-19 are collected from open-source system using IoT devices.The collected images are then preprocessed using Gaussian filter.Gaussian filter can be utilized in the removal of unwanted noise from the collected CT scan images.Afterwards,the preprocessed images are sent to prediction phase.In this phase,Deep Dense Convolutional Neural Network(DDCNN)is applied upon the pre-processed images.The proposed classifier is optimally designed with the consideration of Oppositional-basedChimp Optimization Algorithm(OCOA).This algorithm is utilized in the selection of optimal parameters for the proposed classifier.Finally,the proposed technique is used in the prediction of COVID-19 and classify the results as either COVID-19 or non-COVID-19.The projected method was implemented in MATLAB and the performances were evaluated through statistical measurements.The proposed method was contrasted with conventional techniques such as Convolutional Neural Network-Firefly Algorithm(CNN-FA),Emperor Penguin Optimization(CNN-EPO)respectively.The results established the supremacy of the proposed model.展开更多
Single image super-resolution(SISR)is a fundamentally challenging problem because a low-resolution(LR)image can correspond to a set of high-resolution(HR)images,while most are not expected.Recently,SISR can be achieve...Single image super-resolution(SISR)is a fundamentally challenging problem because a low-resolution(LR)image can correspond to a set of high-resolution(HR)images,while most are not expected.Recently,SISR can be achieved by a deep learning-based method.By constructing a very deep super-resolution convolutional neural network(VDSRCNN),the LR images can be improved to HR images.This study mainly achieves two objectives:image super-resolution(ISR)and deblurring the image from VDSRCNN.Firstly,by analyzing ISR,we modify different training parameters to test the performance of VDSRCNN.Secondly,we add the motion blurred images to the training set to optimize the performance of VDSRCNN.Finally,we use image quality indexes to evaluate the difference between the images from classical methods and VDSRCNN.The results indicate that the VDSRCNN performs better in generating HR images from LR images using the optimized VDSRCNN in a proper method.展开更多
The novel coronavirus 2019(COVID-19)rapidly spreading around the world and turns into a pandemic situation,consequently,detecting the coronavirus(COVID-19)affected patients are now the most critical task for medical s...The novel coronavirus 2019(COVID-19)rapidly spreading around the world and turns into a pandemic situation,consequently,detecting the coronavirus(COVID-19)affected patients are now the most critical task for medical specialists.The deficiency of medical testing kits leading to huge complexity in detecting COVID-19 patients worldwide,resulting in the number of infected cases is expanding.Therefore,a significant study is necessary about detecting COVID-19 patients using an automated diagnosis method,which hinders the spreading of coronavirus.In this paper,the study suggests a Deep Convolutional Neural Network-based multi-classification framework(COV-MCNet)using eight different pre-trained architectures such as VGG16,VGG19,ResNet50V2,DenseNet201,InceptionV3,MobileNet,InceptionResNetV2,Xception which are trained and tested on the X-ray images of COVID-19,Normal,Viral Pneumonia,and Bacterial Pneumonia.The results from 4-class(Normal vs.COVID-19 vs.Viral Pneumonia vs.Bacterial Pneumonia)demonstrated that the pre-trained model DenseNet201 provides the highest classification performance(accuracy:92.54%,precision:93.05%,recall:92.81%,F1-score:92.83%,specificity:97.47%).Notably,the DenseNet201(4-class classification)pre-trained model in the proposed COV-MCNet framework showed higher accuracy compared to the rest seven models.Important to mention that the proposed COV-MCNet model showed comparatively higher classification accuracy based on the small number of pre-processed datasets that specifies the designed system can produce superior results when more data become available.The proposed multi-classification network(COV-MCNet)significantly speeds up the existing radiology based method which will be helpful for the medical community and clinical specialists to early diagnosis the COVID-19 cases during this pandemic.展开更多
As an important component of load transfer,various fatigue damages occur in the track as the rail service life and train traffic increase gradually,such as rail corrugation,rail joint damage,uneven thermite welds,rail ...As an important component of load transfer,various fatigue damages occur in the track as the rail service life and train traffic increase gradually,such as rail corrugation,rail joint damage,uneven thermite welds,rail squats fas-tener defects,etc.Real-time recognition of track defects plays a vital role in ensuring the safe and stable operation of rail transit.In this paper,an intelligent and innovative method is proposed to detect the track defects by using axle-box vibration acceleration and deep learning network,and the coexistence of the above-mentioned typical track defects in the track system is considered.Firstly,the dynamic relationship between the track defects(using the example of the fastening defects)and the axle-box vibration acceleration(ABVA)is investigated using the dynamic vehicle-track model.Then,a simulation model for the coupled dynamics of the vehicle and track with different track defects is established,and the wavelet power spectrum(WPS)analysis is performed for the vibra-tion acceleration signals of the axle box to extract the characteristic response.Lastly,using wavelet spectrum photos as input,an automatic detection technique based on the deep convolution neural network(DCNN)is sug-gested to realize the real-time intelligent detection and identification of various track problems.Thefindings demonstrate that the suggested approach achieves a 96.72%classification accuracy.展开更多
Background:Myopic maculopathy(MM)has become a major cause of visual impairment and blindness worldwide,especially in East Asian countries.Deep learning approaches such as deep convolutional neural networks(DCNN)have b...Background:Myopic maculopathy(MM)has become a major cause of visual impairment and blindness worldwide,especially in East Asian countries.Deep learning approaches such as deep convolutional neural networks(DCNN)have been successfully applied to identify some common retinal diseases and show great potential for the intelligent analysis of MM.This study aimed to build a reliable approach for automated detection of MM from retinal fundus images using DCNN models.Methods:A dual-stream DCNN(DCNN-DS)model that perceives features from both original images and corresponding processed images by color histogram distribution optimization method was designed for classification of no MM,tessellated fundus(TF),and pathologic myopia(PM).A total of 36,515 gradable images from four hospitals were used for DCNN model development,and 14,986 gradable images from the other two hospitals for external testing.We also compared the performance of the DCNN-DS model and four ophthalmologists on 3000 randomly sampledfundus images.Results:The DCNN-DS model achieved sensitivities of 93.3%and 91.0%,specificities of 99.6%and 98.7%,areas under the receiver operating characteristic curves(AUCs)of 0.998 and 0.994 for detecting PM,whereas sensitivities of 98.8%and 92.8%,specificities of 95.6%and 94.1%,AUCs of 0.986 and 0.970 for detecting TF in two external testing datasets.In the sampled testing dataset,the sensitivities of four ophthalmologists ranged from 88.3%to 95.8%and 81.1%to 89.1%,and the specificities ranged from 95.9%to 99.2%and 77.8%to 97.3%for detecting PM and TF,respectively.Meanwhile,the DCNN-DS model achieved sensitivities of 90.8%and 97.9%and specificities of 99.1%and 94.0%for detecting PMand T,respectively.Conclusions:The proposed DCNN-DS approach demonstrated reliable performance with high sensitivity,specificity,and AUC to classify different MM levels on fundus photographs sourced from clinics.It can help identify MM automatically among the large myopic groups and show great potential for real-life applications.展开更多
Deep Convolutional Neural Networks(CNNs)have achieved high accuracy in image classification tasks,however,most existing models are trained on high-quality images that are not subject to image degradation.In practice,i...Deep Convolutional Neural Networks(CNNs)have achieved high accuracy in image classification tasks,however,most existing models are trained on high-quality images that are not subject to image degradation.In practice,images are often affected by various types of degradation which can significantly impact the performance of CNNs.In this work,we investigate the influence of image degradation on three typical image classification CNNs and propose a Degradation Type Adaptive Image Classification Model(DTA-ICM)to improve the existing CNNs’classification accuracy on degraded images.The proposed DTA-ICM comprises two key components:a Degradation Type Predictor(DTP)and a Degradation Type Specified Image Classifier(DTS-IC)set,which is trained on existing CNNs for specified types of degradation.The DTP predicts the degradation type of a test image,and the corresponding DTS-IC is then selected to classify the image.We evaluate the performance of both the proposed DTP and the DTA-ICMon the Caltech 101 database.The experimental results demonstrate that the proposed DTP achieves an average accuracy of 99.70%.Moreover,the proposed DTA-ICM,based on AlexNet,VGG19,and ResNet152,exhibits an average accuracy improvement of 20.63%,18.22%,and 12.9%,respectively,compared with the original CNNs in classifying degraded images.It suggests that the proposed DTA-ICM can effectively improve the classification performance of existing CNNs on degraded images,which has important practical implications.展开更多
Even though much advancements have been achieved with regards to the recognition of handwritten characters,researchers still face difficulties with the handwritten character recognition problem,especially with the adv...Even though much advancements have been achieved with regards to the recognition of handwritten characters,researchers still face difficulties with the handwritten character recognition problem,especially with the advent of new datasets like the Extended Modified National Institute of Standards and Technology dataset(EMNIST).The EMNIST dataset represents a challenge for both machine-learning and deep-learning techniques due to inter-class similarity and intra-class variability.Inter-class similarity exists because of the similarity between the shapes of certain characters in the dataset.The presence of intra-class variability is mainly due to different shapes written by different writers for the same character.In this research,we have optimized a deep residual network to achieve higher accuracy vs.the published state-of-the-art results.This approach is mainly based on the prebuilt deep residual network model ResNet18,whose architecture has been enhanced by using the optimal number of residual blocks and the optimal size of the receptive field of the first convolutional filter,the replacement of the first max-pooling filter by an average pooling filter,and the addition of a drop-out layer before the fully connected layer.A distinctive modification has been introduced by replacing the final addition layer with a depth concatenation layer,which resulted in a novel deep architecture having higher accuracy vs.the pure residual architecture.Moreover,the dataset images’sizes have been adjusted to optimize their visibility in the network.Finally,by tuning the training hyperparameters and using rotation and shear augmentations,the proposed model outperformed the state-of-the-art models by achieving average accuracies of 95.91%and 90.90%for the Letters and Balanced dataset sections,respectively.Furthermore,the average accuracies were improved to 95.9%and 91.06%for the Letters and Balanced sections,respectively,by using a group of 5 instances of the trained models and averaging the output class probabilities.展开更多
Traditional vehicle detection algorithms use traverse search based vehicle candidate generation and hand crafted based classifier training for vehicle candidate verification.These types of methods generally have high ...Traditional vehicle detection algorithms use traverse search based vehicle candidate generation and hand crafted based classifier training for vehicle candidate verification.These types of methods generally have high processing times and low vehicle detection performance.To address this issue,a visual saliency and deep sparse convolution hierarchical model based vehicle detection algorithm is proposed.A visual saliency calculation is firstly used to generate a small vehicle candidate area.The vehicle candidate sub images are then loaded into a sparse deep convolution hierarchical model with an SVM-based classifier to perform the final detection.The experimental results demonstrate that the proposed method is with 94.81% correct rate and 0.78% false detection rate on the existing datasets and the real road pictures captured by our group,which outperforms the existing state-of-the-art algorithms.More importantly,high discriminative multi-scale features are generated by deep sparse convolution network which has broad application prospects in target recognition in the field of intelligent vehicle.展开更多
Few-shot semantic segmentation aims at training a model that can segment novel classes in a query image with only a few densely annotated support exemplars.It remains a challenge because of large intra-class variation...Few-shot semantic segmentation aims at training a model that can segment novel classes in a query image with only a few densely annotated support exemplars.It remains a challenge because of large intra-class variations between the support and query images.Existing approaches utilize 4D convolutions to mine semantic correspondence between the support and query images.However,they still suffer from heavy computation,sparse correspondence,and large memory.We propose axial assembled correspondence network(AACNet)to alleviate these issues.The key point of AACNet is the proposed axial assembled 4D kernel,which constructs the basic block for semantic correspondence encoder(SCE).Furthermore,we propose the deblurring equations to provide more robust correspondence for the aforementioned SCE and design a novel fusion module to mix correspondences in a learnable manner.Experiments on PASCAL-5~i reveal that our AACNet achieves a mean intersection-over-union score of 65.9%for 1-shot segmentation and 70.6%for 5-shot segmentation,surpassing the state-of-the-art method by 5.8%and 5.0%respectively.展开更多
Offensive messages on social media,have recently been frequently used to harass and criticize people.In recent studies,many promising algorithms have been developed to identify offensive texts.Most algorithms analyze ...Offensive messages on social media,have recently been frequently used to harass and criticize people.In recent studies,many promising algorithms have been developed to identify offensive texts.Most algorithms analyze text in a unidirectional manner,where a bidirectional method can maximize performance results and capture semantic and contextual information in sentences.In addition,there are many separate models for identifying offensive texts based on monolin-gual and multilingual,but there are a few models that can detect both monolingual and multilingual-based offensive texts.In this study,a detection system has been developed for both monolingual and multilingual offensive texts by combining deep convolutional neural network and bidirectional encoder representations from transformers(Deep-BERT)to identify offensive posts on social media that are used to harass others.This paper explores a variety of ways to deal with multilin-gualism,including collaborative multilingual and translation-based approaches.Then,the Deep-BERT is tested on the Bengali and English datasets,including the different bidirectional encoder representations from transformers(BERT)pre-trained word-embedding techniques,and found that the proposed Deep-BERT’s efficacy outperformed all existing offensive text classification algorithms reaching an accuracy of 91.83%.The proposed model is a state-of-the-art model that can classify both monolingual-based and multilingual-based offensive texts.展开更多
基金supported by the National Natural Science Foundation of China-China State Railway Group Co.,Ltd.Railway Basic Research Joint Fund (Grant No.U2268217)the Scientific Funding for China Academy of Railway Sciences Corporation Limited (No.2021YJ183).
文摘Graph Convolutional Neural Networks(GCNs)have been widely used in various fields due to their powerful capabilities in processing graph-structured data.However,GCNs encounter significant challenges when applied to scale-free graphs with power-law distributions,resulting in substantial distortions.Moreover,most of the existing GCN models are shallow structures,which restricts their ability to capture dependencies among distant nodes and more refined high-order node features in scale-free graphs with hierarchical structures.To more broadly and precisely apply GCNs to real-world graphs exhibiting scale-free or hierarchical structures and utilize multi-level aggregation of GCNs for capturing high-level information in local representations,we propose the Hyperbolic Deep Graph Convolutional Neural Network(HDGCNN),an end-to-end deep graph representation learning framework that can map scale-free graphs from Euclidean space to hyperbolic space.In HDGCNN,we define the fundamental operations of deep graph convolutional neural networks in hyperbolic space.Additionally,we introduce a hyperbolic feature transformation method based on identity mapping and a dense connection scheme based on a novel non-local message passing framework.In addition,we present a neighborhood aggregation method that combines initial structural featureswith hyperbolic attention coefficients.Through the above methods,HDGCNN effectively leverages both the structural features and node features of graph data,enabling enhanced exploration of non-local structural features and more refined node features in scale-free or hierarchical graphs.Experimental results demonstrate that HDGCNN achieves remarkable performance improvements over state-ofthe-art GCNs in node classification and link prediction tasks,even when utilizing low-dimensional embedding representations.Furthermore,when compared to shallow hyperbolic graph convolutional neural network models,HDGCNN exhibits notable advantages and performance enhancements.
文摘The development of defect prediction plays a significant role in improving software quality. Such predictions are used to identify defective modules before the testing and to minimize the time and cost. The software with defects negatively impacts operational costs and finally affects customer satisfaction. Numerous approaches exist to predict software defects. However, the timely and accurate software bugs are the major challenging issues. To improve the timely and accurate software defect prediction, a novel technique called Nonparametric Statistical feature scaled QuAdratic regressive convolution Deep nEural Network (SQADEN) is introduced. The proposed SQADEN technique mainly includes two major processes namely metric or feature selection and classification. First, the SQADEN uses the nonparametric statistical Torgerson–Gower scaling technique for identifying the relevant software metrics by measuring the similarity using the dice coefficient. The feature selection process is used to minimize the time complexity of software fault prediction. With the selected metrics, software fault perdition with the help of the Quadratic Censored regressive convolution deep neural network-based classification. The deep learning classifier analyzes the training and testing samples using the contingency correlation coefficient. The softstep activation function is used to provide the final fault prediction results. To minimize the error, the Nelder–Mead method is applied to solve non-linear least-squares problems. Finally, accurate classification results with a minimum error are obtained at the output layer. Experimental evaluation is carried out with different quantitative metrics such as accuracy, precision, recall, F-measure, and time complexity. The analyzed results demonstrate the superior performance of our proposed SQADEN technique with maximum accuracy, sensitivity and specificity by 3%, 3%, 2% and 3% and minimum time and space by 13% and 15% when compared with the two state-of-the-art methods.
基金National Key R&D Program of China(2021YFC3000905)Open Research Program of the State Key Laboratory of Severe Weather(2022LASW-B09)National Natural Science Foundation of China(42375010)。
文摘In this paper,we utilized the deep convolutional neural network D-LinkNet,a model for semantic segmentation,to analyze the Himawari-8 satellite data captured from 16 channels at a spatial resolution of 0.5 km,with a focus on the area over the Yellow Sea and the Bohai Sea(32°-42°N,117°-127°E).The objective was to develop an algorithm for fusing and segmenting multi-channel images from geostationary meteorological satellites,specifically for monitoring sea fog in this region.Firstly,the extreme gradient boosting algorithm was adopted to evaluate the data from the 16 channels of the Himawari-8 satellite for sea fog detection,and we found that the top three channels in order of importance were channels 3,4,and 14,which were fused into false color daytime images,while channels 7,13,and 15 were fused into false color nighttime images.Secondly,the simple linear iterative super-pixel clustering algorithm was used for the pixel-level segmentation of false color images,and based on super-pixel blocks,manual sea-fog annotation was performed to obtain fine-grained annotation labels.The deep convolutional neural network D-LinkNet was built on the ResNet backbone and the dilated convolutional layers with direct connections were added in the central part to form a string-and-combine structure with five branches having different depths and receptive fields.Results show that the accuracy rate of fog area(proportion of detected real fog to detected fog)was 66.5%,the recognition rate of fog zone(proportion of detected real fog to real fog or cloud cover)was 51.9%,and the detection accuracy rate(proportion of samples detected correctly to total samples)was 93.2%.
文摘This paper proposes a cascade deep convolutional neural network to address the loosening detection problem of bolts on axlebox covers.Firstly,an SSD network based on ResNet50 and CBAM module by improving bolt image features is proposed for locating bolts on axlebox covers.And then,theA2-PFN is proposed according to the slender features of the marker lines for extracting more accurate marker lines regions of the bolts.Finally,a rectangular approximationmethod is proposed to regularize themarker line regions asaway tocalculate the angle of themarker line and plot all the angle values into an angle table,according to which the criteria of the angle table can determine whether the bolt with the marker line is in danger of loosening.Meanwhile,our improved algorithm is compared with the pre-improved algorithmin the object localization stage.The results show that our proposed method has a significant improvement in both detection accuracy and detection speed,where ourmAP(IoU=0.75)reaches 0.77 and fps reaches 16.6.And in the saliency detection stage,after qualitative comparison and quantitative comparison,our method significantly outperforms other state-of-the-art methods,where our MAE reaches 0.092,F-measure reaches 0.948 and AUC reaches 0.943.Ultimately,according to the angle table,out of 676 bolt samples,a total of 60 bolts are loose,69 bolts are at risk of loosening,and 547 bolts are tightened.
文摘Enabling high mobility applications in millimeter wave(mmWave)based systems opens up a slew of new possibilities,including vehicle communi-cations in addition to wireless virtual/augmented reality.The narrow beam usage in addition to the millimeter waves sensitivity might block the coverage along with the reliability of the mobile links.In this research work,the improvement in the quality of experience faced by the user for multimedia-related applications over the millimeter-wave band is investigated.The high attenuation loss in high frequencies is compensated with a massive array structure named Multiple Input and Multiple Output(MIMO)which is utilized in a hyperdense environment called heterogeneous networks(HetNet).The optimization problem which arises while maximizing the Mean Opinion Score(MOS)is analyzed along with the QoE(Quality of Experience)metric by considering the Base Station(BS)powers in addition to the needed Quality of Service(QoS).Most of the approaches related to wireless network communication are not suitable for the millimeter-wave band because of its problems due to high complexity and its dynamic nature.Hence a deep reinforcement learning framework is developed for tackling the same opti-mization problem.In this work,a Fuzzy-based Deep Convolutional Neural Net-work(FDCNN)is proposed in addition to a Deep Reinforcing Learning Framework(DRLF)for extracting the features of highly correlated data.The investigational results prove that the proposed method yields the highest satisfac-tion to the user by increasing the number of antennas in addition with the small-scale antennas at the base stations.The proposed work outperforms in terms of MOS with multiple antennas.
文摘In recent years,computer visionfinds wide applications in maritime surveillance with its sophisticated algorithms and advanced architecture.Auto-matic ship detection with computer vision techniques provide an efficient means to monitor as well as track ships in water bodies.Waterways being an important medium of transport require continuous monitoring for protection of national security.The remote sensing satellite images of ships in harbours and water bodies are the image data that aid the neural network models to localize ships and to facilitate early identification of possible threats at sea.This paper proposes a deep learning based model capable enough to classify between ships and no-ships as well as to localize ships in the original images using bounding box tech-nique.Furthermore,classified ships are again segmented with deep learning based auto-encoder model.The proposed model,in terms of classification,provides suc-cessful results generating 99.5%and 99.2%validation and training accuracy respectively.The auto-encoder model also produces 85.1%and 84.2%validation and training accuracies.Moreover the IoU metric of the segmented images is found to be of 0.77 value.The experimental results reveal that the model is accu-rate and can be implemented for automatic ship detection in water bodies consid-ering remote sensing satellite images as input to the computer vision system.
文摘The most widely farmed fruit in the world is mango.Both the production and quality of the mangoes are hampered by many diseases.These diseases need to be effectively controlled and mitigated.Therefore,a quick and accurate diagnosis of the disorders is essential.Deep convolutional neural networks,renowned for their independence in feature extraction,have established their value in numerous detection and classification tasks.However,it requires large training datasets and several parameters that need careful adjustment.The proposed Modified Dense Convolutional Network(MDCN)provides a successful classification scheme for plant diseases affecting mango leaves.This model employs the strength of pre-trained networks and modifies them for the particular context of mango leaf diseases by incorporating transfer learning techniques.The data loader also builds mini-batches for training the models to reduce training time.Finally,optimization approaches help increase the overall model’s efficiency and lower computing costs.MDCN employed on the MangoLeafBD Dataset consists of a total of 4,000 images.Following the experimental results,the proposed system is compared with existing techniques and it is clear that the proposed algorithm surpasses the existing algorithms by achieving high performance and overall throughput.
基金supported by the Fundamental Research Funds for the Central Universities of China(Grant No.2013SCU11006)the Key Laboratory of Digital Mapping and Land Information Application of National Administration of Surveying,Mapping and Geoinformation of China(Grant NO.DM2014SC02)the Key Laboratory of Geospecial Information Technology,Ministry of Land and Resources of China(Grant NO.KLGSIT201504)
文摘The development of precision agriculture demands high accuracy and efficiency of cultivated land information extraction. As a new means of monitoring the ground in recent years, unmanned aerial vehicle (UAV) low-height remote sensing technique, which is flexible, efficient with low cost and with high resolution, is widely applied to investing various resources. Based on this, a novel extraction method for cultivated land information based on Deep Convolutional Neural Network and Transfer Learning (DTCLE) was proposed. First, linear features (roads and ridges etc.) were excluded based on Deep Convolutional Neural Network (DCNN). Next, feature extraction method learned from DCNN was used to cultivated land information extraction by introducing transfer learning mechanism. Last, cultivated land information extraction results were completed by the DTCLE and eCognifion for cultivated land information extraction (ECLE). The location of the Pengzhou County and Guanghan County, Sichuan Province were selected for the experimental purpose. The experimental results showed that the overall precision for the experimental image 1, 2 and 3 (of extracting cultivated land) with the DTCLE method was 91.7%, 88.1% and 88.2% respectively, and the overall precision of ECLE is 9o.7%, 90.5% and 87.0%, respectively. Accuracy of DTCLE was equivalent to that of ECLE, and also outperformed ECLE in terms of integrity and continuity.
基金This work was supported by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(NRF-2018R1A2B6007333).
文摘Compressive strength of concrete is a significant factor to assess building structure health and safety.Therefore,various methods have been developed to evaluate the compressive strength of concrete structures.However,previous methods have several challenges in costly,time-consuming,and unsafety.To address these drawbacks,this paper proposed a digital vision based concrete compressive strength evaluating model using deep convolutional neural network(DCNN).The proposed model presented an alternative approach to evaluating the concrete strength and contributed to improving efficiency and accuracy.The model was developed with 4,000 digital images and 61,996 images extracted from video recordings collected from concrete samples.The experimental results indicated a root mean square error(RMSE)value of 3.56(MPa),demonstrating a strong feasibility that the proposed model can be utilized to predict the concrete strength with digital images of their surfaces and advantages to overcome the previous limitations.This experiment contributed to provide the basis that could be extended to future research with image analysis technique and artificial neural network in the diagnosis of concrete building structures.
基金This work was supported by the National Research Foundation of Korea(NRF)grant funded by the Korea government(MSIT)(NRF-2018R1A2B6007333)This study was supported by 2018 Research Grant from Kangwon National University.
文摘In this study,we examined the efficacy of a deep convolutional neural network(DCNN)in recognizing concrete surface images and predicting the compressive strength of concrete.A digital single-lens reflex(DSLR)camera and microscope were simultaneously used to obtain concrete surface images used as the input data for the DCNN.Thereafter,training,validation,and testing of the DCNNs were performed based on the DSLR camera and microscope image data.Results of the analysis indicated that the DCNN employing DSLR image data achieved a relatively higher accuracy.The accuracy of the DSLR-derived image data was attributed to the relatively wider range of the DSLR camera,which was beneficial for extracting a larger number of features.Moreover,the DSLR camera procured more realistic images than the microscope.Thus,when the compressive strength of concrete was evaluated using the DCNN employing a DSLR camera,time and cost were reduced,whereas the usefulness increased.Furthermore,an indirect comparison of the accuracy of the DCNN with that of existing non-destructive methods for evaluating the strength of concrete proved the reliability of DCNN-derived concrete strength predictions.In addition,it was determined that the DCNN used for concrete strength evaluations in this study can be further expanded to detect and evaluate various deteriorative factors that affect the durability of structures,such as salt damage,carbonation,sulfation,corrosion,and freezing-thawing.
文摘Early diagnosis and detection are important tasks in controlling the spread of COVID-19.A number of Deep Learning techniques has been established by researchers to detect the presence of COVID-19 using CT scan images and X-rays.However,these methods suffer from biased results and inaccurate detection of the disease.So,the current research article developed Oppositional-based Chimp Optimization Algorithm and Deep Dense Convolutional Neural Network(OCOA-DDCNN)for COVID-19 prediction using CT images in IoT environment.The proposed methodology works on the basis of two stages such as pre-processing and prediction.Initially,CT scan images generated from prospective COVID-19 are collected from open-source system using IoT devices.The collected images are then preprocessed using Gaussian filter.Gaussian filter can be utilized in the removal of unwanted noise from the collected CT scan images.Afterwards,the preprocessed images are sent to prediction phase.In this phase,Deep Dense Convolutional Neural Network(DDCNN)is applied upon the pre-processed images.The proposed classifier is optimally designed with the consideration of Oppositional-basedChimp Optimization Algorithm(OCOA).This algorithm is utilized in the selection of optimal parameters for the proposed classifier.Finally,the proposed technique is used in the prediction of COVID-19 and classify the results as either COVID-19 or non-COVID-19.The projected method was implemented in MATLAB and the performances were evaluated through statistical measurements.The proposed method was contrasted with conventional techniques such as Convolutional Neural Network-Firefly Algorithm(CNN-FA),Emperor Penguin Optimization(CNN-EPO)respectively.The results established the supremacy of the proposed model.
文摘Single image super-resolution(SISR)is a fundamentally challenging problem because a low-resolution(LR)image can correspond to a set of high-resolution(HR)images,while most are not expected.Recently,SISR can be achieved by a deep learning-based method.By constructing a very deep super-resolution convolutional neural network(VDSRCNN),the LR images can be improved to HR images.This study mainly achieves two objectives:image super-resolution(ISR)and deblurring the image from VDSRCNN.Firstly,by analyzing ISR,we modify different training parameters to test the performance of VDSRCNN.Secondly,we add the motion blurred images to the training set to optimize the performance of VDSRCNN.Finally,we use image quality indexes to evaluate the difference between the images from classical methods and VDSRCNN.The results indicate that the VDSRCNN performs better in generating HR images from LR images using the optimized VDSRCNN in a proper method.
文摘The novel coronavirus 2019(COVID-19)rapidly spreading around the world and turns into a pandemic situation,consequently,detecting the coronavirus(COVID-19)affected patients are now the most critical task for medical specialists.The deficiency of medical testing kits leading to huge complexity in detecting COVID-19 patients worldwide,resulting in the number of infected cases is expanding.Therefore,a significant study is necessary about detecting COVID-19 patients using an automated diagnosis method,which hinders the spreading of coronavirus.In this paper,the study suggests a Deep Convolutional Neural Network-based multi-classification framework(COV-MCNet)using eight different pre-trained architectures such as VGG16,VGG19,ResNet50V2,DenseNet201,InceptionV3,MobileNet,InceptionResNetV2,Xception which are trained and tested on the X-ray images of COVID-19,Normal,Viral Pneumonia,and Bacterial Pneumonia.The results from 4-class(Normal vs.COVID-19 vs.Viral Pneumonia vs.Bacterial Pneumonia)demonstrated that the pre-trained model DenseNet201 provides the highest classification performance(accuracy:92.54%,precision:93.05%,recall:92.81%,F1-score:92.83%,specificity:97.47%).Notably,the DenseNet201(4-class classification)pre-trained model in the proposed COV-MCNet framework showed higher accuracy compared to the rest seven models.Important to mention that the proposed COV-MCNet model showed comparatively higher classification accuracy based on the small number of pre-processed datasets that specifies the designed system can produce superior results when more data become available.The proposed multi-classification network(COV-MCNet)significantly speeds up the existing radiology based method which will be helpful for the medical community and clinical specialists to early diagnosis the COVID-19 cases during this pandemic.
基金supported by the Doctoral Fund Project(Grant No.X22003Z).
文摘As an important component of load transfer,various fatigue damages occur in the track as the rail service life and train traffic increase gradually,such as rail corrugation,rail joint damage,uneven thermite welds,rail squats fas-tener defects,etc.Real-time recognition of track defects plays a vital role in ensuring the safe and stable operation of rail transit.In this paper,an intelligent and innovative method is proposed to detect the track defects by using axle-box vibration acceleration and deep learning network,and the coexistence of the above-mentioned typical track defects in the track system is considered.Firstly,the dynamic relationship between the track defects(using the example of the fastening defects)and the axle-box vibration acceleration(ABVA)is investigated using the dynamic vehicle-track model.Then,a simulation model for the coupled dynamics of the vehicle and track with different track defects is established,and the wavelet power spectrum(WPS)analysis is performed for the vibra-tion acceleration signals of the axle box to extract the characteristic response.Lastly,using wavelet spectrum photos as input,an automatic detection technique based on the deep convolution neural network(DCNN)is sug-gested to realize the real-time intelligent detection and identification of various track problems.Thefindings demonstrate that the suggested approach achieves a 96.72%classification accuracy.
基金The research has been supported by the Qingdao Science and Technology Demonstration and Guidance Project(Grant No.20-3-4-45-nsh)Academic Promotion Plan of Shandong First Medical University&Shandong Academy of Medical Sciences(Grant No.2019ZL001)National Science and Technology Major Project of China(Grant No.2017ZX09304010).
文摘Background:Myopic maculopathy(MM)has become a major cause of visual impairment and blindness worldwide,especially in East Asian countries.Deep learning approaches such as deep convolutional neural networks(DCNN)have been successfully applied to identify some common retinal diseases and show great potential for the intelligent analysis of MM.This study aimed to build a reliable approach for automated detection of MM from retinal fundus images using DCNN models.Methods:A dual-stream DCNN(DCNN-DS)model that perceives features from both original images and corresponding processed images by color histogram distribution optimization method was designed for classification of no MM,tessellated fundus(TF),and pathologic myopia(PM).A total of 36,515 gradable images from four hospitals were used for DCNN model development,and 14,986 gradable images from the other two hospitals for external testing.We also compared the performance of the DCNN-DS model and four ophthalmologists on 3000 randomly sampledfundus images.Results:The DCNN-DS model achieved sensitivities of 93.3%and 91.0%,specificities of 99.6%and 98.7%,areas under the receiver operating characteristic curves(AUCs)of 0.998 and 0.994 for detecting PM,whereas sensitivities of 98.8%and 92.8%,specificities of 95.6%and 94.1%,AUCs of 0.986 and 0.970 for detecting TF in two external testing datasets.In the sampled testing dataset,the sensitivities of four ophthalmologists ranged from 88.3%to 95.8%and 81.1%to 89.1%,and the specificities ranged from 95.9%to 99.2%and 77.8%to 97.3%for detecting PM and TF,respectively.Meanwhile,the DCNN-DS model achieved sensitivities of 90.8%and 97.9%and specificities of 99.1%and 94.0%for detecting PMand T,respectively.Conclusions:The proposed DCNN-DS approach demonstrated reliable performance with high sensitivity,specificity,and AUC to classify different MM levels on fundus photographs sourced from clinics.It can help identify MM automatically among the large myopic groups and show great potential for real-life applications.
基金This work was supported by Special Funds for the Construction of an Innovative Province of Hunan(GrantNo.2020GK2028)lNatural Science Foundation of Hunan Province(Grant No.2022JJ30002)lScientific Research Project of Hunan Provincial EducationDepartment(GrantNo.21B0833)lScientific Research Key Project of Hunan Education Department(Grant No.21A0592)lScientific Research Project of Hunan Provincial Education Department(Grant No.22A0663).
文摘Deep Convolutional Neural Networks(CNNs)have achieved high accuracy in image classification tasks,however,most existing models are trained on high-quality images that are not subject to image degradation.In practice,images are often affected by various types of degradation which can significantly impact the performance of CNNs.In this work,we investigate the influence of image degradation on three typical image classification CNNs and propose a Degradation Type Adaptive Image Classification Model(DTA-ICM)to improve the existing CNNs’classification accuracy on degraded images.The proposed DTA-ICM comprises two key components:a Degradation Type Predictor(DTP)and a Degradation Type Specified Image Classifier(DTS-IC)set,which is trained on existing CNNs for specified types of degradation.The DTP predicts the degradation type of a test image,and the corresponding DTS-IC is then selected to classify the image.We evaluate the performance of both the proposed DTP and the DTA-ICMon the Caltech 101 database.The experimental results demonstrate that the proposed DTP achieves an average accuracy of 99.70%.Moreover,the proposed DTA-ICM,based on AlexNet,VGG19,and ResNet152,exhibits an average accuracy improvement of 20.63%,18.22%,and 12.9%,respectively,compared with the original CNNs in classifying degraded images.It suggests that the proposed DTA-ICM can effectively improve the classification performance of existing CNNs on degraded images,which has important practical implications.
文摘Even though much advancements have been achieved with regards to the recognition of handwritten characters,researchers still face difficulties with the handwritten character recognition problem,especially with the advent of new datasets like the Extended Modified National Institute of Standards and Technology dataset(EMNIST).The EMNIST dataset represents a challenge for both machine-learning and deep-learning techniques due to inter-class similarity and intra-class variability.Inter-class similarity exists because of the similarity between the shapes of certain characters in the dataset.The presence of intra-class variability is mainly due to different shapes written by different writers for the same character.In this research,we have optimized a deep residual network to achieve higher accuracy vs.the published state-of-the-art results.This approach is mainly based on the prebuilt deep residual network model ResNet18,whose architecture has been enhanced by using the optimal number of residual blocks and the optimal size of the receptive field of the first convolutional filter,the replacement of the first max-pooling filter by an average pooling filter,and the addition of a drop-out layer before the fully connected layer.A distinctive modification has been introduced by replacing the final addition layer with a depth concatenation layer,which resulted in a novel deep architecture having higher accuracy vs.the pure residual architecture.Moreover,the dataset images’sizes have been adjusted to optimize their visibility in the network.Finally,by tuning the training hyperparameters and using rotation and shear augmentations,the proposed model outperformed the state-of-the-art models by achieving average accuracies of 95.91%and 90.90%for the Letters and Balanced dataset sections,respectively.Furthermore,the average accuracies were improved to 95.9%and 91.06%for the Letters and Balanced sections,respectively,by using a group of 5 instances of the trained models and averaging the output class probabilities.
基金Supported by National Natural Science Foundation of China(Grant Nos.U1564201,61573171,61403172,51305167)China Postdoctoral Science Foundation(Grant Nos.2015T80511,2014M561592)+3 种基金Jiangsu Provincial Natural Science Foundation of China(Grant No.BK20140555)Six Talent Peaks Project of Jiangsu Province,China(Grant Nos.2015-JXQC-012,2014-DZXX-040)Jiangsu Postdoctoral Science Foundation,China(Grant No.1402097C)Jiangsu University Scientific Research Foundation for Senior Professionals,China(Grant No.14JDG028)
文摘Traditional vehicle detection algorithms use traverse search based vehicle candidate generation and hand crafted based classifier training for vehicle candidate verification.These types of methods generally have high processing times and low vehicle detection performance.To address this issue,a visual saliency and deep sparse convolution hierarchical model based vehicle detection algorithm is proposed.A visual saliency calculation is firstly used to generate a small vehicle candidate area.The vehicle candidate sub images are then loaded into a sparse deep convolution hierarchical model with an SVM-based classifier to perform the final detection.The experimental results demonstrate that the proposed method is with 94.81% correct rate and 0.78% false detection rate on the existing datasets and the real road pictures captured by our group,which outperforms the existing state-of-the-art algorithms.More importantly,high discriminative multi-scale features are generated by deep sparse convolution network which has broad application prospects in target recognition in the field of intelligent vehicle.
基金supported in part by the Key Research and Development Program of Guangdong Province(2021B0101200001)the Guangdong Basic and Applied Basic Research Foundation(2020B1515120071)。
文摘Few-shot semantic segmentation aims at training a model that can segment novel classes in a query image with only a few densely annotated support exemplars.It remains a challenge because of large intra-class variations between the support and query images.Existing approaches utilize 4D convolutions to mine semantic correspondence between the support and query images.However,they still suffer from heavy computation,sparse correspondence,and large memory.We propose axial assembled correspondence network(AACNet)to alleviate these issues.The key point of AACNet is the proposed axial assembled 4D kernel,which constructs the basic block for semantic correspondence encoder(SCE).Furthermore,we propose the deblurring equations to provide more robust correspondence for the aforementioned SCE and design a novel fusion module to mix correspondences in a learnable manner.Experiments on PASCAL-5~i reveal that our AACNet achieves a mean intersection-over-union score of 65.9%for 1-shot segmentation and 70.6%for 5-shot segmentation,surpassing the state-of-the-art method by 5.8%and 5.0%respectively.
文摘Offensive messages on social media,have recently been frequently used to harass and criticize people.In recent studies,many promising algorithms have been developed to identify offensive texts.Most algorithms analyze text in a unidirectional manner,where a bidirectional method can maximize performance results and capture semantic and contextual information in sentences.In addition,there are many separate models for identifying offensive texts based on monolin-gual and multilingual,but there are a few models that can detect both monolingual and multilingual-based offensive texts.In this study,a detection system has been developed for both monolingual and multilingual offensive texts by combining deep convolutional neural network and bidirectional encoder representations from transformers(Deep-BERT)to identify offensive posts on social media that are used to harass others.This paper explores a variety of ways to deal with multilin-gualism,including collaborative multilingual and translation-based approaches.Then,the Deep-BERT is tested on the Bengali and English datasets,including the different bidirectional encoder representations from transformers(BERT)pre-trained word-embedding techniques,and found that the proposed Deep-BERT’s efficacy outperformed all existing offensive text classification algorithms reaching an accuracy of 91.83%.The proposed model is a state-of-the-art model that can classify both monolingual-based and multilingual-based offensive texts.