●AIM:To establish a classification for congenital cataracts that can facilitate individualized treatment and help identify individuals with a high likelihood of different visual outcomes.●METHODS:Consecutive patient...●AIM:To establish a classification for congenital cataracts that can facilitate individualized treatment and help identify individuals with a high likelihood of different visual outcomes.●METHODS:Consecutive patients diagnosed with congenital cataracts and undergoing surgery between January 2005 and November 2021 were recruited.Data on visual outcomes and the phenotypic characteristics of ocular biometry and the anterior and posterior segments were extracted from the patients’medical records.A hierarchical cluster analysis was performed.The main outcome measure was the identification of distinct clusters of eyes with congenital cataracts.●RESULTS:A total of 164 children(299 eyes)were divided into two clusters based on their ocular features.Cluster 1(96 eyes)had a shorter axial length(mean±SD,19.44±1.68 mm),a low prevalence of macular abnormalities(1.04%),and no retinal abnormalities or posterior cataracts.Cluster 2(203 eyes)had a greater axial length(mean±SD,20.42±2.10 mm)and a higher prevalence of macular abnormalities(8.37%),retinal abnormalities(98.52%),and posterior cataracts(4.93%).Compared with the eyes in Cluster 2(57.14%),those in Cluster 1(71.88%)had a 2.2 times higher chance of good best-corrected visual acuity[<0.7 logMAR;OR(95%CI),2.20(1.25–3.81);P=0.006].●CONCLUSION:This retrospective study categorizes congenital cataracts into two distinct clusters,each associated with a different likelihood of visual outcomes.This innovative classification may enable the personalization and prioritization of early interventions for patients who may gain the greatest benefit,thereby making strides toward precision medicine in the field of congenital cataracts.展开更多
Deep learning has achieved excellent results in various tasks in the field of computer vision,especially in fine-grained visual categorization.It aims to distinguish the subordinate categories of the label-level categ...Deep learning has achieved excellent results in various tasks in the field of computer vision,especially in fine-grained visual categorization.It aims to distinguish the subordinate categories of the label-level categories.Due to high intra-class variances and high inter-class similarity,the fine-grained visual categorization is extremely challenging.This paper first briefly introduces and analyzes the related public datasets.After that,some of the latest methods are reviewed.Based on the feature types,the feature processing methods,and the overall structure used in the model,we divide them into three types of methods:methods based on general convolutional neural network(CNN)and strong supervision of parts,methods based on single feature processing,and meth-ods based on multiple feature processing.Most methods of the first type have a relatively simple structure,which is the result of the initial research.The methods of the other two types include models that have special structures and training processes,which are helpful to obtain discriminative features.We conduct a specific analysis on several methods with high accuracy on pub-lic datasets.In addition,we support that the focus of the future research is to solve the demand of existing methods for the large amount of the data and the computing power.In terms of tech-nology,the extraction of the subtle feature information with the burgeoning vision transformer(ViT)network is also an important research direction.展开更多
Based on reviews and summaries of the naming schemes of fine-grained sedimentary rocks, and analysis of characteristics of fine-grained sedimentary rocks, the problems existing in the classification and naming of fine...Based on reviews and summaries of the naming schemes of fine-grained sedimentary rocks, and analysis of characteristics of fine-grained sedimentary rocks, the problems existing in the classification and naming of fine-grained sedimentary rocks are discussed. On this basis, following the principle of three-level nomenclature, a new scheme of rock classification and naming for fine-grained sedimentary rocks is determined from two perspectives: First, fine-grained sedimentary rocks are divided into 12 types in two major categories, mudstone and siltstone, according to particle size(sand, silt and mud). Second,fine-grained sedimentary rocks are divided into 18 types in four categories, carbonate rock, fine-grained felsic sedimentary rock,clay rock and mixed fine-grained sedimentary rock according to mineral composition(carbonate minerals, felsic detrital minerals and clay minerals as three end elements). Considering the importance of organic matter in unconventional oil and gas generation and evaluation, organic matter is taken as the fourth element in the scheme. Taking the organic matter contents of 0.5% and 2% as dividing points, fine grained sedimentary rocks are divided into three categories, organic-poor, organic-bearing,and organic-rich ones. The new scheme meets the requirement of unconventional oil and gas exploration and development today and solves the problem of conceptual confusion in fine-grained sedimentary rocks, providing a unified basic term system for the research of fine-grained sedimentology.展开更多
The remote sensing ships’fine-grained classification technology makes it possible to identify certain ship types in remote sensing images,and it has broad application prospects in civil and military fields.However,th...The remote sensing ships’fine-grained classification technology makes it possible to identify certain ship types in remote sensing images,and it has broad application prospects in civil and military fields.However,the current model does not examine the properties of ship targets in remote sensing images with mixed multi-granularity features and a complicated backdrop.There is still an opportunity for future enhancement of the classification impact.To solve the challenges brought by the above characteristics,this paper proposes a Metaformer and Residual fusion network based on Visual Attention Network(VAN-MR)for fine-grained classification tasks.For the complex background of remote sensing images,the VAN-MR model adopts the parallel structure of large kernel attention and spatial attention to enhance the model’s feature extraction ability of interest targets and improve the classification performance of remote sensing ship targets.For the problem of multi-grained feature mixing in remote sensing images,the VAN-MR model uses a Metaformer structure and a parallel network of residual modules to extract ship features.The parallel network has different depths,considering both high-level and lowlevel semantic information.The model achieves better classification performance in remote sensing ship images with multi-granularity mixing.Finally,the model achieves 88.73%and 94.56%accuracy on the public fine-grained ship collection-23(FGSC-23)and FGSCR-42 datasets,respectively,while the parameter size is only 53.47 M,the floating point operations is 9.9 G.The experimental results show that the classification effect of VAN-MR is superior to that of traditional CNNs model and visual model with Transformer structure under the same parameter quantity.展开更多
The process of human natural scene categorization consists of two correlated stages: visual perception and visual cognition of natural scenes.Inspired by this fact,we propose a biologically plausible approach for natu...The process of human natural scene categorization consists of two correlated stages: visual perception and visual cognition of natural scenes.Inspired by this fact,we propose a biologically plausible approach for natural scene image classification.This approach consists of one visual perception model and two visual cognition models.The visual perception model,composed of two steps,is used to extract discriminative features from natural scene images.In the first step,we mimic the oriented and bandpass properties of human primary visual cortex by a special complex wavelets transform,which can decompose a natural scene image into a series of 2D spatial structure signals.In the second step,a hybrid statistical feature extraction method is used to generate gist features from those 2D spatial structure signals.Then we design a cognitive feedback model to realize adaptive optimization for the visual perception model.At last,we build a multiple semantics based cognition model to imitate human cognitive mode in rapid natural scene categorization.Experiments on natural scene datasets show that the proposed method achieves high efficiency and accuracy for natural scene classification.展开更多
The Corona Virus Disease 2019(COVID-19)effect has made telecommuting and remote learning the norm.The growing number of Internet-connected devices provides cyber attackers with more attack vectors.The development of m...The Corona Virus Disease 2019(COVID-19)effect has made telecommuting and remote learning the norm.The growing number of Internet-connected devices provides cyber attackers with more attack vectors.The development of malware by criminals also incorporates a number of sophisticated obfuscation techniques,making it difficult to classify and detect malware using conventional approaches.Therefore,this paper proposes a novel visualization-based malware classification system using transfer and ensemble learning(VMCTE).VMCTE has a strong anti-interference ability.Even if malware uses obfuscation,fuzzing,encryption,and other techniques to evade detection,it can be accurately classified into its corresponding malware family.Unlike traditional dynamic and static analysis techniques,VMCTE does not require either reverse engineering or the aid of domain expert knowledge.The proposed classification system combines three strong deep convolutional neural networks(ResNet50,MobilenetV1,and MobilenetV2)as feature extractors,lessens the dimension of the extracted features using principal component analysis,and employs a support vector machine to establish the classification model.The semantic representations of malware images can be extracted using various convolutional neural network(CNN)architectures,obtaining higher-quality features than traditional methods.Integrating fine-tuned and non-fine-tuned classification models based on transfer learning can greatly enhance the capacity to classify various families ofmalware.The experimental findings on the Malimg dataset demonstrate that VMCTE can attain 99.64%,99.64%,99.66%,and 99.64%accuracy,F1-score,precision,and recall,respectively.展开更多
A ransomware attack that interrupted the operation of Colonial Pipeline(a large U.S.oil pipeline company),showed that security threats by malware have become serious enough to affect industries and social infrastructu...A ransomware attack that interrupted the operation of Colonial Pipeline(a large U.S.oil pipeline company),showed that security threats by malware have become serious enough to affect industries and social infrastructure rather than individuals alone.The agents and characteristics of attacks should be identified,and appropriate strategies should be established accordingly in order to respond to such attacks.For this purpose,the first task that must be performed is malware classification.Malware creators are well aware of this and apply various concealment and avoidance techniques,making it difficult to classify malware.This study focuses on new features and classification techniques to overcome these difficulties.We propose a behavioral performance visualization method using utilization patterns of system resources,such as the central processing unit,memory,and input/output,that are commonly used in performance analysis or tuning of programs.We extracted the usage patterns of the system resources for ransomware to performbehavioral performance visualization.The results of the classification performance evaluation using the visualization results indicate an accuracy of at least 98.94%with a 3.69%loss rate.Furthermore,we designed and implemented a framework to perform the entire process—from data extraction to behavioral performance visualization and classification performance measurement—that is expected to contribute to related studies in the future.展开更多
With the rapid development of deepfake technology,the authenticity of various types of fake synthetic content is increasing rapidly,which brings potential security threats to people’s daily life and social stability....With the rapid development of deepfake technology,the authenticity of various types of fake synthetic content is increasing rapidly,which brings potential security threats to people’s daily life and social stability.Currently,most algorithms define deepfake detection as a binary classification problem,i.e.,global features are first extracted using a backbone network and then fed into a binary classifier to discriminate true or false.However,the differences between real and fake samples are often subtle and local,and such global feature-based detection algorithms are not optimal in efficiency and accuracy.To this end,to enhance the extraction of forgery details in deep forgery samples,we propose a multi-branch deepfake detection algorithm based on fine-grained features from the perspective of fine-grained classification.First,to address the critical problem in locating discriminative feature regions in fine-grained classification tasks,we investigate a method for locating multiple different discriminative regions and design a lightweight feature localization module to obtain crucial feature representations by augmenting the most significant parts of the feature map.Second,using information complementation,we introduce a correlation-guided fusion module to enhance the discriminative feature information of different branches.Finally,we use the global attention module in the multi-branch model to improve the cross-dimensional interaction of spatial domain and channel domain information and increase the weights of crucial feature regions and feature channels.We conduct sufficient ablation experiments and comparative experiments.The experimental results show that the algorithm outperforms the detection accuracy and effectiveness on the FaceForensics++and Celeb-DF-v2 datasets compared with the representative detection algorithms in recent years,which can achieve better detection results.展开更多
With the development of short video industry,video and bullet screen have become important ways to spread public opinions.Public attitudes can be timely obtained through emotional analysis on bullet screen,which can a...With the development of short video industry,video and bullet screen have become important ways to spread public opinions.Public attitudes can be timely obtained through emotional analysis on bullet screen,which can also reduce difficulties in management of online public opinions.A convolutional neural network model based on multi-head attention is proposed to solve the problem of how to effectively model relations among words and identify key words in emotion classification tasks with short text contents and lack of complete context information.Firstly,encode word positions so that order information of input sequences can be used by the model.Secondly,use a multi-head attention mechanism to obtain semantic expressions in different subspaces,effectively capture internal relevance and enhance dependent relationships among words,as well as highlight emotional weights of key emotional words.Then a dilated convolution is used to increase the receptive field and extract more features.On this basis,the above multi-attention mechanism is combined with a convolutional neural network to model and analyze the seven emotional categories of bullet screens.Testing from perspectives of model and dataset,experimental results can validate effectiveness of our approach.Finally,emotions of bullet screens are visualized to provide data supports for hot event controls and other fields.展开更多
Pneumonia is part of the main diseases causing the death of children.It is generally diagnosed through chest Xray images.With the development of Deep Learning(DL),the diagnosis of pneumonia based on DL has received ex...Pneumonia is part of the main diseases causing the death of children.It is generally diagnosed through chest Xray images.With the development of Deep Learning(DL),the diagnosis of pneumonia based on DL has received extensive attention.However,due to the small difference between pneumonia and normal images,the performance of DL methods could be improved.This research proposes a new fine-grained Convolutional Neural Network(CNN)for children’s pneumonia diagnosis(FG-CPD).Firstly,the fine-grainedCNNclassificationwhich can handle the slight difference in images is investigated.To obtain the raw images from the real-world chest X-ray data,the YOLOv4 algorithm is trained to detect and position the chest part in the raw images.Secondly,a novel attention network is proposed,named SGNet,which integrates the spatial information and channel information of the images to locate the discriminative parts in the chest image for expanding the difference between pneumonia and normal images.Thirdly,the automatic data augmentation method is adopted to increase the diversity of the images and avoid the overfitting of FG-CPD.The FG-CPD has been tested on the public Chest X-ray 2017 dataset,and the results show that it has achieved great effect.Then,the FG-CPD is tested on the real chest X-ray images from children aged 3–12 years ago from Tongji Hospital.The results show that FG-CPD has achieved up to 96.91%accuracy,which can validate the potential of the FG-CPD.展开更多
Histogram Intersection Kernel Support Vector Machines (SVM) was used for the image classification problem. Specifically, each image was split into blocks, and each block was represented by the Scale Invariant Feature ...Histogram Intersection Kernel Support Vector Machines (SVM) was used for the image classification problem. Specifically, each image was split into blocks, and each block was represented by the Scale Invariant Feature Transform (SIFT) descriptors;secondly, k-means cluster method was applied to separate the SIFT descriptors into groups, each group represented a visual keywords;thirdly, count the number of the SIFT descriptors in each image, and histogram of each image should be constructed;finally, Histogram Intersection Kernel should be built based on these histograms. In our experimental study, we use Corel-low images to test our method. Compared with typical RBF kernel SVM, the Histogram Intersection kernel SVM performs better than RBF kernel SVM.展开更多
Targeted multimodal sentiment classification(TMSC)aims to identify the sentiment polarity of a target mentioned in a multimodal post.The majority of current studies on this task focus on mapping the image and the text...Targeted multimodal sentiment classification(TMSC)aims to identify the sentiment polarity of a target mentioned in a multimodal post.The majority of current studies on this task focus on mapping the image and the text to a high-dimensional space in order to obtain and fuse implicit representations,ignoring the rich semantic information contained in the images and not taking into account the contribution of the visual modality in the multimodal fusion representation,which can potentially influence the results of TMSC tasks.This paper proposes a general model for Improving Targeted Multimodal Sentiment Classification with Semantic Description of Images(ITMSC)as a way to tackle these issues and improve the accu-racy of multimodal sentiment analysis.Specifically,the ITMSC model can automatically adjust the contribution of images in the fusion representation through the exploitation of semantic descriptions of images and text similarity relations.Further,we propose a target-based attention module to capture the target-text relevance,an image-based attention module to capture the image-text relevance,and a target-image matching module based on the former two modules to properly align the target with the image so that fine-grained semantic information can be extracted.Our experimental results demonstrate that our model achieves comparable performance with several state-of-the-art approaches on two multimodal sentiment datasets.Our findings indicate that incorporating semantic descriptions of images can enhance our understanding of multimodal content and lead to improved sentiment analysis performance.展开更多
<strong>Background:</strong> Cell replacement therapies have been evaluated in recent years as an alternative for various retinal pathologies to evaluate the therapeutic efficacy of cell therapy, it is imp...<strong>Background:</strong> Cell replacement therapies have been evaluated in recent years as an alternative for various retinal pathologies to evaluate the therapeutic efficacy of cell therapy, it is important to measure the severity of the disease. The aim of this study was to evaluate the effect of umbilical cord derived Mesenchymal Stem Cell (UC-MSC) implantation on severity of Retinitis Pigmentosa (RP). <strong>Methods:</strong> This single-center, clinical study included data of 138 eyes of 92 patients who had a confirmed diagnosis of RP and received stem cell implantation to the suprachoroidal area with a surgical procedure. Patients were evaluated before and 1 year after the surgery regarding to the outcome measures of Best Corrected Visual Acuity (BCVA), Optical Coherence Tomography (OCT) and Visual Field (VF) tests. BCVA, VF width and ellipsoid zone (EZ) width on OCT were recorded for each patient and a scoring criterion was established for each variable varying from 0 to 5 depending on its distribution. The cumulative score (from 0 to 15) was used to classify disease severity from grade 0 to 5. <strong>Results:</strong> All of the patients completed 12-month follow-up period. The median age of the patients was 40.8 years, 46% were female, 77% had been diagnosed within 10 years and 41% had a family history. 79% of the patients with family history had autosomal recessive inheritance pattern. There were statistically significant improvements in the mean BCVA and VF scores during the study (p < 0.05). The mean score and the mean grade of the disease also improved after the treatment (p < 0.05). There was a negative correlation between BCVA improvement and scoring and grading of the disease. <strong>Conclusions:</strong> This study demonstrated beneficial effect of suprachoroidally applied UC-MSCs on BCVA, VF and the severity score and grade of the disease during 12-month follow-up period. Cell mediated therapy based on the secretion of Growth Factors (GFs) seems to be an effective and safe option for the treatment of degenerative retinal diseases. This classification is simple, produces objective measure of disease severity and gives opportunity to compare the results of different treatment modalities.展开更多
Aim: To establish a useful and objective classification for retinitis pigmentosa (RP) to evaluate the disease severity. Methods: This is a retrospective cross-sectional study. Visual acuity (VA), visual field (VF) wid...Aim: To establish a useful and objective classification for retinitis pigmentosa (RP) to evaluate the disease severity. Methods: This is a retrospective cross-sectional study. Visual acuity (VA), visual field (VF) width, ellipsoid zone width on optic cohorence tomography (OCT) and multifocal electroretinography (mf ERG) values were obtained from medical records of patients with RP. A scoring criterion was developed wherein each variable was assigned a score from 0 to 5 depending on its distribution. The cumulative score (from 0 to 20) was used to classify disease severity from grade 0 to 5. The scores were correlated with each other and the final grade. Results: Data of 152 eyes of 92 patients who had the results of all tests were reviewed. The mean age was 41.2 years. The mean VA of the patients was 0.13 ± 0.16 Snellen lines. The majority of patients had a VA less than 20/40 (88.6%), a visual field smaller than 20<sup>˚</sup> (78%), and an ellipsoid zone width smaller than 7<sup>˚</sup> (84.4%). The majority of the patients (85.4%) were in advanced stage of the disease (Grade 4 and 5). Conclusions: We present a simple, objective and easy to use disease severity classification for RP which can be used to categorize patients and to evaluate and compare treatment results.展开更多
COVID-19 is a growing problem worldwide with a high mortality rate.As a result,the World Health Organization(WHO)declared it a pandemic.In order to limit the spread of the disease,a fast and accurate diagnosis is requ...COVID-19 is a growing problem worldwide with a high mortality rate.As a result,the World Health Organization(WHO)declared it a pandemic.In order to limit the spread of the disease,a fast and accurate diagnosis is required.A reverse transcript polymerase chain reaction(RT-PCR)test is often used to detect the disease.However,since this test is time-consuming,a chest computed tomography(CT)or plain chest X-ray(CXR)is sometimes indicated.The value of automated diagnosis is that it saves time and money by minimizing human effort.Three significant contributions are made by our research.Its initial purpose is to use the essential finetuning methodology to test the action and efficiency of a variety of vision models,ranging from Inception to Neural Architecture Search(NAS)networks.Second,by plotting class activationmaps(CAMs)for individual networks and assessing classification efficiency with AUC-ROC curves,the behavior of these models is visually analyzed.Finally,stacked ensembles techniques were used to provide greater generalization by combining finetuned models with six ensemble neural networks.Using stacked ensembles,the generalization of the models improved.Furthermore,the ensemble model created by combining all of the finetuned networks obtained a state-of-the-art COVID-19 accuracy detection score of 99.17%.The precision and recall rates were 99.99%and 89.79%,respectively,highlighting the robustness of stacked ensembles.The proposed ensemble approach performed well in the classification of the COVID-19 lesions on CXR according to the experimental results.展开更多
Localizing discriminative object parts(e.g.,bird head)is crucial for fine-grained classification tasks,especially for the more challenging fine-grained few-shot scenario.Previous work always relies on the learned obje...Localizing discriminative object parts(e.g.,bird head)is crucial for fine-grained classification tasks,especially for the more challenging fine-grained few-shot scenario.Previous work always relies on the learned object parts in a unified manner,where they attend the same object parts(even with common attention weights)for different few-shot episodic tasks.In this paper,we propose that it should adaptively capture the task-specific object parts that require attention for each few-shot task,since the parts that can distinguish different tasks are naturally different.Specifically for a few-shot task,after obtaining part-level deep features,we learn a task-specific part-based dictionary for both aligning and reweighting part features in an episode.Then,part-level categorical prototypes are generated based on the part features of support data,which are later employed by calculating distances to classify query data for evaluation.To retain the discriminative ability of the part-level representations(i.e.,part features and part prototypes),we design an optimal transport solution that also utilizes query data in a transductive way to optimize the aforementioned distance calculation for the final predictions.Extensive experiments on five fine-grained benchmarks show the superiority of our method,especially for the 1-shot setting,gaining 0.12%,8.56%and 5.87%improvements over state-of-the-art methods on CUB,Stanford Dogs,and Stanford Cars,respectively.展开更多
In today's interconnected world,network traffic is replete with adversarial attacks.As technology evolves,these attacks are also becoming increasingly sophisticated,making them even harder to detect.Fortunately,ar...In today's interconnected world,network traffic is replete with adversarial attacks.As technology evolves,these attacks are also becoming increasingly sophisticated,making them even harder to detect.Fortunately,artificial intelli-gence(Al)and,specifically machine learning(ML),have shown great success in fast and accurate detection,classifica-tion,and even analysis of such threats.Accordingly,there is a growing body of literature addressing how subfields of Al/ML(e.g.,natural language processing(NLP))are getting leveraged to accurately detect evasive malicious patterns in network traffic.In this paper,we delve into the current advancements in ML-based network traffic classification using image visualization.Through a rigorous experimental methodology,we first explore the process of network traffic to image conversion.Subsequently,we investigate how machine learning techniques can effectively leverage image visualization to accurately classify evasive malicious traces within network traffic.Through the utilization of production-level tools and utilities in realistic experiments,our proposed solution achieves an impressive accuracy rate of 99.48%in detecting fileless malware,which is widely regarded as one of the most elusive classes of malicious software.展开更多
基金Supported by the Municipal Government and School(Hospital)Joint Funding Programme of Guangzhou(No.2023A03J0174,No.2023A03J0188)the State Key Laboratories’Youth Program of China(No.83000-32030003).
文摘●AIM:To establish a classification for congenital cataracts that can facilitate individualized treatment and help identify individuals with a high likelihood of different visual outcomes.●METHODS:Consecutive patients diagnosed with congenital cataracts and undergoing surgery between January 2005 and November 2021 were recruited.Data on visual outcomes and the phenotypic characteristics of ocular biometry and the anterior and posterior segments were extracted from the patients’medical records.A hierarchical cluster analysis was performed.The main outcome measure was the identification of distinct clusters of eyes with congenital cataracts.●RESULTS:A total of 164 children(299 eyes)were divided into two clusters based on their ocular features.Cluster 1(96 eyes)had a shorter axial length(mean±SD,19.44±1.68 mm),a low prevalence of macular abnormalities(1.04%),and no retinal abnormalities or posterior cataracts.Cluster 2(203 eyes)had a greater axial length(mean±SD,20.42±2.10 mm)and a higher prevalence of macular abnormalities(8.37%),retinal abnormalities(98.52%),and posterior cataracts(4.93%).Compared with the eyes in Cluster 2(57.14%),those in Cluster 1(71.88%)had a 2.2 times higher chance of good best-corrected visual acuity[<0.7 logMAR;OR(95%CI),2.20(1.25–3.81);P=0.006].●CONCLUSION:This retrospective study categorizes congenital cataracts into two distinct clusters,each associated with a different likelihood of visual outcomes.This innovative classification may enable the personalization and prioritization of early interventions for patients who may gain the greatest benefit,thereby making strides toward precision medicine in the field of congenital cataracts.
基金supported by the National Natural Science Foundation of China(61571453,61806218).
文摘Deep learning has achieved excellent results in various tasks in the field of computer vision,especially in fine-grained visual categorization.It aims to distinguish the subordinate categories of the label-level categories.Due to high intra-class variances and high inter-class similarity,the fine-grained visual categorization is extremely challenging.This paper first briefly introduces and analyzes the related public datasets.After that,some of the latest methods are reviewed.Based on the feature types,the feature processing methods,and the overall structure used in the model,we divide them into three types of methods:methods based on general convolutional neural network(CNN)and strong supervision of parts,methods based on single feature processing,and meth-ods based on multiple feature processing.Most methods of the first type have a relatively simple structure,which is the result of the initial research.The methods of the other two types include models that have special structures and training processes,which are helpful to obtain discriminative features.We conduct a specific analysis on several methods with high accuracy on pub-lic datasets.In addition,we support that the focus of the future research is to solve the demand of existing methods for the large amount of the data and the computing power.In terms of tech-nology,the extraction of the subtle feature information with the burgeoning vision transformer(ViT)network is also an important research direction.
基金Supported by the National Natural Science Foundation of China (41872166)。
文摘Based on reviews and summaries of the naming schemes of fine-grained sedimentary rocks, and analysis of characteristics of fine-grained sedimentary rocks, the problems existing in the classification and naming of fine-grained sedimentary rocks are discussed. On this basis, following the principle of three-level nomenclature, a new scheme of rock classification and naming for fine-grained sedimentary rocks is determined from two perspectives: First, fine-grained sedimentary rocks are divided into 12 types in two major categories, mudstone and siltstone, according to particle size(sand, silt and mud). Second,fine-grained sedimentary rocks are divided into 18 types in four categories, carbonate rock, fine-grained felsic sedimentary rock,clay rock and mixed fine-grained sedimentary rock according to mineral composition(carbonate minerals, felsic detrital minerals and clay minerals as three end elements). Considering the importance of organic matter in unconventional oil and gas generation and evaluation, organic matter is taken as the fourth element in the scheme. Taking the organic matter contents of 0.5% and 2% as dividing points, fine grained sedimentary rocks are divided into three categories, organic-poor, organic-bearing,and organic-rich ones. The new scheme meets the requirement of unconventional oil and gas exploration and development today and solves the problem of conceptual confusion in fine-grained sedimentary rocks, providing a unified basic term system for the research of fine-grained sedimentology.
文摘The remote sensing ships’fine-grained classification technology makes it possible to identify certain ship types in remote sensing images,and it has broad application prospects in civil and military fields.However,the current model does not examine the properties of ship targets in remote sensing images with mixed multi-granularity features and a complicated backdrop.There is still an opportunity for future enhancement of the classification impact.To solve the challenges brought by the above characteristics,this paper proposes a Metaformer and Residual fusion network based on Visual Attention Network(VAN-MR)for fine-grained classification tasks.For the complex background of remote sensing images,the VAN-MR model adopts the parallel structure of large kernel attention and spatial attention to enhance the model’s feature extraction ability of interest targets and improve the classification performance of remote sensing ship targets.For the problem of multi-grained feature mixing in remote sensing images,the VAN-MR model uses a Metaformer structure and a parallel network of residual modules to extract ship features.The parallel network has different depths,considering both high-level and lowlevel semantic information.The model achieves better classification performance in remote sensing ship images with multi-granularity mixing.Finally,the model achieves 88.73%and 94.56%accuracy on the public fine-grained ship collection-23(FGSC-23)and FGSCR-42 datasets,respectively,while the parameter size is only 53.47 M,the floating point operations is 9.9 G.The experimental results show that the classification effect of VAN-MR is superior to that of traditional CNNs model and visual model with Transformer structure under the same parameter quantity.
文摘The process of human natural scene categorization consists of two correlated stages: visual perception and visual cognition of natural scenes.Inspired by this fact,we propose a biologically plausible approach for natural scene image classification.This approach consists of one visual perception model and two visual cognition models.The visual perception model,composed of two steps,is used to extract discriminative features from natural scene images.In the first step,we mimic the oriented and bandpass properties of human primary visual cortex by a special complex wavelets transform,which can decompose a natural scene image into a series of 2D spatial structure signals.In the second step,a hybrid statistical feature extraction method is used to generate gist features from those 2D spatial structure signals.Then we design a cognitive feedback model to realize adaptive optimization for the visual perception model.At last,we build a multiple semantics based cognition model to imitate human cognitive mode in rapid natural scene categorization.Experiments on natural scene datasets show that the proposed method achieves high efficiency and accuracy for natural scene classification.
基金This work is supported,in part,by the National Natural Science Foundation of China Grant No.62102190 and 62272236in part,by the Natural Science Foundation of Jiangsu Province under Grant No.BK20201136 and BK20191401.
文摘The Corona Virus Disease 2019(COVID-19)effect has made telecommuting and remote learning the norm.The growing number of Internet-connected devices provides cyber attackers with more attack vectors.The development of malware by criminals also incorporates a number of sophisticated obfuscation techniques,making it difficult to classify and detect malware using conventional approaches.Therefore,this paper proposes a novel visualization-based malware classification system using transfer and ensemble learning(VMCTE).VMCTE has a strong anti-interference ability.Even if malware uses obfuscation,fuzzing,encryption,and other techniques to evade detection,it can be accurately classified into its corresponding malware family.Unlike traditional dynamic and static analysis techniques,VMCTE does not require either reverse engineering or the aid of domain expert knowledge.The proposed classification system combines three strong deep convolutional neural networks(ResNet50,MobilenetV1,and MobilenetV2)as feature extractors,lessens the dimension of the extracted features using principal component analysis,and employs a support vector machine to establish the classification model.The semantic representations of malware images can be extracted using various convolutional neural network(CNN)architectures,obtaining higher-quality features than traditional methods.Integrating fine-tuned and non-fine-tuned classification models based on transfer learning can greatly enhance the capacity to classify various families ofmalware.The experimental findings on the Malimg dataset demonstrate that VMCTE can attain 99.64%,99.64%,99.66%,and 99.64%accuracy,F1-score,precision,and recall,respectively.
基金This work was supported by the Institute of Information&Communications Technology Planning&Evaluation(IITP)(Project No.2019-0-00426%,10%)the ICT R&D Program of MSIT/IITP(Project No.2021-0-01816,A Research on Core Technology of Autonomous Twins for Metaverse,10%)National Research Foundation of Korea(NRF)grant funded by the Korean government(Project No.NRF-2020R1A2C4002737%,80%).
文摘A ransomware attack that interrupted the operation of Colonial Pipeline(a large U.S.oil pipeline company),showed that security threats by malware have become serious enough to affect industries and social infrastructure rather than individuals alone.The agents and characteristics of attacks should be identified,and appropriate strategies should be established accordingly in order to respond to such attacks.For this purpose,the first task that must be performed is malware classification.Malware creators are well aware of this and apply various concealment and avoidance techniques,making it difficult to classify malware.This study focuses on new features and classification techniques to overcome these difficulties.We propose a behavioral performance visualization method using utilization patterns of system resources,such as the central processing unit,memory,and input/output,that are commonly used in performance analysis or tuning of programs.We extracted the usage patterns of the system resources for ransomware to performbehavioral performance visualization.The results of the classification performance evaluation using the visualization results indicate an accuracy of at least 98.94%with a 3.69%loss rate.Furthermore,we designed and implemented a framework to perform the entire process—from data extraction to behavioral performance visualization and classification performance measurement—that is expected to contribute to related studies in the future.
基金supported by the 2023 Open Project of Key Laboratory of Ministry of Public Security for Artificial Intelligence Security(RGZNAQ-2304)the Fundamental Research Funds for the Central Universities of PPSUC(2023JKF01ZK08).
文摘With the rapid development of deepfake technology,the authenticity of various types of fake synthetic content is increasing rapidly,which brings potential security threats to people’s daily life and social stability.Currently,most algorithms define deepfake detection as a binary classification problem,i.e.,global features are first extracted using a backbone network and then fed into a binary classifier to discriminate true or false.However,the differences between real and fake samples are often subtle and local,and such global feature-based detection algorithms are not optimal in efficiency and accuracy.To this end,to enhance the extraction of forgery details in deep forgery samples,we propose a multi-branch deepfake detection algorithm based on fine-grained features from the perspective of fine-grained classification.First,to address the critical problem in locating discriminative feature regions in fine-grained classification tasks,we investigate a method for locating multiple different discriminative regions and design a lightweight feature localization module to obtain crucial feature representations by augmenting the most significant parts of the feature map.Second,using information complementation,we introduce a correlation-guided fusion module to enhance the discriminative feature information of different branches.Finally,we use the global attention module in the multi-branch model to improve the cross-dimensional interaction of spatial domain and channel domain information and increase the weights of crucial feature regions and feature channels.We conduct sufficient ablation experiments and comparative experiments.The experimental results show that the algorithm outperforms the detection accuracy and effectiveness on the FaceForensics++and Celeb-DF-v2 datasets compared with the representative detection algorithms in recent years,which can achieve better detection results.
基金National Natural Science Foundation of China(No.61562057)Gansu Science and Technology Plan Project(No.18JR3RA104)。
文摘With the development of short video industry,video and bullet screen have become important ways to spread public opinions.Public attitudes can be timely obtained through emotional analysis on bullet screen,which can also reduce difficulties in management of online public opinions.A convolutional neural network model based on multi-head attention is proposed to solve the problem of how to effectively model relations among words and identify key words in emotion classification tasks with short text contents and lack of complete context information.Firstly,encode word positions so that order information of input sequences can be used by the model.Secondly,use a multi-head attention mechanism to obtain semantic expressions in different subspaces,effectively capture internal relevance and enhance dependent relationships among words,as well as highlight emotional weights of key emotional words.Then a dilated convolution is used to increase the receptive field and extract more features.On this basis,the above multi-attention mechanism is combined with a convolutional neural network to model and analyze the seven emotional categories of bullet screens.Testing from perspectives of model and dataset,experimental results can validate effectiveness of our approach.Finally,emotions of bullet screens are visualized to provide data supports for hot event controls and other fields.
基金supported in part by the Natural Science Foundation of China(NSFC)underGrant No.51805192,Major Special Science and Technology Project of Hubei Province under Grant No.2020AEA009sponsored by the State Key Laboratory of Digital Manufacturing Equipment and Technology(DMET)of Huazhong University of Science and Technology(HUST)under Grant No.DMETKF2020029.
文摘Pneumonia is part of the main diseases causing the death of children.It is generally diagnosed through chest Xray images.With the development of Deep Learning(DL),the diagnosis of pneumonia based on DL has received extensive attention.However,due to the small difference between pneumonia and normal images,the performance of DL methods could be improved.This research proposes a new fine-grained Convolutional Neural Network(CNN)for children’s pneumonia diagnosis(FG-CPD).Firstly,the fine-grainedCNNclassificationwhich can handle the slight difference in images is investigated.To obtain the raw images from the real-world chest X-ray data,the YOLOv4 algorithm is trained to detect and position the chest part in the raw images.Secondly,a novel attention network is proposed,named SGNet,which integrates the spatial information and channel information of the images to locate the discriminative parts in the chest image for expanding the difference between pneumonia and normal images.Thirdly,the automatic data augmentation method is adopted to increase the diversity of the images and avoid the overfitting of FG-CPD.The FG-CPD has been tested on the public Chest X-ray 2017 dataset,and the results show that it has achieved great effect.Then,the FG-CPD is tested on the real chest X-ray images from children aged 3–12 years ago from Tongji Hospital.The results show that FG-CPD has achieved up to 96.91%accuracy,which can validate the potential of the FG-CPD.
文摘Histogram Intersection Kernel Support Vector Machines (SVM) was used for the image classification problem. Specifically, each image was split into blocks, and each block was represented by the Scale Invariant Feature Transform (SIFT) descriptors;secondly, k-means cluster method was applied to separate the SIFT descriptors into groups, each group represented a visual keywords;thirdly, count the number of the SIFT descriptors in each image, and histogram of each image should be constructed;finally, Histogram Intersection Kernel should be built based on these histograms. In our experimental study, we use Corel-low images to test our method. Compared with typical RBF kernel SVM, the Histogram Intersection kernel SVM performs better than RBF kernel SVM.
文摘Targeted multimodal sentiment classification(TMSC)aims to identify the sentiment polarity of a target mentioned in a multimodal post.The majority of current studies on this task focus on mapping the image and the text to a high-dimensional space in order to obtain and fuse implicit representations,ignoring the rich semantic information contained in the images and not taking into account the contribution of the visual modality in the multimodal fusion representation,which can potentially influence the results of TMSC tasks.This paper proposes a general model for Improving Targeted Multimodal Sentiment Classification with Semantic Description of Images(ITMSC)as a way to tackle these issues and improve the accu-racy of multimodal sentiment analysis.Specifically,the ITMSC model can automatically adjust the contribution of images in the fusion representation through the exploitation of semantic descriptions of images and text similarity relations.Further,we propose a target-based attention module to capture the target-text relevance,an image-based attention module to capture the image-text relevance,and a target-image matching module based on the former two modules to properly align the target with the image so that fine-grained semantic information can be extracted.Our experimental results demonstrate that our model achieves comparable performance with several state-of-the-art approaches on two multimodal sentiment datasets.Our findings indicate that incorporating semantic descriptions of images can enhance our understanding of multimodal content and lead to improved sentiment analysis performance.
文摘<strong>Background:</strong> Cell replacement therapies have been evaluated in recent years as an alternative for various retinal pathologies to evaluate the therapeutic efficacy of cell therapy, it is important to measure the severity of the disease. The aim of this study was to evaluate the effect of umbilical cord derived Mesenchymal Stem Cell (UC-MSC) implantation on severity of Retinitis Pigmentosa (RP). <strong>Methods:</strong> This single-center, clinical study included data of 138 eyes of 92 patients who had a confirmed diagnosis of RP and received stem cell implantation to the suprachoroidal area with a surgical procedure. Patients were evaluated before and 1 year after the surgery regarding to the outcome measures of Best Corrected Visual Acuity (BCVA), Optical Coherence Tomography (OCT) and Visual Field (VF) tests. BCVA, VF width and ellipsoid zone (EZ) width on OCT were recorded for each patient and a scoring criterion was established for each variable varying from 0 to 5 depending on its distribution. The cumulative score (from 0 to 15) was used to classify disease severity from grade 0 to 5. <strong>Results:</strong> All of the patients completed 12-month follow-up period. The median age of the patients was 40.8 years, 46% were female, 77% had been diagnosed within 10 years and 41% had a family history. 79% of the patients with family history had autosomal recessive inheritance pattern. There were statistically significant improvements in the mean BCVA and VF scores during the study (p < 0.05). The mean score and the mean grade of the disease also improved after the treatment (p < 0.05). There was a negative correlation between BCVA improvement and scoring and grading of the disease. <strong>Conclusions:</strong> This study demonstrated beneficial effect of suprachoroidally applied UC-MSCs on BCVA, VF and the severity score and grade of the disease during 12-month follow-up period. Cell mediated therapy based on the secretion of Growth Factors (GFs) seems to be an effective and safe option for the treatment of degenerative retinal diseases. This classification is simple, produces objective measure of disease severity and gives opportunity to compare the results of different treatment modalities.
文摘Aim: To establish a useful and objective classification for retinitis pigmentosa (RP) to evaluate the disease severity. Methods: This is a retrospective cross-sectional study. Visual acuity (VA), visual field (VF) width, ellipsoid zone width on optic cohorence tomography (OCT) and multifocal electroretinography (mf ERG) values were obtained from medical records of patients with RP. A scoring criterion was developed wherein each variable was assigned a score from 0 to 5 depending on its distribution. The cumulative score (from 0 to 20) was used to classify disease severity from grade 0 to 5. The scores were correlated with each other and the final grade. Results: Data of 152 eyes of 92 patients who had the results of all tests were reviewed. The mean age was 41.2 years. The mean VA of the patients was 0.13 ± 0.16 Snellen lines. The majority of patients had a VA less than 20/40 (88.6%), a visual field smaller than 20<sup>˚</sup> (78%), and an ellipsoid zone width smaller than 7<sup>˚</sup> (84.4%). The majority of the patients (85.4%) were in advanced stage of the disease (Grade 4 and 5). Conclusions: We present a simple, objective and easy to use disease severity classification for RP which can be used to categorize patients and to evaluate and compare treatment results.
基金The research is funded by the Researchers Supporting Project at King Saud University,(Project#RSP-2021/305).
文摘COVID-19 is a growing problem worldwide with a high mortality rate.As a result,the World Health Organization(WHO)declared it a pandemic.In order to limit the spread of the disease,a fast and accurate diagnosis is required.A reverse transcript polymerase chain reaction(RT-PCR)test is often used to detect the disease.However,since this test is time-consuming,a chest computed tomography(CT)or plain chest X-ray(CXR)is sometimes indicated.The value of automated diagnosis is that it saves time and money by minimizing human effort.Three significant contributions are made by our research.Its initial purpose is to use the essential finetuning methodology to test the action and efficiency of a variety of vision models,ranging from Inception to Neural Architecture Search(NAS)networks.Second,by plotting class activationmaps(CAMs)for individual networks and assessing classification efficiency with AUC-ROC curves,the behavior of these models is visually analyzed.Finally,stacked ensembles techniques were used to provide greater generalization by combining finetuned models with six ensemble neural networks.Using stacked ensembles,the generalization of the models improved.Furthermore,the ensemble model created by combining all of the finetuned networks obtained a state-of-the-art COVID-19 accuracy detection score of 99.17%.The precision and recall rates were 99.99%and 89.79%,respectively,highlighting the robustness of stacked ensembles.The proposed ensemble approach performed well in the classification of the COVID-19 lesions on CXR according to the experimental results.
基金supported by National Natural Science Foundation of China(No.62272231)Natural Science Foundation of Jiangsu Province of China(No.BK 20210340)+2 种基金National Key R&D Program of China(No.2021YFA1001100)the Fundamental Research Funds for the Central Universities,China(No.NJ2022028)CAAI-Huawei MindSpore Open Fund,China.
文摘Localizing discriminative object parts(e.g.,bird head)is crucial for fine-grained classification tasks,especially for the more challenging fine-grained few-shot scenario.Previous work always relies on the learned object parts in a unified manner,where they attend the same object parts(even with common attention weights)for different few-shot episodic tasks.In this paper,we propose that it should adaptively capture the task-specific object parts that require attention for each few-shot task,since the parts that can distinguish different tasks are naturally different.Specifically for a few-shot task,after obtaining part-level deep features,we learn a task-specific part-based dictionary for both aligning and reweighting part features in an episode.Then,part-level categorical prototypes are generated based on the part features of support data,which are later employed by calculating distances to classify query data for evaluation.To retain the discriminative ability of the part-level representations(i.e.,part features and part prototypes),we design an optimal transport solution that also utilizes query data in a transductive way to optimize the aforementioned distance calculation for the final predictions.Extensive experiments on five fine-grained benchmarks show the superiority of our method,especially for the 1-shot setting,gaining 0.12%,8.56%and 5.87%improvements over state-of-the-art methods on CUB,Stanford Dogs,and Stanford Cars,respectively.
基金supported in part by NSF Grants#2113945 and#2200538 and a generous financial and technical support from Palo Alto Networks,Inc.
文摘In today's interconnected world,network traffic is replete with adversarial attacks.As technology evolves,these attacks are also becoming increasingly sophisticated,making them even harder to detect.Fortunately,artificial intelli-gence(Al)and,specifically machine learning(ML),have shown great success in fast and accurate detection,classifica-tion,and even analysis of such threats.Accordingly,there is a growing body of literature addressing how subfields of Al/ML(e.g.,natural language processing(NLP))are getting leveraged to accurately detect evasive malicious patterns in network traffic.In this paper,we delve into the current advancements in ML-based network traffic classification using image visualization.Through a rigorous experimental methodology,we first explore the process of network traffic to image conversion.Subsequently,we investigate how machine learning techniques can effectively leverage image visualization to accurately classify evasive malicious traces within network traffic.Through the utilization of production-level tools and utilities in realistic experiments,our proposed solution achieves an impressive accuracy rate of 99.48%in detecting fileless malware,which is widely regarded as one of the most elusive classes of malicious software.