Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to est...Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to establish relationships between distant but relevant points. To overcome the limitation of local spatial attention, we propose a point content-based Transformer architecture, called PointConT for short. It exploits the locality of points in the feature space(content-based), which clusters the sampled points with similar features into the same class and computes the self-attention within each class, thus enabling an effective trade-off between capturing long-range dependencies and computational complexity. We further introduce an inception feature aggregator for point cloud classification, which uses parallel structures to aggregate high-frequency and low-frequency information in each branch separately. Extensive experiments show that our PointConT model achieves a remarkable performance on point cloud shape classification. Especially, our method exhibits 90.3% Top-1 accuracy on the hardest setting of ScanObjectN N. Source code of this paper is available at https://github.com/yahuiliu99/PointC onT.展开更多
When building a classification model,the scenario where the samples of one class are significantly more than those of the other class is called data imbalance.Data imbalance causes the trained classification model to ...When building a classification model,the scenario where the samples of one class are significantly more than those of the other class is called data imbalance.Data imbalance causes the trained classification model to be in favor of the majority class(usually defined as the negative class),which may do harm to the accuracy of the minority class(usually defined as the positive class),and then lead to poor overall performance of the model.A method called MSHR-FCSSVM for solving imbalanced data classification is proposed in this article,which is based on a new hybrid resampling approach(MSHR)and a new fine cost-sensitive support vector machine(CS-SVM)classifier(FCSSVM).The MSHR measures the separability of each negative sample through its Silhouette value calculated by Mahalanobis distance between samples,based on which,the so-called pseudo-negative samples are screened out to generate new positive samples(over-sampling step)through linear interpolation and are deleted finally(under-sampling step).This approach replaces pseudo-negative samples with generated new positive samples one by one to clear up the inter-class overlap on the borderline,without changing the overall scale of the dataset.The FCSSVM is an improved version of the traditional CS-SVM.It considers influences of both the imbalance of sample number and the class distribution on classification simultaneously,and through finely tuning the class cost weights by using the efficient optimization algorithm based on the physical phenomenon of rime-ice(RIME)algorithm with cross-validation accuracy as the fitness function to accurately adjust the classification borderline.To verify the effectiveness of the proposed method,a series of experiments are carried out based on 20 imbalanced datasets including both mildly and extremely imbalanced datasets.The experimental results show that the MSHR-FCSSVM method performs better than the methods for comparison in most cases,and both the MSHR and the FCSSVM played significant roles.展开更多
Background: Cavernous transformation of the portal vein(CTPV) due to portal vein obstruction is a rare vascular anomaly defined as the formation of multiple collateral vessels in the hepatic hilum. This study aimed to...Background: Cavernous transformation of the portal vein(CTPV) due to portal vein obstruction is a rare vascular anomaly defined as the formation of multiple collateral vessels in the hepatic hilum. This study aimed to investigate the imaging features of intrahepatic portal vein in adult patients with CTPV and establish the relationship between the manifestations of intrahepatic portal vein and the progression of CTPV. Methods: We retrospectively analyzed 14 CTPV patients in Beijing Tsinghua Changgung Hospital. All patients underwent both direct portal venography(DPV) and computed tomography angiography(CTA) to reveal the manifestations of the portal venous system. The vessels measured included the left portal vein(LPV), right portal vein(RPV), main portal vein(MPV) and the portal vein bifurcation(PVB). Results: Nine males and 5 females, with a median age of 40.5 years, were included in the study. No significant difference was found in the diameters of the LPV or RPV measured by DPV and CTA. The visualization in terms of LPV, RPV and PVB measured by DPV was higher than that by CTA. There was a significant association between LPV/RPV and PVB/MPV in term of visibility revealed with DPV( P = 0.01), while this association was not observed with CTA. According to the imaging features of the portal vein measured by DPV, CTPV was classified into three categories to facilitate the diagnosis and treatment. Conclusions: DPV was more accurate than CTA for revealing the course of the intrahepatic portal vein in patients with CTPV. The classification of CTPV, that originated from the imaging features of the portal vein revealed by DPV, may provide a new perspective for the diagnosis and treatment of CTPV.展开更多
Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on t...Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on the benchmark datasets have been proposed for multi-label classification task in the literature.Furthermore,several open-source tools implementing these approaches have also been developed.However,the characteristics of real-world multi-label patent and publication datasets are not completely in line with those of benchmark ones.Therefore,the main purpose of this paper is to evaluate comprehensively seven multi-label classification methods on real-world datasets.Research limitations:Three real-world datasets differ in the following aspects:statement,data quality,and purposes.Additionally,open-source tools designed for multi-label classification also have intrinsic differences in their approaches for data processing and feature selection,which in turn impacts the performance of a multi-label classification approach.In the near future,we will enhance experimental precision and reinforce the validity of conclusions by employing more rigorous control over variables through introducing expanded parameter settings.Practical implications:The observed Macro F1 and Micro F1 scores on real-world datasets typically fall short of those achieved on benchmark datasets,underscoring the complexity of real-world multi-label classification tasks.Approaches leveraging deep learning techniques offer promising solutions by accommodating the hierarchical relationships and interdependencies among labels.With ongoing enhancements in deep learning algorithms and large-scale models,it is expected that the efficacy of multi-label classification tasks will be significantly improved,reaching a level of practical utility in the foreseeable future.Originality/value:(1)Seven multi-label classification methods are comprehensively compared on three real-world datasets.(2)The TextCNN and TextRCNN models perform better on small-scale datasets with more complex hierarchical structure of labels and more balanced document-label distribution.(3)The MLkNN method works better on the larger-scale dataset with more unbalanced document-label distribution.展开更多
The tell tail is usually placed on the triangular sail to display the running state of the air flow on the sail surface.It is of great significance to make accurate judgement on the drift of the tell tail of the sailb...The tell tail is usually placed on the triangular sail to display the running state of the air flow on the sail surface.It is of great significance to make accurate judgement on the drift of the tell tail of the sailboat during sailing for the best sailing effect.Normally it is difficult for sailors to keep an eye for a long time on the tell sail for accurate judging its changes,affected by strong sunlight and visual fatigue.In this case,we adopt computer vision technology in hope of helping the sailors judge the changes of the tell tail in ease with ease.This paper proposes for the first time a method to classify sailboat tell tails based on deep learning and an expert guidance system,supported by a sailboat tell tail classification data set on the expert guidance system of interpreting the tell tails states in different sea wind conditions,including the feature extraction performance.Considering the expression capabilities that vary with the computational features in different visual tasks,the paper focuses on five tell tail computing features,which are recoded by an automatic encoder and classified by a SVM classifier.All experimental samples were randomly divided into five groups,and four groups were selected from each group as the training set to train the classifier.The remaining one group was used as the test set for testing.The highest resolution value of the ResNet network was 80.26%.To achieve better operational results on the basis of deep computing features obtained through the ResNet network in the experiments.The method can be used to assist the sailors in making better judgement about the tell tail changes during sailing.展开更多
Lung cancer is a leading cause of global mortality rates.Early detection of pulmonary tumors can significantly enhance the survival rate of patients.Recently,various Computer-Aided Diagnostic(CAD)methods have been dev...Lung cancer is a leading cause of global mortality rates.Early detection of pulmonary tumors can significantly enhance the survival rate of patients.Recently,various Computer-Aided Diagnostic(CAD)methods have been developed to enhance the detection of pulmonary nodules with high accuracy.Nevertheless,the existing method-ologies cannot obtain a high level of specificity and sensitivity.The present study introduces a novel model for Lung Cancer Segmentation and Classification(LCSC),which incorporates two improved architectures,namely the improved U-Net architecture and the improved AlexNet architecture.The LCSC model comprises two distinct stages.The first stage involves the utilization of an improved U-Net architecture to segment candidate nodules extracted from the lung lobes.Subsequently,an improved AlexNet architecture is employed to classify lung cancer.During the first stage,the proposed model demonstrates a dice accuracy of 0.855,a precision of 0.933,and a recall of 0.789 for the segmentation of candidate nodules.The suggested improved AlexNet architecture attains 97.06%accuracy,a true positive rate of 96.36%,a true negative rate of 97.77%,a positive predictive value of 97.74%,and a negative predictive value of 96.41%for classifying pulmonary cancer as either benign or malignant.The proposed LCSC model is tested and evaluated employing the publically available dataset furnished by the Lung Image Database Consortium and Image Database Resource Initiative(LIDC-IDRI).This proposed technique exhibits remarkable performance compared to the existing methods by using various evaluation parameters.展开更多
Among central nervous system-associated malignancies,glioblastoma(GBM)is the most common and has the highest mortality rate.The high heterogeneity of GBM cell types and the complex tumor microenvironment frequently le...Among central nervous system-associated malignancies,glioblastoma(GBM)is the most common and has the highest mortality rate.The high heterogeneity of GBM cell types and the complex tumor microenvironment frequently lead to tumor recurrence and sudden relapse in patients treated with temozolomide.In precision medicine,research on GBM treatment is increasingly focusing on molecular subtyping to precisely characterize the cellular and molecular heterogeneity,as well as the refractory nature of GBM toward therapy.Deep understanding of the different molecular expression patterns of GBM subtypes is critical.Researchers have recently proposed tetra fractional or tripartite methods for detecting GBM molecular subtypes.The various molecular subtypes of GBM show significant differences in gene expression patterns and biological behaviors.These subtypes also exhibit high plasticity in their regulatory pathways,oncogene expression,tumor microenvironment alterations,and differential responses to standard therapy.Herein,we summarize the current molecular typing scheme of GBM and the major molecular/genetic characteristics of each subtype.Furthermore,we review the mesenchymal transition mechanisms of GBM under various regulators.展开更多
The classification of functional data has drawn much attention in recent years.The main challenge is representing infinite-dimensional functional data by finite-dimensional features while utilizing those features to a...The classification of functional data has drawn much attention in recent years.The main challenge is representing infinite-dimensional functional data by finite-dimensional features while utilizing those features to achieve better classification accuracy.In this paper,we propose a mean-variance-based(MV)feature weighting method for classifying functional data or functional curves.In the feature extraction stage,each sample curve is approximated by B-splines to transfer features to the coefficients of the spline basis.After that,a feature weighting approach based on statistical principles is introduced by comprehensively considering the between-class differences and within-class variations of the coefficients.We also introduce a scaling parameter to adjust the gap between the weights of features.The new feature weighting approach can adaptively enhance noteworthy local features while mitigating the impact of confusing features.The algorithms for feature weighted K-nearest neighbor and support vector machine classifiers are both provided.Moreover,the new approach can be well integrated into existing functional data classifiers,such as the generalized functional linear model and functional linear discriminant analysis,resulting in a more accurate classification.The performance of the mean-variance-based classifiers is evaluated by simulation studies and real data.The results show that the newfeatureweighting approach significantly improves the classification accuracy for complex functional data.展开更多
Intrusion detection is a predominant task that monitors and protects the network infrastructure.Therefore,many datasets have been published and investigated by researchers to analyze and understand the problem of intr...Intrusion detection is a predominant task that monitors and protects the network infrastructure.Therefore,many datasets have been published and investigated by researchers to analyze and understand the problem of intrusion prediction and detection.In particular,the Network Security Laboratory-Knowledge Discovery in Databases(NSL-KDD)is an extensively used benchmark dataset for evaluating intrusion detection systems(IDSs)as it incorporates various network traffic attacks.It is worth mentioning that a large number of studies have tackled the problem of intrusion detection using machine learning models,but the performance of these models often decreases when evaluated on new attacks.This has led to the utilization of deep learning techniques,which have showcased significant potential for processing large datasets and therefore improving detection accuracy.For that reason,this paper focuses on the role of stacking deep learning models,including convolution neural network(CNN)and deep neural network(DNN)for improving the intrusion detection rate of the NSL-KDD dataset.Each base model is trained on the NSL-KDD dataset to extract significant features.Once the base models have been trained,the stacking process proceeds to the second stage,where a simple meta-model has been trained on the predictions generated from the proposed base models.The combination of the predictions allows the meta-model to distinguish different classes of attacks and increase the detection rate.Our experimental evaluations using the NSL-KDD dataset have shown the efficacy of stacking deep learning models for intrusion detection.The performance of the ensemble of base models,combined with the meta-model,exceeds the performance of individual models.Our stacking model has attained an accuracy of 99%and an average F1-score of 93%for the multi-classification scenario.Besides,the training time of the proposed ensemble model is lower than the training time of benchmark techniques,demonstrating its efficiency and robustness.展开更多
Predicting depression intensity from microblogs and social media posts has numerous benefits and applications,including predicting early psychological disorders and stress in individuals or the general public.A major ...Predicting depression intensity from microblogs and social media posts has numerous benefits and applications,including predicting early psychological disorders and stress in individuals or the general public.A major challenge in predicting depression using social media posts is that the existing studies do not focus on predicting the intensity of depression in social media texts but rather only perform the binary classification of depression and moreover noisy data makes it difficult to predict the true depression in the social media text.This study intends to begin by collecting relevant Tweets and generating a corpus of 210000 public tweets using Twitter public application programming interfaces(APIs).A strategy is devised to filter out only depression-related tweets by creating a list of relevant hashtags to reduce noise in the corpus.Furthermore,an algorithm is developed to annotate the data into three depression classes:‘Mild,’‘Moderate,’and‘Severe,’based on International Classification of Diseases-10(ICD-10)depression diagnostic criteria.Different baseline classifiers are applied to the annotated dataset to get a preliminary idea of classification performance on the corpus.Further FastText-based model is applied and fine-tuned with different preprocessing techniques and hyperparameter tuning to produce the tuned model,which significantly increases the depression classification performance to an 84%F1 score and 90%accuracy compared to baselines.Finally,a FastText-based weighted soft voting ensemble(WSVE)is proposed to boost the model’s performance by combining several other classifiers and assigning weights to individual models according to their individual performances.The proposed WSVE outperformed all baselines as well as FastText alone,with an F1 of 89%,5%higher than FastText alone,and an accuracy of 93%,3%higher than FastText alone.The proposed model better captures the contextual features of the relatively small sample class and aids in the detection of early depression intensity prediction from tweets with impactful performances.展开更多
Although disintegrated dolomite,widely distributed across the globe,has conventionally been a focus of research in underground engineering,the issue of slope stability issues in disintegrated dolomite strata is gainin...Although disintegrated dolomite,widely distributed across the globe,has conventionally been a focus of research in underground engineering,the issue of slope stability issues in disintegrated dolomite strata is gaining increasing prominence.This is primarily due to their unique properties,including low strength and loose structure.Current methods for evaluating slope stability,such as basic quality(BQ)and slope stability probability classification(SSPC),do not adequately account for the poor integrity and structural fragmentation characteristic of disintegrated dolomite.To address this challenge,an analysis of the applicability of the limit equilibrium method(LEM),BQ,and SSPC methods was conducted on eight disintegrated dolomite slopes located in Baoshan,Southwest China.However,conflicting results were obtained.Therefore,this paper introduces a novel method,SMRDDS,to provide rapid and accurate assessment of disintegrated dolomite slope stability.This method incorporates parameters such as disintegrated grade,joint state,groundwater conditions,and excavation methods.The findings reveal that six slopes exhibit stability,while two are considered partially unstable.Notably,the proposed method demonstrates a closer match with the actual conditions and is more time-efficient compared with the BQ and SSPC methods.However,due to the limited research on disintegrated dolomite slopes,the results of the SMRDDS method tend to be conservative as a safety precaution.In conclusion,the SMRDDS method can quickly evaluate the current situation of disintegrated dolomite slopes in the field.This contributes significantly to disaster risk reduction for disintegrated dolomite slopes.展开更多
Few‐shot image classification is the task of classifying novel classes using extremely limited labelled samples.To perform classification using the limited samples,one solution is to learn the feature alignment(FA)in...Few‐shot image classification is the task of classifying novel classes using extremely limited labelled samples.To perform classification using the limited samples,one solution is to learn the feature alignment(FA)information between the labelled and unlabelled sample features.Most FA methods use the feature mean as the class prototype and calculate the correlation between prototype and unlabelled features to learn an alignment strategy.However,mean prototypes tend to degenerate informative features because spatial features at the same position may not be equally important for the final classification,leading to inaccurate correlation calculations.Therefore,the authors propose an effective intraclass FA strategy that aggregates semantically similar spatial features from an adaptive reference prototype in low‐dimensional feature space to obtain an informative prototype feature map for precise correlation computation.Moreover,a dual correlation module to learn the hard and soft correlations was developed by the authors.This module combines the correlation information between the prototype and unlabelled features in both the original and learnable feature spaces,aiming to produce a comprehensive cross‐correlation between the prototypes and unlabelled features.Using both FA and cross‐attention modules,our model can maintain informative class features and capture important shared features for classification.Experimental results on three few‐shot classification benchmarks show that the proposed method outperformed related methods and resulted in a 3%performance boost in the 1‐shot setting by inserting the proposed module into the related methods.展开更多
Gliomas have the highest mortality rate of all brain tumors.Correctly classifying the glioma risk period can help doctors make reasonable treatment plans and improve patients’survival rates.This paper proposes a hier...Gliomas have the highest mortality rate of all brain tumors.Correctly classifying the glioma risk period can help doctors make reasonable treatment plans and improve patients’survival rates.This paper proposes a hierarchical multi-scale attention feature fusion medical image classification network(HMAC-Net),which effectively combines global features and local features.The network framework consists of three parallel layers:The global feature extraction layer,the local feature extraction layer,and the multi-scale feature fusion layer.A linear sparse attention mechanism is designed in the global feature extraction layer to reduce information redundancy.In the local feature extraction layer,a bilateral local attention mechanism is introduced to improve the extraction of relevant information between adjacent slices.In the multi-scale feature fusion layer,a channel fusion block combining convolutional attention mechanism and residual inverse multi-layer perceptron is proposed to prevent gradient disappearance and network degradation and improve feature representation capability.The double-branch iterative multi-scale classification block is used to improve the classification performance.On the brain glioma risk grading dataset,the results of the ablation experiment and comparison experiment show that the proposed HMAC-Net has the best performance in both qualitative analysis of heat maps and quantitative analysis of evaluation indicators.On the dataset of skin cancer classification,the generalization experiment results show that the proposed HMAC-Net has a good generalization effect.展开更多
In the era of the Internet of Things(IoT),the proliferation of connected devices has raised security concerns,increasing the risk of intrusions into diverse systems.Despite the convenience and efficiency offered by Io...In the era of the Internet of Things(IoT),the proliferation of connected devices has raised security concerns,increasing the risk of intrusions into diverse systems.Despite the convenience and efficiency offered by IoT technology,the growing number of IoT devices escalates the likelihood of attacks,emphasizing the need for robust security tools to automatically detect and explain threats.This paper introduces a deep learning methodology for detecting and classifying distributed denial of service(DDoS)attacks,addressing a significant security concern within IoT environments.An effective procedure of deep transfer learning is applied to utilize deep learning backbones,which is then evaluated on two benchmarking datasets of DDoS attacks in terms of accuracy and time complexity.By leveraging several deep architectures,the study conducts thorough binary and multiclass experiments,each varying in the complexity of classifying attack types and demonstrating real-world scenarios.Additionally,this study employs an explainable artificial intelligence(XAI)AI technique to elucidate the contribution of extracted features in the process of attack detection.The experimental results demonstrate the effectiveness of the proposed method,achieving a recall of 99.39%by the XAI bidirectional long short-term memory(XAI-BiLSTM)model.展开更多
Automatic modulation classification(AMC) technology is one of the cutting-edge technologies in cognitive radio communications. AMC based on deep learning has recently attracted much attention due to its superior perfo...Automatic modulation classification(AMC) technology is one of the cutting-edge technologies in cognitive radio communications. AMC based on deep learning has recently attracted much attention due to its superior performances in classification accuracy and robustness. In this paper, we propose a novel, high resolution and multi-scale feature fusion convolutional neural network model with a squeeze-excitation block, referred to as HRSENet,to classify different kinds of modulation signals.The proposed model establishes a parallel computing mechanism of multi-resolution feature maps through the multi-layer convolution operation, which effectively reduces the information loss caused by downsampling convolution. Moreover, through dense skipconnecting at the same resolution and up-sampling or down-sampling connection at different resolutions, the low resolution representation of the deep feature maps and the high resolution representation of the shallow feature maps are simultaneously extracted and fully integrated, which is benificial to mine signal multilevel features. Finally, the feature squeeze and excitation module embedded in the decoder is used to adjust the response weights between channels, further improving classification accuracy of proposed model.The proposed HRSENet significantly outperforms existing methods in terms of classification accuracy on the public dataset “Over the Air” in signal-to-noise(SNR) ranging from-2dB to 20dB. The classification accuracy in the proposed model achieves 85.36% and97.30% at 4dB and 10dB, respectively, with the improvement by 9.71% and 5.82% compared to LWNet.Furthermore, the model also has a moderate computation complexity compared with several state-of-the-art methods.展开更多
Imbalanced data classification is the task of classifying datasets where there is a significant disparity in the number of samples between different classes.This task is prevalent in practical scenarios such as indust...Imbalanced data classification is the task of classifying datasets where there is a significant disparity in the number of samples between different classes.This task is prevalent in practical scenarios such as industrial fault diagnosis,network intrusion detection,cancer detection,etc.In imbalanced classification tasks,the focus is typically on achieving high recognition accuracy for the minority class.However,due to the challenges presented by imbalanced multi-class datasets,such as the scarcity of samples in minority classes and complex inter-class relationships with overlapping boundaries,existing methods often do not perform well in multi-class imbalanced data classification tasks,particularly in terms of recognizing minority classes with high accuracy.Therefore,this paper proposes a multi-class imbalanced data classification method called CSDSResNet,which is based on a cost-sensitive dualstream residual network.Firstly,to address the issue of limited samples in the minority class within imbalanced datasets,a dual-stream residual network backbone structure is designed to enhance the model’s feature extraction capability.Next,considering the complexities arising fromimbalanced inter-class sample quantities and imbalanced inter-class overlapping boundaries in multi-class imbalanced datasets,a unique cost-sensitive loss function is devised.This loss function places more emphasis on the minority class and the challenging classes with high interclass similarity,thereby improving the model’s classification ability.Finally,the effectiveness and generalization of the proposed method,CSDSResNet,are evaluated on two datasets:‘DryBeans’and‘Electric Motor Defects’.The experimental results demonstrate that CSDSResNet achieves the best performance on imbalanced datasets,with macro_F1-score values improving by 2.9%and 1.9%on the two datasets compared to current state-of-the-art classification methods,respectively.Furthermore,it achieves the highest precision in single-class recognition tasks for the minority class.展开更多
Convolutional neural network(CNN)has excellent ability to model locally contextual information.However,CNNs face challenges for descripting long-range semantic features,which will lead to relatively low classification...Convolutional neural network(CNN)has excellent ability to model locally contextual information.However,CNNs face challenges for descripting long-range semantic features,which will lead to relatively low classification accuracy of hyperspectral images.To address this problem,this article proposes an algorithm based on multiscale fusion and transformer network for hyperspectral image classification.Firstly,the low-level spatial-spectral features are extracted by multi-scale residual structure.Secondly,an attention module is introduced to focus on the more important spatialspectral information.Finally,high-level semantic features are represented and learned by a token learner and an improved transformer encoder.The proposed algorithm is compared with six classical hyperspectral classification algorithms on real hyperspectral images.The experimental results show that the proposed algorithm effectively improves the land cover classification accuracy of hyperspectral images.展开更多
●AIM:To establish a classification for congenital cataracts that can facilitate individualized treatment and help identify individuals with a high likelihood of different visual outcomes.●METHODS:Consecutive patient...●AIM:To establish a classification for congenital cataracts that can facilitate individualized treatment and help identify individuals with a high likelihood of different visual outcomes.●METHODS:Consecutive patients diagnosed with congenital cataracts and undergoing surgery between January 2005 and November 2021 were recruited.Data on visual outcomes and the phenotypic characteristics of ocular biometry and the anterior and posterior segments were extracted from the patients’medical records.A hierarchical cluster analysis was performed.The main outcome measure was the identification of distinct clusters of eyes with congenital cataracts.●RESULTS:A total of 164 children(299 eyes)were divided into two clusters based on their ocular features.Cluster 1(96 eyes)had a shorter axial length(mean±SD,19.44±1.68 mm),a low prevalence of macular abnormalities(1.04%),and no retinal abnormalities or posterior cataracts.Cluster 2(203 eyes)had a greater axial length(mean±SD,20.42±2.10 mm)and a higher prevalence of macular abnormalities(8.37%),retinal abnormalities(98.52%),and posterior cataracts(4.93%).Compared with the eyes in Cluster 2(57.14%),those in Cluster 1(71.88%)had a 2.2 times higher chance of good best-corrected visual acuity[<0.7 logMAR;OR(95%CI),2.20(1.25–3.81);P=0.006].●CONCLUSION:This retrospective study categorizes congenital cataracts into two distinct clusters,each associated with a different likelihood of visual outcomes.This innovative classification may enable the personalization and prioritization of early interventions for patients who may gain the greatest benefit,thereby making strides toward precision medicine in the field of congenital cataracts.展开更多
Bitcoin is widely used as the most classic electronic currency for various electronic services such as exchanges,gambling,marketplaces,and also scams such as high-yield investment projects.Identifying the services ope...Bitcoin is widely used as the most classic electronic currency for various electronic services such as exchanges,gambling,marketplaces,and also scams such as high-yield investment projects.Identifying the services operated by a Bitcoin address can help determine the risk level of that address and build an alert model accordingly.Feature engineering can also be used to flesh out labeled addresses and to analyze the current state of Bitcoin in a small way.In this paper,we address the problem of identifying multiple classes of Bitcoin services,and for the poor classification of individual addresses that do not have significant features,we propose a Bitcoin address identification scheme based on joint multi-model prediction using the mapping relationship between addresses and entities.The innovation of the method is to(1)Extract as many valuable features as possible when an address is given to facilitate the multi-class service identification task.(2)Unlike the general supervised model approach,this paper proposes a joint prediction scheme for multiple learners based on address-entity mapping relationships.Specifically,after obtaining the overall features,the address classification and entity clustering tasks are performed separately,and the results are subjected to graph-basedmaximization consensus.The final result ismade to baseline the individual address classification results while satisfying the constraint of having similarly behaving entities as far as possible.By testing and evaluating over 26,000 Bitcoin addresses,our feature extraction method captures more useful features.In addition,the combined multi-learner model obtained results that exceeded the baseline classifier reaching an accuracy of 77.4%.展开更多
We apply stochastic seismic inversion and Bayesian facies classification for porosity modeling and igneous rock identification in the presalt interval of the Santos Basin. This integration of seismic and well-derived ...We apply stochastic seismic inversion and Bayesian facies classification for porosity modeling and igneous rock identification in the presalt interval of the Santos Basin. This integration of seismic and well-derived information enhances reservoir characterization. Stochastic inversion and Bayesian classification are powerful tools because they permit addressing the uncertainties in the model. We used the ES-MDA algorithm to achieve the realizations equivalent to the percentiles P10, P50, and P90 of acoustic impedance, a novel method for acoustic inversion in presalt. The facies were divided into five: reservoir 1,reservoir 2, tight carbonates, clayey rocks, and igneous rocks. To deal with the overlaps in acoustic impedance values of facies, we included geological information using a priori probability, indicating that structural highs are reservoir-dominated. To illustrate our approach, we conducted porosity modeling using facies-related rock-physics models for rock-physics inversion in an area with a well drilled in a coquina bank and evaluated the thickness and extension of an igneous intrusion near the carbonate-salt interface. The modeled porosity and the classified seismic facies are in good agreement with the ones observed in the wells. Notably, the coquinas bank presents an improvement in the porosity towards the top. The a priori probability model was crucial for limiting the clayey rocks to the structural lows. In Well B, the hit rate of the igneous rock in the three scenarios is higher than 60%, showing an excellent thickness-prediction capability.展开更多
基金supported in part by the Nationa Natural Science Foundation of China (61876011)the National Key Research and Development Program of China (2022YFB4703700)+1 种基金the Key Research and Development Program 2020 of Guangzhou (202007050002)the Key-Area Research and Development Program of Guangdong Province (2020B090921003)。
文摘Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to establish relationships between distant but relevant points. To overcome the limitation of local spatial attention, we propose a point content-based Transformer architecture, called PointConT for short. It exploits the locality of points in the feature space(content-based), which clusters the sampled points with similar features into the same class and computes the self-attention within each class, thus enabling an effective trade-off between capturing long-range dependencies and computational complexity. We further introduce an inception feature aggregator for point cloud classification, which uses parallel structures to aggregate high-frequency and low-frequency information in each branch separately. Extensive experiments show that our PointConT model achieves a remarkable performance on point cloud shape classification. Especially, our method exhibits 90.3% Top-1 accuracy on the hardest setting of ScanObjectN N. Source code of this paper is available at https://github.com/yahuiliu99/PointC onT.
基金supported by the Yunnan Major Scientific and Technological Projects(Grant No.202302AD080001)the National Natural Science Foundation,China(No.52065033).
文摘When building a classification model,the scenario where the samples of one class are significantly more than those of the other class is called data imbalance.Data imbalance causes the trained classification model to be in favor of the majority class(usually defined as the negative class),which may do harm to the accuracy of the minority class(usually defined as the positive class),and then lead to poor overall performance of the model.A method called MSHR-FCSSVM for solving imbalanced data classification is proposed in this article,which is based on a new hybrid resampling approach(MSHR)and a new fine cost-sensitive support vector machine(CS-SVM)classifier(FCSSVM).The MSHR measures the separability of each negative sample through its Silhouette value calculated by Mahalanobis distance between samples,based on which,the so-called pseudo-negative samples are screened out to generate new positive samples(over-sampling step)through linear interpolation and are deleted finally(under-sampling step).This approach replaces pseudo-negative samples with generated new positive samples one by one to clear up the inter-class overlap on the borderline,without changing the overall scale of the dataset.The FCSSVM is an improved version of the traditional CS-SVM.It considers influences of both the imbalance of sample number and the class distribution on classification simultaneously,and through finely tuning the class cost weights by using the efficient optimization algorithm based on the physical phenomenon of rime-ice(RIME)algorithm with cross-validation accuracy as the fitness function to accurately adjust the classification borderline.To verify the effectiveness of the proposed method,a series of experiments are carried out based on 20 imbalanced datasets including both mildly and extremely imbalanced datasets.The experimental results show that the MSHR-FCSSVM method performs better than the methods for comparison in most cases,and both the MSHR and the FCSSVM played significant roles.
文摘Background: Cavernous transformation of the portal vein(CTPV) due to portal vein obstruction is a rare vascular anomaly defined as the formation of multiple collateral vessels in the hepatic hilum. This study aimed to investigate the imaging features of intrahepatic portal vein in adult patients with CTPV and establish the relationship between the manifestations of intrahepatic portal vein and the progression of CTPV. Methods: We retrospectively analyzed 14 CTPV patients in Beijing Tsinghua Changgung Hospital. All patients underwent both direct portal venography(DPV) and computed tomography angiography(CTA) to reveal the manifestations of the portal venous system. The vessels measured included the left portal vein(LPV), right portal vein(RPV), main portal vein(MPV) and the portal vein bifurcation(PVB). Results: Nine males and 5 females, with a median age of 40.5 years, were included in the study. No significant difference was found in the diameters of the LPV or RPV measured by DPV and CTA. The visualization in terms of LPV, RPV and PVB measured by DPV was higher than that by CTA. There was a significant association between LPV/RPV and PVB/MPV in term of visibility revealed with DPV( P = 0.01), while this association was not observed with CTA. According to the imaging features of the portal vein measured by DPV, CTPV was classified into three categories to facilitate the diagnosis and treatment. Conclusions: DPV was more accurate than CTA for revealing the course of the intrahepatic portal vein in patients with CTPV. The classification of CTPV, that originated from the imaging features of the portal vein revealed by DPV, may provide a new perspective for the diagnosis and treatment of CTPV.
基金the Natural Science Foundation of China(Grant Numbers 72074014 and 72004012).
文摘Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on the benchmark datasets have been proposed for multi-label classification task in the literature.Furthermore,several open-source tools implementing these approaches have also been developed.However,the characteristics of real-world multi-label patent and publication datasets are not completely in line with those of benchmark ones.Therefore,the main purpose of this paper is to evaluate comprehensively seven multi-label classification methods on real-world datasets.Research limitations:Three real-world datasets differ in the following aspects:statement,data quality,and purposes.Additionally,open-source tools designed for multi-label classification also have intrinsic differences in their approaches for data processing and feature selection,which in turn impacts the performance of a multi-label classification approach.In the near future,we will enhance experimental precision and reinforce the validity of conclusions by employing more rigorous control over variables through introducing expanded parameter settings.Practical implications:The observed Macro F1 and Micro F1 scores on real-world datasets typically fall short of those achieved on benchmark datasets,underscoring the complexity of real-world multi-label classification tasks.Approaches leveraging deep learning techniques offer promising solutions by accommodating the hierarchical relationships and interdependencies among labels.With ongoing enhancements in deep learning algorithms and large-scale models,it is expected that the efficacy of multi-label classification tasks will be significantly improved,reaching a level of practical utility in the foreseeable future.Originality/value:(1)Seven multi-label classification methods are comprehensively compared on three real-world datasets.(2)The TextCNN and TextRCNN models perform better on small-scale datasets with more complex hierarchical structure of labels and more balanced document-label distribution.(3)The MLkNN method works better on the larger-scale dataset with more unbalanced document-label distribution.
基金supported by the Shandong Provin-cial Key Research Project of Undergraduate Teaching Reform(No.Z2022218)the Fundamental Research Funds for the Central University(No.202113028)+1 种基金the Graduate Education Promotion Program of Ocean University of China(No.HDJG20006)supported by the Sailing Laboratory of Ocean University of China.
文摘The tell tail is usually placed on the triangular sail to display the running state of the air flow on the sail surface.It is of great significance to make accurate judgement on the drift of the tell tail of the sailboat during sailing for the best sailing effect.Normally it is difficult for sailors to keep an eye for a long time on the tell sail for accurate judging its changes,affected by strong sunlight and visual fatigue.In this case,we adopt computer vision technology in hope of helping the sailors judge the changes of the tell tail in ease with ease.This paper proposes for the first time a method to classify sailboat tell tails based on deep learning and an expert guidance system,supported by a sailboat tell tail classification data set on the expert guidance system of interpreting the tell tails states in different sea wind conditions,including the feature extraction performance.Considering the expression capabilities that vary with the computational features in different visual tasks,the paper focuses on five tell tail computing features,which are recoded by an automatic encoder and classified by a SVM classifier.All experimental samples were randomly divided into five groups,and four groups were selected from each group as the training set to train the classifier.The remaining one group was used as the test set for testing.The highest resolution value of the ResNet network was 80.26%.To achieve better operational results on the basis of deep computing features obtained through the ResNet network in the experiments.The method can be used to assist the sailors in making better judgement about the tell tail changes during sailing.
基金supported and funded by the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University(IMSIU)(Grant Number IMSIU-RP23044).
文摘Lung cancer is a leading cause of global mortality rates.Early detection of pulmonary tumors can significantly enhance the survival rate of patients.Recently,various Computer-Aided Diagnostic(CAD)methods have been developed to enhance the detection of pulmonary nodules with high accuracy.Nevertheless,the existing method-ologies cannot obtain a high level of specificity and sensitivity.The present study introduces a novel model for Lung Cancer Segmentation and Classification(LCSC),which incorporates two improved architectures,namely the improved U-Net architecture and the improved AlexNet architecture.The LCSC model comprises two distinct stages.The first stage involves the utilization of an improved U-Net architecture to segment candidate nodules extracted from the lung lobes.Subsequently,an improved AlexNet architecture is employed to classify lung cancer.During the first stage,the proposed model demonstrates a dice accuracy of 0.855,a precision of 0.933,and a recall of 0.789 for the segmentation of candidate nodules.The suggested improved AlexNet architecture attains 97.06%accuracy,a true positive rate of 96.36%,a true negative rate of 97.77%,a positive predictive value of 97.74%,and a negative predictive value of 96.41%for classifying pulmonary cancer as either benign or malignant.The proposed LCSC model is tested and evaluated employing the publically available dataset furnished by the Lung Image Database Consortium and Image Database Resource Initiative(LIDC-IDRI).This proposed technique exhibits remarkable performance compared to the existing methods by using various evaluation parameters.
基金supported by grants from the National Natural Science Foundation of China(Grant No.82172660)Hebei Province Graduate Student Innovation Project(Grant No.CXZZBS2023001)Baoding Natural Science Foundation(Grant No.H2272P015).
文摘Among central nervous system-associated malignancies,glioblastoma(GBM)is the most common and has the highest mortality rate.The high heterogeneity of GBM cell types and the complex tumor microenvironment frequently lead to tumor recurrence and sudden relapse in patients treated with temozolomide.In precision medicine,research on GBM treatment is increasingly focusing on molecular subtyping to precisely characterize the cellular and molecular heterogeneity,as well as the refractory nature of GBM toward therapy.Deep understanding of the different molecular expression patterns of GBM subtypes is critical.Researchers have recently proposed tetra fractional or tripartite methods for detecting GBM molecular subtypes.The various molecular subtypes of GBM show significant differences in gene expression patterns and biological behaviors.These subtypes also exhibit high plasticity in their regulatory pathways,oncogene expression,tumor microenvironment alterations,and differential responses to standard therapy.Herein,we summarize the current molecular typing scheme of GBM and the major molecular/genetic characteristics of each subtype.Furthermore,we review the mesenchymal transition mechanisms of GBM under various regulators.
基金the National Social Science Foundation of China(Grant No.22BTJ035).
文摘The classification of functional data has drawn much attention in recent years.The main challenge is representing infinite-dimensional functional data by finite-dimensional features while utilizing those features to achieve better classification accuracy.In this paper,we propose a mean-variance-based(MV)feature weighting method for classifying functional data or functional curves.In the feature extraction stage,each sample curve is approximated by B-splines to transfer features to the coefficients of the spline basis.After that,a feature weighting approach based on statistical principles is introduced by comprehensively considering the between-class differences and within-class variations of the coefficients.We also introduce a scaling parameter to adjust the gap between the weights of features.The new feature weighting approach can adaptively enhance noteworthy local features while mitigating the impact of confusing features.The algorithms for feature weighted K-nearest neighbor and support vector machine classifiers are both provided.Moreover,the new approach can be well integrated into existing functional data classifiers,such as the generalized functional linear model and functional linear discriminant analysis,resulting in a more accurate classification.The performance of the mean-variance-based classifiers is evaluated by simulation studies and real data.The results show that the newfeatureweighting approach significantly improves the classification accuracy for complex functional data.
文摘Intrusion detection is a predominant task that monitors and protects the network infrastructure.Therefore,many datasets have been published and investigated by researchers to analyze and understand the problem of intrusion prediction and detection.In particular,the Network Security Laboratory-Knowledge Discovery in Databases(NSL-KDD)is an extensively used benchmark dataset for evaluating intrusion detection systems(IDSs)as it incorporates various network traffic attacks.It is worth mentioning that a large number of studies have tackled the problem of intrusion detection using machine learning models,but the performance of these models often decreases when evaluated on new attacks.This has led to the utilization of deep learning techniques,which have showcased significant potential for processing large datasets and therefore improving detection accuracy.For that reason,this paper focuses on the role of stacking deep learning models,including convolution neural network(CNN)and deep neural network(DNN)for improving the intrusion detection rate of the NSL-KDD dataset.Each base model is trained on the NSL-KDD dataset to extract significant features.Once the base models have been trained,the stacking process proceeds to the second stage,where a simple meta-model has been trained on the predictions generated from the proposed base models.The combination of the predictions allows the meta-model to distinguish different classes of attacks and increase the detection rate.Our experimental evaluations using the NSL-KDD dataset have shown the efficacy of stacking deep learning models for intrusion detection.The performance of the ensemble of base models,combined with the meta-model,exceeds the performance of individual models.Our stacking model has attained an accuracy of 99%and an average F1-score of 93%for the multi-classification scenario.Besides,the training time of the proposed ensemble model is lower than the training time of benchmark techniques,demonstrating its efficiency and robustness.
文摘Predicting depression intensity from microblogs and social media posts has numerous benefits and applications,including predicting early psychological disorders and stress in individuals or the general public.A major challenge in predicting depression using social media posts is that the existing studies do not focus on predicting the intensity of depression in social media texts but rather only perform the binary classification of depression and moreover noisy data makes it difficult to predict the true depression in the social media text.This study intends to begin by collecting relevant Tweets and generating a corpus of 210000 public tweets using Twitter public application programming interfaces(APIs).A strategy is devised to filter out only depression-related tweets by creating a list of relevant hashtags to reduce noise in the corpus.Furthermore,an algorithm is developed to annotate the data into three depression classes:‘Mild,’‘Moderate,’and‘Severe,’based on International Classification of Diseases-10(ICD-10)depression diagnostic criteria.Different baseline classifiers are applied to the annotated dataset to get a preliminary idea of classification performance on the corpus.Further FastText-based model is applied and fine-tuned with different preprocessing techniques and hyperparameter tuning to produce the tuned model,which significantly increases the depression classification performance to an 84%F1 score and 90%accuracy compared to baselines.Finally,a FastText-based weighted soft voting ensemble(WSVE)is proposed to boost the model’s performance by combining several other classifiers and assigning weights to individual models according to their individual performances.The proposed WSVE outperformed all baselines as well as FastText alone,with an F1 of 89%,5%higher than FastText alone,and an accuracy of 93%,3%higher than FastText alone.The proposed model better captures the contextual features of the relatively small sample class and aids in the detection of early depression intensity prediction from tweets with impactful performances.
基金supported by the National Natural Science Foundation of China(Grant No.42162026)the Applied Basic Research Foundation of Yunnan Province(Grant No.202201AT070083).
文摘Although disintegrated dolomite,widely distributed across the globe,has conventionally been a focus of research in underground engineering,the issue of slope stability issues in disintegrated dolomite strata is gaining increasing prominence.This is primarily due to their unique properties,including low strength and loose structure.Current methods for evaluating slope stability,such as basic quality(BQ)and slope stability probability classification(SSPC),do not adequately account for the poor integrity and structural fragmentation characteristic of disintegrated dolomite.To address this challenge,an analysis of the applicability of the limit equilibrium method(LEM),BQ,and SSPC methods was conducted on eight disintegrated dolomite slopes located in Baoshan,Southwest China.However,conflicting results were obtained.Therefore,this paper introduces a novel method,SMRDDS,to provide rapid and accurate assessment of disintegrated dolomite slope stability.This method incorporates parameters such as disintegrated grade,joint state,groundwater conditions,and excavation methods.The findings reveal that six slopes exhibit stability,while two are considered partially unstable.Notably,the proposed method demonstrates a closer match with the actual conditions and is more time-efficient compared with the BQ and SSPC methods.However,due to the limited research on disintegrated dolomite slopes,the results of the SMRDDS method tend to be conservative as a safety precaution.In conclusion,the SMRDDS method can quickly evaluate the current situation of disintegrated dolomite slopes in the field.This contributes significantly to disaster risk reduction for disintegrated dolomite slopes.
基金Institute of Information&Communications Technology Planning&Evaluation,Grant/Award Number:2022-0-00074。
文摘Few‐shot image classification is the task of classifying novel classes using extremely limited labelled samples.To perform classification using the limited samples,one solution is to learn the feature alignment(FA)information between the labelled and unlabelled sample features.Most FA methods use the feature mean as the class prototype and calculate the correlation between prototype and unlabelled features to learn an alignment strategy.However,mean prototypes tend to degenerate informative features because spatial features at the same position may not be equally important for the final classification,leading to inaccurate correlation calculations.Therefore,the authors propose an effective intraclass FA strategy that aggregates semantically similar spatial features from an adaptive reference prototype in low‐dimensional feature space to obtain an informative prototype feature map for precise correlation computation.Moreover,a dual correlation module to learn the hard and soft correlations was developed by the authors.This module combines the correlation information between the prototype and unlabelled features in both the original and learnable feature spaces,aiming to produce a comprehensive cross‐correlation between the prototypes and unlabelled features.Using both FA and cross‐attention modules,our model can maintain informative class features and capture important shared features for classification.Experimental results on three few‐shot classification benchmarks show that the proposed method outperformed related methods and resulted in a 3%performance boost in the 1‐shot setting by inserting the proposed module into the related methods.
基金Major Program of National Natural Science Foundation of China(NSFC12292980,NSFC12292984)National Key R&D Program of China(2023YFA1009000,2023YFA1009004,2020YFA0712203,2020YFA0712201)+2 种基金Major Program of National Natural Science Foundation of China(NSFC12031016)Beijing Natural Science Foundation(BNSFZ210003)Department of Science,Technology and Information of the Ministry of Education(8091B042240).
文摘Gliomas have the highest mortality rate of all brain tumors.Correctly classifying the glioma risk period can help doctors make reasonable treatment plans and improve patients’survival rates.This paper proposes a hierarchical multi-scale attention feature fusion medical image classification network(HMAC-Net),which effectively combines global features and local features.The network framework consists of three parallel layers:The global feature extraction layer,the local feature extraction layer,and the multi-scale feature fusion layer.A linear sparse attention mechanism is designed in the global feature extraction layer to reduce information redundancy.In the local feature extraction layer,a bilateral local attention mechanism is introduced to improve the extraction of relevant information between adjacent slices.In the multi-scale feature fusion layer,a channel fusion block combining convolutional attention mechanism and residual inverse multi-layer perceptron is proposed to prevent gradient disappearance and network degradation and improve feature representation capability.The double-branch iterative multi-scale classification block is used to improve the classification performance.On the brain glioma risk grading dataset,the results of the ablation experiment and comparison experiment show that the proposed HMAC-Net has the best performance in both qualitative analysis of heat maps and quantitative analysis of evaluation indicators.On the dataset of skin cancer classification,the generalization experiment results show that the proposed HMAC-Net has a good generalization effect.
文摘In the era of the Internet of Things(IoT),the proliferation of connected devices has raised security concerns,increasing the risk of intrusions into diverse systems.Despite the convenience and efficiency offered by IoT technology,the growing number of IoT devices escalates the likelihood of attacks,emphasizing the need for robust security tools to automatically detect and explain threats.This paper introduces a deep learning methodology for detecting and classifying distributed denial of service(DDoS)attacks,addressing a significant security concern within IoT environments.An effective procedure of deep transfer learning is applied to utilize deep learning backbones,which is then evaluated on two benchmarking datasets of DDoS attacks in terms of accuracy and time complexity.By leveraging several deep architectures,the study conducts thorough binary and multiclass experiments,each varying in the complexity of classifying attack types and demonstrating real-world scenarios.Additionally,this study employs an explainable artificial intelligence(XAI)AI technique to elucidate the contribution of extracted features in the process of attack detection.The experimental results demonstrate the effectiveness of the proposed method,achieving a recall of 99.39%by the XAI bidirectional long short-term memory(XAI-BiLSTM)model.
基金supported by the Beijing Natural Science Foundation (L202003)National Natural Science Foundation of China (No. 31700479)。
文摘Automatic modulation classification(AMC) technology is one of the cutting-edge technologies in cognitive radio communications. AMC based on deep learning has recently attracted much attention due to its superior performances in classification accuracy and robustness. In this paper, we propose a novel, high resolution and multi-scale feature fusion convolutional neural network model with a squeeze-excitation block, referred to as HRSENet,to classify different kinds of modulation signals.The proposed model establishes a parallel computing mechanism of multi-resolution feature maps through the multi-layer convolution operation, which effectively reduces the information loss caused by downsampling convolution. Moreover, through dense skipconnecting at the same resolution and up-sampling or down-sampling connection at different resolutions, the low resolution representation of the deep feature maps and the high resolution representation of the shallow feature maps are simultaneously extracted and fully integrated, which is benificial to mine signal multilevel features. Finally, the feature squeeze and excitation module embedded in the decoder is used to adjust the response weights between channels, further improving classification accuracy of proposed model.The proposed HRSENet significantly outperforms existing methods in terms of classification accuracy on the public dataset “Over the Air” in signal-to-noise(SNR) ranging from-2dB to 20dB. The classification accuracy in the proposed model achieves 85.36% and97.30% at 4dB and 10dB, respectively, with the improvement by 9.71% and 5.82% compared to LWNet.Furthermore, the model also has a moderate computation complexity compared with several state-of-the-art methods.
基金supported by Beijing Municipal Science and Technology Project(No.Z221100007122003)。
文摘Imbalanced data classification is the task of classifying datasets where there is a significant disparity in the number of samples between different classes.This task is prevalent in practical scenarios such as industrial fault diagnosis,network intrusion detection,cancer detection,etc.In imbalanced classification tasks,the focus is typically on achieving high recognition accuracy for the minority class.However,due to the challenges presented by imbalanced multi-class datasets,such as the scarcity of samples in minority classes and complex inter-class relationships with overlapping boundaries,existing methods often do not perform well in multi-class imbalanced data classification tasks,particularly in terms of recognizing minority classes with high accuracy.Therefore,this paper proposes a multi-class imbalanced data classification method called CSDSResNet,which is based on a cost-sensitive dualstream residual network.Firstly,to address the issue of limited samples in the minority class within imbalanced datasets,a dual-stream residual network backbone structure is designed to enhance the model’s feature extraction capability.Next,considering the complexities arising fromimbalanced inter-class sample quantities and imbalanced inter-class overlapping boundaries in multi-class imbalanced datasets,a unique cost-sensitive loss function is devised.This loss function places more emphasis on the minority class and the challenging classes with high interclass similarity,thereby improving the model’s classification ability.Finally,the effectiveness and generalization of the proposed method,CSDSResNet,are evaluated on two datasets:‘DryBeans’and‘Electric Motor Defects’.The experimental results demonstrate that CSDSResNet achieves the best performance on imbalanced datasets,with macro_F1-score values improving by 2.9%and 1.9%on the two datasets compared to current state-of-the-art classification methods,respectively.Furthermore,it achieves the highest precision in single-class recognition tasks for the minority class.
基金National Natural Science Foundation of China(No.62201457)Natural Science Foundation of Shaanxi Province(Nos.2022JQ-668,2022JQ-588)。
文摘Convolutional neural network(CNN)has excellent ability to model locally contextual information.However,CNNs face challenges for descripting long-range semantic features,which will lead to relatively low classification accuracy of hyperspectral images.To address this problem,this article proposes an algorithm based on multiscale fusion and transformer network for hyperspectral image classification.Firstly,the low-level spatial-spectral features are extracted by multi-scale residual structure.Secondly,an attention module is introduced to focus on the more important spatialspectral information.Finally,high-level semantic features are represented and learned by a token learner and an improved transformer encoder.The proposed algorithm is compared with six classical hyperspectral classification algorithms on real hyperspectral images.The experimental results show that the proposed algorithm effectively improves the land cover classification accuracy of hyperspectral images.
基金Supported by the Municipal Government and School(Hospital)Joint Funding Programme of Guangzhou(No.2023A03J0174,No.2023A03J0188)the State Key Laboratories’Youth Program of China(No.83000-32030003).
文摘●AIM:To establish a classification for congenital cataracts that can facilitate individualized treatment and help identify individuals with a high likelihood of different visual outcomes.●METHODS:Consecutive patients diagnosed with congenital cataracts and undergoing surgery between January 2005 and November 2021 were recruited.Data on visual outcomes and the phenotypic characteristics of ocular biometry and the anterior and posterior segments were extracted from the patients’medical records.A hierarchical cluster analysis was performed.The main outcome measure was the identification of distinct clusters of eyes with congenital cataracts.●RESULTS:A total of 164 children(299 eyes)were divided into two clusters based on their ocular features.Cluster 1(96 eyes)had a shorter axial length(mean±SD,19.44±1.68 mm),a low prevalence of macular abnormalities(1.04%),and no retinal abnormalities or posterior cataracts.Cluster 2(203 eyes)had a greater axial length(mean±SD,20.42±2.10 mm)and a higher prevalence of macular abnormalities(8.37%),retinal abnormalities(98.52%),and posterior cataracts(4.93%).Compared with the eyes in Cluster 2(57.14%),those in Cluster 1(71.88%)had a 2.2 times higher chance of good best-corrected visual acuity[<0.7 logMAR;OR(95%CI),2.20(1.25–3.81);P=0.006].●CONCLUSION:This retrospective study categorizes congenital cataracts into two distinct clusters,each associated with a different likelihood of visual outcomes.This innovative classification may enable the personalization and prioritization of early interventions for patients who may gain the greatest benefit,thereby making strides toward precision medicine in the field of congenital cataracts.
基金sponsored by the National Natural Science Foundation of China Nos.62172353,62302114 and U20B2046Future Network Scientific Research Fund Project No.FNSRFP-2021-YB-48Innovation Fund Program of the Engineering Research Center for Integration and Application of Digital Learning Technology of Ministry of Education No.1221045。
文摘Bitcoin is widely used as the most classic electronic currency for various electronic services such as exchanges,gambling,marketplaces,and also scams such as high-yield investment projects.Identifying the services operated by a Bitcoin address can help determine the risk level of that address and build an alert model accordingly.Feature engineering can also be used to flesh out labeled addresses and to analyze the current state of Bitcoin in a small way.In this paper,we address the problem of identifying multiple classes of Bitcoin services,and for the poor classification of individual addresses that do not have significant features,we propose a Bitcoin address identification scheme based on joint multi-model prediction using the mapping relationship between addresses and entities.The innovation of the method is to(1)Extract as many valuable features as possible when an address is given to facilitate the multi-class service identification task.(2)Unlike the general supervised model approach,this paper proposes a joint prediction scheme for multiple learners based on address-entity mapping relationships.Specifically,after obtaining the overall features,the address classification and entity clustering tasks are performed separately,and the results are subjected to graph-basedmaximization consensus.The final result ismade to baseline the individual address classification results while satisfying the constraint of having similarly behaving entities as far as possible.By testing and evaluating over 26,000 Bitcoin addresses,our feature extraction method captures more useful features.In addition,the combined multi-learner model obtained results that exceeded the baseline classifier reaching an accuracy of 77.4%.
基金Equinor for financing the R&D projectthe Institute of Science and Technology of Petroleum Geophysics of Brazil for supporting this research。
文摘We apply stochastic seismic inversion and Bayesian facies classification for porosity modeling and igneous rock identification in the presalt interval of the Santos Basin. This integration of seismic and well-derived information enhances reservoir characterization. Stochastic inversion and Bayesian classification are powerful tools because they permit addressing the uncertainties in the model. We used the ES-MDA algorithm to achieve the realizations equivalent to the percentiles P10, P50, and P90 of acoustic impedance, a novel method for acoustic inversion in presalt. The facies were divided into five: reservoir 1,reservoir 2, tight carbonates, clayey rocks, and igneous rocks. To deal with the overlaps in acoustic impedance values of facies, we included geological information using a priori probability, indicating that structural highs are reservoir-dominated. To illustrate our approach, we conducted porosity modeling using facies-related rock-physics models for rock-physics inversion in an area with a well drilled in a coquina bank and evaluated the thickness and extension of an igneous intrusion near the carbonate-salt interface. The modeled porosity and the classified seismic facies are in good agreement with the ones observed in the wells. Notably, the coquinas bank presents an improvement in the porosity towards the top. The a priori probability model was crucial for limiting the clayey rocks to the structural lows. In Well B, the hit rate of the igneous rock in the three scenarios is higher than 60%, showing an excellent thickness-prediction capability.