Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on t...Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on the benchmark datasets have been proposed for multi-label classification task in the literature.Furthermore,several open-source tools implementing these approaches have also been developed.However,the characteristics of real-world multi-label patent and publication datasets are not completely in line with those of benchmark ones.Therefore,the main purpose of this paper is to evaluate comprehensively seven multi-label classification methods on real-world datasets.Research limitations:Three real-world datasets differ in the following aspects:statement,data quality,and purposes.Additionally,open-source tools designed for multi-label classification also have intrinsic differences in their approaches for data processing and feature selection,which in turn impacts the performance of a multi-label classification approach.In the near future,we will enhance experimental precision and reinforce the validity of conclusions by employing more rigorous control over variables through introducing expanded parameter settings.Practical implications:The observed Macro F1 and Micro F1 scores on real-world datasets typically fall short of those achieved on benchmark datasets,underscoring the complexity of real-world multi-label classification tasks.Approaches leveraging deep learning techniques offer promising solutions by accommodating the hierarchical relationships and interdependencies among labels.With ongoing enhancements in deep learning algorithms and large-scale models,it is expected that the efficacy of multi-label classification tasks will be significantly improved,reaching a level of practical utility in the foreseeable future.Originality/value:(1)Seven multi-label classification methods are comprehensively compared on three real-world datasets.(2)The TextCNN and TextRCNN models perform better on small-scale datasets with more complex hierarchical structure of labels and more balanced document-label distribution.(3)The MLkNN method works better on the larger-scale dataset with more unbalanced document-label distribution.展开更多
The research aims to improve the performance of image recognition methods based on a description in the form of a set of keypoint descriptors.The main focus is on increasing the speed of establishing the relevance of ...The research aims to improve the performance of image recognition methods based on a description in the form of a set of keypoint descriptors.The main focus is on increasing the speed of establishing the relevance of object and etalon descriptions while maintaining the required level of classification efficiency.The class to be recognized is represented by an infinite set of images obtained from the etalon by applying arbitrary geometric transformations.It is proposed to reduce the descriptions for the etalon database by selecting the most significant descriptor components according to the information content criterion.The informativeness of an etalon descriptor is estimated by the difference of the closest distances to its own and other descriptions.The developed method determines the relevance of the full description of the recognized object with the reduced description of the etalons.Several practical models of the classifier with different options for establishing the correspondence between object descriptors and etalons are considered.The results of the experimental modeling of the proposed methods for a database including images of museum jewelry are presented.The test sample is formed as a set of images from the etalon database and out of the database with the application of geometric transformations of scale and rotation in the field of view.The practical problems of determining the threshold for the number of votes,based on which a classification decision is made,have been researched.Modeling has revealed the practical possibility of tenfold reducing descriptions with full preservation of classification accuracy.Reducing the descriptions by twenty times in the experiment leads to slightly decreased accuracy.The speed of the analysis increases in proportion to the degree of reduction.The use of reduction by the informativeness criterion confirmed the possibility of obtaining the most significant subset of features for classification,which guarantees a decent level of accuracy.展开更多
Condensed and hydrolysable tannins are non-toxic natural polyphenols that are a commercial commodity industrialized for tanning hides to obtain leather and for a growing number of other industrial applications mainly ...Condensed and hydrolysable tannins are non-toxic natural polyphenols that are a commercial commodity industrialized for tanning hides to obtain leather and for a growing number of other industrial applications mainly to substitute petroleum-based products.They are a definite class of sustainable materials of the forestry industry.They have been in operation for hundreds of years to manufacture leather and now for a growing number of applications in a variety of other industries,such as wood adhesives,metal coating,pharmaceutical/medical applications and several others.This review presents the main sources,either already or potentially commercial of this forestry by-materials,their industrial and laboratory extraction systems,their systems of analysis with their advantages and drawbacks,be these methods so simple to even appear primitive but nonetheless of proven effectiveness,or very modern and instrumental.It constitutes a basic but essential summary of what is necessary to know of these sustainable materials.In doing so,the review highlights some of the main challenges that remain to be addressed to deliver the quality and economics of tannin supply necessary to fulfill the industrial production requirements for some materials-based uses.展开更多
Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to est...Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to establish relationships between distant but relevant points. To overcome the limitation of local spatial attention, we propose a point content-based Transformer architecture, called PointConT for short. It exploits the locality of points in the feature space(content-based), which clusters the sampled points with similar features into the same class and computes the self-attention within each class, thus enabling an effective trade-off between capturing long-range dependencies and computational complexity. We further introduce an inception feature aggregator for point cloud classification, which uses parallel structures to aggregate high-frequency and low-frequency information in each branch separately. Extensive experiments show that our PointConT model achieves a remarkable performance on point cloud shape classification. Especially, our method exhibits 90.3% Top-1 accuracy on the hardest setting of ScanObjectN N. Source code of this paper is available at https://github.com/yahuiliu99/PointC onT.展开更多
In this study,the structural characters,antioxidant activities and bile acid-binding ability of sea buckthorn polysaccharides(HRPs)obtained by the commonly used hot water(HRP-W),pressurized hot water(HRP-H),ultrasonic...In this study,the structural characters,antioxidant activities and bile acid-binding ability of sea buckthorn polysaccharides(HRPs)obtained by the commonly used hot water(HRP-W),pressurized hot water(HRP-H),ultrasonic(HRP-U),acid(HRP-C)and alkali(HRP-A)assisted extraction methods were investigated.The results demonstrated that extraction methods had significant effects on extraction yield,monosaccharide composition,molecular weight,particle size,triple-helical structure,and surface morphology of HRPs except for the major linkage bands.Thermogravimetric analysis showed that HRP-U with filamentous reticular microstructure exhibited better thermal stability.The HRP-A with the lowest molecular weight and highest arabinose content possessed the best antioxidant activities.Moreover,the rheological analysis indicated that HRPs with higher galacturonic acid content and molecular weight showed higher viscosity and stronger crosslinking network(HRP-C,HRP-W and HRP-U),which exhibited stronger bile acid binding capacity.The present findings provide scientific evidence in the preparation technology of sea buckthorn polysaccharides with good antioxidant and bile acid binding capacity which are related to the structure affected by the extraction methods.展开更多
Background: Cavernous transformation of the portal vein(CTPV) due to portal vein obstruction is a rare vascular anomaly defined as the formation of multiple collateral vessels in the hepatic hilum. This study aimed to...Background: Cavernous transformation of the portal vein(CTPV) due to portal vein obstruction is a rare vascular anomaly defined as the formation of multiple collateral vessels in the hepatic hilum. This study aimed to investigate the imaging features of intrahepatic portal vein in adult patients with CTPV and establish the relationship between the manifestations of intrahepatic portal vein and the progression of CTPV. Methods: We retrospectively analyzed 14 CTPV patients in Beijing Tsinghua Changgung Hospital. All patients underwent both direct portal venography(DPV) and computed tomography angiography(CTA) to reveal the manifestations of the portal venous system. The vessels measured included the left portal vein(LPV), right portal vein(RPV), main portal vein(MPV) and the portal vein bifurcation(PVB). Results: Nine males and 5 females, with a median age of 40.5 years, were included in the study. No significant difference was found in the diameters of the LPV or RPV measured by DPV and CTA. The visualization in terms of LPV, RPV and PVB measured by DPV was higher than that by CTA. There was a significant association between LPV/RPV and PVB/MPV in term of visibility revealed with DPV( P = 0.01), while this association was not observed with CTA. According to the imaging features of the portal vein measured by DPV, CTPV was classified into three categories to facilitate the diagnosis and treatment. Conclusions: DPV was more accurate than CTA for revealing the course of the intrahepatic portal vein in patients with CTPV. The classification of CTPV, that originated from the imaging features of the portal vein revealed by DPV, may provide a new perspective for the diagnosis and treatment of CTPV.展开更多
In this study, a new rain type classification algorithm for the Dual-Frequency Precipitation Radar(DPR) suitable over the Tibetan Plateau(TP) was proposed by analyzing Global Precipitation Measurement(GPM) DPR Level-2...In this study, a new rain type classification algorithm for the Dual-Frequency Precipitation Radar(DPR) suitable over the Tibetan Plateau(TP) was proposed by analyzing Global Precipitation Measurement(GPM) DPR Level-2 data in summer from 2014 to 2020. It was found that the DPR rain type classification algorithm(simply called DPR algorithm) has mis-identification problems in two aspects in summer TP. In the new algorithm of rain type classification in summer TP,four rain types are classified by using new thresholds, such as the maximum reflectivity factor, the difference between the maximum reflectivity factor and the background maximum reflectivity factor, and the echo top height. In the threshold of the maximum reflectivity factors, 30 d BZ and 18 d BZ are both thresholds to separate strong convective precipitation, weak convective precipitation and weak precipitation. The results illustrate obvious differences of radar reflectivity factor and vertical velocity among the three rain types in summer TP, such as the reflectivity factor of most strong convective precipitation distributes from 15 d BZ to near 35 d BZ from 4 km to 13 km, and increases almost linearly with the decrease in height. For most weak convective precipitation, the reflectivity factor distributes from 15 d BZ to 28 d BZ with the height from 4 km to 9 km. For weak precipitation, the reflectivity factor mainly distributes in range of 15–25 d BZ with height within 4–10 km. It is also shows that weak precipitation is the dominant rain type in summer TP, accounting for 40%–80%,followed by weak convective precipitation(25%–40%), and strong convective precipitation has the least proportion(less than 30%).展开更多
The complex sand-casting process combined with the interactions between process parameters makes it difficult to control the casting quality,resulting in a high scrap rate.A strategy based on a data-driven model was p...The complex sand-casting process combined with the interactions between process parameters makes it difficult to control the casting quality,resulting in a high scrap rate.A strategy based on a data-driven model was proposed to reduce casting defects and improve production efficiency,which includes the random forest(RF)classification model,the feature importance analysis,and the process parameters optimization with Monte Carlo simulation.The collected data includes four types of defects and corresponding process parameters were used to construct the RF model.Classification results show a recall rate above 90% for all categories.The Gini Index was used to assess the importance of the process parameters in the formation of various defects in the RF model.Finally,the classification model was applied to different production conditions for quality prediction.In the case of process parameters optimization for gas porosity defects,this model serves as an experimental process in the Monte Carlo method to estimate a better temperature distribution.The prediction model,when applied to the factory,greatly improved the efficiency of defect detection.Results show that the scrap rate decreased from 10.16% to 6.68%.展开更多
Although disintegrated dolomite,widely distributed across the globe,has conventionally been a focus of research in underground engineering,the issue of slope stability issues in disintegrated dolomite strata is gainin...Although disintegrated dolomite,widely distributed across the globe,has conventionally been a focus of research in underground engineering,the issue of slope stability issues in disintegrated dolomite strata is gaining increasing prominence.This is primarily due to their unique properties,including low strength and loose structure.Current methods for evaluating slope stability,such as basic quality(BQ)and slope stability probability classification(SSPC),do not adequately account for the poor integrity and structural fragmentation characteristic of disintegrated dolomite.To address this challenge,an analysis of the applicability of the limit equilibrium method(LEM),BQ,and SSPC methods was conducted on eight disintegrated dolomite slopes located in Baoshan,Southwest China.However,conflicting results were obtained.Therefore,this paper introduces a novel method,SMRDDS,to provide rapid and accurate assessment of disintegrated dolomite slope stability.This method incorporates parameters such as disintegrated grade,joint state,groundwater conditions,and excavation methods.The findings reveal that six slopes exhibit stability,while two are considered partially unstable.Notably,the proposed method demonstrates a closer match with the actual conditions and is more time-efficient compared with the BQ and SSPC methods.However,due to the limited research on disintegrated dolomite slopes,the results of the SMRDDS method tend to be conservative as a safety precaution.In conclusion,the SMRDDS method can quickly evaluate the current situation of disintegrated dolomite slopes in the field.This contributes significantly to disaster risk reduction for disintegrated dolomite slopes.展开更多
In recent years,great breakthroughs have been made in the exploration and development of natural gas in deep coal-rock reservoirs in Junggar,Ordos and other basins in China.In view of the inconsistency between the ind...In recent years,great breakthroughs have been made in the exploration and development of natural gas in deep coal-rock reservoirs in Junggar,Ordos and other basins in China.In view of the inconsistency between the industrial and academic circles on this new type of unconventional natural gas,this paper defines the concept of"coal-rock gas"on the basis of previous studies,and systematically analyzes its characteristics of occurrence state,transport and storage form,differential accumulation,and development law.Coal-rock gas,geologically unlike coalbed methane in the traditional sense,occurs in both free and adsorbed states,with free state in abundance.It is generated and stored in the same set of rocks through short distance migration,occasionally with the accumulation from other sources.Moreover,coal rock develops cleat fractures,and the free gas accumulates differentially.The coal-rock gas reservoirs deeper than 2000 m are high in pressure,temperature,gas content,gas saturation,and free-gas content.In terms of development,similar to shale gas and tight gas,coal-rock gas can be exploited by natural formation energy after the reservoirs connectivity is improved artificially,that is,the adsorbed gas is desorbed due to pressure drop after the high-potential free gas is recovered,so that the free gas and adsorbed gas are produced in succession for a long term without water drainage for pressure drop.According to buried depth,coal rank,pressure coefficient,reserves scale,reserves abundance and gas well production,the classification criteria and reserves/resources estimation method of coal-rock gas are presented.It is preliminarily estimated that the coal-rock gas in place deeper than 2000 m in China exceeds 30×10^(12)m^(3),indicating an important strategic resource for the country.The Ordos,Sichuan,Junggar and Bohai Bay basins are favorable areas for large-scale enrichment of coal-rock gas.The paper summarizes the technical and management challenges and points out the research directions,laying a foundation for the management,exploration,and development of coal-rock gas in China.展开更多
Background:The nasal alar defect in Asians remains a challenging issue,as do clear classification and algorithm guidance,despite numerous previously described surgical techniques.The aim of this study is to propose a ...Background:The nasal alar defect in Asians remains a challenging issue,as do clear classification and algorithm guidance,despite numerous previously described surgical techniques.The aim of this study is to propose a surgical algorithm that addresses the appropriate surgical procedures for different types of nasal alar defects in Asian patients.Methods:A retrospective case note review was conducted on 32 patients with nasal alar defect who underwent reconstruction between 2008 and 2022.Based on careful analysis and our clinical experience,we proposed a classification system for nasal alar defects and presented a reconstructive algorithm.Patient data,including age,sex,diagnosis,surgical options,and complications,were assessed.The extent of surgical scar formation was evaluated using standard photography based on a 4-grade scar scale.Results:Among the 32 patients,there were 20 males and 12 females with nasal alar defects.The predominant cause of trauma in China was industrial factors.The majority of alar defects were classified as type Ⅰ C(n=8,25%),comprising 18 cases(56.2%);there were 5 cases(15.6%)of type Ⅱ defect,7(21.9%)of type Ⅲ defect,and 2(6.3%)of type Ⅳ defect.The most common surgical option was auricular composite graft(n=8,25%),followed by bilobed flap(n=6,18.8%),free auricular composite flap(n=4,12.5%),and primary closure(n=3,9.4%).Satisfactory improvements were observed postoperatively.Conclusion:Factors contributing to classifications were analyzed and defined,providing a framework for the proposed classification system.The reconstructive algorithm offers surgeons appropriate procedures for treating nasal alar defect in Asians.展开更多
The inverse and direct piezoelectric and circuit coupling are widely observed in advanced electro-mechanical systems such as piezoelectric energy harvesters.Existing strongly coupled analysis methods based on direct n...The inverse and direct piezoelectric and circuit coupling are widely observed in advanced electro-mechanical systems such as piezoelectric energy harvesters.Existing strongly coupled analysis methods based on direct numerical modeling for this phenomenon can be classified into partitioned or monolithic formulations.Each formulation has its advantages and disadvantages,and the choice depends on the characteristics of each coupled problem.This study proposes a new option:a coupled analysis strategy that combines the best features of the existing formulations,namely,the hybrid partitioned-monolithic method.The analysis of inverse piezoelectricity and the monolithic analysis of direct piezoelectric and circuit interaction are strongly coupled using a partitioned iterative hierarchical algorithm.In a typical benchmark problem of a piezoelectric energy harvester,this research compares the results from the proposed method to those from the conventional strongly coupled partitioned iterative method,discussing the accuracy,stability,and computational cost.The proposed hybrid concept is effective for coupled multi-physics problems,including various coupling conditions.展开更多
Lung cancer is a leading cause of global mortality rates.Early detection of pulmonary tumors can significantly enhance the survival rate of patients.Recently,various Computer-Aided Diagnostic(CAD)methods have been dev...Lung cancer is a leading cause of global mortality rates.Early detection of pulmonary tumors can significantly enhance the survival rate of patients.Recently,various Computer-Aided Diagnostic(CAD)methods have been developed to enhance the detection of pulmonary nodules with high accuracy.Nevertheless,the existing method-ologies cannot obtain a high level of specificity and sensitivity.The present study introduces a novel model for Lung Cancer Segmentation and Classification(LCSC),which incorporates two improved architectures,namely the improved U-Net architecture and the improved AlexNet architecture.The LCSC model comprises two distinct stages.The first stage involves the utilization of an improved U-Net architecture to segment candidate nodules extracted from the lung lobes.Subsequently,an improved AlexNet architecture is employed to classify lung cancer.During the first stage,the proposed model demonstrates a dice accuracy of 0.855,a precision of 0.933,and a recall of 0.789 for the segmentation of candidate nodules.The suggested improved AlexNet architecture attains 97.06%accuracy,a true positive rate of 96.36%,a true negative rate of 97.77%,a positive predictive value of 97.74%,and a negative predictive value of 96.41%for classifying pulmonary cancer as either benign or malignant.The proposed LCSC model is tested and evaluated employing the publically available dataset furnished by the Lung Image Database Consortium and Image Database Resource Initiative(LIDC-IDRI).This proposed technique exhibits remarkable performance compared to the existing methods by using various evaluation parameters.展开更多
Among central nervous system-associated malignancies,glioblastoma(GBM)is the most common and has the highest mortality rate.The high heterogeneity of GBM cell types and the complex tumor microenvironment frequently le...Among central nervous system-associated malignancies,glioblastoma(GBM)is the most common and has the highest mortality rate.The high heterogeneity of GBM cell types and the complex tumor microenvironment frequently lead to tumor recurrence and sudden relapse in patients treated with temozolomide.In precision medicine,research on GBM treatment is increasingly focusing on molecular subtyping to precisely characterize the cellular and molecular heterogeneity,as well as the refractory nature of GBM toward therapy.Deep understanding of the different molecular expression patterns of GBM subtypes is critical.Researchers have recently proposed tetra fractional or tripartite methods for detecting GBM molecular subtypes.The various molecular subtypes of GBM show significant differences in gene expression patterns and biological behaviors.These subtypes also exhibit high plasticity in their regulatory pathways,oncogene expression,tumor microenvironment alterations,and differential responses to standard therapy.Herein,we summarize the current molecular typing scheme of GBM and the major molecular/genetic characteristics of each subtype.Furthermore,we review the mesenchymal transition mechanisms of GBM under various regulators.展开更多
The tell tail is usually placed on the triangular sail to display the running state of the air flow on the sail surface.It is of great significance to make accurate judgement on the drift of the tell tail of the sailb...The tell tail is usually placed on the triangular sail to display the running state of the air flow on the sail surface.It is of great significance to make accurate judgement on the drift of the tell tail of the sailboat during sailing for the best sailing effect.Normally it is difficult for sailors to keep an eye for a long time on the tell sail for accurate judging its changes,affected by strong sunlight and visual fatigue.In this case,we adopt computer vision technology in hope of helping the sailors judge the changes of the tell tail in ease with ease.This paper proposes for the first time a method to classify sailboat tell tails based on deep learning and an expert guidance system,supported by a sailboat tell tail classification data set on the expert guidance system of interpreting the tell tails states in different sea wind conditions,including the feature extraction performance.Considering the expression capabilities that vary with the computational features in different visual tasks,the paper focuses on five tell tail computing features,which are recoded by an automatic encoder and classified by a SVM classifier.All experimental samples were randomly divided into five groups,and four groups were selected from each group as the training set to train the classifier.The remaining one group was used as the test set for testing.The highest resolution value of the ResNet network was 80.26%.To achieve better operational results on the basis of deep computing features obtained through the ResNet network in the experiments.The method can be used to assist the sailors in making better judgement about the tell tail changes during sailing.展开更多
The classification of functional data has drawn much attention in recent years.The main challenge is representing infinite-dimensional functional data by finite-dimensional features while utilizing those features to a...The classification of functional data has drawn much attention in recent years.The main challenge is representing infinite-dimensional functional data by finite-dimensional features while utilizing those features to achieve better classification accuracy.In this paper,we propose a mean-variance-based(MV)feature weighting method for classifying functional data or functional curves.In the feature extraction stage,each sample curve is approximated by B-splines to transfer features to the coefficients of the spline basis.After that,a feature weighting approach based on statistical principles is introduced by comprehensively considering the between-class differences and within-class variations of the coefficients.We also introduce a scaling parameter to adjust the gap between the weights of features.The new feature weighting approach can adaptively enhance noteworthy local features while mitigating the impact of confusing features.The algorithms for feature weighted K-nearest neighbor and support vector machine classifiers are both provided.Moreover,the new approach can be well integrated into existing functional data classifiers,such as the generalized functional linear model and functional linear discriminant analysis,resulting in a more accurate classification.The performance of the mean-variance-based classifiers is evaluated by simulation studies and real data.The results show that the newfeatureweighting approach significantly improves the classification accuracy for complex functional data.展开更多
Intrusion detection is a predominant task that monitors and protects the network infrastructure.Therefore,many datasets have been published and investigated by researchers to analyze and understand the problem of intr...Intrusion detection is a predominant task that monitors and protects the network infrastructure.Therefore,many datasets have been published and investigated by researchers to analyze and understand the problem of intrusion prediction and detection.In particular,the Network Security Laboratory-Knowledge Discovery in Databases(NSL-KDD)is an extensively used benchmark dataset for evaluating intrusion detection systems(IDSs)as it incorporates various network traffic attacks.It is worth mentioning that a large number of studies have tackled the problem of intrusion detection using machine learning models,but the performance of these models often decreases when evaluated on new attacks.This has led to the utilization of deep learning techniques,which have showcased significant potential for processing large datasets and therefore improving detection accuracy.For that reason,this paper focuses on the role of stacking deep learning models,including convolution neural network(CNN)and deep neural network(DNN)for improving the intrusion detection rate of the NSL-KDD dataset.Each base model is trained on the NSL-KDD dataset to extract significant features.Once the base models have been trained,the stacking process proceeds to the second stage,where a simple meta-model has been trained on the predictions generated from the proposed base models.The combination of the predictions allows the meta-model to distinguish different classes of attacks and increase the detection rate.Our experimental evaluations using the NSL-KDD dataset have shown the efficacy of stacking deep learning models for intrusion detection.The performance of the ensemble of base models,combined with the meta-model,exceeds the performance of individual models.Our stacking model has attained an accuracy of 99%and an average F1-score of 93%for the multi-classification scenario.Besides,the training time of the proposed ensemble model is lower than the training time of benchmark techniques,demonstrating its efficiency and robustness.展开更多
Additive Runge-Kutta methods designed for preserving highly accurate solutions in mixed-precision computation were previously proposed and analyzed.These specially designed methods use reduced precision for the implic...Additive Runge-Kutta methods designed for preserving highly accurate solutions in mixed-precision computation were previously proposed and analyzed.These specially designed methods use reduced precision for the implicit computations and full precision for the explicit computations.In this work,we analyze the stability properties of these methods and their sensitivity to the low-precision rounding errors,and demonstrate their performance in terms of accuracy and efficiency.We develop codes in FORTRAN and Julia to solve nonlinear systems of ODEs and PDEs using the mixed-precision additive Runge-Kutta(MP-ARK)methods.The convergence,accuracy,and runtime of these methods are explored.We show that for a given level of accuracy,suitably chosen MP-ARK methods may provide significant reductions in runtime.展开更多
Predicting depression intensity from microblogs and social media posts has numerous benefits and applications,including predicting early psychological disorders and stress in individuals or the general public.A major ...Predicting depression intensity from microblogs and social media posts has numerous benefits and applications,including predicting early psychological disorders and stress in individuals or the general public.A major challenge in predicting depression using social media posts is that the existing studies do not focus on predicting the intensity of depression in social media texts but rather only perform the binary classification of depression and moreover noisy data makes it difficult to predict the true depression in the social media text.This study intends to begin by collecting relevant Tweets and generating a corpus of 210000 public tweets using Twitter public application programming interfaces(APIs).A strategy is devised to filter out only depression-related tweets by creating a list of relevant hashtags to reduce noise in the corpus.Furthermore,an algorithm is developed to annotate the data into three depression classes:‘Mild,’‘Moderate,’and‘Severe,’based on International Classification of Diseases-10(ICD-10)depression diagnostic criteria.Different baseline classifiers are applied to the annotated dataset to get a preliminary idea of classification performance on the corpus.Further FastText-based model is applied and fine-tuned with different preprocessing techniques and hyperparameter tuning to produce the tuned model,which significantly increases the depression classification performance to an 84%F1 score and 90%accuracy compared to baselines.Finally,a FastText-based weighted soft voting ensemble(WSVE)is proposed to boost the model’s performance by combining several other classifiers and assigning weights to individual models according to their individual performances.The proposed WSVE outperformed all baselines as well as FastText alone,with an F1 of 89%,5%higher than FastText alone,and an accuracy of 93%,3%higher than FastText alone.The proposed model better captures the contextual features of the relatively small sample class and aids in the detection of early depression intensity prediction from tweets with impactful performances.展开更多
Few‐shot image classification is the task of classifying novel classes using extremely limited labelled samples.To perform classification using the limited samples,one solution is to learn the feature alignment(FA)in...Few‐shot image classification is the task of classifying novel classes using extremely limited labelled samples.To perform classification using the limited samples,one solution is to learn the feature alignment(FA)information between the labelled and unlabelled sample features.Most FA methods use the feature mean as the class prototype and calculate the correlation between prototype and unlabelled features to learn an alignment strategy.However,mean prototypes tend to degenerate informative features because spatial features at the same position may not be equally important for the final classification,leading to inaccurate correlation calculations.Therefore,the authors propose an effective intraclass FA strategy that aggregates semantically similar spatial features from an adaptive reference prototype in low‐dimensional feature space to obtain an informative prototype feature map for precise correlation computation.Moreover,a dual correlation module to learn the hard and soft correlations was developed by the authors.This module combines the correlation information between the prototype and unlabelled features in both the original and learnable feature spaces,aiming to produce a comprehensive cross‐correlation between the prototypes and unlabelled features.Using both FA and cross‐attention modules,our model can maintain informative class features and capture important shared features for classification.Experimental results on three few‐shot classification benchmarks show that the proposed method outperformed related methods and resulted in a 3%performance boost in the 1‐shot setting by inserting the proposed module into the related methods.展开更多
基金the Natural Science Foundation of China(Grant Numbers 72074014 and 72004012).
文摘Purpose:Many science,technology and innovation(STI)resources are attached with several different labels.To assign automatically the resulting labels to an interested instance,many approaches with good performance on the benchmark datasets have been proposed for multi-label classification task in the literature.Furthermore,several open-source tools implementing these approaches have also been developed.However,the characteristics of real-world multi-label patent and publication datasets are not completely in line with those of benchmark ones.Therefore,the main purpose of this paper is to evaluate comprehensively seven multi-label classification methods on real-world datasets.Research limitations:Three real-world datasets differ in the following aspects:statement,data quality,and purposes.Additionally,open-source tools designed for multi-label classification also have intrinsic differences in their approaches for data processing and feature selection,which in turn impacts the performance of a multi-label classification approach.In the near future,we will enhance experimental precision and reinforce the validity of conclusions by employing more rigorous control over variables through introducing expanded parameter settings.Practical implications:The observed Macro F1 and Micro F1 scores on real-world datasets typically fall short of those achieved on benchmark datasets,underscoring the complexity of real-world multi-label classification tasks.Approaches leveraging deep learning techniques offer promising solutions by accommodating the hierarchical relationships and interdependencies among labels.With ongoing enhancements in deep learning algorithms and large-scale models,it is expected that the efficacy of multi-label classification tasks will be significantly improved,reaching a level of practical utility in the foreseeable future.Originality/value:(1)Seven multi-label classification methods are comprehensively compared on three real-world datasets.(2)The TextCNN and TextRCNN models perform better on small-scale datasets with more complex hierarchical structure of labels and more balanced document-label distribution.(3)The MLkNN method works better on the larger-scale dataset with more unbalanced document-label distribution.
基金This research was funded by Prince Sattam bin Abdulaziz University(Project Number PSAU/2023/01/25387).
文摘The research aims to improve the performance of image recognition methods based on a description in the form of a set of keypoint descriptors.The main focus is on increasing the speed of establishing the relevance of object and etalon descriptions while maintaining the required level of classification efficiency.The class to be recognized is represented by an infinite set of images obtained from the etalon by applying arbitrary geometric transformations.It is proposed to reduce the descriptions for the etalon database by selecting the most significant descriptor components according to the information content criterion.The informativeness of an etalon descriptor is estimated by the difference of the closest distances to its own and other descriptions.The developed method determines the relevance of the full description of the recognized object with the reduced description of the etalons.Several practical models of the classifier with different options for establishing the correspondence between object descriptors and etalons are considered.The results of the experimental modeling of the proposed methods for a database including images of museum jewelry are presented.The test sample is formed as a set of images from the etalon database and out of the database with the application of geometric transformations of scale and rotation in the field of view.The practical problems of determining the threshold for the number of votes,based on which a classification decision is made,have been researched.Modeling has revealed the practical possibility of tenfold reducing descriptions with full preservation of classification accuracy.Reducing the descriptions by twenty times in the experiment leads to slightly decreased accuracy.The speed of the analysis increases in proportion to the degree of reduction.The use of reduction by the informativeness criterion confirmed the possibility of obtaining the most significant subset of features for classification,which guarantees a decent level of accuracy.
文摘Condensed and hydrolysable tannins are non-toxic natural polyphenols that are a commercial commodity industrialized for tanning hides to obtain leather and for a growing number of other industrial applications mainly to substitute petroleum-based products.They are a definite class of sustainable materials of the forestry industry.They have been in operation for hundreds of years to manufacture leather and now for a growing number of applications in a variety of other industries,such as wood adhesives,metal coating,pharmaceutical/medical applications and several others.This review presents the main sources,either already or potentially commercial of this forestry by-materials,their industrial and laboratory extraction systems,their systems of analysis with their advantages and drawbacks,be these methods so simple to even appear primitive but nonetheless of proven effectiveness,or very modern and instrumental.It constitutes a basic but essential summary of what is necessary to know of these sustainable materials.In doing so,the review highlights some of the main challenges that remain to be addressed to deliver the quality and economics of tannin supply necessary to fulfill the industrial production requirements for some materials-based uses.
基金supported in part by the Nationa Natural Science Foundation of China (61876011)the National Key Research and Development Program of China (2022YFB4703700)+1 种基金the Key Research and Development Program 2020 of Guangzhou (202007050002)the Key-Area Research and Development Program of Guangdong Province (2020B090921003)。
文摘Recently, there have been some attempts of Transformer in 3D point cloud classification. In order to reduce computations, most existing methods focus on local spatial attention,but ignore their content and fail to establish relationships between distant but relevant points. To overcome the limitation of local spatial attention, we propose a point content-based Transformer architecture, called PointConT for short. It exploits the locality of points in the feature space(content-based), which clusters the sampled points with similar features into the same class and computes the self-attention within each class, thus enabling an effective trade-off between capturing long-range dependencies and computational complexity. We further introduce an inception feature aggregator for point cloud classification, which uses parallel structures to aggregate high-frequency and low-frequency information in each branch separately. Extensive experiments show that our PointConT model achieves a remarkable performance on point cloud shape classification. Especially, our method exhibits 90.3% Top-1 accuracy on the hardest setting of ScanObjectN N. Source code of this paper is available at https://github.com/yahuiliu99/PointC onT.
基金The Guangdong Basic and Applied Basic Research Foundation(2022A1515010730)National Natural Science Foundation of China(32001647)+2 种基金National Natural Science Foundation of China(31972022)Financial and moral assistance supported by the Guangdong Basic and Applied Basic Research Foundation(2019A1515011996)111 Project(B17018)。
文摘In this study,the structural characters,antioxidant activities and bile acid-binding ability of sea buckthorn polysaccharides(HRPs)obtained by the commonly used hot water(HRP-W),pressurized hot water(HRP-H),ultrasonic(HRP-U),acid(HRP-C)and alkali(HRP-A)assisted extraction methods were investigated.The results demonstrated that extraction methods had significant effects on extraction yield,monosaccharide composition,molecular weight,particle size,triple-helical structure,and surface morphology of HRPs except for the major linkage bands.Thermogravimetric analysis showed that HRP-U with filamentous reticular microstructure exhibited better thermal stability.The HRP-A with the lowest molecular weight and highest arabinose content possessed the best antioxidant activities.Moreover,the rheological analysis indicated that HRPs with higher galacturonic acid content and molecular weight showed higher viscosity and stronger crosslinking network(HRP-C,HRP-W and HRP-U),which exhibited stronger bile acid binding capacity.The present findings provide scientific evidence in the preparation technology of sea buckthorn polysaccharides with good antioxidant and bile acid binding capacity which are related to the structure affected by the extraction methods.
文摘Background: Cavernous transformation of the portal vein(CTPV) due to portal vein obstruction is a rare vascular anomaly defined as the formation of multiple collateral vessels in the hepatic hilum. This study aimed to investigate the imaging features of intrahepatic portal vein in adult patients with CTPV and establish the relationship between the manifestations of intrahepatic portal vein and the progression of CTPV. Methods: We retrospectively analyzed 14 CTPV patients in Beijing Tsinghua Changgung Hospital. All patients underwent both direct portal venography(DPV) and computed tomography angiography(CTA) to reveal the manifestations of the portal venous system. The vessels measured included the left portal vein(LPV), right portal vein(RPV), main portal vein(MPV) and the portal vein bifurcation(PVB). Results: Nine males and 5 females, with a median age of 40.5 years, were included in the study. No significant difference was found in the diameters of the LPV or RPV measured by DPV and CTA. The visualization in terms of LPV, RPV and PVB measured by DPV was higher than that by CTA. There was a significant association between LPV/RPV and PVB/MPV in term of visibility revealed with DPV( P = 0.01), while this association was not observed with CTA. According to the imaging features of the portal vein measured by DPV, CTPV was classified into three categories to facilitate the diagnosis and treatment. Conclusions: DPV was more accurate than CTA for revealing the course of the intrahepatic portal vein in patients with CTPV. The classification of CTPV, that originated from the imaging features of the portal vein revealed by DPV, may provide a new perspective for the diagnosis and treatment of CTPV.
基金funded by the National Natural Science Foundation of China project (Grant Nos.42275140, 42230612, 91837310, 92037000)the Second Tibetan Plateau Scientific Expedition and Research (STEP) program(Grant No. 2019QZKK0104)。
文摘In this study, a new rain type classification algorithm for the Dual-Frequency Precipitation Radar(DPR) suitable over the Tibetan Plateau(TP) was proposed by analyzing Global Precipitation Measurement(GPM) DPR Level-2 data in summer from 2014 to 2020. It was found that the DPR rain type classification algorithm(simply called DPR algorithm) has mis-identification problems in two aspects in summer TP. In the new algorithm of rain type classification in summer TP,four rain types are classified by using new thresholds, such as the maximum reflectivity factor, the difference between the maximum reflectivity factor and the background maximum reflectivity factor, and the echo top height. In the threshold of the maximum reflectivity factors, 30 d BZ and 18 d BZ are both thresholds to separate strong convective precipitation, weak convective precipitation and weak precipitation. The results illustrate obvious differences of radar reflectivity factor and vertical velocity among the three rain types in summer TP, such as the reflectivity factor of most strong convective precipitation distributes from 15 d BZ to near 35 d BZ from 4 km to 13 km, and increases almost linearly with the decrease in height. For most weak convective precipitation, the reflectivity factor distributes from 15 d BZ to 28 d BZ with the height from 4 km to 9 km. For weak precipitation, the reflectivity factor mainly distributes in range of 15–25 d BZ with height within 4–10 km. It is also shows that weak precipitation is the dominant rain type in summer TP, accounting for 40%–80%,followed by weak convective precipitation(25%–40%), and strong convective precipitation has the least proportion(less than 30%).
基金financially supported by the National Key Research and Development Program of China(2022YFB3706800,2020YFB1710100)the National Natural Science Foundation of China(51821001,52090042,52074183)。
文摘The complex sand-casting process combined with the interactions between process parameters makes it difficult to control the casting quality,resulting in a high scrap rate.A strategy based on a data-driven model was proposed to reduce casting defects and improve production efficiency,which includes the random forest(RF)classification model,the feature importance analysis,and the process parameters optimization with Monte Carlo simulation.The collected data includes four types of defects and corresponding process parameters were used to construct the RF model.Classification results show a recall rate above 90% for all categories.The Gini Index was used to assess the importance of the process parameters in the formation of various defects in the RF model.Finally,the classification model was applied to different production conditions for quality prediction.In the case of process parameters optimization for gas porosity defects,this model serves as an experimental process in the Monte Carlo method to estimate a better temperature distribution.The prediction model,when applied to the factory,greatly improved the efficiency of defect detection.Results show that the scrap rate decreased from 10.16% to 6.68%.
基金supported by the National Natural Science Foundation of China(Grant No.42162026)the Applied Basic Research Foundation of Yunnan Province(Grant No.202201AT070083).
文摘Although disintegrated dolomite,widely distributed across the globe,has conventionally been a focus of research in underground engineering,the issue of slope stability issues in disintegrated dolomite strata is gaining increasing prominence.This is primarily due to their unique properties,including low strength and loose structure.Current methods for evaluating slope stability,such as basic quality(BQ)and slope stability probability classification(SSPC),do not adequately account for the poor integrity and structural fragmentation characteristic of disintegrated dolomite.To address this challenge,an analysis of the applicability of the limit equilibrium method(LEM),BQ,and SSPC methods was conducted on eight disintegrated dolomite slopes located in Baoshan,Southwest China.However,conflicting results were obtained.Therefore,this paper introduces a novel method,SMRDDS,to provide rapid and accurate assessment of disintegrated dolomite slope stability.This method incorporates parameters such as disintegrated grade,joint state,groundwater conditions,and excavation methods.The findings reveal that six slopes exhibit stability,while two are considered partially unstable.Notably,the proposed method demonstrates a closer match with the actual conditions and is more time-efficient compared with the BQ and SSPC methods.However,due to the limited research on disintegrated dolomite slopes,the results of the SMRDDS method tend to be conservative as a safety precaution.In conclusion,the SMRDDS method can quickly evaluate the current situation of disintegrated dolomite slopes in the field.This contributes significantly to disaster risk reduction for disintegrated dolomite slopes.
基金Supported by the Prospective and Basic Research Project of PetroChina(2021DJ23)。
文摘In recent years,great breakthroughs have been made in the exploration and development of natural gas in deep coal-rock reservoirs in Junggar,Ordos and other basins in China.In view of the inconsistency between the industrial and academic circles on this new type of unconventional natural gas,this paper defines the concept of"coal-rock gas"on the basis of previous studies,and systematically analyzes its characteristics of occurrence state,transport and storage form,differential accumulation,and development law.Coal-rock gas,geologically unlike coalbed methane in the traditional sense,occurs in both free and adsorbed states,with free state in abundance.It is generated and stored in the same set of rocks through short distance migration,occasionally with the accumulation from other sources.Moreover,coal rock develops cleat fractures,and the free gas accumulates differentially.The coal-rock gas reservoirs deeper than 2000 m are high in pressure,temperature,gas content,gas saturation,and free-gas content.In terms of development,similar to shale gas and tight gas,coal-rock gas can be exploited by natural formation energy after the reservoirs connectivity is improved artificially,that is,the adsorbed gas is desorbed due to pressure drop after the high-potential free gas is recovered,so that the free gas and adsorbed gas are produced in succession for a long term without water drainage for pressure drop.According to buried depth,coal rank,pressure coefficient,reserves scale,reserves abundance and gas well production,the classification criteria and reserves/resources estimation method of coal-rock gas are presented.It is preliminarily estimated that the coal-rock gas in place deeper than 2000 m in China exceeds 30×10^(12)m^(3),indicating an important strategic resource for the country.The Ordos,Sichuan,Junggar and Bohai Bay basins are favorable areas for large-scale enrichment of coal-rock gas.The paper summarizes the technical and management challenges and points out the research directions,laying a foundation for the management,exploration,and development of coal-rock gas in China.
文摘Background:The nasal alar defect in Asians remains a challenging issue,as do clear classification and algorithm guidance,despite numerous previously described surgical techniques.The aim of this study is to propose a surgical algorithm that addresses the appropriate surgical procedures for different types of nasal alar defects in Asian patients.Methods:A retrospective case note review was conducted on 32 patients with nasal alar defect who underwent reconstruction between 2008 and 2022.Based on careful analysis and our clinical experience,we proposed a classification system for nasal alar defects and presented a reconstructive algorithm.Patient data,including age,sex,diagnosis,surgical options,and complications,were assessed.The extent of surgical scar formation was evaluated using standard photography based on a 4-grade scar scale.Results:Among the 32 patients,there were 20 males and 12 females with nasal alar defects.The predominant cause of trauma in China was industrial factors.The majority of alar defects were classified as type Ⅰ C(n=8,25%),comprising 18 cases(56.2%);there were 5 cases(15.6%)of type Ⅱ defect,7(21.9%)of type Ⅲ defect,and 2(6.3%)of type Ⅳ defect.The most common surgical option was auricular composite graft(n=8,25%),followed by bilobed flap(n=6,18.8%),free auricular composite flap(n=4,12.5%),and primary closure(n=3,9.4%).Satisfactory improvements were observed postoperatively.Conclusion:Factors contributing to classifications were analyzed and defined,providing a framework for the proposed classification system.The reconstructive algorithm offers surgeons appropriate procedures for treating nasal alar defect in Asians.
基金supported by the Japan Society for the Promotion of Science,KAKENHI Grant No.23H00475.
文摘The inverse and direct piezoelectric and circuit coupling are widely observed in advanced electro-mechanical systems such as piezoelectric energy harvesters.Existing strongly coupled analysis methods based on direct numerical modeling for this phenomenon can be classified into partitioned or monolithic formulations.Each formulation has its advantages and disadvantages,and the choice depends on the characteristics of each coupled problem.This study proposes a new option:a coupled analysis strategy that combines the best features of the existing formulations,namely,the hybrid partitioned-monolithic method.The analysis of inverse piezoelectricity and the monolithic analysis of direct piezoelectric and circuit interaction are strongly coupled using a partitioned iterative hierarchical algorithm.In a typical benchmark problem of a piezoelectric energy harvester,this research compares the results from the proposed method to those from the conventional strongly coupled partitioned iterative method,discussing the accuracy,stability,and computational cost.The proposed hybrid concept is effective for coupled multi-physics problems,including various coupling conditions.
基金supported and funded by the Deanship of Scientific Research at Imam Mohammad Ibn Saud Islamic University(IMSIU)(Grant Number IMSIU-RP23044).
文摘Lung cancer is a leading cause of global mortality rates.Early detection of pulmonary tumors can significantly enhance the survival rate of patients.Recently,various Computer-Aided Diagnostic(CAD)methods have been developed to enhance the detection of pulmonary nodules with high accuracy.Nevertheless,the existing method-ologies cannot obtain a high level of specificity and sensitivity.The present study introduces a novel model for Lung Cancer Segmentation and Classification(LCSC),which incorporates two improved architectures,namely the improved U-Net architecture and the improved AlexNet architecture.The LCSC model comprises two distinct stages.The first stage involves the utilization of an improved U-Net architecture to segment candidate nodules extracted from the lung lobes.Subsequently,an improved AlexNet architecture is employed to classify lung cancer.During the first stage,the proposed model demonstrates a dice accuracy of 0.855,a precision of 0.933,and a recall of 0.789 for the segmentation of candidate nodules.The suggested improved AlexNet architecture attains 97.06%accuracy,a true positive rate of 96.36%,a true negative rate of 97.77%,a positive predictive value of 97.74%,and a negative predictive value of 96.41%for classifying pulmonary cancer as either benign or malignant.The proposed LCSC model is tested and evaluated employing the publically available dataset furnished by the Lung Image Database Consortium and Image Database Resource Initiative(LIDC-IDRI).This proposed technique exhibits remarkable performance compared to the existing methods by using various evaluation parameters.
基金supported by grants from the National Natural Science Foundation of China(Grant No.82172660)Hebei Province Graduate Student Innovation Project(Grant No.CXZZBS2023001)Baoding Natural Science Foundation(Grant No.H2272P015).
文摘Among central nervous system-associated malignancies,glioblastoma(GBM)is the most common and has the highest mortality rate.The high heterogeneity of GBM cell types and the complex tumor microenvironment frequently lead to tumor recurrence and sudden relapse in patients treated with temozolomide.In precision medicine,research on GBM treatment is increasingly focusing on molecular subtyping to precisely characterize the cellular and molecular heterogeneity,as well as the refractory nature of GBM toward therapy.Deep understanding of the different molecular expression patterns of GBM subtypes is critical.Researchers have recently proposed tetra fractional or tripartite methods for detecting GBM molecular subtypes.The various molecular subtypes of GBM show significant differences in gene expression patterns and biological behaviors.These subtypes also exhibit high plasticity in their regulatory pathways,oncogene expression,tumor microenvironment alterations,and differential responses to standard therapy.Herein,we summarize the current molecular typing scheme of GBM and the major molecular/genetic characteristics of each subtype.Furthermore,we review the mesenchymal transition mechanisms of GBM under various regulators.
基金supported by the Shandong Provin-cial Key Research Project of Undergraduate Teaching Reform(No.Z2022218)the Fundamental Research Funds for the Central University(No.202113028)+1 种基金the Graduate Education Promotion Program of Ocean University of China(No.HDJG20006)supported by the Sailing Laboratory of Ocean University of China.
文摘The tell tail is usually placed on the triangular sail to display the running state of the air flow on the sail surface.It is of great significance to make accurate judgement on the drift of the tell tail of the sailboat during sailing for the best sailing effect.Normally it is difficult for sailors to keep an eye for a long time on the tell sail for accurate judging its changes,affected by strong sunlight and visual fatigue.In this case,we adopt computer vision technology in hope of helping the sailors judge the changes of the tell tail in ease with ease.This paper proposes for the first time a method to classify sailboat tell tails based on deep learning and an expert guidance system,supported by a sailboat tell tail classification data set on the expert guidance system of interpreting the tell tails states in different sea wind conditions,including the feature extraction performance.Considering the expression capabilities that vary with the computational features in different visual tasks,the paper focuses on five tell tail computing features,which are recoded by an automatic encoder and classified by a SVM classifier.All experimental samples were randomly divided into five groups,and four groups were selected from each group as the training set to train the classifier.The remaining one group was used as the test set for testing.The highest resolution value of the ResNet network was 80.26%.To achieve better operational results on the basis of deep computing features obtained through the ResNet network in the experiments.The method can be used to assist the sailors in making better judgement about the tell tail changes during sailing.
基金the National Social Science Foundation of China(Grant No.22BTJ035).
文摘The classification of functional data has drawn much attention in recent years.The main challenge is representing infinite-dimensional functional data by finite-dimensional features while utilizing those features to achieve better classification accuracy.In this paper,we propose a mean-variance-based(MV)feature weighting method for classifying functional data or functional curves.In the feature extraction stage,each sample curve is approximated by B-splines to transfer features to the coefficients of the spline basis.After that,a feature weighting approach based on statistical principles is introduced by comprehensively considering the between-class differences and within-class variations of the coefficients.We also introduce a scaling parameter to adjust the gap between the weights of features.The new feature weighting approach can adaptively enhance noteworthy local features while mitigating the impact of confusing features.The algorithms for feature weighted K-nearest neighbor and support vector machine classifiers are both provided.Moreover,the new approach can be well integrated into existing functional data classifiers,such as the generalized functional linear model and functional linear discriminant analysis,resulting in a more accurate classification.The performance of the mean-variance-based classifiers is evaluated by simulation studies and real data.The results show that the newfeatureweighting approach significantly improves the classification accuracy for complex functional data.
文摘Intrusion detection is a predominant task that monitors and protects the network infrastructure.Therefore,many datasets have been published and investigated by researchers to analyze and understand the problem of intrusion prediction and detection.In particular,the Network Security Laboratory-Knowledge Discovery in Databases(NSL-KDD)is an extensively used benchmark dataset for evaluating intrusion detection systems(IDSs)as it incorporates various network traffic attacks.It is worth mentioning that a large number of studies have tackled the problem of intrusion detection using machine learning models,but the performance of these models often decreases when evaluated on new attacks.This has led to the utilization of deep learning techniques,which have showcased significant potential for processing large datasets and therefore improving detection accuracy.For that reason,this paper focuses on the role of stacking deep learning models,including convolution neural network(CNN)and deep neural network(DNN)for improving the intrusion detection rate of the NSL-KDD dataset.Each base model is trained on the NSL-KDD dataset to extract significant features.Once the base models have been trained,the stacking process proceeds to the second stage,where a simple meta-model has been trained on the predictions generated from the proposed base models.The combination of the predictions allows the meta-model to distinguish different classes of attacks and increase the detection rate.Our experimental evaluations using the NSL-KDD dataset have shown the efficacy of stacking deep learning models for intrusion detection.The performance of the ensemble of base models,combined with the meta-model,exceeds the performance of individual models.Our stacking model has attained an accuracy of 99%and an average F1-score of 93%for the multi-classification scenario.Besides,the training time of the proposed ensemble model is lower than the training time of benchmark techniques,demonstrating its efficiency and robustness.
基金supported by ONR UMass Dartmouth Marine and UnderSea Technology(MUST)grant N00014-20-1-2849 under the project S31320000049160by DOE grant DE-SC0023164 sub-award RC114586-UMD+2 种基金by AFOSR grants FA9550-18-1-0383 and FA9550-23-1-0037supported by Michigan State University,by AFOSR grants FA9550-19-1-0281 and FA9550-18-1-0383by DOE grant DE-SC0023164.
文摘Additive Runge-Kutta methods designed for preserving highly accurate solutions in mixed-precision computation were previously proposed and analyzed.These specially designed methods use reduced precision for the implicit computations and full precision for the explicit computations.In this work,we analyze the stability properties of these methods and their sensitivity to the low-precision rounding errors,and demonstrate their performance in terms of accuracy and efficiency.We develop codes in FORTRAN and Julia to solve nonlinear systems of ODEs and PDEs using the mixed-precision additive Runge-Kutta(MP-ARK)methods.The convergence,accuracy,and runtime of these methods are explored.We show that for a given level of accuracy,suitably chosen MP-ARK methods may provide significant reductions in runtime.
文摘Predicting depression intensity from microblogs and social media posts has numerous benefits and applications,including predicting early psychological disorders and stress in individuals or the general public.A major challenge in predicting depression using social media posts is that the existing studies do not focus on predicting the intensity of depression in social media texts but rather only perform the binary classification of depression and moreover noisy data makes it difficult to predict the true depression in the social media text.This study intends to begin by collecting relevant Tweets and generating a corpus of 210000 public tweets using Twitter public application programming interfaces(APIs).A strategy is devised to filter out only depression-related tweets by creating a list of relevant hashtags to reduce noise in the corpus.Furthermore,an algorithm is developed to annotate the data into three depression classes:‘Mild,’‘Moderate,’and‘Severe,’based on International Classification of Diseases-10(ICD-10)depression diagnostic criteria.Different baseline classifiers are applied to the annotated dataset to get a preliminary idea of classification performance on the corpus.Further FastText-based model is applied and fine-tuned with different preprocessing techniques and hyperparameter tuning to produce the tuned model,which significantly increases the depression classification performance to an 84%F1 score and 90%accuracy compared to baselines.Finally,a FastText-based weighted soft voting ensemble(WSVE)is proposed to boost the model’s performance by combining several other classifiers and assigning weights to individual models according to their individual performances.The proposed WSVE outperformed all baselines as well as FastText alone,with an F1 of 89%,5%higher than FastText alone,and an accuracy of 93%,3%higher than FastText alone.The proposed model better captures the contextual features of the relatively small sample class and aids in the detection of early depression intensity prediction from tweets with impactful performances.
基金Institute of Information&Communications Technology Planning&Evaluation,Grant/Award Number:2022-0-00074。
文摘Few‐shot image classification is the task of classifying novel classes using extremely limited labelled samples.To perform classification using the limited samples,one solution is to learn the feature alignment(FA)information between the labelled and unlabelled sample features.Most FA methods use the feature mean as the class prototype and calculate the correlation between prototype and unlabelled features to learn an alignment strategy.However,mean prototypes tend to degenerate informative features because spatial features at the same position may not be equally important for the final classification,leading to inaccurate correlation calculations.Therefore,the authors propose an effective intraclass FA strategy that aggregates semantically similar spatial features from an adaptive reference prototype in low‐dimensional feature space to obtain an informative prototype feature map for precise correlation computation.Moreover,a dual correlation module to learn the hard and soft correlations was developed by the authors.This module combines the correlation information between the prototype and unlabelled features in both the original and learnable feature spaces,aiming to produce a comprehensive cross‐correlation between the prototypes and unlabelled features.Using both FA and cross‐attention modules,our model can maintain informative class features and capture important shared features for classification.Experimental results on three few‐shot classification benchmarks show that the proposed method outperformed related methods and resulted in a 3%performance boost in the 1‐shot setting by inserting the proposed module into the related methods.