期刊文献+
共找到9篇文章
< 1 >
每页显示 20 50 100
Fusion of Feature Ranking Methods for an Effective Intrusion Detection System
1
作者 Seshu Bhavani Mallampati Seetha Hari 《Computers, Materials & Continua》 SCIE EI 2023年第8期1721-1744,共24页
Expanding internet-connected services has increased cyberattacks,many of which have grave and disastrous repercussions.An Intrusion Detection System(IDS)plays an essential role in network security since it helps to pr... Expanding internet-connected services has increased cyberattacks,many of which have grave and disastrous repercussions.An Intrusion Detection System(IDS)plays an essential role in network security since it helps to protect the network from vulnerabilities and attacks.Although extensive research was reported in IDS,detecting novel intrusions with optimal features and reducing false alarm rates are still challenging.Therefore,we developed a novel fusion-based feature importance method to reduce the high dimensional feature space,which helps to identify attacks accurately with less false alarm rate.Initially,to improve training data quality,various preprocessing techniques are utilized.The Adaptive Synthetic oversampling technique generates synthetic samples for minority classes.In the proposed fusion-based feature importance,we use different approaches from the filter,wrapper,and embedded methods like mutual information,random forest importance,permutation importance,Shapley Additive exPlanations(SHAP)-based feature importance,and statistical feature importance methods like the difference of mean and median and standard deviation to rank each feature according to its rank.Then by simple plurality voting,the most optimal features are retrieved.Then the optimal features are fed to various models like Extra Tree(ET),Logistic Regression(LR),Support vector Machine(SVM),Decision Tree(DT),and Extreme Gradient Boosting Machine(XGBM).Then the hyperparameters of classification models are tuned with Halving Random Search cross-validation to enhance the performance.The experiments were carried out on the original imbalanced data and balanced data.The outcomes demonstrate that the balanced data scenario knocked out the imbalanced data.Finally,the experimental analysis proved that our proposed fusionbased feature importance performed well with XGBM giving an accuracy of 99.86%,99.68%,and 92.4%,with 9,7 and 8 features by training time of 1.5,4.5 and 5.5 s on Network Security Laboratory-Knowledge Discovery in Databases(NSL-KDD),Canadian Institute for Cybersecurity(CIC-IDS 2017),and UNSW-NB15,datasets respectively.In addition,the suggested technique has been examined and contrasted with the state of art methods on three datasets. 展开更多
关键词 Cyber security feature ranking IMBALANCE PREPROCESSING IDS SHAP
下载PDF
Exploiting Consumer Reviews for Product Feature Ranking 被引量:1
2
作者 李素科 关志 +1 位作者 唐礼勇 陈钟 《Journal of Computer Science & Technology》 SCIE EI CSCD 2012年第3期635-649,共15页
Web 2.0 technology leads Web users to publish a large number of consumer reviews about products and services on various websites. Major product features extracted from consumer reviews may let product providers find w... Web 2.0 technology leads Web users to publish a large number of consumer reviews about products and services on various websites. Major product features extracted from consumer reviews may let product providers find what features are mostly cared by consumers, and also may help potential consumers to make purchasing decisions. In this work, we propose a linear regression with rules-based approach to ranking product features according to their importance. Empirical experiments show our approach is effective and promising. We also demonstrate two applications using our proposed approach. The first application decomposes overall ratings of products into product feature ratings. And the second application seeks to generate consumer surveys automatically. 展开更多
关键词 product feature ranking product review opinion mining
原文传递
Listwise approaches based on feature ranking discovery
3
作者 Yongqing WANG Wenji MAO +1 位作者 Daniel ZENG Fen XIA 《Frontiers of Computer Science》 SCIE EI CSCD 2012年第6期647-659,共13页
Listwise approaches are an important class of learning to rank, which utilizes automatic learning techniques to discover useful information. Most previous research on listwise approaches has focused on optimizing rank... Listwise approaches are an important class of learning to rank, which utilizes automatic learning techniques to discover useful information. Most previous research on listwise approaches has focused on optimizing ranking models using weights and has used imprecisely labeled training data; optimizing ranking models using features was largely ignored thus the continuous performance improvement of these approaches was hindered. To address the limitations of previous listwise work, we propose a quasi-KNN model to discover the ranking of features and employ rank addition rule to calculate the weight of combination. On the basis of this, we propose three listwise algorithms, FeatureRank, BL-FeatureRank, and DiffRank. The experimental results show that our proposed algorithms can be applied to a strict ordered ranking training set and gain better performance than state-of-the-art listwise algorithms. 展开更多
关键词 learning to rank listwise approach feature's ranking discovery
原文传递
Expert ranking method based on ListNet with multiple features
4
作者 陈方琼 余正涛 +2 位作者 毛存礼 吴则键 张优敏 《Journal of Beijing Institute of Technology》 EI CAS 2014年第2期240-247,共8页
The quality of expert ranking directly affects the expert retrieval precision.According to the characteristics of the expert entity,an expert ranking model based on the list with multiple features was proposed.Firstly... The quality of expert ranking directly affects the expert retrieval precision.According to the characteristics of the expert entity,an expert ranking model based on the list with multiple features was proposed.Firstly,multiple features was selected through the analysis of expert pages;secondly,in order to learn parameters through gradient descent and construct expert ranking model,all features were integrated into ListNet ranking model;finally,expert ranking contrast experiment will be performed using the trained model.The experimental results show that the proposed method has a good effect,and the value of NDCG@1 increased14.2%comparing with the pairwise method with expert ranking. 展开更多
关键词 expert retrieval expert ranking ListNet multiple features
下载PDF
Development of Data Mining Models Based on Features Ranks Voting (FRV)
5
作者 Mofreh A.Hogo 《Computers, Materials & Continua》 SCIE EI 2022年第11期2947-2966,共20页
Data size plays a significant role in the design and the performance of data mining models.A good feature selection algorithm reduces the problems of big data size and noise due to data redundancy.Features selection a... Data size plays a significant role in the design and the performance of data mining models.A good feature selection algorithm reduces the problems of big data size and noise due to data redundancy.Features selection algorithms aim at selecting the best features and eliminating unnecessary ones,which in turn simplifies the structure of the data mining model as well as increases its performance.This paper introduces a robust features selection algorithm,named Features Ranking Voting Algorithm FRV.It merges the benefits of the different features selection algorithms to specify the features ranks in the dataset correctly and robustly;based on the feature ranks and voting algorithm.The FRV comprises of three different proposed techniques to select the minimum best feature set,the forward voting technique to select the best high ranks features,the backward voting technique,which drops the low ranks features(low importance feature),and the third technique merges the outputs from the forward and backward techniques to maximize the robustness of the selected features set.Different data mining models were built using obtained selected features sets from applying the proposed FVR on different datasets;to evaluate the success behavior of the proposed FRV.The high performance of these data mining models reflects the success of the proposed FRV algorithm.The FRV performance is compared with other features selection algorithms.It successes to develop data mining models for the Hungarian CAD dataset with Acc.of 96.8%,and with Acc.of 96%for the Z-Alizadeh Sani CAD dataset compared with 83.94%and 92.56%respectively in[48]. 展开更多
关键词 EVALUATOR features selection data mining FORWARD BACKWARD VOTING feature rank
下载PDF
Unsupervised spectral feature selection algorithms for high dimensional data
6
作者 Mingzhao WANG Henry HAN +1 位作者 Zhao HUANG Juanying XIE 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第5期27-40,共14页
It is a significant and challenging task to detect the informative features to carry out explainable analysis for high dimensional data,especially for those with very small number of samples.Feature selection especial... It is a significant and challenging task to detect the informative features to carry out explainable analysis for high dimensional data,especially for those with very small number of samples.Feature selection especially the unsupervised ones are the right way to deal with this challenge and realize the task.Therefore,two unsupervised spectral feature selection algorithms are proposed in this paper.They group features using advanced Self-Tuning spectral clustering algorithm based on local standard deviation,so as to detect the global optimal feature clusters as far as possible.Then two feature ranking techniques,including cosine-similarity-based feature ranking and entropy-based feature ranking,are proposed,so that the representative feature of each cluster can be detected to comprise the feature subset on which the explainable classification system will be built.The effectiveness of the proposed algorithms is tested on high dimensional benchmark omics datasets and compared to peer methods,and the statistical test are conducted to determine whether or not the proposed spectral feature selection algorithms are significantly different from those of the peer methods.The extensive experiments demonstrate the proposed unsupervised spectral feature selection algorithms outperform the peer ones in comparison,especially the one based on cosine similarity feature ranking technique.The statistical test results show that the entropy feature ranking based spectral feature selection algorithm performs best.The detected features demonstrate strong discriminative capabilities in downstream classifiers for omics data,such that the AI system built on them would be reliable and explainable.It is especially significant in building transparent and trustworthy medical diagnostic systems from an interpretable AI perspective. 展开更多
关键词 feature selection spectral clustering feature ranking techniques ENTROPY cosine similarity
原文传递
Recognition of 3-D Aircrafts by Fourier Descriptors with Fast and Efficient Library Search
7
作者 Zhao Hengzhuo, Wang Yanping(College of Electronic Information, Wuhan University, Wuhan 430072, China) 《Wuhan University Journal of Natural Sciences》 EI CAS 1998年第2期169-174,共6页
Fourier descriptors are used as features for 3-D aircraft classification and pose determination from a 2-D image recorded at an arbitrary viewing angle. By the feature ranking of Fourier descriptors, a classification ... Fourier descriptors are used as features for 3-D aircraft classification and pose determination from a 2-D image recorded at an arbitrary viewing angle. By the feature ranking of Fourier descriptors, a classification procedure based on the fast nearest neighbour rule is proposed to save the matching time of an unknown aircraft with a partial library search. The testing results of some typical examples indicate this method is generally applicable and efficient in 3-D aircraft recognition. 展开更多
关键词 pattern recognition Fourier descriptors nearest neighbour rule feature rank weighting factor distance bound
下载PDF
Ranking and tagging bursty features in text streams with context language models
8
作者 Wayne Xin ZHAO Chen LIU +1 位作者 Ji-Rong WEN Xiaoming LI 《Frontiers of Computer Science》 SCIE EI CSCD 2017年第5期852-862,共11页
Detecting and using bursty pattems to analyze text streams has been one of the fundamental approaches in many temporal text mining applications. So far, most existing studies have focused on developing methods to dete... Detecting and using bursty pattems to analyze text streams has been one of the fundamental approaches in many temporal text mining applications. So far, most existing studies have focused on developing methods to detect bursty features based purely on term frequency changes. Few have taken the semantic contexts of bursty features into consideration, and as a result the detected bursty features may not always be interesting and can be hard to interpret. In this article, we propose to model the contexts of bursty features using a language modeling approach. We propose two methods to estimate the context language models based on sentence-level context and document-level context. We then propose a novel topic diversity-based metric using the context models to find newsworthy bursty features. We also propose to use the context models to automatically assign meaningful tags to bursty features. Using a large corpus of news articles, we quantitatively show that the proposed context language models for bursty features can effectively help rank bursty features based on their newsworthiness and to assign meaningful tags to annotate bursty features. We also use two example text mining applications to qualitatively demonstrate the usefulness of bursty feature ranking and tagging. 展开更多
关键词 bursty features bursty features ranking bursty feature tagging context modeling
原文传递
Gini Correlation for Feature Screening
9
作者 Jun-ying ZHANG Xiao-feng LIU +1 位作者 Ri-quan ZHANG Hang-WANG 《Acta Mathematicae Applicatae Sinica》 SCIE CSCD 2021年第3期590-601,共12页
In this paper we propose the Gini correlation screening(GCS)method to select the important variables with ultrahigh dimensional data.The new procedure is based on the Gini correlation coefficient via the covariance be... In this paper we propose the Gini correlation screening(GCS)method to select the important variables with ultrahigh dimensional data.The new procedure is based on the Gini correlation coefficient via the covariance between the response and the rank of the predictor variables rather than the Pearson correlation and the Kendallτcorrelation coefficient.The new method does not require imposing a specific model structure on regression functions and only needs the condition which the predictors and response have continuous distribution function.We demonstrate that,with the number of predictors growing at an exponential rate of the sample size,the proposed procedure possesses consistency in ranking,which is both useful in its own right and can lead to consistency in selection.The procedure is computationally efficient and simple,and exhibits a competent empirical performance in our intensive simulations and real data analysis. 展开更多
关键词 ultrahigh dimension Gini correlation coefficient variable screening feature ranking
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部