期刊文献+
共找到13篇文章
< 1 >
每页显示 20 50 100
Development of Data Mining Models Based on Features Ranks Voting (FRV)
1
作者 Mofreh A.Hogo 《Computers, Materials & Continua》 SCIE EI 2022年第11期2947-2966,共20页
Data size plays a significant role in the design and the performance of data mining models.A good feature selection algorithm reduces the problems of big data size and noise due to data redundancy.Features selection a... Data size plays a significant role in the design and the performance of data mining models.A good feature selection algorithm reduces the problems of big data size and noise due to data redundancy.Features selection algorithms aim at selecting the best features and eliminating unnecessary ones,which in turn simplifies the structure of the data mining model as well as increases its performance.This paper introduces a robust features selection algorithm,named Features Ranking Voting Algorithm FRV.It merges the benefits of the different features selection algorithms to specify the features ranks in the dataset correctly and robustly;based on the feature ranks and voting algorithm.The FRV comprises of three different proposed techniques to select the minimum best feature set,the forward voting technique to select the best high ranks features,the backward voting technique,which drops the low ranks features(low importance feature),and the third technique merges the outputs from the forward and backward techniques to maximize the robustness of the selected features set.Different data mining models were built using obtained selected features sets from applying the proposed FVR on different datasets;to evaluate the success behavior of the proposed FRV.The high performance of these data mining models reflects the success of the proposed FRV algorithm.The FRV performance is compared with other features selection algorithms.It successes to develop data mining models for the Hungarian CAD dataset with Acc.of 96.8%,and with Acc.of 96%for the Z-Alizadeh Sani CAD dataset compared with 83.94%and 92.56%respectively in[48]. 展开更多
关键词 EVALUATOR features selection data mining FORWARD BACKWARD VOTING feature rank
下载PDF
Performance prediction for Grid workflow activities based on features-ranked RBF network
2
作者 王洁 Duan Rubing Farrukh Nadeem 《High Technology Letters》 EI CAS 2009年第2期203-207,共5页
Accurate performance prediction of Grid workflow activities can help Grid schedulers map activitiesto appropriate Grid sites.This paper describes an approach based on features-ranked RBF neural networkto predict the p... Accurate performance prediction of Grid workflow activities can help Grid schedulers map activitiesto appropriate Grid sites.This paper describes an approach based on features-ranked RBF neural networkto predict the performance of Grid workflow activities.Experimental results for two kinds of real worldGrid workflow activities are presented to show effectiveness of our approach. 展开更多
关键词 performance prediction radial basis function (RBF) neural network features rank Grid workflow activities
下载PDF
Expert ranking method based on ListNet with multiple features
3
作者 陈方琼 余正涛 +2 位作者 毛存礼 吴则键 张优敏 《Journal of Beijing Institute of Technology》 EI CAS 2014年第2期240-247,共8页
The quality of expert ranking directly affects the expert retrieval precision.According to the characteristics of the expert entity,an expert ranking model based on the list with multiple features was proposed.Firstly... The quality of expert ranking directly affects the expert retrieval precision.According to the characteristics of the expert entity,an expert ranking model based on the list with multiple features was proposed.Firstly,multiple features was selected through the analysis of expert pages;secondly,in order to learn parameters through gradient descent and construct expert ranking model,all features were integrated into ListNet ranking model;finally,expert ranking contrast experiment will be performed using the trained model.The experimental results show that the proposed method has a good effect,and the value of NDCG@1 increased14.2%comparing with the pairwise method with expert ranking. 展开更多
关键词 expert retrieval expert ranking ListNet multiple features
下载PDF
Fusion of Feature Ranking Methods for an Effective Intrusion Detection System
4
作者 Seshu Bhavani Mallampati Seetha Hari 《Computers, Materials & Continua》 SCIE EI 2023年第8期1721-1744,共24页
Expanding internet-connected services has increased cyberattacks,many of which have grave and disastrous repercussions.An Intrusion Detection System(IDS)plays an essential role in network security since it helps to pr... Expanding internet-connected services has increased cyberattacks,many of which have grave and disastrous repercussions.An Intrusion Detection System(IDS)plays an essential role in network security since it helps to protect the network from vulnerabilities and attacks.Although extensive research was reported in IDS,detecting novel intrusions with optimal features and reducing false alarm rates are still challenging.Therefore,we developed a novel fusion-based feature importance method to reduce the high dimensional feature space,which helps to identify attacks accurately with less false alarm rate.Initially,to improve training data quality,various preprocessing techniques are utilized.The Adaptive Synthetic oversampling technique generates synthetic samples for minority classes.In the proposed fusion-based feature importance,we use different approaches from the filter,wrapper,and embedded methods like mutual information,random forest importance,permutation importance,Shapley Additive exPlanations(SHAP)-based feature importance,and statistical feature importance methods like the difference of mean and median and standard deviation to rank each feature according to its rank.Then by simple plurality voting,the most optimal features are retrieved.Then the optimal features are fed to various models like Extra Tree(ET),Logistic Regression(LR),Support vector Machine(SVM),Decision Tree(DT),and Extreme Gradient Boosting Machine(XGBM).Then the hyperparameters of classification models are tuned with Halving Random Search cross-validation to enhance the performance.The experiments were carried out on the original imbalanced data and balanced data.The outcomes demonstrate that the balanced data scenario knocked out the imbalanced data.Finally,the experimental analysis proved that our proposed fusionbased feature importance performed well with XGBM giving an accuracy of 99.86%,99.68%,and 92.4%,with 9,7 and 8 features by training time of 1.5,4.5 and 5.5 s on Network Security Laboratory-Knowledge Discovery in Databases(NSL-KDD),Canadian Institute for Cybersecurity(CIC-IDS 2017),and UNSW-NB15,datasets respectively.In addition,the suggested technique has been examined and contrasted with the state of art methods on three datasets. 展开更多
关键词 Cyber security feature ranking IMBALANCE PREPROCESSING IDS SHAP
下载PDF
FAST FEATURE RANKING AND ITS APPLICATION TO FACE RECOGNITION 被引量:1
5
作者 潘锋 王建东 +2 位作者 宋广为 牛奔 顾其威 《Transactions of Nanjing University of Aeronautics and Astronautics》 EI 2013年第4期389-396,共8页
A fast feature ranking algorithm for classification in the presence of high dimensionahty and small sample size is proposed. The basic idea is that the important features force the data points of the same class to mai... A fast feature ranking algorithm for classification in the presence of high dimensionahty and small sample size is proposed. The basic idea is that the important features force the data points of the same class to maintain their intrinsic neighbor relations, whereas neighboring points of different classes are no longer to stick to one an- other. Applying this assumption, an optimization problem weighting each feature is derived. The algorithm does not involve the dense matrix eigen-decomposition which can be computationally expensive in time. Extensive exper- iments are conducted to validate the significance of selected features using the Yale, Extended YaleB and PIE data- sets. The thorough evaluation shows that, using one-nearest neighbor classifier, the recognition rates using 100-- 500 leading features selected by the algorithm distinctively outperform those with features selected by the baseline feature selection algorithms, while using support vector machine features selected by the algorithm show less prominent improvement. Moreover, the experiments demonstrate that the proposed algorithm is particularly effi- cient for multi-class face recognition problem. 展开更多
关键词 feature selection feature ranking manifold learning Laplacian matrix
下载PDF
Ranking and tagging bursty features in text streams with context language models
6
作者 Wayne Xin ZHAO Chen LIU +1 位作者 Ji-Rong WEN Xiaoming LI 《Frontiers of Computer Science》 SCIE EI CSCD 2017年第5期852-862,共11页
Detecting and using bursty pattems to analyze text streams has been one of the fundamental approaches in many temporal text mining applications. So far, most existing studies have focused on developing methods to dete... Detecting and using bursty pattems to analyze text streams has been one of the fundamental approaches in many temporal text mining applications. So far, most existing studies have focused on developing methods to detect bursty features based purely on term frequency changes. Few have taken the semantic contexts of bursty features into consideration, and as a result the detected bursty features may not always be interesting and can be hard to interpret. In this article, we propose to model the contexts of bursty features using a language modeling approach. We propose two methods to estimate the context language models based on sentence-level context and document-level context. We then propose a novel topic diversity-based metric using the context models to find newsworthy bursty features. We also propose to use the context models to automatically assign meaningful tags to bursty features. Using a large corpus of news articles, we quantitatively show that the proposed context language models for bursty features can effectively help rank bursty features based on their newsworthiness and to assign meaningful tags to annotate bursty features. We also use two example text mining applications to qualitatively demonstrate the usefulness of bursty feature ranking and tagging. 展开更多
关键词 bursty features bursty features ranking bursty feature tagging context modeling
原文传递
Recognition of 3-D Aircrafts by Fourier Descriptors with Fast and Efficient Library Search
7
作者 Zhao Hengzhuo, Wang Yanping(College of Electronic Information, Wuhan University, Wuhan 430072, China) 《Wuhan University Journal of Natural Sciences》 EI CAS 1998年第2期169-174,共6页
Fourier descriptors are used as features for 3-D aircraft classification and pose determination from a 2-D image recorded at an arbitrary viewing angle. By the feature ranking of Fourier descriptors, a classification ... Fourier descriptors are used as features for 3-D aircraft classification and pose determination from a 2-D image recorded at an arbitrary viewing angle. By the feature ranking of Fourier descriptors, a classification procedure based on the fast nearest neighbour rule is proposed to save the matching time of an unknown aircraft with a partial library search. The testing results of some typical examples indicate this method is generally applicable and efficient in 3-D aircraft recognition. 展开更多
关键词 pattern recognition Fourier descriptors nearest neighbour rule feature rank weighting factor distance bound
下载PDF
Applying machine learning approaches to improving the accuracy of breast-tumour diagnosis via fine needle aspiration
8
作者 袁前飞 CAI Cong-zhong +1 位作者 XIAO Han-guang LIU Xing-hua 《Journal of Chongqing University》 CAS 2007年第1期1-7,共7页
Diagnosis and treatment of breast cancer have been improved during the last decade; however, breast cancer is still a leading cause of death among women in the whole world. Early detection and accurate diagnosis of th... Diagnosis and treatment of breast cancer have been improved during the last decade; however, breast cancer is still a leading cause of death among women in the whole world. Early detection and accurate diagnosis of this disease has been demonstrated an approach to long survival of the patients. As an attempt to develop a reliable diagnosing method for breast cancer, we integrated support vector machine (SVM), k-nearest neighbor and probabilistic neural network into a complex machine learning approach to detect malignant breast tumour through a set of indicators consisting of age and ten cellular features of fine-needle aspiration of breast which were ranked according to signal-to-noise ratio to identify determinants distinguishing benign breast tumours from malignant ones. The method turned out to significantly improve the diagnosis, with a sensitivity of 94.04%, a specificity of 97.37%, and an overall accuracy up to 96.24% when SVM was adopted with the sigmoid kernel function under 5-fold cross validation. The results suggest that SVM is a promising methodology to be further developed into a practical adjunct implement to help discerning benign and malignant breast tumours and thus reduce the incidence of misdiagnosis. 展开更多
关键词 breast cancer DIAGNOSIS machine learning approach fine needle aspirate feature ranking/filtering
下载PDF
Exploiting Consumer Reviews for Product Feature Ranking 被引量:1
9
作者 李素科 关志 +1 位作者 唐礼勇 陈钟 《Journal of Computer Science & Technology》 SCIE EI CSCD 2012年第3期635-649,共15页
Web 2.0 technology leads Web users to publish a large number of consumer reviews about products and services on various websites. Major product features extracted from consumer reviews may let product providers find w... Web 2.0 technology leads Web users to publish a large number of consumer reviews about products and services on various websites. Major product features extracted from consumer reviews may let product providers find what features are mostly cared by consumers, and also may help potential consumers to make purchasing decisions. In this work, we propose a linear regression with rules-based approach to ranking product features according to their importance. Empirical experiments show our approach is effective and promising. We also demonstrate two applications using our proposed approach. The first application decomposes overall ratings of products into product feature ratings. And the second application seeks to generate consumer surveys automatically. 展开更多
关键词 product feature ranking product review opinion mining
原文传递
Listwise approaches based on feature ranking discovery
10
作者 Yongqing WANG Wenji MAO +1 位作者 Daniel ZENG Fen XIA 《Frontiers of Computer Science》 SCIE EI CSCD 2012年第6期647-659,共13页
Listwise approaches are an important class of learning to rank, which utilizes automatic learning techniques to discover useful information. Most previous research on listwise approaches has focused on optimizing rank... Listwise approaches are an important class of learning to rank, which utilizes automatic learning techniques to discover useful information. Most previous research on listwise approaches has focused on optimizing ranking models using weights and has used imprecisely labeled training data; optimizing ranking models using features was largely ignored thus the continuous performance improvement of these approaches was hindered. To address the limitations of previous listwise work, we propose a quasi-KNN model to discover the ranking of features and employ rank addition rule to calculate the weight of combination. On the basis of this, we propose three listwise algorithms, FeatureRank, BL-FeatureRank, and DiffRank. The experimental results show that our proposed algorithms can be applied to a strict ordered ranking training set and gain better performance than state-of-the-art listwise algorithms. 展开更多
关键词 learning to rank listwise approach feature's ranking discovery
原文传递
Unsupervised spectral feature selection algorithms for high dimensional data
11
作者 Mingzhao WANG Henry HAN +1 位作者 Zhao HUANG Juanying XIE 《Frontiers of Computer Science》 SCIE EI CSCD 2023年第5期27-40,共14页
It is a significant and challenging task to detect the informative features to carry out explainable analysis for high dimensional data,especially for those with very small number of samples.Feature selection especial... It is a significant and challenging task to detect the informative features to carry out explainable analysis for high dimensional data,especially for those with very small number of samples.Feature selection especially the unsupervised ones are the right way to deal with this challenge and realize the task.Therefore,two unsupervised spectral feature selection algorithms are proposed in this paper.They group features using advanced Self-Tuning spectral clustering algorithm based on local standard deviation,so as to detect the global optimal feature clusters as far as possible.Then two feature ranking techniques,including cosine-similarity-based feature ranking and entropy-based feature ranking,are proposed,so that the representative feature of each cluster can be detected to comprise the feature subset on which the explainable classification system will be built.The effectiveness of the proposed algorithms is tested on high dimensional benchmark omics datasets and compared to peer methods,and the statistical test are conducted to determine whether or not the proposed spectral feature selection algorithms are significantly different from those of the peer methods.The extensive experiments demonstrate the proposed unsupervised spectral feature selection algorithms outperform the peer ones in comparison,especially the one based on cosine similarity feature ranking technique.The statistical test results show that the entropy feature ranking based spectral feature selection algorithm performs best.The detected features demonstrate strong discriminative capabilities in downstream classifiers for omics data,such that the AI system built on them would be reliable and explainable.It is especially significant in building transparent and trustworthy medical diagnostic systems from an interpretable AI perspective. 展开更多
关键词 feature selection spectral clustering feature ranking techniques ENTROPY cosine similarity
原文传递
A feature selection approach based on a similarity measure for software defect prediction 被引量:3
12
作者 Qiao YU Shu-juan JIANG +1 位作者 Rong-cun WANG Hong-yang WANG 《Frontiers of Information Technology & Electronic Engineering》 SCIE EI CSCD 2017年第11期1744-1753,共10页
Software defect prediction is aimed to find potential defects based on historical data and software features. Software features can reflect the characteristics of software modules. However, some of these features may ... Software defect prediction is aimed to find potential defects based on historical data and software features. Software features can reflect the characteristics of software modules. However, some of these features may be more relevant to the class (defective or non-defective), but others may be redundant or irrelevant. To fully measure the correlation between different features and the class, we present a feature selection approach based on a similarity measure (SM) for software defect prediction. First, the feature weights are updated according to the similarity of samples in different classes. Second, a feature ranking list is generated by sorting the feature weights in descending order, and all feature subsets are selected from the feature ranking list in sequence. Finally, all feature subsets are evaluated on a k-nearest neighbor (KNN) model and measured by an area under curve (AUC) metric for classification performance. The experiments are conducted on 11 National Aeronautics and Space Administration (NASA) datasets, and the results show that our approach performs better than or is comparable to the compared feature selection approaches in terms of classification performance. 展开更多
关键词 Software defect prediction Feature selection Similarity measure Feature weights Feature ranking list
原文传递
Gini Correlation for Feature Screening
13
作者 Jun-ying ZHANG Xiao-feng LIU +1 位作者 Ri-quan ZHANG Hang-WANG 《Acta Mathematicae Applicatae Sinica》 SCIE CSCD 2021年第3期590-601,共12页
In this paper we propose the Gini correlation screening(GCS)method to select the important variables with ultrahigh dimensional data.The new procedure is based on the Gini correlation coefficient via the covariance be... In this paper we propose the Gini correlation screening(GCS)method to select the important variables with ultrahigh dimensional data.The new procedure is based on the Gini correlation coefficient via the covariance between the response and the rank of the predictor variables rather than the Pearson correlation and the Kendallτcorrelation coefficient.The new method does not require imposing a specific model structure on regression functions and only needs the condition which the predictors and response have continuous distribution function.We demonstrate that,with the number of predictors growing at an exponential rate of the sample size,the proposed procedure possesses consistency in ranking,which is both useful in its own right and can lead to consistency in selection.The procedure is computationally efficient and simple,and exhibits a competent empirical performance in our intensive simulations and real data analysis. 展开更多
关键词 ultrahigh dimension Gini correlation coefficient variable screening feature ranking
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部