期刊文献+
共找到476篇文章
< 1 2 24 >
每页显示 20 50 100
Accuracies and Training Times of Data Mining Classification Algorithms:An Empirical Comparative Study 被引量:2
1
作者 S.Olalekan Akinola O.Jephthar Oyabugbe 《Journal of Software Engineering and Applications》 2015年第9期470-477,共8页
Two important performance indicators for data mining algorithms are accuracy of classification/ prediction and time taken for training. These indicators are useful for selecting best algorithms for classification/pred... Two important performance indicators for data mining algorithms are accuracy of classification/ prediction and time taken for training. These indicators are useful for selecting best algorithms for classification/prediction tasks in data mining. Empirical studies on these performance indicators in data mining are few. Therefore, this study was designed to determine how data mining classification algorithm perform with increase in input data sizes. Three data mining classification algorithms—Decision Tree, Multi-Layer Perceptron (MLP) Neural Network and Na&iuml;ve Bayes— were subjected to varying simulated data sizes. The time taken by the algorithms for trainings and accuracies of their classifications were analyzed for the different data sizes. Results show that Na&iuml;ve Bayes takes least time to train data but with least accuracy as compared to MLP and Decision Tree algorithms. 展开更多
关键词 Artificial Neural Network classification data mining decision tree Naive Bayesian Performance Evaluation
下载PDF
基于改进SPRINT分类算法的数据挖掘模型
2
作者 林敏 王李杰 《信息技术》 2024年第3期170-174,187,共6页
为解决目前数据挖掘模型分类时间长、挖掘准确率不高的问题,提出基于改进决策树分类算法(SPRINT)的数据挖掘模型。先采用最大-最小规范化公式完成原始数据线性变换,利用改进后的SPRINT分类算法按照输入数据特性进行分类,使用协同过滤技... 为解决目前数据挖掘模型分类时间长、挖掘准确率不高的问题,提出基于改进决策树分类算法(SPRINT)的数据挖掘模型。先采用最大-最小规范化公式完成原始数据线性变换,利用改进后的SPRINT分类算法按照输入数据特性进行分类,使用协同过滤技术生成与数据相近的属性集,计算数据属性相似度,生成语义规则集,为用户提供更优的数据服务。选取某公司营销数据集作为样本进行对比实验,结果表明,与对比模型相比,所提出的数据挖掘模型分类时间更短,挖掘准确率更高,能为用户提供更优质的数据服务。 展开更多
关键词 决策树分类算法 协同过滤技术 语义规则集 数据挖掘模型 神经网络
下载PDF
Research on Scholarship Evaluation System based on Decision Tree Algorithm 被引量:1
3
作者 YIN Xiao WANG Ming-yu 《电脑知识与技术》 2015年第3X期11-13,共3页
Under the modern education system of China, the annual scholarship evaluation is a vital thing for many of the collegestudents. This paper adopts the classification algorithm of decision tree C4.5 based on the betteri... Under the modern education system of China, the annual scholarship evaluation is a vital thing for many of the collegestudents. This paper adopts the classification algorithm of decision tree C4.5 based on the bettering of ID3 algorithm and constructa data set of the scholarship evaluation system through the analysis of the related attributes in scholarship evaluation information.And also having found some factors that plays a significant role in the growing up of the college students through analysis and re-search of moral education, intellectural education and culture&PE. 展开更多
关键词 data mining scholarship evaluation system decision tree algorithm C4.5 algorithm
下载PDF
Data mining and well logging interpretation: application to a conglomerate reservoir 被引量:8
4
作者 石宁 李洪奇 罗伟平 《Applied Geophysics》 SCIE CSCD 2015年第2期263-272,276,共11页
Data mining is the process of extracting implicit but potentially useful information from incomplete, noisy, and fuzzy data. Data mining offers excellent nonlinear modeling and self-organized learning, and it can play... Data mining is the process of extracting implicit but potentially useful information from incomplete, noisy, and fuzzy data. Data mining offers excellent nonlinear modeling and self-organized learning, and it can play a vital role in the interpretation of well logging data of complex reservoirs. We used data mining to identify the lithologies in a complex reservoir. The reservoir lithologies served as the classification task target and were identified using feature extraction, feature selection, and modeling of data streams. We used independent component analysis to extract information from well curves. We then used the branch-and- bound algorithm to look for the optimal feature subsets and eliminate redundant information. Finally, we used the C5.0 decision-tree algorithm to set up disaggregated models of the well logging curves. The modeling and actual logging data were in good agreement, showing the usefulness of data mining methods in complex reservoirs. 展开更多
关键词 data mining well logging interpretation independent component analysis branch-and-bound algorithm C5.0 decision tree
下载PDF
Forecasting Model of Agro-meteorological Disaster Grade Based on Decision Tree 被引量:2
5
作者 司巧梅 《Meteorological and Environmental Research》 CAS 2010年第2期85-87,90,共4页
Based on the discuss of the basic concept of data mining technology and the decision tree method,combining with the data samples of wind and hailstorm disasters in some counties of Mudanjiang region,the forecasting mo... Based on the discuss of the basic concept of data mining technology and the decision tree method,combining with the data samples of wind and hailstorm disasters in some counties of Mudanjiang region,the forecasting model of agro-meteorological disaster grade was established by adopting the C4.5 classification algorithm of decision tree,which can forecast the direct economic loss degree to provide rational data mining model and obtain effective analysis results. 展开更多
关键词 data mining Agro-meteorology decision tree C4.5 algorithm classification mining China
下载PDF
Developing a prediction model for customer churn from electronic banking services using data mining 被引量:5
6
作者 Abbas Keramati Hajar Ghaneei Seyed Mohammad Mirmohammadi 《Financial Innovation》 2016年第1期122-134,共13页
Background:Given the importance of customers as the most valuable assets of organizations,customer retention seems to be an essential,basic requirement for any organization.Banks are no exception to this rule.The comp... Background:Given the importance of customers as the most valuable assets of organizations,customer retention seems to be an essential,basic requirement for any organization.Banks are no exception to this rule.The competitive atmosphere within which electronic banking services are provided by different banks increases the necessity of customer retention.Methods:Being based on existing information technologies which allow one to collect data from organizations’databases,data mining introduces a powerful tool for the extraction of knowledge from huge amounts of data.In this research,the decision tree technique was applied to build a model incorporating this knowledge.Results:The results represent the characteristics of churned customers.Conclusions:Bank managers can identify churners in future using the results of decision tree.They should be provide some strategies for customers whose features are getting more likely to churner’s features. 展开更多
关键词 Customer churn data mining Electronic banking services decision tree classification
下载PDF
Improving Decision Tree Performance by Exception Handling 被引量:1
7
作者 Appavu Alias Balamurugan Subramanian S.Pramala +1 位作者 B.Rajalakshmi Ramasamy Rajaram 《International Journal of Automation and computing》 EI 2010年第3期372-380,共9页
This paper focuses on improving decision tree induction algorithms when a kind of tie appears during the rule generation procedure for specific training datasets. The tie occurs when there are equal proportions of the... This paper focuses on improving decision tree induction algorithms when a kind of tie appears during the rule generation procedure for specific training datasets. The tie occurs when there are equal proportions of the target class outcome in the leaf node's records that leads to a situation where majority voting cannot be applied. To solve the above mentioned exception, we propose to base the prediction of the result on the naive Bayes (NB) estimate, k-nearest neighbour (k-NN) and association rule mining (ARM). The other features used for splitting the parent nodes are also taken into consideration. 展开更多
关键词 data mining classification decision tree majority voting naive Bayes (NB) k nearest neighbour (k NN) association rule mining (ARM)
下载PDF
Study on the Grouping of Patients with Chronic Infectious Diseases Based on Data Mining
8
作者 Min Li 《Journal of Biosciences and Medicines》 2019年第11期119-135,共17页
Objective: According to RFM model theory of customer relationship management, data mining technology was used to group the chronic infectious disease patients to explore the effect of customer segmentation on the mana... Objective: According to RFM model theory of customer relationship management, data mining technology was used to group the chronic infectious disease patients to explore the effect of customer segmentation on the management of patients with different characteristics. Methods: 170,246 outpatient data was extracted from the hospital management information system (HIS) during January 2016 to July 2016, 43,448 data was formed after the data cleaning. K-Means clustering algorithm was used to classify patients with chronic infectious diseases, and then C5.0 decision tree algorithm was used to predict the situation of patients with chronic infectious diseases. Results: Male patients accounted for 58.7%, patients living in Shanghai accounted for 85.6%. The average age of patients is 45.88 years old, the high incidence age is 25 to 65 years old. Patients was gathered into three categories: 1) Clusters 1—Important patients (4786 people, 11.72%, R = 2.89, F = 11.72, M = 84,302.95);2) Clustering 2—Major patients (23,103, 53.2%, R = 5.22, F = 3.45, M = 9146.39);3) Cluster 3—Potential patients (15,559 people, 35.8%, R = 19.77, F = 1.55, M = 1739.09). C5.0 decision tree algorithm was used to predict the treatment situation of patients with chronic infectious diseases, the final treatment time (weeks) is an important predictor, the accuracy rate is 99.94% verified by the confusion model. Conclusion: Medical institutions should strengthen the adherence education for patients with chronic infectious diseases, establish the chronic infectious diseases and customer relationship management database, take the initiative to help them improve treatment adherence. Chinese governments at all levels should speed up the construction of hospital information, establish the chronic infectious disease database, strengthen the blocking of mother-to-child transmission, to effectively curb chronic infectious diseases, reduce disease burden and mortality. 展开更多
关键词 data mining K-Means Clustering algorithm C5.0 decision tree algorithm Customer Relationship Management PATIENTS with CHRONIC INFECTIOUS Disease
下载PDF
Innovative data mining approaches for outcome prediction of trauma patients
9
作者 Eleni-Maria Theodoraki Stylianos Katsaragakis +1 位作者 Christos Koukouvinos Christina Parpoula 《Journal of Biomedical Science and Engineering》 2010年第8期791-798,共8页
Trauma is the most common cause of death to young people and many of these deaths are preventable [1]. The prediction of trauma patients outcome was a difficult problem to investigate till present times. In this study... Trauma is the most common cause of death to young people and many of these deaths are preventable [1]. The prediction of trauma patients outcome was a difficult problem to investigate till present times. In this study, prediction models are built and their capabilities to accurately predict the mortality are assessed. The analysis includes a comparison of data mining techniques using classification, clustering and association algorithms. Data were collected by Hellenic Trauma and Emergency Surgery Society from 30 Greek hospitals. Dataset contains records of 8544 patients suffering from severe injuries collected from the year 2005 to 2006. Factors include patients' demographic elements and several other variables registered from the time and place of accident until the hospital treatment and final outcome. Using this analysis the obtained results are compared in terms of sensitivity, specificity, positive predictive value and negative predictive value and the ROC curve depicts these methods performance. 展开更多
关键词 data mining Medical data decision trees classification RULES Association RULES CLUSTERS CONFUSION Matrix ROC
下载PDF
基于剪枝处理的多源异构数据双挖掘仿真
10
作者 刘诗瑾 杨知玲 《计算机仿真》 2024年第8期513-516,534,共5页
多源异构数据可能来自不同领域、不同格式和不同质量的数据源,处理难度较大,针对多源异构数据难以精准挖掘的问题,提出基于决策树分类的多源异构数据挖掘算法。构建决策树划分数据属性,对初始决策树实施剪枝处理,得出多源异构数据属性集... 多源异构数据可能来自不同领域、不同格式和不同质量的数据源,处理难度较大,针对多源异构数据难以精准挖掘的问题,提出基于决策树分类的多源异构数据挖掘算法。构建决策树划分数据属性,对初始决策树实施剪枝处理,得出多源异构数据属性集,提取出多源异构数据因子,获取粗略的数据挖掘结果。再使用深度学习算法进一步挖掘出其余数据中残存的多源异构数据,并对原始多源异构数据集实施二次挖掘,将粗细挖掘结果整合后实现多源异构数据挖掘。实验结果表明,所提算法的F1值较高,泛化误差较低,数据挖掘性能较强。 展开更多
关键词 决策树 数据分类 多源异构数据 数据挖掘 深度学习算法
下载PDF
不确定大数据流分类的决策树模型构建仿真
11
作者 杨知玲 谭树杰 《计算机仿真》 2024年第5期532-535,542,共5页
在不确定大数据流分类过程中,受噪声和孤立点的干扰,导致处理效果和分类精度无法达到预期要求。为解决上述问题,提出一种基于决策树模型的不确定大数据流分类算法。通过采用在线字典学习算法,对不确定大数据流去噪处理,消除噪声对分类... 在不确定大数据流分类过程中,受噪声和孤立点的干扰,导致处理效果和分类精度无法达到预期要求。为解决上述问题,提出一种基于决策树模型的不确定大数据流分类算法。通过采用在线字典学习算法,对不确定大数据流去噪处理,消除噪声对分类过程产生的干扰。构建决策树,在剪枝过程中通过特征过滤算法,滤除不确定大数据流中掺杂的孤立点。将去噪后的不确定大数据流,输入决策树模型中,完成分类工作。实验结果表明,所提算法处理后的不确定大数据流振幅明显减小,且分类精度高,具有一定的应用价值。 展开更多
关键词 决策树模型 在线字典学习算法 特征过滤 不确定大数据流 数据分类
下载PDF
基于CART决策树的调度算法研究
12
作者 杨松 王艳红 《工业控制计算机》 2024年第11期152-154,共3页
以往的作业车间存在大量的离线加工数据。基于数据挖掘、调度规则和算法优化相关知识,提出基于贝叶斯优化的改进CART算法来对车间数据挖掘利用,根据车间数据的属性逐步划分节点,生成树状结构,剪枝,最后生成加工规则。通过不同数据算例... 以往的作业车间存在大量的离线加工数据。基于数据挖掘、调度规则和算法优化相关知识,提出基于贝叶斯优化的改进CART算法来对车间数据挖掘利用,根据车间数据的属性逐步划分节点,生成树状结构,剪枝,最后生成加工规则。通过不同数据算例的实验结果表明,经过贝叶斯优化后的CART算法相较于传统CART算法提高了对数据划分的能力并提升了生成的决策树的准确度。 展开更多
关键词 数据挖掘 贝叶斯优化 CART算法 加工规则 决策树
下载PDF
SPRINT算法及其改进方法 被引量:3
13
作者 罗可 张学茂 《计算机工程与应用》 CSCD 北大核心 2005年第32期178-180,189,共4页
分类是数据挖掘中重要的研究课题。文章介绍了SPRINT分类算法。为了提高该算法在海量数据库中分类的总体效率,笔者提出了两种处理离散属性的新方法,这些方法能明显减少求最佳分割点的运算量,提高算法的执行速度。
关键词 数据挖掘 分类 决策树 sprint算法
下载PDF
基于SPRINT方法的并行决策树分类研究 被引量:18
14
作者 魏红宁 《计算机应用》 CSCD 北大核心 2005年第1期39-41,共3页
决策树技术的最大问题之一就是它的计算复杂性和训练数据的规模成正比,导致在大的数据集上构造决策树的计算时间太长。并行构造决策树是解决这个问题的一种有效方法。文中基于同步构造决策树的思想,对SPRINT方法的并行性做了详细分析和... 决策树技术的最大问题之一就是它的计算复杂性和训练数据的规模成正比,导致在大的数据集上构造决策树的计算时间太长。并行构造决策树是解决这个问题的一种有效方法。文中基于同步构造决策树的思想,对SPRINT方法的并行性做了详细分析和研究,并提出了进一步研究的方向。 展开更多
关键词 数据挖掘 sprint决策树分类 并行性
下载PDF
基于Hadoop平台的SPRINT算法的分析与研究 被引量:2
15
作者 黄刚 孙媛 《南京师大学报(自然科学版)》 CAS CSCD 北大核心 2016年第4期25-30,共6页
传统的决策树算法在单机平台上处理海量数据挖掘时,容易受到计算能力和存储能力的限制,所以存在耗时过长、容错性差、存储量小的缺点.而拥有高可靠性和高容错性的Hadoop平台的出现为决策树算法的并行化提供了新的思路.本文设计和实现了... 传统的决策树算法在单机平台上处理海量数据挖掘时,容易受到计算能力和存储能力的限制,所以存在耗时过长、容错性差、存储量小的缺点.而拥有高可靠性和高容错性的Hadoop平台的出现为决策树算法的并行化提供了新的思路.本文设计和实现了一种基于Hadoop平台的并行SPRINT分类算法.实验结果表明:基于Hadoop平台的SPRINT分类算法比没有进行并行化的SPRINT算法具有较好的分类正确率、较低的时间复杂度和较好的并行性能,并且能明显提高算法求最佳分裂点时的执行速度. 展开更多
关键词 HADOOP MAPREDUCE 数据挖掘 决策树 sprint算法
下载PDF
决策树算法在船舶自主巡航数据消冗中的应用
16
作者 生力军 陈施奇 《舰船科学技术》 北大核心 2024年第12期157-161,共5页
船舶在进行智能化管理和航行时,需依据可靠的自主巡航数据完成,将大量的传感器数据和监测信息作为输入,以便系统能够作出正确的决策。然而,这些数据可能存在冗余信息干扰,影响着智能决策系统的可靠性,因此研究决策树算法在船舶自主巡航... 船舶在进行智能化管理和航行时,需依据可靠的自主巡航数据完成,将大量的传感器数据和监测信息作为输入,以便系统能够作出正确的决策。然而,这些数据可能存在冗余信息干扰,影响着智能决策系统的可靠性,因此研究决策树算法在船舶自主巡航数据消冗中的应用。采用滤波、插值以及混合式时序数据生成的方式,进行船舶自主巡航数据的时序处理,生成规范的船舶自主巡航时序数据;依据处理后的数据生成决策树,划分船舶自主巡航数据类别;通过计算同类间数据相似度,并设计消除器,实现船舶自主巡航数据消冗处理,获取没有冗余的巡航数据。测试结果显示,该算法的数据时序处理效果较好,可以完成不同数据类别之间的划分,同时能够计算同类数据之间的相似度,最大空间缩减比为27.8%。 展开更多
关键词 决策树算法 船舶自主巡航 数据消冗 时序数据 数据相似度 数据分类
下载PDF
基于SPRINT分类算法挖掘保险业务数据中的风险规则 被引量:1
17
作者 宾宁 《广东工业大学学报》 CAS 2007年第2期99-102,共4页
提出利用SPRINT算法对保险业务数据进行风险分析.针对医疗保险业务,详细介绍了SPRINT算法的预处理、计算最佳分裂、执行分裂的具体设计实现过程,并得出一些实用的风险规则.
关键词 sprint算法 分类算法 数据挖掘 保险业务
下载PDF
改进决策树算法的大数据分类优化方法
18
作者 唐灵逸 唐怡雯 李蓓蓓 《吉林大学学报(信息科学版)》 CAS 2024年第5期959-965,共7页
针对当前海量数据的结构和特征较为复杂,对其分类时很难确保较高的精准度与效率的问题,提出了改进决策树算法的大数据分类优化方法。构建模糊决策函数检测大数据的序列特征,并将其输入决策树模型中挖掘和训练规则;利用灰狼优化算法改进... 针对当前海量数据的结构和特征较为复杂,对其分类时很难确保较高的精准度与效率的问题,提出了改进决策树算法的大数据分类优化方法。构建模糊决策函数检测大数据的序列特征,并将其输入决策树模型中挖掘和训练规则;利用灰狼优化算法改进决策树模型,使用改进后模型对大数据简化、粗略分类,再建立分类器准确度目标函数,实现对大数据的精准分类。实验结果表明,所提方法取得分类结果准确度最高、假正例率最低,保证了算法整体具有较高的吞吐量,提高了算法分类效率。 展开更多
关键词 决策树模型 灰狼优化算法 目标函数 大数据分类 模糊决策函数
下载PDF
基于SPRINT分类算法进行医学预后分析的研究与应用 被引量:2
19
作者 雷炜 《现代计算机》 2008年第10期67-69,共3页
SPRINT算法是一种具有良好扩展性且能实现并行处理的数据分类方法,可以方便地从算法生成的决策树提取规则。在使用海量医学数据库进行预后分析中,它是值得推荐的一种研究方法。对该算法进行了深入研究,并在预后分析中进行了应用,对于类... SPRINT算法是一种具有良好扩展性且能实现并行处理的数据分类方法,可以方便地从算法生成的决策树提取规则。在使用海量医学数据库进行预后分析中,它是值得推荐的一种研究方法。对该算法进行了深入研究,并在预后分析中进行了应用,对于类似医学信息处理有启发意义。 展开更多
关键词 数据挖掘 决策树 sprint算法 预后分析
下载PDF
基于改进决策树算法的电能计量装置故障自动化诊断系统
20
作者 张驰 王栋 赵书函 《自动化与仪表》 2024年第2期1-4,10,共5页
电网中电能计量装置具有周检工作量大、数量多、巡检成本较高、巡检效率较低以及故障定位用时较长等问题,该文提出以改进决策树算法为依据的电能计量装置故障自动化诊断系统。设计相应的电能计量装置在线监测流程,结合数据挖掘以及决策... 电网中电能计量装置具有周检工作量大、数量多、巡检成本较高、巡检效率较低以及故障定位用时较长等问题,该文提出以改进决策树算法为依据的电能计量装置故障自动化诊断系统。设计相应的电能计量装置在线监测流程,结合数据挖掘以及决策树算法进行算例模型的搭建。经过实验验证,所提系统能够更高效地进行故障诊断,同时其自动化诊断故障的准确率相比人工判读更高。 展开更多
关键词 决策树算法 电能计量装置 故障诊断 决策树深度 数据挖掘
下载PDF
上一页 1 2 24 下一页 到第
使用帮助 返回顶部