期刊文献+
共找到1,508篇文章
< 1 2 76 >
每页显示 20 50 100
Predicting Tuberculosis Treatment Relapse: A Decision Tree Analysis of J48 for Data Mining 被引量:1
1
作者 Arnold P. Dela Cruz Gilbert M. Tumibay 《Journal of Computer and Communications》 2019年第7期243-251,共9页
Tuberculosis remains an important problem in public health that threatens the world, including the Philippines. Treatment relapse continues to place a severe problem on patients and TB programs worldwide. A significan... Tuberculosis remains an important problem in public health that threatens the world, including the Philippines. Treatment relapse continues to place a severe problem on patients and TB programs worldwide. A significant reason for the development of decline is poor compliance with medical treatments. The objectives of this research are to generate a predictive data mining model to classify the treatment relapse of TB patients and to identify the features influencing the category of treatment relapse. The TB patient dataset is applied and tested in decision tree J48 algorithm using WEKA. The J48 model identified the three (3) significant independent variables (DSSM Result, Age, and Sex) as predictors of category treatment relapse. 展开更多
关键词 data mining decision tree J48 TUBERCULOSIS WEKA
下载PDF
Accuracies and Training Times of Data Mining Classification Algorithms:An Empirical Comparative Study 被引量:2
2
作者 S.Olalekan Akinola O.Jephthar Oyabugbe 《Journal of Software Engineering and Applications》 2015年第9期470-477,共8页
Two important performance indicators for data mining algorithms are accuracy of classification/ prediction and time taken for training. These indicators are useful for selecting best algorithms for classification/pred... Two important performance indicators for data mining algorithms are accuracy of classification/ prediction and time taken for training. These indicators are useful for selecting best algorithms for classification/prediction tasks in data mining. Empirical studies on these performance indicators in data mining are few. Therefore, this study was designed to determine how data mining classification algorithm perform with increase in input data sizes. Three data mining classification algorithms—Decision Tree, Multi-Layer Perceptron (MLP) Neural Network and Na&iuml;ve Bayes— were subjected to varying simulated data sizes. The time taken by the algorithms for trainings and accuracies of their classifications were analyzed for the different data sizes. Results show that Na&iuml;ve Bayes takes least time to train data but with least accuracy as compared to MLP and Decision Tree algorithms. 展开更多
关键词 Artificial Neural Network Classification data mining decision tree Naive Bayesian Performance Evaluation
下载PDF
Research on Scholarship Evaluation System based on Decision Tree Algorithm 被引量:1
3
作者 YIN Xiao WANG Ming-yu 《电脑知识与技术》 2015年第3X期11-13,共3页
Under the modern education system of China, the annual scholarship evaluation is a vital thing for many of the collegestudents. This paper adopts the classification algorithm of decision tree C4.5 based on the betteri... Under the modern education system of China, the annual scholarship evaluation is a vital thing for many of the collegestudents. This paper adopts the classification algorithm of decision tree C4.5 based on the bettering of ID3 algorithm and constructa data set of the scholarship evaluation system through the analysis of the related attributes in scholarship evaluation information.And also having found some factors that plays a significant role in the growing up of the college students through analysis and re-search of moral education, intellectural education and culture&PE. 展开更多
关键词 data mining scholarship evaluation system decision tree algorithm C4.5 algorithm
下载PDF
Data mining and well logging interpretation: application to a conglomerate reservoir 被引量:8
4
作者 石宁 李洪奇 罗伟平 《Applied Geophysics》 SCIE CSCD 2015年第2期263-272,276,共11页
Data mining is the process of extracting implicit but potentially useful information from incomplete, noisy, and fuzzy data. Data mining offers excellent nonlinear modeling and self-organized learning, and it can play... Data mining is the process of extracting implicit but potentially useful information from incomplete, noisy, and fuzzy data. Data mining offers excellent nonlinear modeling and self-organized learning, and it can play a vital role in the interpretation of well logging data of complex reservoirs. We used data mining to identify the lithologies in a complex reservoir. The reservoir lithologies served as the classification task target and were identified using feature extraction, feature selection, and modeling of data streams. We used independent component analysis to extract information from well curves. We then used the branch-and- bound algorithm to look for the optimal feature subsets and eliminate redundant information. Finally, we used the C5.0 decision-tree algorithm to set up disaggregated models of the well logging curves. The modeling and actual logging data were in good agreement, showing the usefulness of data mining methods in complex reservoirs. 展开更多
关键词 data mining well logging interpretation independent component analysis branch-and-bound algorithm C5.0 decision tree
下载PDF
Forecasting Model of Agro-meteorological Disaster Grade Based on Decision Tree 被引量:2
5
作者 司巧梅 《Meteorological and Environmental Research》 CAS 2010年第2期85-87,90,共4页
Based on the discuss of the basic concept of data mining technology and the decision tree method,combining with the data samples of wind and hailstorm disasters in some counties of Mudanjiang region,the forecasting mo... Based on the discuss of the basic concept of data mining technology and the decision tree method,combining with the data samples of wind and hailstorm disasters in some counties of Mudanjiang region,the forecasting model of agro-meteorological disaster grade was established by adopting the C4.5 classification algorithm of decision tree,which can forecast the direct economic loss degree to provide rational data mining model and obtain effective analysis results. 展开更多
关键词 data mining Agro-meteorology decision tree C4.5 algorithm Classification mining China
下载PDF
Developing a prediction model for customer churn from electronic banking services using data mining 被引量:5
6
作者 Abbas Keramati Hajar Ghaneei Seyed Mohammad Mirmohammadi 《Financial Innovation》 2016年第1期122-134,共13页
Background:Given the importance of customers as the most valuable assets of organizations,customer retention seems to be an essential,basic requirement for any organization.Banks are no exception to this rule.The comp... Background:Given the importance of customers as the most valuable assets of organizations,customer retention seems to be an essential,basic requirement for any organization.Banks are no exception to this rule.The competitive atmosphere within which electronic banking services are provided by different banks increases the necessity of customer retention.Methods:Being based on existing information technologies which allow one to collect data from organizations’databases,data mining introduces a powerful tool for the extraction of knowledge from huge amounts of data.In this research,the decision tree technique was applied to build a model incorporating this knowledge.Results:The results represent the characteristics of churned customers.Conclusions:Bank managers can identify churners in future using the results of decision tree.They should be provide some strategies for customers whose features are getting more likely to churner’s features. 展开更多
关键词 Customer churn data mining Electronic banking services decision tree CLASSIFICATION
下载PDF
Application of Data Mining and Process Knowledge Discovery in Sheet Metal Assembly Dimensional Variation Diagnostic 被引量:1
7
作者 LIAN Jun, LAI Xin-min, LIN Zhong-qin, YAO Fu-sheng (School of Mechanical Engineering, Shanghai Jiaotong University, Shanghai 200030, China) 《厦门大学学报(自然科学版)》 CAS CSCD 北大核心 2002年第S1期37-,共1页
Sheet metal is widely used on auto-bodies, plane-bodies and metal furniture, etc. For instance, a typical auto-body commonly consists of hundreds of sheet metal stamping parts. Because of its complexity of structure a... Sheet metal is widely used on auto-bodies, plane-bodies and metal furniture, etc. For instance, a typical auto-body commonly consists of hundreds of sheet metal stamping parts. Because of its complexity of structure and manufacturing process, auto-bodies inevitably have geometrical variation results from a number of different sources, such as the geometrical variation of stamping parts, the transformation of assembly process parameters and even the improper design concept. As more than 30% quality defects of an auto-body are born from the dimensional deviation of Body-In-White originated during the manufacturing process, effective diagnosis and control of dimensional faults are essential to the continuous improvement of the quality of vehicles. Especially during the period of new car launching or model changing when the assembly process was changed and adjusted frequently. For continuously improving the quality of modern cars, rapid dimensional variation causes identification becomes a challenging but essential work. In this paper, main variation causes of auto-body was firstly been cataloged and analyzed, then, a dimensional variation diagnostic reasoning and decision approach was developed through the combination of data mining and knowledge discovery techniques. This approach is driven by variation pattern identification which can be discovered from the dispersive, isolated massive measured data: Correlation Analysis (CA) and Maximal Tree (MT) methods were applied to extract the large variation group from massive multidimensional measured data, while multivariate statistical analysis (MSA) approach was used to discovery the principle variation pattern. A Decision Tree (DT) approach based on the knowledge of product and assembly process was developed to fulfill the "Hypothesis and Validation" characterized variation causes reasoning procedure. An practical application case with sudden and severe dimension variation on rear end panel in up/down direction was analyzed and successfully solved aided by the devloped variation diagnostic method, which have proved that the approach is effective and efficient. 展开更多
关键词 auto-body variation diagnosis data mining decision tree
下载PDF
Improving Decision Tree Performance by Exception Handling 被引量:1
8
作者 Appavu Alias Balamurugan Subramanian S.Pramala +1 位作者 B.Rajalakshmi Ramasamy Rajaram 《International Journal of Automation and computing》 EI 2010年第3期372-380,共9页
This paper focuses on improving decision tree induction algorithms when a kind of tie appears during the rule generation procedure for specific training datasets. The tie occurs when there are equal proportions of the... This paper focuses on improving decision tree induction algorithms when a kind of tie appears during the rule generation procedure for specific training datasets. The tie occurs when there are equal proportions of the target class outcome in the leaf node's records that leads to a situation where majority voting cannot be applied. To solve the above mentioned exception, we propose to base the prediction of the result on the naive Bayes (NB) estimate, k-nearest neighbour (k-NN) and association rule mining (ARM). The other features used for splitting the parent nodes are also taken into consideration. 展开更多
关键词 data mining classification decision tree majority voting naive Bayes (NB) k nearest neighbour (k NN) association rule mining (ARM)
下载PDF
Analyzing the Factors Affecting the Users' Success in Web Based Education: A Data Mining Approach
9
作者 Sona Mardikyan Cigdem Karakaya 《Computer Technology and Application》 2011年第5期396-400,共5页
Corporations focus on web based education to train their employees ever more than before. Unlike traditional learning environments, web based education applications store large amount of data. This growing availabilit... Corporations focus on web based education to train their employees ever more than before. Unlike traditional learning environments, web based education applications store large amount of data. This growing availability of data stimulated the emergence of a new field called educational data mining. In this study, the classification method is implemented on a data that is obtained from a company which uses web based education to train their employees. The authors' aim is to find out the most critical factors that influence the users' success. For the classification of the data, two decision tree algorithms, Classification and Regression Tree (CART) and Quick, Unbiased and Efficient Statistical Tree (QUEST) are applied. According to the results, assurance of a certificate at the end of the training is found to be the most critical factor that influences the users' success. Position, number of work years and the education level of the user, are also found as important factors. 展开更多
关键词 Web based education data mining decision trees users' success
下载PDF
A study of the employment of higher institutions based on the decision tree model 被引量:1
10
作者 SHEN Shi-kai WANG Wu HONG Sun-yan 《通讯和计算机(中英文版)》 2008年第10期28-32,共5页
关键词 决策树 电子商务 计算机技术 企业管理
下载PDF
Study on the Grouping of Patients with Chronic Infectious Diseases Based on Data Mining
11
作者 Min Li 《Journal of Biosciences and Medicines》 2019年第11期119-135,共17页
Objective: According to RFM model theory of customer relationship management, data mining technology was used to group the chronic infectious disease patients to explore the effect of customer segmentation on the mana... Objective: According to RFM model theory of customer relationship management, data mining technology was used to group the chronic infectious disease patients to explore the effect of customer segmentation on the management of patients with different characteristics. Methods: 170,246 outpatient data was extracted from the hospital management information system (HIS) during January 2016 to July 2016, 43,448 data was formed after the data cleaning. K-Means clustering algorithm was used to classify patients with chronic infectious diseases, and then C5.0 decision tree algorithm was used to predict the situation of patients with chronic infectious diseases. Results: Male patients accounted for 58.7%, patients living in Shanghai accounted for 85.6%. The average age of patients is 45.88 years old, the high incidence age is 25 to 65 years old. Patients was gathered into three categories: 1) Clusters 1—Important patients (4786 people, 11.72%, R = 2.89, F = 11.72, M = 84,302.95);2) Clustering 2—Major patients (23,103, 53.2%, R = 5.22, F = 3.45, M = 9146.39);3) Cluster 3—Potential patients (15,559 people, 35.8%, R = 19.77, F = 1.55, M = 1739.09). C5.0 decision tree algorithm was used to predict the treatment situation of patients with chronic infectious diseases, the final treatment time (weeks) is an important predictor, the accuracy rate is 99.94% verified by the confusion model. Conclusion: Medical institutions should strengthen the adherence education for patients with chronic infectious diseases, establish the chronic infectious diseases and customer relationship management database, take the initiative to help them improve treatment adherence. Chinese governments at all levels should speed up the construction of hospital information, establish the chronic infectious disease database, strengthen the blocking of mother-to-child transmission, to effectively curb chronic infectious diseases, reduce disease burden and mortality. 展开更多
关键词 data mining K-Means Clustering algorithm C5.0 decision tree algorithm Customer Relationship Management PATIENTS with CHRONIC INFECTIOUS Disease
下载PDF
Innovative data mining approaches for outcome prediction of trauma patients
12
作者 Eleni-Maria Theodoraki Stylianos Katsaragakis +1 位作者 Christos Koukouvinos Christina Parpoula 《Journal of Biomedical Science and Engineering》 2010年第8期791-798,共8页
Trauma is the most common cause of death to young people and many of these deaths are preventable [1]. The prediction of trauma patients outcome was a difficult problem to investigate till present times. In this study... Trauma is the most common cause of death to young people and many of these deaths are preventable [1]. The prediction of trauma patients outcome was a difficult problem to investigate till present times. In this study, prediction models are built and their capabilities to accurately predict the mortality are assessed. The analysis includes a comparison of data mining techniques using classification, clustering and association algorithms. Data were collected by Hellenic Trauma and Emergency Surgery Society from 30 Greek hospitals. Dataset contains records of 8544 patients suffering from severe injuries collected from the year 2005 to 2006. Factors include patients' demographic elements and several other variables registered from the time and place of accident until the hospital treatment and final outcome. Using this analysis the obtained results are compared in terms of sensitivity, specificity, positive predictive value and negative predictive value and the ROC curve depicts these methods performance. 展开更多
关键词 data mining Medical data decision trees Classification RULES Association RULES CLUSTERS CONFUSION Matrix ROC
下载PDF
Data Mining for Flooding Episode in the States of Alagoas and Pernambuco—Brazil
13
作者 Heloisa Musetti Ruivo Haroldo F. de Campos Velho +1 位作者 Fernando M. Ramos Saulo R. Freitas 《American Journal of Climate Change》 2018年第3期420-430,共11页
The increasing volume of data in the area of environmental sciences needs analysis and interpretation. Among the challenges generated by this “data deluge”, the development of efficient strategies for the knowledge ... The increasing volume of data in the area of environmental sciences needs analysis and interpretation. Among the challenges generated by this “data deluge”, the development of efficient strategies for the knowledge discovery is an important issue. Here, statistical and tools from computational intelligence are applied to analyze large data sets from meteorology and climate sciences. Our approach allows a geographical mapping of the statistical property to be easily interpreted by meteorologists. Our data analysis comprises two main steps of knowledge extraction, applied successively in order to reduce the complexity from the original data set. The goal is to identify a much smaller subset of climatic variables that might still be able to describe or even predict the probability of occurrence of an extreme event. The first step applies a class comparison technique: p-value estimation. The second step consists of a decision tree (DT) configured from the data available and the p-value analysis. The DT is used as a predictive model, identifying the most statistically significant climate variables of the precipitation intensity. The methodology is employed to the study the climatic causes of an extreme precipitation events occurred in Alagoas and Pernambuco States (Brazil) at June/2010. 展开更多
关键词 data mining Statistical Analysis T-TEST P-VALUE Artificial INTELLIGENCE decision tree
下载PDF
Development of a monitoring system for grain loss of paddy rice based on a decision tree algorithm 被引量:1
14
作者 Yi Lian Jin Chen +1 位作者 Zhuohuai Guan Jie Song 《International Journal of Agricultural and Biological Engineering》 SCIE EI CAS 2021年第1期224-229,共6页
China has the world’s largest planting area of paddy rice,but large quantities of paddy rice fall to the ground and are lost during harvesting with a combine harvester.Reducing grain loss is an effective way to incre... China has the world’s largest planting area of paddy rice,but large quantities of paddy rice fall to the ground and are lost during harvesting with a combine harvester.Reducing grain loss is an effective way to increase production and revenue.In this study,a monitoring system was developed to monitor the grain loss of the paddy rice and this approach was tested on the test bench for verifying the precision.The development of the monitoring system for grain loss included two stages:the first stage was to collect impact signals using a piezoelectric film,extract the four features of Root Mean Square,Peak number,Frequency and Amplitude(fundamental component),and identify the kernel impact signals using the J48(C4.5)Decision Tree algorithm.In the second stage,the precision of the monitoring system was tested for the paddy rice at three different moisture contents(10.4%,19.6%,and 30.4%)and five different grain/impurity ratios(1/0.5,1/1,1/1.5,1/2,and 1/2.5).According to the results,the highest monitoring accuracy was 99.3%(moisture content 30.8%and grain/impurity ratio 1/2.5),the average accuracy of the monitoring tests was 92.6%,and monitoring of grain/impurity ratios between 1/1 and 1/1.5(>95.4%)had higher accuracy than monitoring the other grain/impurity ratios.Monitoring accuracy decreased as impurities increased.The lowest accuracy for grain loss monitoring was obtained when the grain/impurity ratio was 1/2.5,with monitoring accuracies of 88.2%,75.7%and 78.8%at moisture contents of 10.4%,19.6%and 30.4%. 展开更多
关键词 monitoring system combine harvester paddy rice grain loss SENSOR data mining decision tree DEVELOPMENT
原文传递
基于改进SPRINT分类算法的数据挖掘模型
15
作者 林敏 王李杰 《信息技术》 2024年第3期170-174,187,共6页
为解决目前数据挖掘模型分类时间长、挖掘准确率不高的问题,提出基于改进决策树分类算法(SPRINT)的数据挖掘模型。先采用最大-最小规范化公式完成原始数据线性变换,利用改进后的SPRINT分类算法按照输入数据特性进行分类,使用协同过滤技... 为解决目前数据挖掘模型分类时间长、挖掘准确率不高的问题,提出基于改进决策树分类算法(SPRINT)的数据挖掘模型。先采用最大-最小规范化公式完成原始数据线性变换,利用改进后的SPRINT分类算法按照输入数据特性进行分类,使用协同过滤技术生成与数据相近的属性集,计算数据属性相似度,生成语义规则集,为用户提供更优的数据服务。选取某公司营销数据集作为样本进行对比实验,结果表明,与对比模型相比,所提出的数据挖掘模型分类时间更短,挖掘准确率更高,能为用户提供更优质的数据服务。 展开更多
关键词 决策树分类算法 协同过滤技术 语义规则集 数据挖掘模型 神经网络
下载PDF
An Experimental Analysis of the Applications of Datamining Methods on Bigdata
16
作者 CH.Naga Santhosh Kumar K.S.Reddy 《Journal of Autonomous Intelligence》 2019年第3期31-39,共9页
Data mining is a procedure of separating covered up,obscure,however possibly valuable data from gigantic data.Huge Data impactsly affects logical disclosures and worth creation.Data mining(DM)with Big Data has been br... Data mining is a procedure of separating covered up,obscure,however possibly valuable data from gigantic data.Huge Data impactsly affects logical disclosures and worth creation.Data mining(DM)with Big Data has been broadly utilized in the lifecycle of electronic items that range from the structure and generation stages to the administration organize.A far reaching examination of DM with Big Data and a survey of its application in the phases of its lifecycle won't just profit scientists to create solid research.As of late huge data have turned into a trendy expression,which constrained the analysts to extend the current data mining methods to adapt to the advanced idea of data and to grow new scientific procedures.In this paper,we build up an exact assessment technique dependent on the standard of Design of Experiment.We apply this technique to assess data mining instruments and AI calculations towards structure huge data examination for media transmission checking data.Two contextual investigations are directed to give bits of knowledge of relations between the necessities of data examination and the decision of an instrument or calculation with regards to data investigation work processes. 展开更多
关键词 data mining Big data Knowledge Discovery databases decision tree Cloud data mining K-Closest Neighbor Artificial Intelligence CLUSTER
下载PDF
Data mining for classification of power quality problems using WEKA and the effect of attributes on classification accuracy 被引量:17
17
作者 S.Asha Kiranmai A.Jaya Laxmi 《Protection and Control of Modern Power Systems》 2018年第1期312-323,共12页
There is growing interest in power quality issues due to wider developments in power delivery engineering.In order to maintain good power quality,it is necessary to detect and monitor power quality problems.The power ... There is growing interest in power quality issues due to wider developments in power delivery engineering.In order to maintain good power quality,it is necessary to detect and monitor power quality problems.The power quality monitoring requires storing large amount of data for analysis.This rapid increase in the size of databases has demanded new technique such as data mining to assist in the analysis and understanding of the data.This paper presents the classification of power quality problems such as voltage sag,swell,interruption and unbalance using data mining algorithms:J48,Random Tree and Random Forest decision trees.These algorithms are implemented on two sets of voltage data using WEKA software.The numeric attributes in first data set include 3-phase RMS voltages at the point of common coupling.In second data set,three more numeric attributes such as minimum,maximum and average voltages,are added along with 3-phase RMS voltages.The performance of the algorithms is evaluated in both the cases to determine the best classification algorithm,and the effect of addition of the three attributes in the second case is studied,which depicts the advantages in terms of classification accuracy and training time of the decision trees. 展开更多
关键词 Power quality problems CLASSIFICATION data mining decision trees J48 Random tree Random forest WEKA
原文传递
基于关联分析FP-Tree算法的企业风险信息数据在线挖掘方法
18
作者 庞泰 翁巍 +2 位作者 孟灿 赵蕾 牛红伟 《无线互联科技》 2024年第11期75-77,共3页
现阶段的数据挖掘方法缺少对数据关联分析的过程,挖掘效果较差,故文章提出基于关联分析频繁模式树(FrequentPattern Tree, FP-Tree)算法的企业风险信息数据在线挖掘方法。选取与企业风险相关的信息指标,收集有关数据并进行预处理操作后... 现阶段的数据挖掘方法缺少对数据关联分析的过程,挖掘效果较差,故文章提出基于关联分析频繁模式树(FrequentPattern Tree, FP-Tree)算法的企业风险信息数据在线挖掘方法。选取与企业风险相关的信息指标,收集有关数据并进行预处理操作后,设计一种考虑关联分析的FP-Tree算法,生成FP-Tree节点的条件模式树挖掘频繁项集,计算满足最小置信度的频繁项集,实现企业风险信息数据在线挖掘。实验结果表明,所用方法挖掘量和挖掘效率较高。 展开更多
关键词 关联分析FP-tree算法 企业风险信息数据 在线挖掘方法 数据挖掘
下载PDF
Heuristic solution using decision tree model for enhanced XML schema matching of bridge structural calculation documents
19
作者 Sang IPARK Sang-Ho LEE 《Frontiers of Structural and Civil Engineering》 SCIE EI CSCD 2020年第6期1403-1417,共15页
Research on the quality of data in a structural calculation document(SCD)is lacking,although the SCD ofa bridge is used as an essential reference during the entire lifecycle of the facility.XML Schema matching enables... Research on the quality of data in a structural calculation document(SCD)is lacking,although the SCD ofa bridge is used as an essential reference during the entire lifecycle of the facility.XML Schema matching enables qualitative improvement of the stored data.This study aimed to enhance the applicability of XML Schema matching,which improves the speed and quality of information stored in bridge SCDs.First,the authors proposed a method of reducing the computing time for the schema matching of bridge SCDs.The computing speed of schema matching was increased by 13 to 1800 times by reducing the checking process of the correlations.Second,the authors developed a heuristic solution for selecting the optimal weight factors used in the matching process to maintain a high accuracy by introducing a decision tree.The decision tree model was built using the content elements stored in the SCD,design companies,bridge types,and weight factors as input variables,and the matching accuracy as the target variable.The inverse-calculation method was applied to extract the weight factors from the decision tree model for high-accuracy schema matching results. 展开更多
关键词 structural calculation document bridge structure XML Schema matching weight factor data mining decision tree model
原文传递
基于剪枝处理的多源异构数据双挖掘仿真
20
作者 刘诗瑾 杨知玲 《计算机仿真》 2024年第8期513-516,534,共5页
多源异构数据可能来自不同领域、不同格式和不同质量的数据源,处理难度较大,针对多源异构数据难以精准挖掘的问题,提出基于决策树分类的多源异构数据挖掘算法。构建决策树划分数据属性,对初始决策树实施剪枝处理,得出多源异构数据属性集... 多源异构数据可能来自不同领域、不同格式和不同质量的数据源,处理难度较大,针对多源异构数据难以精准挖掘的问题,提出基于决策树分类的多源异构数据挖掘算法。构建决策树划分数据属性,对初始决策树实施剪枝处理,得出多源异构数据属性集,提取出多源异构数据因子,获取粗略的数据挖掘结果。再使用深度学习算法进一步挖掘出其余数据中残存的多源异构数据,并对原始多源异构数据集实施二次挖掘,将粗细挖掘结果整合后实现多源异构数据挖掘。实验结果表明,所提算法的F1值较高,泛化误差较低,数据挖掘性能较强。 展开更多
关键词 决策树 数据分类 多源异构数据 数据挖掘 深度学习算法
下载PDF
上一页 1 2 76 下一页 到第
使用帮助 返回顶部