提出一种基于统计关联规则的增量决策树分类算法,称为SARMT(Statistic Association Rules Miner Tree),它基于快速决策树(Very Fast Decision Tree,VFDT)技术来挖掘医疗数据。与VFDT不同,改进的SARMT算法不依赖于样本分裂节点的数量。...提出一种基于统计关联规则的增量决策树分类算法,称为SARMT(Statistic Association Rules Miner Tree),它基于快速决策树(Very Fast Decision Tree,VFDT)技术来挖掘医疗数据。与VFDT不同,改进的SARMT算法不依赖于样本分裂节点的数量。在医疗大数据中,通常缺少大量可用的数据样本,因此SARMT算法更加适用于医疗环境中。将SARMT算法和VFDT算法应用于不同的三个医疗数据集上,实验结果表明在执行时间相当的情况下,SARMT算法在处理医疗数据中有更高的准确率。展开更多
The Very Fast Decision Tree(VFDT)algorithm is a classification algorithm for data streams.When processing large amounts of data,VFDT requires less time than traditional decision tree algorithms.However,when training s...The Very Fast Decision Tree(VFDT)algorithm is a classification algorithm for data streams.When processing large amounts of data,VFDT requires less time than traditional decision tree algorithms.However,when training samples become fewer,the label values of VFDT leaf nodes will have more errors,and the classification ability of single VFDT decision tree is limited.The Random Forest algorithm is a combinational classifier with high prediction accuracy and noise-tol-erant ability.It is constituted by multiple decision trees and can make up for the shortage of single decision tree.In this paper,in order to improve the classification accuracy on data streams,the Random Forest algorithm is integrated into the process of tree building of the VFDT algorithm,and a new Random Forest Based Very Fast Decision Tree algorithm named RFVFDT is designed.The RFVFDT algorithm adopts the decision tree building criterion of a Random Forest classifier,and improves Random Forest algorithm with sliding window to meet the unboundedness of data streams and avoid process delay and data loss.Experimental results of the classification of KDD CUP data sets show that the classification accuracy of RFVFDT algorithm is higher than that of VFDT.The less the samples are,the more obvious the advantage is.RFVFDT is fast when running in the multithread mode.展开更多
基金科技部"中国新加坡联合研究计划 (0 0 3/ 1 0 1 / 0 4 ) (NSTB- MOST Joint Research Program)"资助项目 (NSTB:National Science and Technology Bureau of SingaporeMOST:Ministry of Science and Technology of China.)
文摘介绍一种用于液体特性研究的新型仪器——光纤、电容液滴分析仪 (FCDA:Fiber- Capacitive Drop Analyzer)。该仪器利用光纤液滴分析技术和电容液滴分析技术制成特殊的液滴传感器 ,获取经过液滴的光强信号随液滴生长变化的规律 ,得到反映液体综合特性的“液滴指纹图”。通过对部分样品进行测试实验 ,证明液滴指纹图可以作为鉴别液体的依据 ,同时具有测量液体物理。
基金科技部"中国新加坡联合研究计划 (0 0 3/ 1 0 1 / 0 4 ) (NSTB- MOST Joint Research Program)"资助项目 (NSTB:National Science and Technology Bureau of SingaporeMOST:Ministry of Science and Technology of China.)
文摘详细介绍了光纤、电容液滴分析仪 (FCDA:Fiber- Capacitive Drop Analyzer)的系统设计方案和各组成部分的具体实现方法 ,包括液滴传感器的设计、微量供液系统的设计。
文摘提出一种基于统计关联规则的增量决策树分类算法,称为SARMT(Statistic Association Rules Miner Tree),它基于快速决策树(Very Fast Decision Tree,VFDT)技术来挖掘医疗数据。与VFDT不同,改进的SARMT算法不依赖于样本分裂节点的数量。在医疗大数据中,通常缺少大量可用的数据样本,因此SARMT算法更加适用于医疗环境中。将SARMT算法和VFDT算法应用于不同的三个医疗数据集上,实验结果表明在执行时间相当的情况下,SARMT算法在处理医疗数据中有更高的准确率。
文摘The Very Fast Decision Tree(VFDT)algorithm is a classification algorithm for data streams.When processing large amounts of data,VFDT requires less time than traditional decision tree algorithms.However,when training samples become fewer,the label values of VFDT leaf nodes will have more errors,and the classification ability of single VFDT decision tree is limited.The Random Forest algorithm is a combinational classifier with high prediction accuracy and noise-tol-erant ability.It is constituted by multiple decision trees and can make up for the shortage of single decision tree.In this paper,in order to improve the classification accuracy on data streams,the Random Forest algorithm is integrated into the process of tree building of the VFDT algorithm,and a new Random Forest Based Very Fast Decision Tree algorithm named RFVFDT is designed.The RFVFDT algorithm adopts the decision tree building criterion of a Random Forest classifier,and improves Random Forest algorithm with sliding window to meet the unboundedness of data streams and avoid process delay and data loss.Experimental results of the classification of KDD CUP data sets show that the classification accuracy of RFVFDT algorithm is higher than that of VFDT.The less the samples are,the more obvious the advantage is.RFVFDT is fast when running in the multithread mode.