摘要
针对传统的关联分类算法在构造分类器的过程中需要多次遍历数据集从而消耗大量的计算、存储资源的问题,该文提出了一种基于知识进化算法的分类规则构造方法。该方法首先对数据集中的数据进行编码;然后利用猜测与反驳算子从编码后的数据中提取出猜测知识和反面知识;接着对提取出来的猜测知识进行覆盖度、正确度的计算,并根据不断变化的统计数据利用萃取算子将猜测知识与反面知识进行合理的转换。当得到的知识集中的知识的覆盖度达到预设的阈值时,该数据集中的知识被用来生成分类器进行分类。该方法分块读入待分类的数据集,极大地减少了遍历数据集的次数,明显减少了系统所需的存储空间,提高了分类器的构造效率。实验结果表明,该方法可行、有效,在保证分类精度的前提下,较好地解决了关联分类器构造低效、费时的问题。
Abstract.. To avoid the repeated exhaustive search of the data in classical associative classification approaches, a knowledge evolutionary algorithm based on evolutionary epistemology is proposed. Firstly, data in the data set is encoded. Secondly, the hypotheses knowledge and inaccuracte knowledge are gained by conjecture and refutation op- erator. Thirdly, the coverage and accuracy of the hypotheses and inaccurate knowledge are calculated. Then, an ex- traction operator is used to extract rules from library of inaccurate knowledge and to put them into hypotheses library. Finally, the knowledge obtained with this method was used to build a classifier. In this way, the dataset can be read in a computer partly and the whole times used for read in and read out were reduced largely. The results have shown that knowledge evolution algorithm can speed up the calculation process under the guarantee of similar accuracy of classification.
出处
《中文信息学报》
CSCD
北大核心
2015年第4期126-133,共8页
Journal of Chinese Information Processing
基金
中国博士后科学基金(2013M530534)
河北省教育厅科学研究计划(Z2014181)
国家自然科学基金(61272283
60873035)
河北省人社厅项目(JRS-2014-1103)
关键词
知识进化
猜测
反驳
关联分类
knowledge evolution
conjecture
refutation
associative classification