摘要
包括计数算子在内的属性构造技术往往能够提高数据挖掘模型的预测精度,但不加条件地使用会导致属性关系不一致问题.为解决此问题,在提出了属性关系一致等3个属性构造原则后,给出了在时序相关模型下避免属性关系不一致问题的新算法——时序计数算子.时序增量计数算子在满足其假设条件下,可以较小的代价显著地降低时序计数算子的高计算成本.实验结果验证了上述结论.
Although count operator was used effectively in the process of data preprocessing, abusive use would cause the inconsistent problem of attribute relationship. To solve that problem, after proposing three attribute construction rules, time-serial count operator, a new algorithm for time-serial correlative model without inconsistent problem of attribute relationship is proposed. The time-serial increment count operator can remarkably reduce the high computing cost of time-serial count operator if the assumption is satisfied. The results of experiments prove the above conclusion.
出处
《软件学报》
EI
CSCD
北大核心
2008年第2期351-357,共7页
Journal of Software
基金
Supported by the National High-Tech Research and Development Plan of China under Grant No.2002AA113020 (国家高技术研究发展计划(863))
关键词
属性构造
计数算子
属性关系一致原则
时序数据预处理
attribute construction
count operator
attribute relationship consistency rule
preprocessing of time-serial data