摘要
针对连续型属性的数据集,当有新样本加入时,可能引起最佳属性约简子集变化的问题,提出了基于邻域粗糙集的特征子集增量式更新方法。根据新增样本对正域的影响,分情况对原数据集的属性约简子集进行动态更新,以便得到增加样本后的新数据的最佳属性约简子集。这种对原约简集合进行的有选择的动态更新可以有效地避免重复操作,降低算法复杂度,只有在最坏的情况下才需要对整个数据集进行重新约简。并以一个实例进行分析说明。实例分析表明,先对新增样本进行分析,然后选择性对新数据集进行约简可以有效地避免重复操作,得到新数据集的最佳属性约简子集。
A feature subset selection algorithm is presented based on neighborhood rough set theory for the datasets which are updated by the increment in their samples. It is well known that the increment in samples can cause the changeable in the reduction of attributes of the dataset. Did a through-paced analysis to the variety on positive region brought by the new added sample to the dataset,and discussed the selective updating to the feature subset ( attribute reduction) according to all the cases. The selective updating to the original reduction of attributes of the dataset can avoid the unwanted operations, and reduce the complexity of the feature subset selection algorithm. Finally, gave a real example and demonstrated the algorithm.
出处
《计算机技术与发展》
2011年第11期149-152,155,共5页
Computer Technology and Development
基金
中央高校基本科研业务费专项资金重点项目(GK200901006)
中央高校基本科研业务费专项资金项目(GK201001003)
陕西省自然科学基础研究计划项目(2010JM3004)
关键词
邻域粗糙集
增量式更新
特征选择
正域
neighborhood rough set
incremental updating
feature subset selection
positive