摘要
为了给邮件网络分析提供预处理后规范、简约的数据集,提出了基于粗糙集的邮件系统相关定义。将邮件对象集按邮件收发时间属性分类,并描述了基于邮件属性值支持度的属性值约简方法。该方法成功运用在Enron公司邮件包中部分邮件数据的预处理上。实验表明,处理后的邮件分析数据更加规范化,并且极大地缩减了邮件对象集的大小。
In order to provide pre-processed normative and reductive dataset to email network analysis,this paper presented correlated definitions in regard to mail system based on the rough set.In these definitions,the mail object set was classified by time attribute.Meanwhile,a method of attribute value reduction based on mail attribute value support is described.This method has been applied to the pretreatment of part of mail data in Enron email packet successfully.Experiment shows that the mail analysis data becomes more normative after being processed,and the size of mail object set is greatly reduced as well.
出处
《计算机应用与软件》
CSCD
2011年第1期112-113,132,共3页
Computer Applications and Software
基金
辽宁省教育计划项目(2008093)
关键词
邮件系统
粗糙集
时间属性
属性值约简
属性值支持度
Mail system Rough sets Time attribute Attribute value reduction Attribute value support