摘要
现有的邮件系统缺少对海量邮件数据的分析和挖掘功能,传统的对单封邮件的分类方式效率低下。针对该问题,研究文本挖掘特点,提出一种基于海量关系型数据库存储过程实现的高效的海量邮件内容数据挖掘算法,并对算法进行多个级别的性能优化。实验结果表明,该算法具有高效性、稳定性和普适性。
It is short of the functions to analysis and mining great capacity mail data on existing mail data engine. Aiming at this problem, this paper describes and optimizes an efficient great capacity mail data mining algorithm based on directly storage procedure of Relational Database Management System(RDBMS) on performance on many levels after the character of text mining characteristic is studied. Experimental results demonstrate that this algorithm is effective, stable and adaptable.
出处
《计算机工程》
CAS
CSCD
北大核心
2010年第1期40-42,共3页
Computer Engineering
基金
国家"863"计划基金资助项目(2007AA01Z146)
关键词
邮件分类器
数据挖掘
存储过程
mail classifier
data mining
storage procedure