摘要
为了满足数据规模的膨胀和分析需求的增长,在对数据挖掘系统的发展史进行回顾的基础上,分析了国内外典型数据挖掘系统的特点,设计了一个多策略的数据挖掘系统。并针对数据挖掘面临的大规模海量数据的处理问题,为系统引入和设计了算法插件思想、缓冲区处理技术、基于XML(Extensib le M arkup Lan-guage)语言的配置文件和相应的并行处理技术。最后讨论了系统今后开发过程中需要注意算法更新及评估的问题。
The development of the database technology and the comprehensive application of the dataoase management system result in the data expanding and the increasing of the analysis requirement. Many kinds of data mining system and business intelligence software are developed continuously. The paper reviews the development history of the data mining system, analyzes the characteristic of the typical data mining system, and designs a multi strategy data mining system, in dealing with the large scale data, introduces and designs the algorithm groupware idea, buffer processing technology, configuration file based on the XML (Extensible Markup Language) and the parallel processing technology. Finally, discuss the future problem during the development of the system.
出处
《吉林大学学报(信息科学版)》
CAS
2006年第6期610-617,共8页
Journal of Jilin University(Information Science Edition)
基金
国家自然科学基金资助项目(60275026)