摘要
Web是动态性极强的信息源,访问、分析信息必须研究异构数据的集成问题,并选择合适的技术进行数据分析、集成和处理。怎样对Web海量的数据信息进行深层次的应用已成为数据挖掘技术的研究热点。本文介绍了XML(可扩展标记语言)在Web数据挖掘中的应用,探讨了Web数据挖掘中的数据异构问题。通过XML技术建立数据抽取模型,解决互联网上绝大多数因异构、非结构化所导致的Web数据挖掘问题。
The web was an information resource with dynamic state, to access and analyze the data we must study how to integrate heterogeneous architecture data and choose fit techniques to analyze, manage and integrate the data.How to apply plentiful web data to the field of web data mining has been brought into focus. The article discusses the data heterogeneity problem in Web by introducing the application of XML in the field of web data mining. By using XML technology a data extraction model is established for solving most of the difficulties in Web data mining caused by heterogeneous, unstructured problems on Internet.
出处
《中国科技资源导刊》
2012年第4期85-90,共6页
China Science & Technology Resources Review
基金
国家国际科技合作计划项目“异构信息知识挖掘与可视化关键技术研究”(2010DFA14390).