
AIS—基于文本挖掘的增强型Web信息处理技术 被引量:3

AIS:An approach to Web information processing based on Web text mining
摘要 回顾了中文和英文语言环境下的Web文本挖掘现状,阐明了其现阶段的特点和技术瓶颈.之后提出了一种基于Web文本挖掘的网页内容挖掘技术:AIS(Augmented information support),介绍了相关实现所涉及的基础技术和功能.最后将AIS技术应用于香山科学会议网站,开发了AIS4XSSC文本挖掘系统并展示了现阶段其主要功能.实践表明AIS技术能够从大量的Web文本中有效提炼信息,提高用户检索效率并向用户推送有价值的信息. Web text mining (WTM) is a technology for information support as one component of the machine system of HWMSE. Concerning the deficiencies of current search engine for retrieval of WWW, improvements are expected. In this paper, a brief review on recent WTM developments was presented at first. Then a technology on augmented information support, AIS, was proposed to cope with "information explosion" based on WTM technologies. Finally, AIS is applied to the development of the AIS4XSSC (AIS for Xiangshan Science Conference) system, which is customized for information retrieval and knowledge discovery from XSSC Website. The practical application demonstrates that AIS is useful to extract information from Web documents and improve the performance of information retrieval.
出处 《系统工程理论与实践》 EI CSSCI CSCD 北大核心 2010年第1期96-104,共9页 Systems Engineering-Theory & Practice
基金 国家自然科学基金(70571078)
关键词 WEB文本挖掘 知识发现 AIS 综合集成研讨厅 香山科学会议 Web text mining knowledge discovery AIS HWMSE Xiangshan science conference
  • 相关文献


  • 1WWW FAQs: How many web pages are there?[EB/OL], http://www.boutell.com/newfaq/misc/sizeofweb.html.
  • 2White C. Consolidating, accessing and analyzing unstructured data[EB/OL], http://www.b-eye-network.com /view/2098.
  • 3Tang X J. Toward meta-synthetic support to unstructured problem solving[J]. International Journal of Information Technology & Decision Making, 2007, 6(3): 491-508.
  • 4Hotho A. A brief survey of text nlining[EB/OL], http://www.kde.cs.uni-kassel.de/hotho/pub/2005/hotho05Text Mining.pdf.
  • 5Hearst M. Untangling text data mining[C]// Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics, University of Maryland, 1999:27-56.
  • 6Feldman R, Dagan I. Mining text using keyword distribution[J]. Journal of Intelligent Information Systems, 1998, 10:281-300.
  • 7Hotho A, Staab S, Stumme G. Ontologies improve text document.clustering[C] // Proceedings of the 3rd IEEE International Conference on Data Mining, Melbourne, Florida, USA, 2003: 541-544.
  • 8Yang H C, Lee C H. A text mining approach for automatic construction of hypertexts[J]. Expert Systems with Applications, 2005, 29(4): 723-734.
  • 9Lo S H. Web service quality control based on text mining using support vector machine[J]. Expert Systems with Applications, 2008, 34(1): 603-610.
  • 10Zhang Z Y, Nasraoui O. Mining search engine query logs for social filtering-based query recommendation[J]. Applied Soft Computing, 2008, 8(4): 1326-1334.


  • 1李颖,阎保平.Web文本挖掘在互联网信息统计中的研究与设计[J].微电子学与计算机,2005,22(1):62-65. 被引量:5
  • 2刘怡君,唐锡晋.一种支持协作与知识创造的“场”[J].管理科学学报,2006,9(1):79-85. 被引量:24
  • 3唐锡晋,刘怡君.有关社会焦点问题的群体研讨实验——定性综合集成的一种实践[J].系统工程理论与实践,2007,27(3):42-49. 被引量:14
  • 4冯是聪 单松巍 张志刚 等.一个中文网页数据集及其分类体系[A]..海峡两岸技术交流会[C].南京,2002-10.121-129.
  • 5Tang, X J and Zhang Z W. Paper review assignment based on human-knowledge network. Proceedings of IEEE SMC '2008, Singapore, 2008, 102-107.
  • 6Thagard P.刘学礼译.病因何在一科学家如何解释疾病.上海;上海科学教育出版社,2001.
  • 7How Scientist Explain Disease? Princeton University Press, 1999).
  • 8O'Reilly T. Bionic software, http://radar.oreilly.com/archives/ 2006/03/bionic-software.html.
  • 9Ohshima H, Jatowt A, Oyama S, et al. Visualizing changes in coordinate terms over time: An example of mining repositories of temporal data through their search interfaces. Proceedings of the 2008 International Workshop on Information-Explosion and Next Generation Search, Shenyang, 2008, 61-68.
  • 10Perrin T. Global dynamics network construction from the web. Proceedings of the 2008 International Workshop on Information-Explosion and Next Generation Search, Shenyang, 2008, 69-76.












使用帮助 返回顶部