期刊文献+

基于多种规则的课程元数据自动抽取 被引量:7

A Rule-based Metadata Extractor for Learning Materials
下载PDF
导出
摘要 在线课程组织和管理系统就是为了使学习更加便利而提供的一个教育资源的集成平台。作为系统中重要环节的元数据抽取模块,需要对半结构化网页能够达到较好的抽取精确性,并具有处理结构松散文档的能力。本文设计并实现了一种按照指定规则自动抽取的元数据方法。该方法能够按照多优先级规则匹配网页元数据,并按照两步抽取的方法进行精细化处理。针对不同的问题域使用不同规则抽取,不需对程序进行特定修改。实验证明,这种方法能够很好地处理半结构化网页,F测度达到85%以上,具有较好的实用价值。 Integrating all kinds of learning material is becoming more and more significant for the teachers and students to take advantage of the online E-learning courses. As the key part of the whole Online Course Organization System,Metadata Extraction function needs to heaccurate enough when dealing with semi-structured documents, even those incompact ones. We design and !mplement a Metadata Extractor to .compare. between several rules ordered by priority,and there is another step of information refinement to help improving the final accuracy. When domain changes, users just need to input.specific rules, without considering the program. The experiment, shows that our new method can perform very well withthose semi-structured documents, with F measure higher than 85%, which indicates that this method is quite feasible in reality.
出处 《计算机科学》 CSCD 北大核心 2008年第3期94-96,共3页 Computer Science
基金 国家自然科学基金“网络计算环境综合试验平台”(编号90412010) 惠普大学合作基金“在线课程的组织与管理”项目 国家自然科学基金(编号60573166) 广东省网络重点实验室基金的支持
关键词 元数据抽取 正则表达式 信息精化 Metadata extraction, Regular expression, Information refinement
  • 相关文献

参考文献7

  • 1刘世杰,唐世渭,杨冬青,王腾蛟,李立宇.基于XML技术的Web信息提取和集成.见:第二十届全国数据库学术会议,2003
  • 2Crescenzi V, Mecca G. Grammars have Exceptions. Information Systems 1998,23 (8): 539-565
  • 3Garcia-Molina H, Papakonstantinou Y, Quass D, et al. The TSIMMIS Approach to Mediation: Data Models and Languages (extended abstract), In NGITS, 1995
  • 4Arocena G, Mendelzon A. WebOQL: Restructuring Documents, Databases, and Webs. In: Proe. ICDE '98, Feb. 1998
  • 5Huck G, Fankhauser P, Aberer K, Neuhold E J. Jedi: Exchanging and Synthesizing Information from the Web. Coopis, 1998
  • 6Califf M E, Mooney R J. Relational Learning oI Pattern-Match Rules for Information Extraction. In; Proceedings of the Sixteenth National Conference on Artificial Intelligence and Eleventh Conference on Innovative Applications of Artificial Intelligence, Orlando, Florida, 1999. 328-334
  • 7Freitag D. Machine Learning for Information Extraction in Informal Domains. Machine Learning, 2000,39(2-3) :169-202

同被引文献35

引证文献7

二级引证文献15

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部