摘要
近年来,在教育技术领域陆续开展了多项元研究工作,这些研究的一个共同特点就是人工建立一个分类体系,然后将研究样本纳入这些分类体系,这其实就是一个文本分类过程。目前,自然语言处理领域已有较为成熟的文本自动分类技术。但是已有的研究中却没有采用该技术。这与当前缺乏教育技术领域术语词典也有关系。文章以远程教育领域为例,以《开放教育研究》杂志2002年至2006年五年的题录信息为样本,在总结教育技术领域部分术语构成规则的基础上,研制出一种规则和统计相结合的算法来提取术语。测试结果表明,本算法术语识别的准确率为66.7%,召回率为76.7%,与现有的一些术语提取算法结果相近,可以较好的帮助研究者完成术语提取工作,并为及时发现教育技术领域的新术语带来可能。
In recent years, many meta-studies in the field of Educational Technology have been implemented. The common thing of these studies is that researchers constructed a classification system and then categorized the document samples according to the classification system. In fact, the process is a text categorization process. At present, there is mature automatic text categorization technology in the field of Natural Language Processing. However these meta-studies have not adopted this technology. It is partly because of being lack of the term lexicon in the field of Educational Technology. In this study, we select the field of Distance Education as the study object and collect the catalog information of the magazine of Open Education Research from 2002 to 2006 as study samples. We conclude the term-forming rule of partial terms in the field of Educational Technology and then develop a term extraction algorithm by integrating rule method and statistical method. The test shows that the precision of this algorithm is 66.7% and the recall is 76.7%.The test result is similar to the results of other text extraction algorithms. This algorithm can be used to help researchers extract terms and make it possible for us to find new terms in the field of Educational Technology in time.
出处
《现代教育技术》
CSSCI
2008年第5期60-65,共6页
Modern Educational Technology
关键词
术语提取
远程教育
自然语言处理
知识工程
Automatic term extraction
Distance Education
NLP
Knowledge Engineer