期刊文献+

基于向量空间模型的题库相似度检查算法 被引量:12

Similarity Checking Algorithm in Item Bank Based on Vector Space Model
下载PDF
导出
摘要 随着题库系统的广泛应用和题库中试题数量的日益增大,如何避免试题重复,成为研究的重要问题。利用向量空间模型,首先通过TF-IDF公式得到试题的文本权重向量,再通过余弦理论计算试题相似度,并与设定的相似度阈值比较,得到相似度检查结果。在现有题库的基础上进行的实验结果显示,算法计算出的试题相似度的准确率与专家人工判别相比达到94%。算法取得了较好的结果。 With a wide use of item bank system and the increment of items in item bank system,how to avoid duplicate items becomes an important research topic.This paper first gets text with vectors with TF-IDF formula through the algorithm based on vector space mode(VSM) theory.Then,it gets the similarity of items by using cosine theory,which is used for the comparison with the threshold value initialized to get similarity checking resulting.Based on the existing item bank system,the experiment with this algorithm shows that the exact rate of 94%is gained,which is a good result compared with expert checking.
作者 汪忠国 吴敏
出处 《计算机系统应用》 2010年第3期213-216,共4页 Computer Systems & Applications
关键词 向量空间模型 相似度检查 单文本词汇频率 逆文档频率 余弦理论 vector space model similarity checking TF IDF cosine theory
  • 相关文献

参考文献5

二级参考文献18

  • 1黎铭,薛晓冰,周志华.基于多示例学习的中文Web目录页面推荐[J].软件学报,2004,15(9):1328-1335. 被引量:17
  • 2GUAN Y, WANG X L. Quantifying semantic similarity of Chinese words from HowNet[ C]//International Conference on Machine Learning and Cybernetics. Beijing: [s. n. ] , 2002:234 -239.
  • 3YU Z T, HU L. Similarity computation of Chinese question based on chunk [ C ]//International Conference on Machine Learning and Cybernetics. Dalian: [ s. n. ] , 2006 : 17 - 22.
  • 4MANDREOLI F, MARTOGLIA R, TIBERIO P. A syntactic approach for searching similarities within sentences [ C ]//Proceedings of the eleventh international conference on Information and knowledge management. Virginia, USA : [ s. n. ] , 2002:635 - 637.
  • 5GAN K W, WONG P W, CHARNIAK E. Annotation information structures in Chinese texts using How net [ C]//Second Chinese Language Processing Workshop. Hong Kong: [ s. n. ] , 2000:85 -92.
  • 6NIRENBURG S, DONMASHNEW C, DEAN J. Two approaches to Matching in Example-based Machine Translation [ C ]//Proceddings of the fifth International Conference on Theoretical and Methodological in Machine Translation of Natural Languages. Kyoto, Japan : [ s. n. ] , 1993:45 - 57.
  • 7WANG T T, SU X H, WANG Y Y, et al. Semantic similarity-based grading of student programs [ J ]. Information and Software Technology, 2007,49 ( 2 ) : 99 - 107.
  • 8黄国祯 曾秋蓉 朱蕙君.智慧型线上测验系统题型之分析与改进[J].科学教育学刊,2002,10(4):423-439.
  • 9Dietterich T G,Lathrop R H,Lozano-Pérez T.Solving the multiple-instance problem with axisparallel rectangles[J].Artificial Intelligence,1997,89(1-2):31-71.
  • 10Maron O,Lozano-Pérez T.A framework for multiple-instance learning[A].Jordan M I,Kearns M J,Solla S A.Advances in Neural Information Processing Systems 10[C].Cambridge:MIT Press,1998.570-576.

共引文献11

同被引文献87

引证文献12

二级引证文献21

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部