摘要
随着题库系统的广泛应用和题库中试题数量的日益增大,如何避免试题重复,成为研究的重要问题。利用向量空间模型,首先通过TF-IDF公式得到试题的文本权重向量,再通过余弦理论计算试题相似度,并与设定的相似度阈值比较,得到相似度检查结果。在现有题库的基础上进行的实验结果显示,算法计算出的试题相似度的准确率与专家人工判别相比达到94%。算法取得了较好的结果。
With a wide use of item bank system and the increment of items in item bank system,how to avoid duplicate items becomes an important research topic.This paper first gets text with vectors with TF-IDF formula through the algorithm based on vector space mode(VSM) theory.Then,it gets the similarity of items by using cosine theory,which is used for the comparison with the threshold value initialized to get similarity checking resulting.Based on the existing item bank system,the experiment with this algorithm shows that the exact rate of 94%is gained,which is a good result compared with expert checking.
出处
《计算机系统应用》
2010年第3期213-216,共4页
Computer Systems & Applications