期刊文献+

一种基于分类的扩展向量空间信息检索模型研究与应用

An Research and Application of Information Retrieval Model Based on Classified Extendsion Vector Space
下载PDF
导出
摘要 为了提高信息检索系统检索性能,针对信息检索系统中普遍使用的向量空间模型(VSM)所固有的缺陷,提出一种新的基于分类和扩展向量空间模型CE-VSM(Classifier Expand-Vector Space Model)。该模型对传统的空间向量法进行了改进,引入分词技术、朴素贝叶斯分类器和专业词库,重新定义了资源特征向量和查询索引项的内容,参考关键词出现的频率及其在所描述的资源中所起的作用等因素来计算特征索引项和资源向量的权重值。在此基础上,又对查询索引项使用了基于专业词库的扩展策略。实验证明该模型使检索能够在相对精确的范围内进行,提高检索查准率和查全率,改善了信息检索系统的性能。 In order to improve the retrieval performance of information retrieval system,a new vector space model CE-VSM(Classifier Expand—Vector Space Vector Space Model) is put forward based on classification and extension,which according to the deficiency of normal vector space model(VSM) used in information retrieval system.The model modifies traditional space vector method,introduces participle technology,naive Bayes classifier and speciality lexicon,redefines the content of resource eigenvector and query index entry,calculates the weight of characteristic index entry and resource vector according to the frequency of keyword and its influence.Furthermore,expansion strategy based on professional lexicon is also uesd in query index entry.Experimental evidence shows that the model makes the retrieve running at relative accurate environment,improves precision ratio and recall ratio during retrieval and modifies the performance of information retieval system.
出处 《科学技术与工程》 2010年第33期8164-8167,共4页 Science Technology and Engineering
关键词 CE-VSM 朴素贝叶斯分类器 专业词典 同义扩展 CE-VSM Naive Bayes classifier professional dictionaries synonymy expansion
  • 引文网络
  • 相关文献

参考文献3

二级参考文献6

  • 1Yang Y,Pedersen J O. A comparative study on feature selection in text categorization[C]//Proceedings of the 14th International Conference on Machine Learning, Nashville, USA, 1997:412-420.
  • 2Mladenic D,Grobelnik M.Feature selection for unbalanced class distribution and Naive Bayes[C]//Proceedings of 16th International Conference on Machine Learning,San Francisco,1999:255-267.
  • 3Forman G.An extensive empirical study of feature selection metrics for text classification[J]Journal of Machine Learning Research,2003,3:1289-1305.
  • 4McCallum A,Nigam K& comparison of event models for naive bayes text classification[C]//Proceedings of AAAI-98 Workshop on Learning for Text Categorization.Menlo Park : AAAI Press, 1998 : 41-48.
  • 5李凡,鲁明羽,陆玉昌.关于文本特征抽取新方法的研究[J].清华大学学报(自然科学版),2001,41(7):98-101. 被引量:78
  • 6周茜,赵明生,扈旻.中文文本分类中的特征选择研究[J].中文信息学报,2004,18(3):17-23. 被引量:165

共引文献130

;
使用帮助 返回顶部