摘要
随着Web2.0技术的日益成熟和Blog技术的发展,Blog页面的数量呈指数级上升,只靠基于关键字匹配的Blog搜索引擎已满足不了用户的需要。传统Blog搜索引擎的搜索效率达不到用户个性化要求,受概率潜在语义分析技术研究的启发,将概率潜在语义分析模型用于Blog日志查询,根据用户的兴趣和个性化特点进行检索,返回与用户需求相关的查询结果。实验结果表明,相对于传统的向量空间模型和潜在语义分析模型,基于概率潜在语义分析模型的Blog日志查询在平均精度和召回率上得到了显著提高。
With the Web 2.0 technologies becoming more sophisticated and the development of Blog technolo- gy, the number of Blog is rising exponentially. Blog search engines only based on keywords matching have not already satisfied the user's needs. Under the premise of the efficiency of traditional blog search engine not reaching user's personalization requirements and inspired by Probabilistic Latent Semantic Analysis (PLSA), the model of PLSA is used for Blog posts query. Retrievals are executed according to the user's interests and personal characteristics, and results connected with user's needs are returned. The experimental results showed that, relative to Vector Space Model (VSM) and Latent Semantic Analysis model ( LSA), blog posts query based on PLSA has been significantly improved on average -precision and recall.
出处
《安阳师范学院学报》
2013年第2期39-42,共4页
Journal of Anyang Normal University
基金
安徽省高等学校优秀青年人才基金项目(2010SQRL192
2011SQRL157)
安徽省教育厅自然科学研究一般项目(KJ2013B283)
宿州学院2012年度国家级大学生创新创业训练计划项目(201210379004
201210379003)
关键词
概率潜在语义分析
博客
查询扩展
向量空间模型
潜在语义分析
Probabilistic Latent Semantic Analysis
Blog
Query Expansion
Vector Space Model
Latent Se- mantic Analysis