摘要
从中英文用户的搜索习惯差异的角度出发,引入中文分词技术对中文搜索引擎的搜索日志进行了分析。重点分析了用户输入搜索词的一些规律,包括选择的语言、搜索词的长度和频率、高级搜索技巧的使用以及搜索词的修改情况;还提出了用户提交搜索词的模型,给出了历史搜索词对搜索结果的影响因子算法。
This paper focused on analyzing the query log of large-scale Chinese search engine.It concerned the differences of searching habits between Chinese and English users,and applied a Chinese segmentation technology in some experiments.Then presented some statistical analysis results including the using language,the length and frequency of query words,the utilization of advanced search techniques,and the modification of query words.Additionally,described a model of query words modification,and presented the computation of the impact factors of all query words within a session.
出处
《计算机应用研究》
CSCD
北大核心
2008年第6期1663-1665,共3页
Application Research of Computers
基金
国家"973"计划重点资助项目(2003CB314806)
关键词
搜索引擎
数据挖掘
搜索日志
分词
search engine
data mining
query log
segmentation