摘要
为提高用户会话聚类的准确性,充分利用页面路径的相似性,提出了基于兴趣点的会话相似性测量方法IPB(interest-point based).该方法充分利用页面路径包含的网站层次结构所体现的分类信息,将同一目录中的页面定义为一个兴趣点.在计算会话相似性时,首先获取用户会话中的兴趣点,根据页面路径的相似性计算兴趣点的相似性,然后根据兴趣点求出会话之间的相似度.实验结果表明,该方法能够更准确地计算Web会话的相似性.
To improve the accuracy of clustering in Web sessions, by fully utilizing the URL path similarity, a new method of measuring similarity of sessions is proposed, viz. IPB(interest-point based). The method defines all the pages in the same content of page hierarchy as an interest point, and extracts interest points in a session. Based on the similarity of URLs path, the similarity of interest point is computed, and the similarity of sessions based on their interest points calculated. Experimental results, compared with the previous methods, show that the proposed method is more accurate in measuring similarity of sessions.
出处
《北京理工大学学报》
EI
CAS
CSCD
北大核心
2006年第4期330-333,共4页
Transactions of Beijing Institute of Technology
基金
北京理工大学基础研究基金资助项目(0301F18)
关键词
WEB挖掘
会话聚类
兴趣点
会话相似性
Web usage mining
clustering session
interest point
sessions similarity