根据专业搜索引擎的特点,提出了一种新颖的基于词语共现与HITS算法的查询推荐算法QR-CH(Query Recom-mendation algorithm based on word Co-occurrence and HITS algorithm)。该算法一方面利用HITS算法对基于词语共现筛选出的关联词按...根据专业搜索引擎的特点,提出了一种新颖的基于词语共现与HITS算法的查询推荐算法QR-CH(Query Recom-mendation algorithm based on word Co-occurrence and HITS algorithm)。该算法一方面利用HITS算法对基于词语共现筛选出的关联词按语义关联性进行排序,选取排序靠前的关联词作为推荐词,提高了推荐词与原查询词的相关性;另一方面使用HITS算法排序关联文档,从查询结果文档集的角度来判断推荐是否冗余,降低了推荐词的冗余性。该算法将推荐相关的信息存储到知识树中,利用知识树实现查询推荐。实验结果表明QR-CH算法在推荐词的相关性和冗余词的判断方面均优于文献中已有的类似算法。展开更多
Microblogging, a popular social media service platform, has become a new information channel for users to receive and exchange the most up-to-date information on current events. Consequently, it is a crucial platform ...Microblogging, a popular social media service platform, has become a new information channel for users to receive and exchange the most up-to-date information on current events. Consequently, it is a crucial platform for detecting newly emerging events and for identifying influential spreaders who have the potential to actively disseminate knowledge about events through microblogs. However, traditional event detection models require human intervention to detect the number of topics to be explored, which significantly reduces the efficiency and accuracy of event detection. In addition, most existing methods focus only on event detection and are unable to identify either influential spreaders or key event-related posts, thus making it challenging to track momentous events in a timely manner. To address these problems, we propose a Hypertext-Induced Topic Search(HITS) based Topic-Decision method(TD-HITS), and a Latent Dirichlet Allocation(LDA) based Three-Step model(TS-LDA). TDHITS can automatically detect the number of topics as well as identify associated key posts in a large number of posts. TS-LDA can identify influential spreaders of hot event topics based on both post and user information.The experimental results, using a Twitter dataset, demonstrate the effectiveness of our proposed methods for both detecting events and identifying influential spreaders.展开更多
文摘根据专业搜索引擎的特点,提出了一种新颖的基于词语共现与HITS算法的查询推荐算法QR-CH(Query Recom-mendation algorithm based on word Co-occurrence and HITS algorithm)。该算法一方面利用HITS算法对基于词语共现筛选出的关联词按语义关联性进行排序,选取排序靠前的关联词作为推荐词,提高了推荐词与原查询词的相关性;另一方面使用HITS算法排序关联文档,从查询结果文档集的角度来判断推荐是否冗余,降低了推荐词的冗余性。该算法将推荐相关的信息存储到知识树中,利用知识树实现查询推荐。实验结果表明QR-CH算法在推荐词的相关性和冗余词的判断方面均优于文献中已有的类似算法。
基金supported by the National Natural Science Foundation of China(Nos.61502209 and 61502207)the Natural Science Foundation of Jiangsu Province of China(No.BK20130528)Visiting Research Fellow Program of Tongji University(No.8105142504)
文摘Microblogging, a popular social media service platform, has become a new information channel for users to receive and exchange the most up-to-date information on current events. Consequently, it is a crucial platform for detecting newly emerging events and for identifying influential spreaders who have the potential to actively disseminate knowledge about events through microblogs. However, traditional event detection models require human intervention to detect the number of topics to be explored, which significantly reduces the efficiency and accuracy of event detection. In addition, most existing methods focus only on event detection and are unable to identify either influential spreaders or key event-related posts, thus making it challenging to track momentous events in a timely manner. To address these problems, we propose a Hypertext-Induced Topic Search(HITS) based Topic-Decision method(TD-HITS), and a Latent Dirichlet Allocation(LDA) based Three-Step model(TS-LDA). TDHITS can automatically detect the number of topics as well as identify associated key posts in a large number of posts. TS-LDA can identify influential spreaders of hot event topics based on both post and user information.The experimental results, using a Twitter dataset, demonstrate the effectiveness of our proposed methods for both detecting events and identifying influential spreaders.