社区搜索用于返回包含给定查询结点且符合查询条件的密集连通子图.目前,大部分已有社区搜索方法主要关注社区的结构,没有考虑到特定应用中资源受限的情况,且忽略了社区的属性特征,无法满足用户对社区搜索的个性化要求.针对该问题,本文...社区搜索用于返回包含给定查询结点且符合查询条件的密集连通子图.目前,大部分已有社区搜索方法主要关注社区的结构,没有考虑到特定应用中资源受限的情况,且忽略了社区的属性特征,无法满足用户对社区搜索的个性化要求.针对该问题,本文提出了规模受限的影响力社区搜索(Size-Constrained Influential Community search,SCIC),设计了基于深度优先搜索的基础算法,在此基础上进一步提出了基于结点预处理、剪枝规则和贪心策略的优化算法,用于减少冗余计算,加速枚举过程.在10个不同规模的数据集上进行实验,实验结果表明基础算法在搜索获得的社区规模和影响力上均优于已有算法,同时,本文提出的优化算法能够显著提升搜索效率,将响应时间缩减至基础算法的1%.展开更多
Microblogging, a popular social media service platform, has become a new information channel for users to receive and exchange the most up-to-date information on current events. Consequently, it is a crucial platform ...Microblogging, a popular social media service platform, has become a new information channel for users to receive and exchange the most up-to-date information on current events. Consequently, it is a crucial platform for detecting newly emerging events and for identifying influential spreaders who have the potential to actively disseminate knowledge about events through microblogs. However, traditional event detection models require human intervention to detect the number of topics to be explored, which significantly reduces the efficiency and accuracy of event detection. In addition, most existing methods focus only on event detection and are unable to identify either influential spreaders or key event-related posts, thus making it challenging to track momentous events in a timely manner. To address these problems, we propose a Hypertext-Induced Topic Search(HITS) based Topic-Decision method(TD-HITS), and a Latent Dirichlet Allocation(LDA) based Three-Step model(TS-LDA). TDHITS can automatically detect the number of topics as well as identify associated key posts in a large number of posts. TS-LDA can identify influential spreaders of hot event topics based on both post and user information.The experimental results, using a Twitter dataset, demonstrate the effectiveness of our proposed methods for both detecting events and identifying influential spreaders.展开更多
文摘社区搜索用于返回包含给定查询结点且符合查询条件的密集连通子图.目前,大部分已有社区搜索方法主要关注社区的结构,没有考虑到特定应用中资源受限的情况,且忽略了社区的属性特征,无法满足用户对社区搜索的个性化要求.针对该问题,本文提出了规模受限的影响力社区搜索(Size-Constrained Influential Community search,SCIC),设计了基于深度优先搜索的基础算法,在此基础上进一步提出了基于结点预处理、剪枝规则和贪心策略的优化算法,用于减少冗余计算,加速枚举过程.在10个不同规模的数据集上进行实验,实验结果表明基础算法在搜索获得的社区规模和影响力上均优于已有算法,同时,本文提出的优化算法能够显著提升搜索效率,将响应时间缩减至基础算法的1%.
基金supported by the National Natural Science Foundation of China(Nos.61502209 and 61502207)the Natural Science Foundation of Jiangsu Province of China(No.BK20130528)Visiting Research Fellow Program of Tongji University(No.8105142504)
文摘Microblogging, a popular social media service platform, has become a new information channel for users to receive and exchange the most up-to-date information on current events. Consequently, it is a crucial platform for detecting newly emerging events and for identifying influential spreaders who have the potential to actively disseminate knowledge about events through microblogs. However, traditional event detection models require human intervention to detect the number of topics to be explored, which significantly reduces the efficiency and accuracy of event detection. In addition, most existing methods focus only on event detection and are unable to identify either influential spreaders or key event-related posts, thus making it challenging to track momentous events in a timely manner. To address these problems, we propose a Hypertext-Induced Topic Search(HITS) based Topic-Decision method(TD-HITS), and a Latent Dirichlet Allocation(LDA) based Three-Step model(TS-LDA). TDHITS can automatically detect the number of topics as well as identify associated key posts in a large number of posts. TS-LDA can identify influential spreaders of hot event topics based on both post and user information.The experimental results, using a Twitter dataset, demonstrate the effectiveness of our proposed methods for both detecting events and identifying influential spreaders.