With the number of social media users ramping up,microblogs are generated and shared at record levels.The high momentum and large volumes of short texts bring redundancies and noises,in which the users and analysts of...With the number of social media users ramping up,microblogs are generated and shared at record levels.The high momentum and large volumes of short texts bring redundancies and noises,in which the users and analysts often find it problematic to elicit useful information of interest.In this paper,we study a query-focused summarization as a solution to address this issue and propose a novel summarization framework to generate personalized online summaries and historical summaries of arbitrary time durations.Our framework can deal with dynamic,perpetual,and large-scale microblogging streams.Specifically,we propose an online microblogging stream clustering algorithm to cluster microblogs and maintain distilled statistics called Microblog Cluster Vectors(MCV).Then we develop a ranking method to extract the most representative sentences relative to the query from the MCVs and generate a query-focused summary of arbitrary time durations.Our experiments on large-scale real microblogs demonstrate the efficiency and effectiveness of our approach.展开更多
Microblog is a new Internet featured product, which has seen a rapid development in recent years. Researchers from different countries are making various technical analyses on microblogging applications. In this study...Microblog is a new Internet featured product, which has seen a rapid development in recent years. Researchers from different countries are making various technical analyses on microblogging applications. In this study, through using the natural language processing(NLP) and data mining, we analyzed the information content transmitted via a microblog, users' social networks and their interactions, and carried out an empirical analysis on the dissemination process of one particular piece of information via Sina Weibo.Based on the result of these analyses, we attempt to develop a better understanding about the rule and mechanism of the informal information flow in microblogging.展开更多
基金This work was supported by Chongqing Research Program of Basic Research and Frontier Technology(cstc2017jcyjAX0071)Basic and Advanced Research Projects of CSTC(cstc2019jcyjzdxm0102)+1 种基金Chongqing Science and Technology Innovation Leading Talent Support Program(CSTCCXLJRC201908)Science and Technology Research Program of Chongqing Municipal Education Commission(KJZD-K201900605).
文摘With the number of social media users ramping up,microblogs are generated and shared at record levels.The high momentum and large volumes of short texts bring redundancies and noises,in which the users and analysts often find it problematic to elicit useful information of interest.In this paper,we study a query-focused summarization as a solution to address this issue and propose a novel summarization framework to generate personalized online summaries and historical summaries of arbitrary time durations.Our framework can deal with dynamic,perpetual,and large-scale microblogging streams.Specifically,we propose an online microblogging stream clustering algorithm to cluster microblogs and maintain distilled statistics called Microblog Cluster Vectors(MCV).Then we develop a ranking method to extract the most representative sentences relative to the query from the MCVs and generate a query-focused summary of arbitrary time durations.Our experiments on large-scale real microblogs demonstrate the efficiency and effectiveness of our approach.
文摘Microblog is a new Internet featured product, which has seen a rapid development in recent years. Researchers from different countries are making various technical analyses on microblogging applications. In this study, through using the natural language processing(NLP) and data mining, we analyzed the information content transmitted via a microblog, users' social networks and their interactions, and carried out an empirical analysis on the dissemination process of one particular piece of information via Sina Weibo.Based on the result of these analyses, we attempt to develop a better understanding about the rule and mechanism of the informal information flow in microblogging.