Rapid advances in machine learning combined with wide availability of online social media have created considerable research activity in predicting what might be the news of tomorrow based on an analysis of the past.I...Rapid advances in machine learning combined with wide availability of online social media have created considerable research activity in predicting what might be the news of tomorrow based on an analysis of the past.In this work,we present a deep learning forecasting framework which is capable to predict tomorrow’s news topics on Twitter and news feeds based on yesterday’s content and topic-interaction features.The proposed framework starts by generating topics from words using word embeddings and K-means clustering.Then temporal topic-networks are constructed where two topics are linked if the same user has worked on both topics.Structural and dynamic metrics calculated from networks along with content features and past activity,are used as input of a long short-term memory(LSTM)model,which predicts the number of mentions of a specific topic on the subsequent day.Utilizing dependencies among topics,our experiments on two Twitter datasets and the HuffPost news dataset demonstrate that selecting a topic’s historical local neighbors in the topic-network as extra features greatly improves the prediction accuracy and outperforms existing baselines.展开更多
基金supported in part by the China Scholarship Council Program,under grant No.201906380135.
文摘Rapid advances in machine learning combined with wide availability of online social media have created considerable research activity in predicting what might be the news of tomorrow based on an analysis of the past.In this work,we present a deep learning forecasting framework which is capable to predict tomorrow’s news topics on Twitter and news feeds based on yesterday’s content and topic-interaction features.The proposed framework starts by generating topics from words using word embeddings and K-means clustering.Then temporal topic-networks are constructed where two topics are linked if the same user has worked on both topics.Structural and dynamic metrics calculated from networks along with content features and past activity,are used as input of a long short-term memory(LSTM)model,which predicts the number of mentions of a specific topic on the subsequent day.Utilizing dependencies among topics,our experiments on two Twitter datasets and the HuffPost news dataset demonstrate that selecting a topic’s historical local neighbors in the topic-network as extra features greatly improves the prediction accuracy and outperforms existing baselines.