The nearly 30-year economic growth miracle brings the consequent tremendous poor-rich gap leading strong drives for social transformation in current China. Chinese top leaders have realized to increase the peoples' i...The nearly 30-year economic growth miracle brings the consequent tremendous poor-rich gap leading strong drives for social transformation in current China. Chinese top leaders have realized to increase the peoples' income, improve quality of life and construct a "harmonious society" as key missions especially in recent 10 years. How to measure a harmonious society is one important topic as different measures may lead to different development policies. This paper outlines over 10 indices relevant to measure a harmonious society. Some are global indicators, while some are contributed by domestic researchers and arouse debates. Most of those indicators require conducting surveys on social attitudes under micro levels, which is always time consuming with problem of data quality. As Internet technology advances provide ways to record and disseminate fresh community ideas and thoughts conveniently, detecting topics or emotions from on-line public opinions is becoming a trend or one supplement way to overcome those data acquisition problems. This paper discusses one approach to on-line societal risk perception using hot search words and BBS posts. Such a trial aims to provide another way to societal risk perception different from those in traditional socio psychology studies. Challenges are also indicated.展开更多
Many social events spread fast through the Internet and arouse wide community discussions. Those on-line public opinions emerge into diverse topics along the time. Moreover, the strength of the topics is fluctuating. ...Many social events spread fast through the Internet and arouse wide community discussions. Those on-line public opinions emerge into diverse topics along the time. Moreover, the strength of the topics is fluctuating. How to catch both primary topics and trend of topics over the shifting on-line discussions are not only of theoretical importance for scientific research, but also of practical importance for societal management especially in current China. To try the cutting-edge text analytic technologies to deal with unstructured on-line public opinions and provide support for social problem-solving in the big data era is worth an endeavour. This paper applies dynamic topic model (DTM) to explore the changing topics of new posts collected from Tianya Zatan Board of Tianya Club, the most influential Chinese BBS in China's Mainland. By analysis of the hot and cold terms trends, we catch the topics shift of main on-line concerns with illustrations of topics of school bus and environment in December of 2011. An algorithm is proposed to compute the strength fluctuation of each topic. With visualized analysis of the respective main topics in several months of 2012, some patterns of the topics fluctuation on the board are summarized.展开更多
Societal risk classification is the fundamental issue for online societal risk monitoring. To show the challenge and feasibility of societal risk classification toward BBS posts, an empirical analysis is implemented i...Societal risk classification is the fundamental issue for online societal risk monitoring. To show the challenge and feasibility of societal risk classification toward BBS posts, an empirical analysis is implemented in this paper. Through effectiveness analysis, Support Vector Machine based on Bag-Of-Words (BOW-SVM) is adopted for challenge validation, and the distributed document embeddings of BBS posts generated by Paragraph Vector are applied to feasibility study. Based on BOW-SVM, cross-validations of BBS posts labeled by different groups and annotators are conducted. The big fluctuation of cross-validation results indicates the differences of individual risk perceptions, which brings more challenges to societal risk classification. Furthermore, based on the distributed document embeddings of BBS posts, the pairwise similarities of more than 300 thousands BBS posts from different societal risk categories are compared. The higher similarities of BBS posts in the same societal risk category reveal that BBS posts in the same societal risk category share more features than BBS posts in different categories, which manifests the feasibility of societal risk classification of BBS posts, and also reflects the possibility to improve the performance of societal risk monitoring.展开更多
Modem China is undergoing a variety of social conflicts as the arrival of new era with thetransformation of the principal contradiction. Then monitoring the society stable is a huge workload.Online societal risk perce...Modem China is undergoing a variety of social conflicts as the arrival of new era with thetransformation of the principal contradiction. Then monitoring the society stable is a huge workload.Online societal risk perception is acquired by mapping on-line public concerns respectively intosocietal risk events including national security, economy & finance, public morals, daily life, socialstability, government management, and resources & environment, and then provides one kind ofmeasurement toward the society state. Obviously, stable and harmonious social situations are the basicguarantee for the healthy development of the stock market. Thus we concern whether the variations ofthe societal risk are related to stock market volatility. We study their relationships by two steps, firstthe relationships between search trends and societal risk perception; next the relationships betweensocietal risk perception and stock volatility. The weekend and holiday effects in China stock market aretaken into consideration. Three different econometric methods are explored to observe the impacts ofvariations of societal risk on Shanghai Composite Index and Shenzhen Composite Index. 3 majorfindings are addressed. Firstly, there exist causal relations between Baidu Index and societal riskperception. Secondly, the perception of finance & economy, social stability, and governmentmanagement has distinguishing effects on the volatility of both Shanghai Composite Index and Shenzhen Composite Index. Thirdly, the weekend and holiday effects of societal risk perception on the stock market are verified. The research demonstrates that capturing societal risk based on on-line public concerns is feasible and meaningful.展开更多
Complex problem solving requires diverse expertise and multiple techniques. In order to solve such problems, complex multi-agent systems that include both of human experts and autonomous agents are required in many ap...Complex problem solving requires diverse expertise and multiple techniques. In order to solve such problems, complex multi-agent systems that include both of human experts and autonomous agents are required in many application domains. Most complex multi-agent systems work in open domains and include various heterogeneous agents. Due to the heterogeneity of agents and dynamic features of working environments, expertise and capabilities of agents might not be well estimated and presented in these systems. Therefore, how to discover useful knowledge from human and autonomous experts, make more accurate estimation for experts' capabilities and find out suitable expert(s) to solve incoming problems ("Expert Mining") are important research issues in the area of multi-agent system. In this paper, we introduce an ontology-based approach for knowledge and expert mining in hybrid multi-agent systems. In this research, ontologies are hired to describe knowledge of the system. Knowledge and expert mining processes are executed as the system handles incoming problems. In this approach, we embed more self-learning and self-adjusting abilities in multi-agent systems, so as to help in discovering knowledge of heterogeneous experts of multi-agent systems.展开更多
Text mining, also known as discovering knowledge from the text, which has emerged as a possible solution for the current information explosion, refers to the process of extracting non-trivial and useful patterns from ...Text mining, also known as discovering knowledge from the text, which has emerged as a possible solution for the current information explosion, refers to the process of extracting non-trivial and useful patterns from unstructured text. Among the general tasks of text mining such as text clustering, summarization, etc, text classification is a subtask of intelligent information processing, which employs unsupervised learning to construct a classifier from training text by which to predict the class of unlabeled text. Because of its simplicity and objectivity in performance evaluation, text classification was usually used as a standard tool to determine the advantage or weakness of a text processing method, such as text representation, text feature selection, etc. In this paper, text classification is carried out to classify the Web documents collected from XSSC Website (http://www.xssc.ac.cn). The performance of support vector machine (SVM) and back propagation neural network (BPNN) is compared on this task. Specifically, binary text classification and multi-class text classification were conducted on the XSSC documents. Moreover, the classification results of both methods are combined to improve the accuracy of classification. An experiment is conducted to show that BPNN can compete with SVM in binary text classification; but for multi-class text classification, SVM performs much better. Furthermore, the classification is improved in both binary and multi-class with the combined method.展开更多
Online media have brought tremendous changes to civic life, public opinions, and government administration. Compared with traditional media, online media not only allow individuals to browse news and express their vie...Online media have brought tremendous changes to civic life, public opinions, and government administration. Compared with traditional media, online media not only allow individuals to browse news and express their views more freely, but also accelerate the transmission of opinions and expand influence. As public opinions may arouse societal unrest, it is worth detecting the primary topics and uncovering the evolution trends of public opinions for societal administration. Various algorithms are developed to deal with the huge volume of unstructured online media data. In this study,dynamic topic model is employed to explore topic content evolution and prevalence evolution using the original posts published from 2013 to 2017 on the Tianya Zatan Board of Tianya Club, which is one of the most popular BBS in China. Based on semantic similarities, topics are grouped into three themes: Family life, societal affairs, and government administration. The evolution of topic prevalence and content are affected by emergent incidents. Topics on family life become popular, while themes"societal affairs" and "government administration" with bigger standard deviations are more likely to be influenced by emergent hot events. Content evolution represented by monthly pairwise distance matrix is very easy to find change points of topic content.展开更多
Societal risk classification is a fundamental and complex issue for societal risk perception. To conduct societal risk classification, Tianya Forum posts are selected as the data source, and four kinds of representati...Societal risk classification is a fundamental and complex issue for societal risk perception. To conduct societal risk classification, Tianya Forum posts are selected as the data source, and four kinds of representations: string representation, term-frequency representation, TF-IDF representation and the distributed representation of BBS posts are applied. Using edit distance or cosine similarity as distance metric, four k-Nearest Neighbor (kNN) classifiers based on different representations are developed and compared. Owing to the priority of word order and semantic extraction of the neural network model Paragraph Vector, kNN based on the distributed representation generated by Paragraph Vector (kNN-PV) shows effectiveness for societal risk classification. Furthermore, to improve the performance of societal risk classification, through different weights, kNN-PV is combined with other three kNN classifiers as an ensemble model. Through brute force grid search method, the optimal weights are assigned to different kNN classifiers. Compared with kNN-PV, the experimental results reveal that Macro-F of the ensemble method is significantly improved for societal risk classification.展开更多
Major societal problems affect the social stability. It is necessary to understand the public opinion toward those issues to avoid social conflicts. Nowadays the social media become the major platform to track what th...Major societal problems affect the social stability. It is necessary to understand the public opinion toward those issues to avoid social conflicts. Nowadays the social media become the major platform to track what the public is concerned about and which may be of the societal risk. However,it is very tough to capture the public attention in short time due to huge flow of user-generated contents.In this paper, we approach this problem by expanding the method of generating storyline with the result displayed by a multi-view graph. One real-world example is illustrated and evaluation is given to show the effectiveness of the proposed method.展开更多
As the main channel for people to obtain information and express their opinions,online media generate a huge amount of unstructured news documents every day and make it difficult for people to perceive major societal ...As the main channel for people to obtain information and express their opinions,online media generate a huge amount of unstructured news documents every day and make it difficult for people to perceive major societal events and grasp the evolution of events.Previous studies on storyline generation are generally based on document clustering without considering event arguments and relations between events.Event-centric knowledge graph has been used to facilitate the construction of news documents to form structured event representation.Although some studies have attempted to construct timelines based on event-centric knowledge graphs,it is difficult for timelines to depict the complex structures of event evolution.In this paper,we try to represent news documents as an event-centric knowledge graph,and compress the whole knowledge graph into salient complex events in temporal order to generate storylines named narrative graph.We first collect news documents from news platforms,construct an event ontology,and build an event-centric knowledge graph with temporal relations.Graph neural network is used to detect events,while BERT fine-tuning is leveraged to identify temporal relations between events.Then,a novel generation framework of narrative graph with constraints of coherence and coverage is proposed.In addition,a case study is implemented to demonstrate how to utilize narrative graph to analyze real-world event.The experiment results show that our approach significantly outperforms the baseline approaches.展开更多
The goal of sentiment analysis is to detect the opinion polarities of people towards specific targets.For finegrained analysis aspect-based sentiment analysis(ABSA)is a challenging subtask of sentiment analysis The go...The goal of sentiment analysis is to detect the opinion polarities of people towards specific targets.For finegrained analysis aspect-based sentiment analysis(ABSA)is a challenging subtask of sentiment analysis The goals of most literature are to judge sentiment orientation for a single aspect,but the entities aspects belong to are ignored.Sequence-based methods,such as LSTM,or tagging schemas,such as BIO,always rely on relative distances to target words or accurate positions of targets in sentences It will require more detailed annotations if the target words do not appear in sentences.In this paper,we discuss a scenario where there are multiple entities and shared aspects in multiple sentences.The task is to predict the sentiment polarities of different pairs,ie,(entity,aspect)in each sample and the target entities or aspects are not guaranteed to exist in texts.After converting the long sequences to dependency relation-connected graphs,the dependency distances are embedded automatically to generate contextual representations during iterations We adopt partly densely connected graph convolutional networks with multi-head attention mechanisms to judgethe sentiment polarities for pairs of entities and aspects.The experiments conducted onaChinesedataset demonstrate the effectiveness of the method.Wealso explore the influences of different attention mechanisms and the connection manners of sentences on the tasks.展开更多
The annual International Symposium onKnowledge and Systems Sciences aims topromote the exchange and interaction ofknowledge across disciplines and borders toexplore the new territories and new frontiers. Thepast 18-ye...The annual International Symposium onKnowledge and Systems Sciences aims topromote the exchange and interaction ofknowledge across disciplines and borders toexplore the new territories and new frontiers. Thepast 18-year continuous endeavors since 2000 areillustrations that knowledge science and systemsscience can complement and benefit each othermethodologically when studying and solving avariety of problems.展开更多
基金supported by National Basic Research Program of China under Grant No.2010CB731405Natural Science Foundation of China under Grant No.71171187
文摘The nearly 30-year economic growth miracle brings the consequent tremendous poor-rich gap leading strong drives for social transformation in current China. Chinese top leaders have realized to increase the peoples' income, improve quality of life and construct a "harmonious society" as key missions especially in recent 10 years. How to measure a harmonious society is one important topic as different measures may lead to different development policies. This paper outlines over 10 indices relevant to measure a harmonious society. Some are global indicators, while some are contributed by domestic researchers and arouse debates. Most of those indicators require conducting surveys on social attitudes under micro levels, which is always time consuming with problem of data quality. As Internet technology advances provide ways to record and disseminate fresh community ideas and thoughts conveniently, detecting topics or emotions from on-line public opinions is becoming a trend or one supplement way to overcome those data acquisition problems. This paper discusses one approach to on-line societal risk perception using hot search words and BBS posts. Such a trial aims to provide another way to societal risk perception different from those in traditional socio psychology studies. Challenges are also indicated.
基金supported by National Basic Research Program of China under Grant No.2010CB731405National Natural Science Foundation of China under Grant No.71171187&71371107
文摘Many social events spread fast through the Internet and arouse wide community discussions. Those on-line public opinions emerge into diverse topics along the time. Moreover, the strength of the topics is fluctuating. How to catch both primary topics and trend of topics over the shifting on-line discussions are not only of theoretical importance for scientific research, but also of practical importance for societal management especially in current China. To try the cutting-edge text analytic technologies to deal with unstructured on-line public opinions and provide support for social problem-solving in the big data era is worth an endeavour. This paper applies dynamic topic model (DTM) to explore the changing topics of new posts collected from Tianya Zatan Board of Tianya Club, the most influential Chinese BBS in China's Mainland. By analysis of the hot and cold terms trends, we catch the topics shift of main on-line concerns with illustrations of topics of school bus and environment in December of 2011. An algorithm is proposed to compute the strength fluctuation of each topic. With visualized analysis of the respective main topics in several months of 2012, some patterns of the topics fluctuation on the board are summarized.
文摘Societal risk classification is the fundamental issue for online societal risk monitoring. To show the challenge and feasibility of societal risk classification toward BBS posts, an empirical analysis is implemented in this paper. Through effectiveness analysis, Support Vector Machine based on Bag-Of-Words (BOW-SVM) is adopted for challenge validation, and the distributed document embeddings of BBS posts generated by Paragraph Vector are applied to feasibility study. Based on BOW-SVM, cross-validations of BBS posts labeled by different groups and annotators are conducted. The big fluctuation of cross-validation results indicates the differences of individual risk perceptions, which brings more challenges to societal risk classification. Furthermore, based on the distributed document embeddings of BBS posts, the pairwise similarities of more than 300 thousands BBS posts from different societal risk categories are compared. The higher similarities of BBS posts in the same societal risk category reveal that BBS posts in the same societal risk category share more features than BBS posts in different categories, which manifests the feasibility of societal risk classification of BBS posts, and also reflects the possibility to improve the performance of societal risk monitoring.
基金This research is supported by National Key Research and Development Program of China (2016YFB1000902) and National Natural Science Foundation of China (61473284 & 71731002).
文摘Modem China is undergoing a variety of social conflicts as the arrival of new era with thetransformation of the principal contradiction. Then monitoring the society stable is a huge workload.Online societal risk perception is acquired by mapping on-line public concerns respectively intosocietal risk events including national security, economy & finance, public morals, daily life, socialstability, government management, and resources & environment, and then provides one kind ofmeasurement toward the society state. Obviously, stable and harmonious social situations are the basicguarantee for the healthy development of the stock market. Thus we concern whether the variations ofthe societal risk are related to stock market volatility. We study their relationships by two steps, firstthe relationships between search trends and societal risk perception; next the relationships betweensocietal risk perception and stock volatility. The weekend and holiday effects in China stock market aretaken into consideration. Three different econometric methods are explored to observe the impacts ofvariations of societal risk on Shanghai Composite Index and Shenzhen Composite Index. 3 majorfindings are addressed. Firstly, there exist causal relations between Baidu Index and societal riskperception. Secondly, the perception of finance & economy, social stability, and governmentmanagement has distinguishing effects on the volatility of both Shanghai Composite Index and Shenzhen Composite Index. Thirdly, the weekend and holiday effects of societal risk perception on the stock market are verified. The research demonstrates that capturing societal risk based on on-line public concerns is feasible and meaningful.
文摘Complex problem solving requires diverse expertise and multiple techniques. In order to solve such problems, complex multi-agent systems that include both of human experts and autonomous agents are required in many application domains. Most complex multi-agent systems work in open domains and include various heterogeneous agents. Due to the heterogeneity of agents and dynamic features of working environments, expertise and capabilities of agents might not be well estimated and presented in these systems. Therefore, how to discover useful knowledge from human and autonomous experts, make more accurate estimation for experts' capabilities and find out suitable expert(s) to solve incoming problems ("Expert Mining") are important research issues in the area of multi-agent system. In this paper, we introduce an ontology-based approach for knowledge and expert mining in hybrid multi-agent systems. In this research, ontologies are hired to describe knowledge of the system. Knowledge and expert mining processes are executed as the system handles incoming problems. In this approach, we embed more self-learning and self-adjusting abilities in multi-agent systems, so as to help in discovering knowledge of heterogeneous experts of multi-agent systems.
基金This work is supported by Ministry of Education, Culture, Sports, Science and Technology of Japan under the "Kanazawa Region, Ishikawa High-Tech Sensing Cluster of Knowledge-Based Cluster Creation Project" and the National Natural Science Foundation of China under Grant No.70571078 and 70221001.
文摘Text mining, also known as discovering knowledge from the text, which has emerged as a possible solution for the current information explosion, refers to the process of extracting non-trivial and useful patterns from unstructured text. Among the general tasks of text mining such as text clustering, summarization, etc, text classification is a subtask of intelligent information processing, which employs unsupervised learning to construct a classifier from training text by which to predict the class of unlabeled text. Because of its simplicity and objectivity in performance evaluation, text classification was usually used as a standard tool to determine the advantage or weakness of a text processing method, such as text representation, text feature selection, etc. In this paper, text classification is carried out to classify the Web documents collected from XSSC Website (http://www.xssc.ac.cn). The performance of support vector machine (SVM) and back propagation neural network (BPNN) is compared on this task. Specifically, binary text classification and multi-class text classification were conducted on the XSSC documents. Moreover, the classification results of both methods are combined to improve the accuracy of classification. An experiment is conducted to show that BPNN can compete with SVM in binary text classification; but for multi-class text classification, SVM performs much better. Furthermore, the classification is improved in both binary and multi-class with the combined method.
基金Supported by the National Key Research and Development Program of China(2016YFB1000902)the National Natural Science Foundation of China(71731002&71971190)
文摘Online media have brought tremendous changes to civic life, public opinions, and government administration. Compared with traditional media, online media not only allow individuals to browse news and express their views more freely, but also accelerate the transmission of opinions and expand influence. As public opinions may arouse societal unrest, it is worth detecting the primary topics and uncovering the evolution trends of public opinions for societal administration. Various algorithms are developed to deal with the huge volume of unstructured online media data. In this study,dynamic topic model is employed to explore topic content evolution and prevalence evolution using the original posts published from 2013 to 2017 on the Tianya Zatan Board of Tianya Club, which is one of the most popular BBS in China. Based on semantic similarities, topics are grouped into three themes: Family life, societal affairs, and government administration. The evolution of topic prevalence and content are affected by emergent incidents. Topics on family life become popular, while themes"societal affairs" and "government administration" with bigger standard deviations are more likely to be influenced by emergent hot events. Content evolution represented by monthly pairwise distance matrix is very easy to find change points of topic content.
基金This study is supported by the National Key Research and Development Program of China under grant No. 2016YFB1000902 and National Natural Science Foundation of China under grant Nos. 61473284, 71601023 and 71371107.
文摘Societal risk classification is a fundamental and complex issue for societal risk perception. To conduct societal risk classification, Tianya Forum posts are selected as the data source, and four kinds of representations: string representation, term-frequency representation, TF-IDF representation and the distributed representation of BBS posts are applied. Using edit distance or cosine similarity as distance metric, four k-Nearest Neighbor (kNN) classifiers based on different representations are developed and compared. Owing to the priority of word order and semantic extraction of the neural network model Paragraph Vector, kNN based on the distributed representation generated by Paragraph Vector (kNN-PV) shows effectiveness for societal risk classification. Furthermore, to improve the performance of societal risk classification, through different weights, kNN-PV is combined with other three kNN classifiers as an ensemble model. Through brute force grid search method, the optimal weights are assigned to different kNN classifiers. Compared with kNN-PV, the experimental results reveal that Macro-F of the ensemble method is significantly improved for societal risk classification.
基金Supported by National Key Research and Development Program of China(2016YFB1000902)National Natural Science Foundation of China(61473284,71371107)
文摘Major societal problems affect the social stability. It is necessary to understand the public opinion toward those issues to avoid social conflicts. Nowadays the social media become the major platform to track what the public is concerned about and which may be of the societal risk. However,it is very tough to capture the public attention in short time due to huge flow of user-generated contents.In this paper, we approach this problem by expanding the method of generating storyline with the result displayed by a multi-view graph. One real-world example is illustrated and evaluation is given to show the effectiveness of the proposed method.
基金This work has been supported in part by the National Natural Science Foundation of China(NSFC),under grants No.71731002 and No.71971190The main contents had been presented at the 21st International Symposium on Knowledge and Systems Sciences(KSS2022)held in Beijing during June 11-12,2022The referees are greatly appreciated for their help to improve the quality of the extended paper.
文摘As the main channel for people to obtain information and express their opinions,online media generate a huge amount of unstructured news documents every day and make it difficult for people to perceive major societal events and grasp the evolution of events.Previous studies on storyline generation are generally based on document clustering without considering event arguments and relations between events.Event-centric knowledge graph has been used to facilitate the construction of news documents to form structured event representation.Although some studies have attempted to construct timelines based on event-centric knowledge graphs,it is difficult for timelines to depict the complex structures of event evolution.In this paper,we try to represent news documents as an event-centric knowledge graph,and compress the whole knowledge graph into salient complex events in temporal order to generate storylines named narrative graph.We first collect news documents from news platforms,construct an event ontology,and build an event-centric knowledge graph with temporal relations.Graph neural network is used to detect events,while BERT fine-tuning is leveraged to identify temporal relations between events.Then,a novel generation framework of narrative graph with constraints of coherence and coverage is proposed.In addition,a case study is implemented to demonstrate how to utilize narrative graph to analyze real-world event.The experiment results show that our approach significantly outperforms the baseline approaches.
基金Supported by the National Natural Science Foundation of China(71731002,71971190)。
文摘The goal of sentiment analysis is to detect the opinion polarities of people towards specific targets.For finegrained analysis aspect-based sentiment analysis(ABSA)is a challenging subtask of sentiment analysis The goals of most literature are to judge sentiment orientation for a single aspect,but the entities aspects belong to are ignored.Sequence-based methods,such as LSTM,or tagging schemas,such as BIO,always rely on relative distances to target words or accurate positions of targets in sentences It will require more detailed annotations if the target words do not appear in sentences.In this paper,we discuss a scenario where there are multiple entities and shared aspects in multiple sentences.The task is to predict the sentiment polarities of different pairs,ie,(entity,aspect)in each sample and the target entities or aspects are not guaranteed to exist in texts.After converting the long sequences to dependency relation-connected graphs,the dependency distances are embedded automatically to generate contextual representations during iterations We adopt partly densely connected graph convolutional networks with multi-head attention mechanisms to judgethe sentiment polarities for pairs of entities and aspects.The experiments conducted onaChinesedataset demonstrate the effectiveness of the method.Wealso explore the influences of different attention mechanisms and the connection manners of sentences on the tasks.
文摘The annual International Symposium onKnowledge and Systems Sciences aims topromote the exchange and interaction ofknowledge across disciplines and borders toexplore the new territories and new frontiers. Thepast 18-year continuous endeavors since 2000 areillustrations that knowledge science and systemsscience can complement and benefit each othermethodologically when studying and solving avariety of problems.