This paper discusses a popular community definition in complex network research in terms of the conditions under which a community is minimal, that is, the community cannot be split into several smaller communities or...This paper discusses a popular community definition in complex network research in terms of the conditions under which a community is minimal, that is, the community cannot be split into several smaller communities or split and reorganized with other network elements into new communities. The result provides a base on which further optimization computation of the quantitative measure for community identification can be realized.展开更多
The risk classification of BBS posts is important to the evaluation of societal risk level within a period. Using the posts collected from Tianya forum as the data source, the authors adopted the societal risk indicat...The risk classification of BBS posts is important to the evaluation of societal risk level within a period. Using the posts collected from Tianya forum as the data source, the authors adopted the societal risk indicators from socio psychology, and conduct document-level multiple societal risk classification of BBS posts. To effectively capture the semantics and word order of documents, a shallow neural network as Paragraph Vector is applied to realize the distributed vector representations of the posts in the vector space. Based on the document vectors, the authors apply one classification method KNN to identify the societal risk category of the posts. The experimental results reveal that paragraph vector in document-level societal risk classification achieves much faster training speed and at least 10% improvements of F-measures than Bag-of-Words. Furthermore, the performance of paragraph vector is also superior to edit distance and Lucene-based search method. The present work is the first attempt of combining document embedding method with socio psychology research results to public opinions area.展开更多
基金The research is supported by the Ministry of Science and Technology of China under Grant No.2006CB503905Some authors are also supported by the National Natural Science Foundation of China under Grant Nos.10631070 and 10701080the JSPS(Japan Society for the Promotion of Science)-NSFC(National Natural Science Foundation of China) collaboration project under Grant No.10711140116
文摘This paper discusses a popular community definition in complex network research in terms of the conditions under which a community is minimal, that is, the community cannot be split into several smaller communities or split and reorganized with other network elements into new communities. The result provides a base on which further optimization computation of the quantitative measure for community identification can be realized.
基金supported by the National Natural Science Foundation of China under Grant Nos.71171187,71371107,and 61473284
文摘The risk classification of BBS posts is important to the evaluation of societal risk level within a period. Using the posts collected from Tianya forum as the data source, the authors adopted the societal risk indicators from socio psychology, and conduct document-level multiple societal risk classification of BBS posts. To effectively capture the semantics and word order of documents, a shallow neural network as Paragraph Vector is applied to realize the distributed vector representations of the posts in the vector space. Based on the document vectors, the authors apply one classification method KNN to identify the societal risk category of the posts. The experimental results reveal that paragraph vector in document-level societal risk classification achieves much faster training speed and at least 10% improvements of F-measures than Bag-of-Words. Furthermore, the performance of paragraph vector is also superior to edit distance and Lucene-based search method. The present work is the first attempt of combining document embedding method with socio psychology research results to public opinions area.