With the explosive growth of Internet text information,the task of text classification is more important.As a part of text classification,Chinese news text classification also plays an important role.In public securit...With the explosive growth of Internet text information,the task of text classification is more important.As a part of text classification,Chinese news text classification also plays an important role.In public security work,public opinion news classification is an important topic.Effective and accurate classification of public opinion news is a necessary prerequisite for relevant departments to grasp the situation of public opinion and control the trend of public opinion in time.This paper introduces a combinedconvolutional neural network text classification model based on word2vec and improved TF-IDF:firstly,the word vector is trained through word2vec model,then the weight of each word is calculated by using the improved TFIDF algorithm based on class frequency variance,and the word vector and weight are combined to construct the text vector representation.Finally,the combined-convolutional neural network is used to train and test the Thucnews data set.The results show that the classification effect of this model is better than the traditional Text-RNN model,the traditional Text-CNN model and word2vec-CNN model.The test accuracy is 97.56%,the accuracy rate is 97%,the recall rate is 97%,and the F1-score is 97%.展开更多
Automatically generating a brief summary for legal-related public opinion news(LPO-news,which contains legal words or phrases)plays an important role in rapid and effective public opinion disposal.For LPO-news,the cri...Automatically generating a brief summary for legal-related public opinion news(LPO-news,which contains legal words or phrases)plays an important role in rapid and effective public opinion disposal.For LPO-news,the critical case elements which are significant parts of the summary may be mentioned several times in the reader comments.Consequently,we investigate the task of comment-aware abstractive text summarization for LPO-news,which can generate salient summary by learning pivotal case elements from the reader comments.In this paper,we present a hierarchical comment-aware encoder(HCAE),which contains four components:1)a traditional sequenceto-sequence framework as our baseline;2)a selective denoising module to filter the noisy of comments and distinguish the case elements;3)a merge module by coupling the source article and comments to yield comment-aware context representation;4)a recoding module to capture the interaction among the source article words conditioned on the comments.Extensive experiments are conducted on a large dataset of legal public opinion news collected from micro-blog,and results show that the proposed model outperforms several existing state-of-the-art baseline models under the ROUGE metrics.展开更多
基金This work was supported by Ministry of public security technology research program[Grant No.2020JSYJC22ok]Fundamental Research Funds for the Central Universities(No.2021JKF215)+1 种基金Open Research Fund of the Public Security Behavioral Science Laboratory,People’s Public Security University of China(2020SYS03)Police and people build/share a smart community(PJ13-201912-0525).
文摘With the explosive growth of Internet text information,the task of text classification is more important.As a part of text classification,Chinese news text classification also plays an important role.In public security work,public opinion news classification is an important topic.Effective and accurate classification of public opinion news is a necessary prerequisite for relevant departments to grasp the situation of public opinion and control the trend of public opinion in time.This paper introduces a combinedconvolutional neural network text classification model based on word2vec and improved TF-IDF:firstly,the word vector is trained through word2vec model,then the weight of each word is calculated by using the improved TFIDF algorithm based on class frequency variance,and the word vector and weight are combined to construct the text vector representation.Finally,the combined-convolutional neural network is used to train and test the Thucnews data set.The results show that the classification effect of this model is better than the traditional Text-RNN model,the traditional Text-CNN model and word2vec-CNN model.The test accuracy is 97.56%,the accuracy rate is 97%,the recall rate is 97%,and the F1-score is 97%.
基金supported by the National Key Research and Development Program of China (2018YFC0830105,2018YFC 0830101,2018YFC0830100)the National Natural Science Foundation of China (Grant Nos.61972186,61762056,61472168)+1 种基金the Yunnan Provincial Major Science and Technology Special Plan Projects (202002AD080001)the General Projects of Basic Research in Yunnan Province (202001AT070046,202001AT070047).
文摘Automatically generating a brief summary for legal-related public opinion news(LPO-news,which contains legal words or phrases)plays an important role in rapid and effective public opinion disposal.For LPO-news,the critical case elements which are significant parts of the summary may be mentioned several times in the reader comments.Consequently,we investigate the task of comment-aware abstractive text summarization for LPO-news,which can generate salient summary by learning pivotal case elements from the reader comments.In this paper,we present a hierarchical comment-aware encoder(HCAE),which contains four components:1)a traditional sequenceto-sequence framework as our baseline;2)a selective denoising module to filter the noisy of comments and distinguish the case elements;3)a merge module by coupling the source article and comments to yield comment-aware context representation;4)a recoding module to capture the interaction among the source article words conditioned on the comments.Extensive experiments are conducted on a large dataset of legal public opinion news collected from micro-blog,and results show that the proposed model outperforms several existing state-of-the-art baseline models under the ROUGE metrics.