摘要
言语行为分析是网络舆情分析中一个重要的环节,对于机器大规模自动分析、识别和刻画网络舆情具有重要意义;矛盾冲突则是网络舆情监管需要重点关注的一类任务。以网络舆情的冲突类言语行为分析为例,初步尝试建立了一个适合中文网络论坛环境的冲突类言语行为分类体系。同时提出了相应的自动化分类算法,并比较不同特征集(提示词、n-grams、句法特征和结构特征)解决分类问题时的效果。实验结果表明,本文所提方法可以获得一个较为满意的分类效果,同时也证实新引入的句法特征和结构特征在解决言语行为分类问题上具有积极作用。
Speech act analysis plays a critical role in the analysis of online public opinion, which is significant for the task of automated analyzing and identifying conflicts in large scale public opinion texts from the Internet. Since conflict management is one of the major tasks in public opinion supervision, this research thus first builds speech act taxonomy for conflicting public opinions in Chinese context and then proposes an automated classification algorithm. We also compare the effect of different feature set (cue phrase, n-grams, syntactic and structural) on classification performance. Experiment results show that the proposed method structural feature provides a satisfactory classification effect, and the new introduced syntactic and sets are effective in improving performance.
出处
《情报科学》
CSSCI
北大核心
2012年第7期1076-1083,共8页
Information Science
基金
国家自然科学基金青年项目(71001038)
关键词
言语行为
自动分类
网络舆情
冲突
speech act
automated classification
online public opinion
conflict