摘要
Uncertainty identification is an important semantic processing task. It is crucial to the quality of information in terms of factuality in many applications, such as topic detection and question answering. Factuality has become a premier concern especially in social media, in which texts are written informally. However, existing approaches that rely on lexical cues suffer greatly from the casual or word-of-mouth peculiarity of social media, in which the cue phrases are often expressed in substandard form or even omitted from sentences. To tackle these problems, this paper proposes an Attention-based Neural Framework for Uncertainty identification on social media texts, named ANFU. ANFU incorporates attention-based Long Short-Term Memory(LSTM) networks to represent the semantics of words and Convolutional Neural Networks(CNNs) to capture the most important semantics. Experiments were conducted on four datasets, including 2 English benchmark datasets used in the CoNLL-2010 task of uncertainty identification and 2 Chinese datasets of Weibo and Chinese news texts. Experimental results showed that our proposed ANFU approach outperformed the-state-of-the-art on all the datasets in terms of F1 measure. More importantly, 41.37% and 13.10% improvements were achieved over the baselines on English and Chinese social media datasets, respectively, showing the particular effectiveness of ANFU on social media texts.
Uncertainty identification is an important semantic processing task. It is crucial to the quality of information in terms of factuality in many applications, such as topic detection and question answering. Factuality has become a premier concern especially in social media, in which texts are written informally. However, existing approaches that rely on lexical cues suffer greatly from the casual or word-of-mouth peculiarity of social media, in which the cue phrases are often expressed in substandard form or even omitted from sentences. To tackle these problems, this paper proposes an Attention-based Neural Framework for Uncertainty identification on social media texts, named ANFU. ANFU incorporates attention-based Long Short-Term Memory(LSTM) networks to represent the semantics of words and Convolutional Neural Networks(CNNs) to capture the most important semantics. Experiments were conducted on four datasets, including 2 English benchmark datasets used in the CoNLL-2010 task of uncertainty identification and 2 Chinese datasets of Weibo and Chinese news texts. Experimental results showed that our proposed ANFU approach outperformed the-state-of-the-art on all the datasets in terms of F1 measure. More importantly, 41.37% and 13.10% improvements were achieved over the baselines on English and Chinese social media datasets, respectively, showing the particular effectiveness of ANFU on social media texts.
基金
supported by the National Natural Science Foundation of China (Nos. 61502115, 61602326, U1636103, U1536207, and 61672361)
the Fundamental Research Fund for the Central Universities (No. 3262019T29)
the Joint Funding for Capital Universities (No. SKX182010023)