Social media like Twitter who serves as a novel news medium and has become increasingly popular since its establishment. Large scale first-hand user-generated tweets motivate automatic event detection on Twitter. Prev...Social media like Twitter who serves as a novel news medium and has become increasingly popular since its establishment. Large scale first-hand user-generated tweets motivate automatic event detection on Twitter. Previous unsupervised approaches detected events by clustering words. These methods detect events using burstiness,which measures surging frequencies of words at certain time windows. However,event clusters represented by a set of individual words are difficult to understand. This issue is addressed by building a document-level event detection model that directly calculates the burstiness of tweets,leveraging distributed word representations for modeling semantic information,thereby avoiding sparsity. Results show that the document-level model not only offers event summaries that are directly human-readable,but also gives significantly improved accuracies compared to previous methods on unsupervised tweet event detection,which are based on words/segments.展开更多
事件抽取旨在从非结构化文本中检测事件类型并抽取事件要素。现有方法在处理文档级文本时仍存在局限性。这是因为文档级文本可能由多个事件组成,并且构成某一事件的事件要素通常分散在不同句子中。为应对上述挑战,提出了一种文档级事件...事件抽取旨在从非结构化文本中检测事件类型并抽取事件要素。现有方法在处理文档级文本时仍存在局限性。这是因为文档级文本可能由多个事件组成,并且构成某一事件的事件要素通常分散在不同句子中。为应对上述挑战,提出了一种文档级事件抽取反向推理模型(reverse inference model for document-level event extraction,RIDEE)。基于无触发词的设计,将文档级事件抽取转化为候选事件要素抽取和事件触发推理两个子任务,并行式抽取事件要素并检测事件类型。此外,设计了一种用于存储历史事件的事件依赖池,使得模型在处理多事件文本时可以充分利用事件之间的依赖关系。公开数据集上的实验结果表明,与现有事件抽取模型相比,RIDEE在进行文档级事件抽取时具有更优的性能。展开更多
基金Supported by the National High Technology Research and Development Programme of China(No.2015AA015405)
文摘Social media like Twitter who serves as a novel news medium and has become increasingly popular since its establishment. Large scale first-hand user-generated tweets motivate automatic event detection on Twitter. Previous unsupervised approaches detected events by clustering words. These methods detect events using burstiness,which measures surging frequencies of words at certain time windows. However,event clusters represented by a set of individual words are difficult to understand. This issue is addressed by building a document-level event detection model that directly calculates the burstiness of tweets,leveraging distributed word representations for modeling semantic information,thereby avoiding sparsity. Results show that the document-level model not only offers event summaries that are directly human-readable,but also gives significantly improved accuracies compared to previous methods on unsupervised tweet event detection,which are based on words/segments.
文摘事件抽取旨在从非结构化文本中检测事件类型并抽取事件要素。现有方法在处理文档级文本时仍存在局限性。这是因为文档级文本可能由多个事件组成,并且构成某一事件的事件要素通常分散在不同句子中。为应对上述挑战,提出了一种文档级事件抽取反向推理模型(reverse inference model for document-level event extraction,RIDEE)。基于无触发词的设计,将文档级事件抽取转化为候选事件要素抽取和事件触发推理两个子任务,并行式抽取事件要素并检测事件类型。此外,设计了一种用于存储历史事件的事件依赖池,使得模型在处理多事件文本时可以充分利用事件之间的依赖关系。公开数据集上的实验结果表明,与现有事件抽取模型相比,RIDEE在进行文档级事件抽取时具有更优的性能。