期刊文献+
共找到7篇文章
< 1 >
每页显示 20 50 100
A Hybrid Method of Coreference Resolution in Information Security 被引量:1
1
作者 Yongjin Hu Yuanbo Guo +1 位作者 Junxiu Liu Han Zhang 《Computers, Materials & Continua》 SCIE EI 2020年第8期1297-1315,共19页
In the field of information security,a gap exists in the study of coreference resolution of entities.A hybrid method is proposed to solve the problem of coreference resolution in information security.The work consists... In the field of information security,a gap exists in the study of coreference resolution of entities.A hybrid method is proposed to solve the problem of coreference resolution in information security.The work consists of two parts:the first extracts all candidates(including noun phrases,pronouns,entities,and nested phrases)from a given document and classifies them;the second is coreference resolution of the selected candidates.In the first part,a method combining rules with a deep learning model(Dictionary BiLSTM-Attention-CRF,or DBAC)is proposed to extract all candidates in the text and classify them.In the DBAC model,the domain dictionary matching mechanism is introduced,and new features of words and their contexts are obtained according to the domain dictionary.In this way,full use can be made of the entities and entity-type information contained in the domain dictionary,which can help solve the recognition problem of both rare and long entities.In the second part,candidates are divided into pronoun candidates and noun phrase candidates according to the part of speech,and the coreference resolution of pronoun candidates is solved by making rules and coreference resolution of noun phrase candidates by machine learning.Finally,a dataset is created with which to evaluate our methods using information security data.The experimental results show that the proposed model exhibits better performance than the other baseline models. 展开更多
关键词 coreference resolution hybrid method RULES BiLSTM-Attention-CRF information security
下载PDF
Augmenting Trigger Semantics to Improve Event Coreference Resolution
2
作者 宦敏 徐昇 李培峰 《Journal of Computer Science & Technology》 SCIE EI CSCD 2023年第3期600-611,共12页
Due to the small size of the annotated corpora and the sparsity of the event trigger words, the event coreference resolver cannot capture enough event semantics, especially the trigger semantics, to identify coreferen... Due to the small size of the annotated corpora and the sparsity of the event trigger words, the event coreference resolver cannot capture enough event semantics, especially the trigger semantics, to identify coreferential event mentions. To address the above issues, this paper proposes a trigger semantics augmentation mechanism to boost event coreference resolution. First, this mechanism performs a trigger-oriented masking strategy to pre-train a BERT (Bidirectional Encoder Representations from Transformers)-based encoder (Trigger-BERT), which is fine-tuned on a large-scale unlabeled dataset Gigaword. Second, it combines the event semantic relations from the Trigger-BERT encoder with the event interactions from the soft-attention mechanism to resolve event coreference. Experimental results on both the KBP2016 and KBP2017 datasets show that our proposed model outperforms several state-of-the-art baselines. 展开更多
关键词 event coreference resolution trigger semantics augmentation information interaction
原文传递
CDCAT: A Multi-Language Cross-Document Entity and Event Coreference Annotation Tool
3
作者 Yang Xu Boming Xia +3 位作者 Yueliang Wan Fan Zhang Jiabo Xu Huansheng Ning 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2022年第3期589-598,共10页
A tool for the manual annotation of cross-document entity and event coreferences that helps annotators to label mention coreference relations in text is essential for the annotation of coreference corpora. To the best... A tool for the manual annotation of cross-document entity and event coreferences that helps annotators to label mention coreference relations in text is essential for the annotation of coreference corpora. To the best of our knowledge, CROss-document Main Events and entities Recognition(CROMER) is the only open-source manual annotation tool available for cross-document entity and event coreferences. However, CROMER lacks multi-language support and extensibility. Moreover, to label cross-document mention coreference relations, CROMER requires the support of another intra-document coreference annotation tool known as Content Annotation Tool, which is now unavailable. To address these problems, we introduce Cross-Document Coreference Annotation Tool(CDCAT), a new multi-language open-source manual annotation tool for cross-document entity and event coreference, which can handle different input/output formats, preprocessing functions, languages, and annotation systems. Using this new tool, annotators can label a reference relation with only two mouse clicks. Best practice analyses reveal that annotators can reach an annotation speed of 0.025 coreference relations per second on a corpus with a coreference density of 0.076 coreference relations per word. As the first multi-language open-source cross-document entity and event coreference annotation tool, CDCAT can theoretically achieve higher annotation efficiency than CROMER. 展开更多
关键词 event coreference entity coreference manual annotation tool natural language processing
原文传递
Learning Noun Phrase Anaphoricity in Coreference Resolution via Label Propagation 被引量:1
4
作者 周国栋 孔芳 《Journal of Computer Science & Technology》 SCIE EI CSCD 2011年第1期34-44,共11页
Knowledge of noun phrase anaphoricity might be profitably exploited in coreference resolution to bypass the resolution of non-anaphoric noun phrases. However, it is surprising to notice that recent attempts to incorpo... Knowledge of noun phrase anaphoricity might be profitably exploited in coreference resolution to bypass the resolution of non-anaphoric noun phrases. However, it is surprising to notice that recent attempts to incorporate automatically acquired anaphoricity information into coreferenee resolution systems have been far from expectation. This paper proposes a global learning method in determining the anaphoricity of noun phrases via a label propagation algorithm to improve learning-based coreference resolution. In order to eliminate the huge computational burden in the label propagation algorithm, we employ the weighted support vectors as the critical instances in the training texts. In addition, two kinds of kernels, i.e instances to represent all the anaphoricity-labeled NP , the feature-based RBF (Radial Basis Function) kernel and the convolution tree kernel with approximate matching, are explored to compute the anaphoricity similarity between two noun phrases. Experiments on the ACE2003 corpus demonstrate the great effectiveness of our method in anaphoricity determination of noun phrases and its application in learning-based coreference resolution. 展开更多
关键词 coreference resolution anaphoricity determination label propagation RBF kernel convolution tree kernel
原文传递
Bootstrapping Object Coreferencing on the Semantic Web
5
作者 胡伟 瞿裕忠 孙行智 《Journal of Computer Science & Technology》 SCIE EI CSCD 2011年第4期663-675,共13页
An object on the Semantic Web is likely to be denoted with several URIs by different parties. Object core-ferencing is a process to identify "equivalent" URIs of objects for achieving a better Data Web. In this pape... An object on the Semantic Web is likely to be denoted with several URIs by different parties. Object core-ferencing is a process to identify "equivalent" URIs of objects for achieving a better Data Web. In this paper, we propose a bootstrapping approach for object coreferencing on the Semantic Web. For an object URI, we firstly establish a kernel that consists of semantically equivalent URIs from the same-as, (inverse) functional properties and (max-)cardinalities, and then extend the kernel with respect to the textual descriptions (e.g., labels and local names) of URIs. We also propose a trustworthiness-based method to rank the coreferent URIs in the kernel as well as a similarity-based method for ranking the URIs in the extension of the kernel. We implement the proposed approach, called ObjectCoref, on a large-scale dataset that contains 76 million URIs collected by the Falcons search engine until 2008. The evaluation on precision, relative recall and response time demonstrates the feasibility of our approach. Additionally, we apply the proposed approach to investigate the popularity of the URI alias phenomenon on the current Semantic Web. 展开更多
关键词 object coreference entity identification URI alias data fusion Semantic Web
原文传递
Importance of retrieving noun phrases and named entities from digital library content
6
作者 Ratna SANYAL Kushal KESHRI Vidya NAND 《Journal of Zhejiang University-Science C(Computers and Electronics)》 SCIE EI 2010年第11期844-849,共6页
We present a novel approach for extracting noun phrases in general and named entities in particular from a digital repository of text documents.The problem of coreference resolution has been divided into two subproble... We present a novel approach for extracting noun phrases in general and named entities in particular from a digital repository of text documents.The problem of coreference resolution has been divided into two subproblems:pronoun resolution and non-pronominal resolution.A rule based-technique was used for pronoun resolution while a learning approach for nonpronominal resolution.For named entity resolution,disambiguation arises mainly due to polysemy and synonymy.The proposed approach fixes both problems with the help of WordNet and the Word Sense Disambiguation tool.The proposed approach,to our knowledge,outperforms several baseline techniques with a higher balanced F-measure,which is harmonic mean of recall and precision.The improvements in the system performance are due to the filtering of antecedents for the anaphor based on several linguistic disagreements,use of a hybrid approach,and increment in the feature vector to include more linguistic details in the learning technique. 展开更多
关键词 coreference resolution Hybrid approach FILTERING Rule based and J48 algorithm
原文传递
Cross-Context News Corpus for Protest Event-Related Knowledge Base Construction
7
作者 Ali Hürriyetoglu Erdem Yörük +4 位作者 Osman Mutlu Fırat Durusan ÇagrıYoltar Deniz Yüret Burak Gürel 《Data Intelligence》 2021年第2期308-335,共28页
We describe a gold standard corpus of protest events that comprise various local and international English language sources from various countries.The corpus contains document-,sentence-,and token-level annotations.Th... We describe a gold standard corpus of protest events that comprise various local and international English language sources from various countries.The corpus contains document-,sentence-,and token-level annotations.This corpus facilitates creating machine learning models that automatically classify news articles and extract protest event-related information,constructing knowledge bases that enable comparative social and political science studies.For each news source,the annotation starts with random samples of news articles and continues with samples drawn using active learning.Each batch of samples is annotated by two social and political scientists,adjudicated by an annotation supervisor,and improved by identifying annotation errors semi-automatically.We found that the corpus possesses the variety and quality that are necessary to develop and benchmark text classification and event extraction systems in a cross-context setting,contributing to the generalizability and robustness of automated text processing systems.This corpus and the reported results will establish a common foundation in automated protest event collection studies,which is currently lacking in the literature. 展开更多
关键词 Event extraction Text classification Political science Social science NEWS Contentious politics PROTESTS Event coreference resolution
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部