期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
A Novel Visualization Tool for Manual Annotation when Building Large Speech Corpora
1
作者 SHE Kun CHEN Shuzhen YANG Shen ZOU Lian 《Wuhan University Journal of Natural Sciences》 CAS 2006年第2期381-384,共4页
A novel visualized sound description, called sound dendrogram is proposed to make manual annotation easier when building large speech corpora. It is a lattice structure built from a group of "seed regions" and throu... A novel visualized sound description, called sound dendrogram is proposed to make manual annotation easier when building large speech corpora. It is a lattice structure built from a group of "seed regions" and through an iterative procedure of mergence. A simple but reliable extraction method of "seed regkms" and advanced distance metric are adopted to construct the sound dendrogram, so that it can present speech's structure character ranging from coarse to fine in a visualized way. Tests show that all phonemic boundaries are contained in the lattice structure of sound dendrogram and very easy to identify. Sound dendrogram can be a powerful assistant tool during the process of speech corporals manual annotation. 展开更多
关键词 sound dedrogram speech corpora manual annotation computer aid tool
下载PDF
CDCAT: A Multi-Language Cross-Document Entity and Event Coreference Annotation Tool
2
作者 Yang Xu Boming Xia +3 位作者 Yueliang Wan Fan Zhang Jiabo Xu Huansheng Ning 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2022年第3期589-598,共10页
A tool for the manual annotation of cross-document entity and event coreferences that helps annotators to label mention coreference relations in text is essential for the annotation of coreference corpora. To the best... A tool for the manual annotation of cross-document entity and event coreferences that helps annotators to label mention coreference relations in text is essential for the annotation of coreference corpora. To the best of our knowledge, CROss-document Main Events and entities Recognition(CROMER) is the only open-source manual annotation tool available for cross-document entity and event coreferences. However, CROMER lacks multi-language support and extensibility. Moreover, to label cross-document mention coreference relations, CROMER requires the support of another intra-document coreference annotation tool known as Content Annotation Tool, which is now unavailable. To address these problems, we introduce Cross-Document Coreference Annotation Tool(CDCAT), a new multi-language open-source manual annotation tool for cross-document entity and event coreference, which can handle different input/output formats, preprocessing functions, languages, and annotation systems. Using this new tool, annotators can label a reference relation with only two mouse clicks. Best practice analyses reveal that annotators can reach an annotation speed of 0.025 coreference relations per second on a corpus with a coreference density of 0.076 coreference relations per word. As the first multi-language open-source cross-document entity and event coreference annotation tool, CDCAT can theoretically achieve higher annotation efficiency than CROMER. 展开更多
关键词 event coreference entity coreference manual annotation tool natural language processing
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部