期刊文献+

基于众包的维吾尔语事件标注研究 被引量:1

Building Uyghur Language Event Annotation Corpus with Crowdsourcing
下载PDF
导出
摘要 大规模标注语料库在自然语言处理的语义理解和算法研究等领域有重要作用.本文针对维吾尔语事件标注语料空白以及标注仅仅涉及简单的人类智能的事实,提出了一种基于众包的维吾尔语事件标注方法.在制定了维吾尔语事件标注规范之后,建立了三层架构的标注体系,并提出质量控制机制.维吾尔语事件标注语料库为维吾尔语事件的研究提供了重要的资源支持. Large scale annotated corpora have played an important role in natural language processing (NLP) research, encountering the development of novel ideas, tasks and algorithm. Confronted with the lack of event tagging corpus in Uyghur language and the fact that corpus annotation only involves a simple human intelligence, this research proposes an event corpus annotation method based on crowdsourcing. At first, the paper formulated the Uygur event tagging specification, then we established a three-layer architecture corpus tagging platform, and then put forward error correction mechanism and quality control strategies to ensure the tagging quality. The establishment of Uygur language event tagging corpus can provide powerful resources for the Uyghur language event researches.
出处 《新疆大学学报(自然科学版)》 CAS 北大核心 2015年第2期209-214,220,共7页 Journal of Xinjiang University(Natural Science Edition)
基金 国家自然科学基金项目(61331011 61262060) 国家重点基础研究发展计划(973)项目(2014cb340506)
关键词 事件 维吾尔语 语料库 众包 Event Uyghur Annotation Corpus Crowdsourcing
  • 相关文献

参考文献10

  • 1邹岳琳,吐尔根.依布拉音,麦热哈巴.艾力,艾山.吾买尔,帕力旦.吐尔逊.基于词干提取的维吾尔语事件类时间短语识别[J].计算机工程与设计,2014,35(2):625-630. 被引量:6
  • 2木合塔尔.艾尔肯,艾斯卡尔.艾木都拉,地里木拉提.吐尔逊.基于规则的维吾尔地名识别[J].通信技术,2013,46(7):103-105. 被引量:9
  • 3Howe J.The rise of crowdsourcing[J].Wired magazine,2006,14(6):1-4.
  • 4Wang A,Hoang C D V,Kan M Y.Perspectives on crowdsourcing annotations for natural language processing[J].Language resources and evaluation,2013,47(1):9-31.
  • 5von Ahn L,Dabbish L.Labeling images with a computer game[C].In CHI’04:Proceedings of the SIGCHI conference on Human factors in computing systems,Vienna Austria,2004,319-326.
  • 6von Ahn L,Dabbish L.Designing games with a purpose[J].Communications of the ACM,2008,51(8):58-67.
  • 7Siorpaes K,Hepp M.Onto Game:Weaving the semantic web by online games[J].Research and applications,2008,751-766.
  • 8邹建红.突发事件信息的标注研究[D].北京:北京语言大学硕士论文,2007.
  • 9Desmet B,Hoste V.Fine-grained Dutch named entity recognition[J].Language Resources and Evaluation,2014,48(2):307-343.
  • 10仲秋雁,王彦杰,裘江南.众包社区用户持续参与行为实证研究[J].大连理工大学学报(社会科学版),2011,32(1):1-6. 被引量:48

二级参考文献22

  • 1赵东霞,卢小君,柳中权.影响城市居民社区满意度因素的实证研究[J].大连理工大学学报(社会科学版),2009,30(2):66-71. 被引量:22
  • 2张晓艳,王挺,陈火旺.基于混合统计模型的汉语命名实体识别方法[J].计算机工程与科学,2006,28(6):135-139. 被引量:19
  • 3钱晶,张杰,张涛.基于最大熵的汉语人名地名识别方法研究[J].小型微型计算机系统,2006,27(9):1761-1765. 被引量:26
  • 4LI L, DING D, HUANG D. Recognizing Location Names from Chinese Texts Based on Max-Margin Network[C]. USA:IEEE, 2008:325-331.
  • 5BRABHAM D C.Moving the crowd at Istockphoto:thecomposition of the crowd and motivations forparticipationin a crowdsourcing application. First Monday . 2008
  • 6DIPALANTINO D,VOJNOVIC M.Crowdsourcing andall-pay auctions. http://www.citeulike.org/us-er/aschriner/article/7658011 . 2011
  • 7CHEN Y,KI M Y M.Knowledge market design:a fieldexperi ment at google answers. Journal of Public Eco-nomics Theory . 2010
  • 8J Howe.The rise of crowdsourcing. WIRED Mag . 2006
  • 9Teo, T. S. H,Lim, V. K. G,Lai, R. Y. C.Intrinsic and Extrinsic Motivation in Internet Usage. The International Journal of Management Science . 1999
  • 10Feng Liu,Lingling Zhang,Jifa Gu.The Application ofKnowledge Management in the Internet Witkey Modein China. International Journal of Knowl-edge and Systems Sciences . 2007

共引文献59

同被引文献22

  • 1徐杰,施鹏飞.图像检索中基于标记与未标记样本的主动学习算法[J].上海交通大学学报,2004,38(12):2068-2072. 被引量:7
  • 2徐军,丁宇新,王晓龙.使用机器学习方法进行新闻的情感自动分类[J].中文信息学报,2007,21(6):95-100. 被引量:107
  • 3居胜峰,王中卿,李寿山,等. 情感分类中不同主动学习策略比较研究[C] //中国计算语言学研究前沿进展(2009-2011). 2011:506-511.
  • 4Li S,Huang C R,Zhou G,et al.Employing Personal/Impersonal Views in Supervised and Semi-Supervised Sentiment Classification[C].Proceedings of Annual Meeting of the Association for Computational Linguistics,2010:414-423.
  • 5Pang B,Lee L,Vaithyanathan S.Thumbs up?:sentiment classification using machine learning techniques[C].Proceedings of Emnlp,2002:79–86.
  • 6Dasgupta S,Ng V.Mine the Easy,Classify the Hard:A Semi-Supervised Approach to Automatic Sentiment Classification[C].Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP,2009,2.
  • 7龙军,殷建平,祝恩,等.主动学习研究综述[C].2007全国理论计算机科学学术年会,2007:300-304.
  • 8Pang B,Lee L.A Sentimental Education:Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts[C].Proceedings of the Acl,2004:271–278.
  • 9Riloff E,Patwardhan S,Wiebe J.Feature Subsumption for Opinion Analysis[J].In Proceedings of the 2006 Conference on Empirical Methods in Natural Language Processing(EMNLP-06,2006:440-448.
  • 10Mcdonald R,Hannan K,Neylon T,et al.Structured Models for Fine-to-Coarse Sentiment Analysis[C].Proceedings of Annual Meeting of the Association of Computational Linguistics,2007.

引证文献1

二级引证文献6

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部