摘要
主要介绍现代汉语中通感(Synaesthesia)句子的自动抽取和感觉域之间的映射规律。通过构建各个感觉领域的词表和词性匹配的方式抽取语料库中的通感句子,采取了两种方法,一种是单纯的多领域感觉词匹配,准确率为20.78%;第二种方法加入了词性匹配,准确率为46.37%。主要难点在于五种感觉领域词表中词的选取和收集以及词性分布规则的总结上。最后统计了抽取句子通感源域到目标域的映射情况,检查了其映射方向是否与其他语言相同。
This paper focuses on the extraction and mapping tendencies of synaesthetic sentences in modern Chinese. The extraction applies two kinds of methodologies both based on the perception related word lists. We have constructed five sense word lists of touch, taste, smell, hearing and vision respectively. By checking each list and extracting the sentences with two or more kinds of perception related words, the accuracy of this methodology is 20. 78% ; by introducing POS distributing tendencies checking, the accuracy rises to 46.37%. The difficulty lies in collecting and further selecting the perception related word and also in observing the POS distributing rules of each perception related word. Finally, we check the mapping directionality of one domain of sense to another one.
出处
《计算机工程与科学》
CSCD
北大核心
2015年第12期2294-2299,共6页
Computer Engineering & Science
基金
Word Chinese and Their Grammatical Variations:Empirical Studies based on Comparable Corpora(GRF project 543512)
关键词
现代汉语
通感
感觉词
自动抽取
modern Chinese
synaesthesia
perception related word
automatic extraction