期刊文献+
共找到215篇文章
< 1 2 11 >
每页显示 20 50 100
Word Sense Disambiguation Based Sentiment Classification Using Linear Kernel Learning Scheme
1
作者 P.Ramya B.Karthik 《Intelligent Automation & Soft Computing》 SCIE 2023年第5期2379-2391,共13页
Word Sense Disambiguation has been a trending topic of research in Natural Language Processing and Machine Learning.Mining core features and performing the text classification still exist as a challenging task.Here the... Word Sense Disambiguation has been a trending topic of research in Natural Language Processing and Machine Learning.Mining core features and performing the text classification still exist as a challenging task.Here the features of the context such as neighboring words like adjective provide the evidence for classification using machine learning approach.This paper presented the text document classification that has wide applications in information retrieval,which uses movie review datasets.Here the document indexing based on controlled vocabulary,adjective,word sense disambiguation,generating hierarchical cate-gorization of web pages,spam detection,topic labeling,web search,document summarization,etc.Here the kernel support vector machine learning algorithm helps to classify the text and feature extract is performed by cuckoo search opti-mization.Positive review and negative review of movie dataset is presented to get the better classification accuracy.Experimental results focused with context mining,feature analysis and classification.By comparing with the previous work,proposed work designed to achieve the efficient results.Overall design is per-formed with MATLAB 2020a tool. 展开更多
关键词 Text classification word sense disambiguation kernel support vector machine learning algorithm cuckoo search optimization feature extraction
下载PDF
Word sense disambiguation using semantic relatedness measurement 被引量:7
2
作者 YANG Che-Yu 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2006年第10期1609-1625,共17页
All human languages have words that can mean different things in different contexts, such words with multiple meanings are potentially “ambiguous”. The process of “deciding which of several meanings of a term is in... All human languages have words that can mean different things in different contexts, such words with multiple meanings are potentially “ambiguous”. The process of “deciding which of several meanings of a term is intended in a given context” is known as “word sense disambiguation (WSD)”. This paper presents a method of WSD that assigns a target word the sense that is most related to the senses of its neighbor words. We explore the use of measures of relatedness between word senses based on a novel hybrid approach. First, we investigate how to “literally” and “regularly” express a “concept”. We apply set algebra to WordNet’s synsets cooperating with WordNet’s word ontology. In this way we establish regular rules for constructing various representations (lexical notations) of a concept using Boolean operators and word forms in various synset(s) defined in WordNet. Then we establish a formal mechanism for quantifying and estimating the semantic relatedness between concepts—we facilitate “concept distribution statistics” to determine the degree of semantic relatedness between two lexically expressed con- cepts. The experimental results showed good performance on Semcor, a subset of Brown corpus. We observe that measures of semantic relatedness are useful sources of information for WSD. 展开更多
关键词 wsd 语义 wordNET 自然语言处理
下载PDF
Graph-Based Chinese Word Sense Disambiguation with Multi-Knowledge Integration 被引量:1
3
作者 Wenpeng Lu Fanqing Meng +4 位作者 Shoujin Wang Guoqiang Zhang Xu Zhang Antai Ouyang Xiaodong Zhang 《Computers, Materials & Continua》 SCIE EI 2019年第7期197-212,共16页
Word sense disambiguation(WSD)is a fundamental but significant task in natural language processing,which directly affects the performance of upper applications.However,WSD is very challenging due to the problem of kno... Word sense disambiguation(WSD)is a fundamental but significant task in natural language processing,which directly affects the performance of upper applications.However,WSD is very challenging due to the problem of knowledge bottleneck,i.e.,it is hard to acquire abundant disambiguation knowledge,especially in Chinese.To solve this problem,this paper proposes a graph-based Chinese WSD method with multi-knowledge integration.Particularly,a graph model combining various Chinese and English knowledge resources by word sense mapping is designed.Firstly,the content words in a Chinese ambiguous sentence are extracted and mapped to English words with BabelNet.Then,English word similarity is computed based on English word embeddings and knowledge base.Chinese word similarity is evaluated with Chinese word embedding and HowNet,respectively.The weights of the three kinds of word similarity are optimized with simulated annealing algorithm so as to obtain their overall similarities,which are utilized to construct a disambiguation graph.The graph scoring algorithm evaluates the importance of each word sense node and judge the right senses of the ambiguous words.Extensive experimental results on SemEval dataset show that our proposed WSD method significantly outperforms the baselines. 展开更多
关键词 word sense disambiguation graph model multi-knowledge integration word similarity
下载PDF
WORD SENSE DISAMBIGUATION BASED ON IMPROVED BAYESIAN CLASSIFIERS 被引量:1
4
作者 Liu Ting Lu Zhimao Li Sheng 《Journal of Electronics(China)》 2006年第3期394-398,共5页
Word Sense Disambiguation (WSD) is to decide the sense of an ambiguous word on particular context. Most of current studies on WSD only use several ambiguous words as test samples, thus leads to some limitation in prac... Word Sense Disambiguation (WSD) is to decide the sense of an ambiguous word on particular context. Most of current studies on WSD only use several ambiguous words as test samples, thus leads to some limitation in practical application. In this paper, we perform WSD study based on large scale real-world corpus using two unsupervised learning algorithms based on ±n-improved Bayesian model and Dependency Grammar (DG)-improved Bayesian model. ±n-improved classifiers reduce the window size of context of ambiguous words with close-distance feature extraction method, and decrease the jamming of useless features, thus obviously improve the accuracy, reaching 83.18% (in open test). DG-improved classifier can more effectively conquer the noise effect existing in Naive-Bayesian classifier. Experimental results show that this approach does better on Chinese WSD, and the open test achieved an accuracy of 86.27%. 展开更多
关键词 叶贝斯分级器 自然语言处理 NLP 学习算法 依赖性
下载PDF
Word sense disambiguation based on rough set
5
作者 陈清才 王晓龙 +2 位作者 赵健 陈滨 王长风 《Journal of Harbin Institute of Technology(New Series)》 EI CAS 2002年第2期201-204,共4页
A sense feature system (SFS) is first automatically constructed from the text corpora to structurize the textural information. WSD rules are then extracted from SFS according to their certainty factors and are applied... A sense feature system (SFS) is first automatically constructed from the text corpora to structurize the textural information. WSD rules are then extracted from SFS according to their certainty factors and are applied to disambiguate the senses of polysemous words. The entropy of a deterministic rough prediction is used to measure the decision quality of a rule set. Finally, the back off rule smoothing method is further designed to improve the performance of a WSD model. In the experiments, a mean rate of correction achieved during experiments for WSD in the case of rule smoothing is 0.92. 展开更多
关键词 word sense disambiguation ROUGH SET sense FEATURE system
下载PDF
Word Sense Disambiguation in Information Retrieval
6
作者 Francis de la C. Fernández REYES Exiquio C. Pérez LEYVA Rogelio Lau FERNáNDEZ 《Intelligent Information Management》 2009年第2期122-127,共6页
The natural language processing has a set of phases that evolves from lexical text analysis to the pragmatic one in which the author’s intentions are shown. The ambiguity problem appears in all of these tasks. Previo... The natural language processing has a set of phases that evolves from lexical text analysis to the pragmatic one in which the author’s intentions are shown. The ambiguity problem appears in all of these tasks. Previous works tries to do word sense disambiguation, the process of assign a sense to a word inside a specific context, creating algorithms under a supervised or unsupervised approach, which means that those algorithms use or not an external lexical resource. This paper presents an approximated approach that combines not supervised algorithms by the use of a classifiers set, the result will be a learning algorithm based on unsupervised methods for word sense disambiguation process. It begins with an introduction to word sense disambiguation concepts and then analyzes some unsupervised algorithms in order to extract the best of them, and combines them under a supervised approach making use of some classifiers. 展开更多
关键词 disambiguation ALGORITHMS NATURAL LANGUAGE processing word sense disambiguation
下载PDF
Word Sense Disambiguation Model with a Cache-Like Memory Module
7
作者 林倩 刘鑫 +4 位作者 辛春蕾 张海英 曾华琳 张同辉 苏劲松 《Journal of Donghua University(English Edition)》 CAS 2021年第4期333-340,共8页
Word sense disambiguation(WSD),identifying the specific sense of the target word given its context,is a fundamental task in natural language processing.Recently,researchers have shown promising results using long shor... Word sense disambiguation(WSD),identifying the specific sense of the target word given its context,is a fundamental task in natural language processing.Recently,researchers have shown promising results using long short term memory(LSTM),which is able to better capture sequential and syntactic features of text.However,this method neglects the dependencies among instances,such as their context semantic similarities.To solve this problem,we proposed a novel WSD model by introducing a cache-like memory module to capture the semantic dependencies among instances for WSD.Extensive evaluations on standard datasets demonstrate the superiority of the proposed model over various baselines. 展开更多
关键词 word sense disambiguation(wsd) memory module semantic dependencies
下载PDF
Improving the Collocation Extraction Method Using an Untagged Corpus for Persian Word Sense Disambiguation
8
作者 Noushin Riahi Fatemeh Sedghi 《Journal of Computer and Communications》 2016年第4期109-124,共16页
Word sense disambiguation is used in many natural language processing fields. One of the ways of disambiguation is the use of decision list algorithm which is a supervised method. Supervised methods are considered as ... Word sense disambiguation is used in many natural language processing fields. One of the ways of disambiguation is the use of decision list algorithm which is a supervised method. Supervised methods are considered as the most accurate machine learning algorithms but they are strongly influenced by knowledge acquisition bottleneck which means that their efficiency depends on the size of the tagged training set, in which their preparation is difficult, time-consuming and costly. The proposed method in this article improves the efficiency of this algorithm where there is a small tagged training set. This method uses a statistical method for collocation extraction from a big untagged corpus. Thus, the more important collocations which are the features used for creation of learning hypotheses will be identified. Weighting the features improves the efficiency and accuracy of a decision list algorithm which has been trained with a small training corpus. 展开更多
关键词 Collocation Extraction word sense disambiguation Untagged Corpus Decision List
下载PDF
Chinese word sense disambiguation based on neural networks
9
作者 刘挺 卢志茂 +1 位作者 郎君 李生 《Journal of Harbin Institute of Technology(New Series)》 EI CAS 2005年第4期408-414,共7页
The input of a network is the key problem for Chinese word sense disambiguation utilizing the neural network. This paper presents an input model of the neural network that calculates the mutual information between con... The input of a network is the key problem for Chinese word sense disambiguation utilizing the neural network. This paper presents an input model of the neural network that calculates the mutual information between contextual words and the ambiguous word by using statistical methodology and taking the contextual words of a certain number beside the ambiguous word according to (-M,+N).The experiment adopts triple-layer BP Neural Network model and proves how the size of a training set and the value of Mand Naffect the performance of the Neural Network Model. The experimental objects are six pseudowords owning three word-senses constructed according to certain principles. The tested accuracy of our approach on a closed-corpus reaches 90.31%, and 89.62% on an open-corpus. The experiment proves that the Neural Network Model has a good performance on Word Sense Disambiguation. 展开更多
关键词 人工神经网络 汉语 单词 词义辨别
下载PDF
基于WordNet词义消歧的系统融合 被引量:12
10
作者 刘宇鹏 李生 赵铁军 《自动化学报》 EI CSCD 北大核心 2010年第11期1575-1580,共6页
最近混淆网络在融合多个机器翻译结果中展示很好的性能.然而为了克服在不同的翻译系统中不同的词序,假设对齐在混淆网络的构建上仍然是一个重要的问题.但以往的对齐方法都没有考虑到语义信息.本文为了更好地改进系统融合的性能,提出了... 最近混淆网络在融合多个机器翻译结果中展示很好的性能.然而为了克服在不同的翻译系统中不同的词序,假设对齐在混淆网络的构建上仍然是一个重要的问题.但以往的对齐方法都没有考虑到语义信息.本文为了更好地改进系统融合的性能,提出了用词义消歧(Word sense disambiguation,WSD)来指导混淆网络中的对齐.同时骨架翻译的选择也是通过计算句子间的相似度来获得的,句子的相似性计算使用了二分图的最大匹配算法.为了使得基于WordNet词义消歧方法融入到系统中,本文将翻译错误率(Translation error rate,TER)算法进行了改进,实验结果显示本方法的性能好于经典的TER算法的性能. 展开更多
关键词 系统融合 翻译错误率 词义消歧 混淆网络
下载PDF
基于WordNet的无导词义消歧方法 被引量:6
11
作者 王瑞琴 孔繁胜 潘俊 《浙江大学学报(工学版)》 EI CAS CSCD 北大核心 2010年第4期732-737,共6页
有导词义消歧机器学习方法由于需要大量人力进行词义标注,难以适用于大规模词义消歧任务.提出一种避免人工词义标注的无导消歧方法.该方法综合利用WordNet知识库中的多种知识源(包括:词义定义描述、使用实例、结构化语义关系、领域属性... 有导词义消歧机器学习方法由于需要大量人力进行词义标注,难以适用于大规模词义消歧任务.提出一种避免人工词义标注的无导消歧方法.该方法综合利用WordNet知识库中的多种知识源(包括:词义定义描述、使用实例、结构化语义关系、领域属性等)描述歧义词的词义信息,生成词义的"代表词汇集"和"领域代表词汇集",结合词汇的词频分布信息和所处的上下文环境进行词义判定.利用通用测试集Senseval-3对6个典型的无导词义消歧方法进行开放实验,该方法取得平均正确率为49.93%的消歧结果. 展开更多
关键词 词义消歧 wordNet知识库 结构化语义关系
下载PDF
基于WordNet的本体澄清 被引量:4
12
作者 郭雷 方俊 王晓东 《计算机科学》 CSCD 北大核心 2008年第10期145-147,185,共4页
由于本体能够消除概念的混淆和重用知识,因此它的质量对于语义网技术的应用非常重要。为了提高本体的质量,很多的工作集中在概念建模,但是本体表示这个非常重要的方面一直被忽视。目前本体的表示使用的是词(term),但同一个词可能有很多... 由于本体能够消除概念的混淆和重用知识,因此它的质量对于语义网技术的应用非常重要。为了提高本体的质量,很多的工作集中在概念建模,但是本体表示这个非常重要的方面一直被忽视。目前本体的表示使用的是词(term),但同一个词可能有很多不同的意思,这样在基于本体的应用时将导致不清楚或错误的理解。为了解决这个问题,使用定义在WordNet中的词义(sense)而不是词来作为本体的表示,其原因是词义只有唯一的意思。本体澄清的定义为利用目标词周围的本体元素和被它标注的文档附近的词,对目标词进行自动消歧的过程。通过计算目标词义和它的邻居词的语义相似度,语义相关度最大的词义将选为正确的词义。实验表明,我们的算法有很好的性能。与最好的消歧算法相比,概念(Concept)精度差不多是名词精度的2倍,关系(Property)精度差不多是动词精度的3倍。实验证明了我们的算法在半自动的本体净化过程中也是非常有效的。 展开更多
关键词 本体澄清 语义相关度 消歧
下载PDF
基于WordNet词义消歧的语义检索研究 被引量:8
13
作者 高雪霞 炎士涛 《湘潭大学自然科学学报》 北大核心 2017年第2期118-121,共4页
针对有监督和基于知识库的词义消歧问题,提出了一种新的基于Jaccard系数的词义消歧算法,以解决词义错误配对问题.利用WordNet知识库中的知识源表示歧义词的词义信息并生成词义资源库,结合提出的基于Jaccard系数词义消歧算法完成信息检索... 针对有监督和基于知识库的词义消歧问题,提出了一种新的基于Jaccard系数的词义消歧算法,以解决词义错误配对问题.利用WordNet知识库中的知识源表示歧义词的词义信息并生成词义资源库,结合提出的基于Jaccard系数词义消歧算法完成信息检索.试验测试结果显示,通过新的词义消歧算法,信息检索系统精确度比传统信息检索系统提高10%. 展开更多
关键词 词义消歧 信息检索 Jaccard系数 wordNET 查准率 查全率
下载PDF
英语语音合成中基于WordNet的多音词消歧算法 被引量:1
14
作者 王永生 李梅 《计算机工程与应用》 CSCD 北大核心 2008年第26期138-140,共3页
英语中的多音词分成两类,一是因词性不同而读音不同,一是因词义不同而读音不同。前者只需经词性标注,根据其词性标记就可判别其正确的读音。而后者则复杂得多,论文采用了一种基于WordNet语义信息的多音词消歧算法,该算法将多音词的语义... 英语中的多音词分成两类,一是因词性不同而读音不同,一是因词义不同而读音不同。前者只需经词性标注,根据其词性标记就可判别其正确的读音。而后者则复杂得多,论文采用了一种基于WordNet语义信息的多音词消歧算法,该算法将多音词的语义信息与上下文中词的语义信息进行匹配,根据匹配结果来判别多音词的读音。 展开更多
关键词 多音词消歧 词义消歧 语音合成
下载PDF
基于WordNet的通用服务分类方法
15
作者 何佳 赵海燕 +2 位作者 陈庆奎 席丽娜 曹健 《计算机工程与科学》 CSCD 北大核心 2013年第9期157-161,共5页
随着服务技术的发展,越来越多的组织将业务功能作为服务通过网络对外发布。服务的增多导致人工对这些服务进行分类的成本越来越高。将文本挖掘、语义技术和机器学习技术相结合,提出了一个基于WordNet的服务自动分类方法。首先,利用文本... 随着服务技术的发展,越来越多的组织将业务功能作为服务通过网络对外发布。服务的增多导致人工对这些服务进行分类的成本越来越高。将文本挖掘、语义技术和机器学习技术相结合,提出了一个基于WordNet的服务自动分类方法。首先,利用文本挖掘技术和语义消歧技术,从服务的描述文档、社会化标注等获得可描述每个服务的一组有确切语义的Sense向量,本文选取的Sense向量是对每个API进行社会化标注的一组Tags。然后,利用K-均值聚类方法完成相应的分类。最后,以Programmable Web上的服务作为测试数据进行了实验,实验表明本方法具有较好的分类效果。 展开更多
关键词 服务 语义消歧 社会化标注 相似度 wordNET
下载PDF
一种基于WordNet的网页情境解析算法
16
作者 蔡劲松 邹汪平 《咸阳师范学院学报》 2015年第4期56-60,共5页
提出一种基于WordNet的网页情境解析算法。获取网页集并建立基于DOM树网页解析;获取网页正文、网页生成时间和更新时间;对网页集进行基于WordNet的词性标注和词义消歧;利用命名实体识别技术获取网页正文内的时间和地点信息,作为网页的... 提出一种基于WordNet的网页情境解析算法。获取网页集并建立基于DOM树网页解析;获取网页正文、网页生成时间和更新时间;对网页集进行基于WordNet的词性标注和词义消歧;利用命名实体识别技术获取网页正文内的时间和地点信息,作为网页的情境表示。经过实验对比,结果表明文中提出的方法和理论完全能够自动解析网页情境信息,为搜索提供巨大帮助。 展开更多
关键词 wordNET 词义消歧 情境表示 情境解析
下载PDF
Improving Entity Linking in Chinese Domain by Sense Embedding Based on Graph Clustering 被引量:1
17
作者 张照博 钟芷漫 +1 位作者 袁平鹏 金海 《Journal of Computer Science & Technology》 SCIE EI CSCD 2023年第1期196-210,共15页
Entity linking refers to linking a string in a text to corresponding entities in a knowledge base through candidate entity generation and candidate entity ranking.It is of great significance to some NLP(natural langua... Entity linking refers to linking a string in a text to corresponding entities in a knowledge base through candidate entity generation and candidate entity ranking.It is of great significance to some NLP(natural language processing)tasks,such as question answering.Unlike English entity linking,Chinese entity linking requires more consideration due to the lack of spacing and capitalization in text sequences and the ambiguity of characters and words,which is more evident in certain scenarios.In Chinese domains,such as industry,the generated candidate entities are usually composed of long strings and are heavily nested.In addition,the meanings of the words that make up industrial entities are sometimes ambiguous.Their semantic space is a subspace of the general word embedding space,and thus each entity word needs to get its exact meanings.Therefore,we propose two schemes to achieve better Chinese entity linking.First,we implement an ngram based candidate entity generation method to increase the recall rate and reduce the nesting noise.Then,we enhance the corresponding candidate entity ranking mechanism by introducing sense embedding.Considering the contradiction between the ambiguity of word vectors and the single sense of the industrial domain,we design a sense embedding model based on graph clustering,which adopts an unsupervised approach for word sense induction and learns sense representation in conjunction with context.We test the embedding quality of our approach on classical datasets and demonstrate its disambiguation ability in general scenarios.We confirm that our method can better learn candidate entities’fundamental laws in the industrial domain and achieve better performance on entity linking through experiments. 展开更多
关键词 natural language processing(NLP) domain entity linking computational linguistics word sense disambiguation knowledge graph
原文传递
基于预训练语言模型与多任务学习的事件检测方法
18
作者 韩如雪 杨苗 +5 位作者 宫小泽 胡镑 王永利 熊伟 赵显伟 徐琳 《南京理工大学学报》 CAS CSCD 北大核心 2023年第6期748-755,共8页
为了解决现有事件检测方法存在语料稀疏和触发词一词多义导致的触发词抽取不准确以及类型判断错误等问题,该文将双向Transformer编码表示(BERT)的预训练模型与条件随机场(CRF)结合,并联合多任务学习,提出了一种基于BERT-CRF模型与多任... 为了解决现有事件检测方法存在语料稀疏和触发词一词多义导致的触发词抽取不准确以及类型判断错误等问题,该文将双向Transformer编码表示(BERT)的预训练模型与条件随机场(CRF)结合,并联合多任务学习,提出了一种基于BERT-CRF模型与多任务学习的事件检测方法(MBCED)。该方法同时进行事件检测任务和词义消歧任务,将词义消歧任务中学习到的知识转移到事件检测任务中,既补充了语料,也缓解了一词多义所导致的触发词分类不准确问题。在ACE2005数据集上的传统事件检测模型对比实验结果表明,与动态多池卷积神经网络(DMCNN)、基于循环神经网络的联合模型(JRNN)、基于双向长短期记忆和条件随机场(BiLSTM-CRF)的联合模型、BERT-CRF方法相比,MBCED方法触发词识别的F值提升了1.2%。多任务学习模型对比实验结果表明,与基于多任务深度学习的实体与事件联合抽取(MDL-J3E)模型、基于共享BERT的多任务学习(MSBERT)模型、基于CRF多任务学习的事件抽取模型(MTL-CRF)相比,MBCED在触发词识别和触发词分类2个子任务上的准确率都较好。 展开更多
关键词 词义消歧 预训练模型 多任务学习 事件检测 语料稀疏 触发词识别 条件随机场 触发词分类
下载PDF
织物整理专业术语词义消歧与热点趋势研究
19
作者 李启正 胡崴琳 王成龙 《染整技术》 CAS 2023年第5期6-10,共5页
织物整理包括化学和机械等多种加工方法,是学科交叉融合的一个典型领域,但存在专业术语表达多样化、用词不规范等现象,给文献检索及学术热点分析造成不便和困难。研究收集了38本中国知网收录的纺织领域期刊论文20余万篇,筛选中图分类号... 织物整理包括化学和机械等多种加工方法,是学科交叉融合的一个典型领域,但存在专业术语表达多样化、用词不规范等现象,给文献检索及学术热点分析造成不便和困难。研究收集了38本中国知网收录的纺织领域期刊论文20余万篇,筛选中图分类号为TS19(染整工业)的论文25 691篇,结合最新版《纺织词典》《纺织汉语叙词表》《纺织大百科》以及论文原有关键词、题目和摘要中出现的专业术语,梳理出织物整理领域相关专业术语122个。结合大数据分析和专家比对,对“一义多词”现象进行词义消歧,得到包含时间属性标签的89个织物整理领域专业术语。为进一步研究织物整理技术的热点变化和趋势发展提供了可能,通过文献计量学的突发检测技术和可视化分析发现,疏水整理、阻燃整理、拒水整理、抗菌整理和防紫外线整理为近年织物整理领域的新兴热点,且相关论文是国家级课题论文的概率更高。研究结果可以帮助研究人员、专业期刊和数据库规范专业术语表达,实现语义检索和学术热点分析。 展开更多
关键词 织物整理 词义消歧 一义多词 突发检测 学术热点
下载PDF
Unsupervised WSD by Finding the Predominant Sense Using Context as a Dynamic Thesaurus 被引量:1
20
作者 Javier Tejada-Carcamo Hiram Calvo +1 位作者 Alexander Gelbukh Kazuo Hara 《Journal of Computer Science & Technology》 SCIE EI CSCD 2010年第5期1030-1039,共10页
We present and analyze an unsupervised method for Word Sense Disambiguation(WSD).Our work is based on the method presented by McCarthy et al.in 2004 for finding the predominant sense of each word in the entire corpu... We present and analyze an unsupervised method for Word Sense Disambiguation(WSD).Our work is based on the method presented by McCarthy et al.in 2004 for finding the predominant sense of each word in the entire corpus.Their maximization algorithm allows weighted terms(similar words) from a distributional thesaurus to accumulate a score for each ambiguous word sense,i.e.,the sense with the highest score is chosen based on votes from a weighted list of terms related to the ambiguous word.This list is obtained using the distributional similarity method proposed by Lin Dekang to obtain a thesaurus.In the method of McCarthy et al.,every occurrence of the ambiguous word uses the same thesaurus,regardless of the context where the ambiguous word occurs.Our method accounts for the context of a word when determining the sense of an ambiguous word by building the list of distributed similar words based on the syntactic context of the ambiguous word.We obtain a top precision of 77.54%of accuracy versus 67.10%of the original method tested on SemCor.We also analyze the effect of the number of weighted terms in the tasks of finding the Most Precuent Sense(MFS) and WSD,and experiment with several corpora for building the Word Space Model. 展开更多
关键词 word sense disambiguation word space model semantic similarity text corpus THESAURUS
原文传递
上一页 1 2 11 下一页 到第
使用帮助 返回顶部