期刊文献+
共找到11篇文章
< 1 >
每页显示 20 50 100
Word sense disambiguation using semantic relatedness measurement 被引量:7
1
作者 YANG Che-Yu 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2006年第10期1609-1625,共17页
All human languages have words that can mean different things in different contexts, such words with multiple meanings are potentially “ambiguous”. The process of “deciding which of several meanings of a term is in... All human languages have words that can mean different things in different contexts, such words with multiple meanings are potentially “ambiguous”. The process of “deciding which of several meanings of a term is intended in a given context” is known as “word sense disambiguation (WSD)”. This paper presents a method of WSD that assigns a target word the sense that is most related to the senses of its neighbor words. We explore the use of measures of relatedness between word senses based on a novel hybrid approach. First, we investigate how to “literally” and “regularly” express a “concept”. We apply set algebra to WordNet’s synsets cooperating with WordNet’s word ontology. In this way we establish regular rules for constructing various representations (lexical notations) of a concept using Boolean operators and word forms in various synset(s) defined in WordNet. Then we establish a formal mechanism for quantifying and estimating the semantic relatedness between concepts—we facilitate “concept distribution statistics” to determine the degree of semantic relatedness between two lexically expressed con- cepts. The experimental results showed good performance on Semcor, a subset of Brown corpus. We observe that measures of semantic relatedness are useful sources of information for WSD. 展开更多
关键词 Word sense disambiguation (WSD) Semantic relatedness WORDNET Natural language processing
下载PDF
Graph-Based Chinese Word Sense Disambiguation with Multi-Knowledge Integration 被引量:1
2
作者 Wenpeng Lu Fanqing Meng +4 位作者 Shoujin Wang Guoqiang Zhang Xu Zhang Antai Ouyang Xiaodong Zhang 《Computers, Materials & Continua》 SCIE EI 2019年第7期197-212,共16页
Word sense disambiguation(WSD)is a fundamental but significant task in natural language processing,which directly affects the performance of upper applications.However,WSD is very challenging due to the problem of kno... Word sense disambiguation(WSD)is a fundamental but significant task in natural language processing,which directly affects the performance of upper applications.However,WSD is very challenging due to the problem of knowledge bottleneck,i.e.,it is hard to acquire abundant disambiguation knowledge,especially in Chinese.To solve this problem,this paper proposes a graph-based Chinese WSD method with multi-knowledge integration.Particularly,a graph model combining various Chinese and English knowledge resources by word sense mapping is designed.Firstly,the content words in a Chinese ambiguous sentence are extracted and mapped to English words with BabelNet.Then,English word similarity is computed based on English word embeddings and knowledge base.Chinese word similarity is evaluated with Chinese word embedding and HowNet,respectively.The weights of the three kinds of word similarity are optimized with simulated annealing algorithm so as to obtain their overall similarities,which are utilized to construct a disambiguation graph.The graph scoring algorithm evaluates the importance of each word sense node and judge the right senses of the ambiguous words.Extensive experimental results on SemEval dataset show that our proposed WSD method significantly outperforms the baselines. 展开更多
关键词 Word sense disambiguation graph model multi-knowledge integration word similarity
下载PDF
WORD SENSE DISAMBIGUATION BASED ON IMPROVED BAYESIAN CLASSIFIERS 被引量:1
3
作者 Liu Ting Lu Zhimao Li Sheng 《Journal of Electronics(China)》 2006年第3期394-398,共5页
Word Sense Disambiguation (WSD) is to decide the sense of an ambiguous word on particular context. Most of current studies on WSD only use several ambiguous words as test samples, thus leads to some limitation in prac... Word Sense Disambiguation (WSD) is to decide the sense of an ambiguous word on particular context. Most of current studies on WSD only use several ambiguous words as test samples, thus leads to some limitation in practical application. In this paper, we perform WSD study based on large scale real-world corpus using two unsupervised learning algorithms based on ±n-improved Bayesian model and Dependency Grammar (DG)-improved Bayesian model. ±n-improved classifiers reduce the window size of context of ambiguous words with close-distance feature extraction method, and decrease the jamming of useless features, thus obviously improve the accuracy, reaching 83.18% (in open test). DG-improved classifier can more effectively conquer the noise effect existing in Naive-Bayesian classifier. Experimental results show that this approach does better on Chinese WSD, and the open test achieved an accuracy of 86.27%. 展开更多
关键词 Word sense Disambiguation (WSD) Natural Language Processing (NLP) Unsupervised learning algorithm Dependency Grammar (DG) Bayesian classifier
下载PDF
Chinese word sense disambiguation based on neural networks
4
作者 刘挺 卢志茂 +1 位作者 郎君 李生 《Journal of Harbin Institute of Technology(New Series)》 EI CAS 2005年第4期408-414,共7页
The input of a network is the key problem for Chinese word sense disambiguation utilizing the neural network. This paper presents an input model of the neural network that calculates the mutual information between con... The input of a network is the key problem for Chinese word sense disambiguation utilizing the neural network. This paper presents an input model of the neural network that calculates the mutual information between contextual words and the ambiguous word by using statistical methodology and taking the contextual words of a certain number beside the ambiguous word according to (-M,+N).The experiment adopts triple-layer BP Neural Network model and proves how the size of a training set and the value of Mand Naffect the performance of the Neural Network Model. The experimental objects are six pseudowords owning three word-senses constructed according to certain principles. The tested accuracy of our approach on a closed-corpus reaches 90.31%, and 89.62% on an open-corpus. The experiment proves that the Neural Network Model has a good performance on Word Sense Disambiguation. 展开更多
关键词 word sense disambiguation artificial neural network mutual information pseudowords
下载PDF
Word Sense Disambiguation Based Sentiment Classification Using Linear Kernel Learning Scheme
5
作者 P.Ramya B.Karthik 《Intelligent Automation & Soft Computing》 SCIE 2023年第5期2379-2391,共13页
Word Sense Disambiguation has been a trending topic of research in Natural Language Processing and Machine Learning.Mining core features and performing the text classification still exist as a challenging task.Here the... Word Sense Disambiguation has been a trending topic of research in Natural Language Processing and Machine Learning.Mining core features and performing the text classification still exist as a challenging task.Here the features of the context such as neighboring words like adjective provide the evidence for classification using machine learning approach.This paper presented the text document classification that has wide applications in information retrieval,which uses movie review datasets.Here the document indexing based on controlled vocabulary,adjective,word sense disambiguation,generating hierarchical cate-gorization of web pages,spam detection,topic labeling,web search,document summarization,etc.Here the kernel support vector machine learning algorithm helps to classify the text and feature extract is performed by cuckoo search opti-mization.Positive review and negative review of movie dataset is presented to get the better classification accuracy.Experimental results focused with context mining,feature analysis and classification.By comparing with the previous work,proposed work designed to achieve the efficient results.Overall design is per-formed with MATLAB 2020a tool. 展开更多
关键词 Text classification word sense disambiguation kernel support vector machine learning algorithm cuckoo search optimization feature extraction
下载PDF
Word Sense Disambiguation Model with a Cache-Like Memory Module
6
作者 LIN Qian LIU Xin +4 位作者 XIN Chunlei ZHANG Haiying ZENG Hualin ZHANG Tonghui SU Jinsong 《Journal of Donghua University(English Edition)》 CAS 2021年第4期333-340,共8页
Word sense disambiguation(WSD),identifying the specific sense of the target word given its context,is a fundamental task in natural language processing.Recently,researchers have shown promising results using long shor... Word sense disambiguation(WSD),identifying the specific sense of the target word given its context,is a fundamental task in natural language processing.Recently,researchers have shown promising results using long short term memory(LSTM),which is able to better capture sequential and syntactic features of text.However,this method neglects the dependencies among instances,such as their context semantic similarities.To solve this problem,we proposed a novel WSD model by introducing a cache-like memory module to capture the semantic dependencies among instances for WSD.Extensive evaluations on standard datasets demonstrate the superiority of the proposed model over various baselines. 展开更多
关键词 word sense disambiguation(WSD) memory module semantic dependencies
下载PDF
Improving the Collocation Extraction Method Using an Untagged Corpus for Persian Word Sense Disambiguation
7
作者 Noushin Riahi Fatemeh Sedghi 《Journal of Computer and Communications》 2016年第4期109-124,共16页
Word sense disambiguation is used in many natural language processing fields. One of the ways of disambiguation is the use of decision list algorithm which is a supervised method. Supervised methods are considered as ... Word sense disambiguation is used in many natural language processing fields. One of the ways of disambiguation is the use of decision list algorithm which is a supervised method. Supervised methods are considered as the most accurate machine learning algorithms but they are strongly influenced by knowledge acquisition bottleneck which means that their efficiency depends on the size of the tagged training set, in which their preparation is difficult, time-consuming and costly. The proposed method in this article improves the efficiency of this algorithm where there is a small tagged training set. This method uses a statistical method for collocation extraction from a big untagged corpus. Thus, the more important collocations which are the features used for creation of learning hypotheses will be identified. Weighting the features improves the efficiency and accuracy of a decision list algorithm which has been trained with a small training corpus. 展开更多
关键词 Collocation Extraction Word sense Disambiguation Untagged Corpus Decision List
下载PDF
Song Ci Style Automatic Identification
8
作者 郑旭玲 周昌乐 曾华琳 《Journal of Donghua University(English Edition)》 EI CAS 2010年第2期181-184,共4页
To identify Song Ci style automatically,we put forward a novel stylistic text categorization approach based on words and their semantic in this paper. And a modified special word segmentation method,a new semantic rel... To identify Song Ci style automatically,we put forward a novel stylistic text categorization approach based on words and their semantic in this paper. And a modified special word segmentation method,a new semantic relativity computing method based on HowNet along with the corresponding word sense disambiguation method are proposed to extract words and semantic features from Song Ci. Experiments are carried out and the results show that these methods are effective. 展开更多
关键词 stylistic text categorization word sense disambiguation (WSD) word segmentation HOWNET Song Ci
下载PDF
Application of Word Embedding to Drug Repositioning
9
作者 Duc Luu Ngo Naoki Yamamoto +5 位作者 Vu Anh Tran Ngoc Giang Nguyen Dau Phan Favorisen Rosyking Lumbanraja Mamoru Kubo Kenji Satou 《Journal of Biomedical Science and Engineering》 2016年第1期7-16,共10页
As a key technology of rapid and low-cost drug development, drug repositioning is getting popular. In this study, a text mining approach to the discovery of unknown drug-disease relation was tested. Using a word embed... As a key technology of rapid and low-cost drug development, drug repositioning is getting popular. In this study, a text mining approach to the discovery of unknown drug-disease relation was tested. Using a word embedding algorithm, senses of over 1.7 million words were well represented in sufficiently short feature vectors. Through various analysis including clustering and classification, feasibility of our approach was tested. Finally, our trained classification model achieved 87.6% accuracy in the prediction of drug-disease relation in cancer treatment and succeeded in discovering novel drug-disease relations that were actually reported in recent studies. 展开更多
关键词 Distributed Representation of Word sense Discovery of Drug-Disease Relation Word Analogy
下载PDF
Unsupervised WSD by Finding the Predominant Sense Using Context as a Dynamic Thesaurus 被引量:1
10
作者 Javier Tejada-Carcamo Hiram Calvo +1 位作者 Alexander Gelbukh Kazuo Hara 《Journal of Computer Science & Technology》 SCIE EI CSCD 2010年第5期1030-1039,共10页
We present and analyze an unsupervised method for Word Sense Disambiguation(WSD).Our work is based on the method presented by McCarthy et al.in 2004 for finding the predominant sense of each word in the entire corpu... We present and analyze an unsupervised method for Word Sense Disambiguation(WSD).Our work is based on the method presented by McCarthy et al.in 2004 for finding the predominant sense of each word in the entire corpus.Their maximization algorithm allows weighted terms(similar words) from a distributional thesaurus to accumulate a score for each ambiguous word sense,i.e.,the sense with the highest score is chosen based on votes from a weighted list of terms related to the ambiguous word.This list is obtained using the distributional similarity method proposed by Lin Dekang to obtain a thesaurus.In the method of McCarthy et al.,every occurrence of the ambiguous word uses the same thesaurus,regardless of the context where the ambiguous word occurs.Our method accounts for the context of a word when determining the sense of an ambiguous word by building the list of distributed similar words based on the syntactic context of the ambiguous word.We obtain a top precision of 77.54%of accuracy versus 67.10%of the original method tested on SemCor.We also analyze the effect of the number of weighted terms in the tasks of finding the Most Precuent Sense(MFS) and WSD,and experiment with several corpora for building the Word Space Model. 展开更多
关键词 word sense disambiguation word space model semantic similarity text corpus THESAURUS
原文传递
Improving Entity Linking in Chinese Domain by Sense Embedding Based on Graph Clustering 被引量:1
11
作者 张照博 钟芷漫 +1 位作者 袁平鹏 金海 《Journal of Computer Science & Technology》 SCIE EI CSCD 2023年第1期196-210,共15页
Entity linking refers to linking a string in a text to corresponding entities in a knowledge base through candidate entity generation and candidate entity ranking.It is of great significance to some NLP(natural langua... Entity linking refers to linking a string in a text to corresponding entities in a knowledge base through candidate entity generation and candidate entity ranking.It is of great significance to some NLP(natural language processing)tasks,such as question answering.Unlike English entity linking,Chinese entity linking requires more consideration due to the lack of spacing and capitalization in text sequences and the ambiguity of characters and words,which is more evident in certain scenarios.In Chinese domains,such as industry,the generated candidate entities are usually composed of long strings and are heavily nested.In addition,the meanings of the words that make up industrial entities are sometimes ambiguous.Their semantic space is a subspace of the general word embedding space,and thus each entity word needs to get its exact meanings.Therefore,we propose two schemes to achieve better Chinese entity linking.First,we implement an ngram based candidate entity generation method to increase the recall rate and reduce the nesting noise.Then,we enhance the corresponding candidate entity ranking mechanism by introducing sense embedding.Considering the contradiction between the ambiguity of word vectors and the single sense of the industrial domain,we design a sense embedding model based on graph clustering,which adopts an unsupervised approach for word sense induction and learns sense representation in conjunction with context.We test the embedding quality of our approach on classical datasets and demonstrate its disambiguation ability in general scenarios.We confirm that our method can better learn candidate entities’fundamental laws in the industrial domain and achieve better performance on entity linking through experiments. 展开更多
关键词 natural language processing(NLP) domain entity linking computational linguistics word sense disambiguation knowledge graph
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部