摘要
All human languages have words that can mean different things in different contexts, such words with multiple meanings are potentially “ambiguous”. The process of “deciding which of several meanings of a term is intended in a given context” is known as “word sense disambiguation (WSD)”. This paper presents a method of WSD that assigns a target word the sense that is most related to the senses of its neighbor words. We explore the use of measures of relatedness between word senses based on a novel hybrid approach. First, we investigate how to “literally” and “regularly” express a “concept”. We apply set algebra to WordNet’s synsets cooperating with WordNet’s word ontology. In this way we establish regular rules for constructing various representations (lexical notations) of a concept using Boolean operators and word forms in various synset(s) defined in WordNet. Then we establish a formal mechanism for quantifying and estimating the semantic relatedness between concepts—we facilitate “concept distribution statistics” to determine the degree of semantic relatedness between two lexically expressed con- cepts. The experimental results showed good performance on Semcor, a subset of Brown corpus. We observe that measures of semantic relatedness are useful sources of information for WSD.
All human languages have words that can mean different things in different contexts, such words with multiple meanings are potentially "ambiguous". The process of "deciding which of several meanings of a term is intended in a given context" is known as "word sense disambiguation (WSD)". This paper presents a method of WSD that assigns a target word the sense that is most related to the senses of its neighbor words. We explore the use of measures of relatedness between word senses based on a novel hybrid approach. First, we investigate how to "literally" and "regularly" express a "concept". We apply set algebra to WordNet's synsets cooperating with WordNet's word ontology. In this way we establish regular rules for constructing various representations (lexical notations) of a concept using Boolean operators and word forms in various synset(s) defined in WordNet. Then we establish a formal mechanism for quantifying and estimating the semantic relatedness between concepts--we facilitate "concept distribution statistics" to determine the degree of semantic relatedness between two lexically expressed concepts. The experimental results showed good performance on Semcor, a subset of Brown corpus. We observe that measures of semantic relatedness are useful sources of information for WSD.