Word Sense Disambiguation (WSD) is to decide the sense of an ambiguous word on particular context. Most of current studies on WSD only use several ambiguous words as test samples, thus leads to some limitation in prac...Word Sense Disambiguation (WSD) is to decide the sense of an ambiguous word on particular context. Most of current studies on WSD only use several ambiguous words as test samples, thus leads to some limitation in practical application. In this paper, we perform WSD study based on large scale real-world corpus using two unsupervised learning algorithms based on ±n-improved Bayesian model and Dependency Grammar (DG)-improved Bayesian model. ±n-improved classifiers reduce the window size of context of ambiguous words with close-distance feature extraction method, and decrease the jamming of useless features, thus obviously improve the accuracy, reaching 83.18% (in open test). DG-improved classifier can more effectively conquer the noise effect existing in Naive-Bayesian classifier. Experimental results show that this approach does better on Chinese WSD, and the open test achieved an accuracy of 86.27%.展开更多
This paper uses the geometric method to describe Lie group machine learning(LML)based on the theoretical framework of LML,which gives the geometric algorithms of Dynkin diagrams in LML.It includes the basic conception...This paper uses the geometric method to describe Lie group machine learning(LML)based on the theoretical framework of LML,which gives the geometric algorithms of Dynkin diagrams in LML.It includes the basic conceptions of Dynkin diagrams in LML,the classification theorems of Dynkin diagrams in LML,the classification algorithm of Dynkin diagrams in LML and the verification of the classification algorithm with experimental results.展开更多
基金Supported by the National Natural Science Foundation of China (No.60435020).
文摘Word Sense Disambiguation (WSD) is to decide the sense of an ambiguous word on particular context. Most of current studies on WSD only use several ambiguous words as test samples, thus leads to some limitation in practical application. In this paper, we perform WSD study based on large scale real-world corpus using two unsupervised learning algorithms based on ±n-improved Bayesian model and Dependency Grammar (DG)-improved Bayesian model. ±n-improved classifiers reduce the window size of context of ambiguous words with close-distance feature extraction method, and decrease the jamming of useless features, thus obviously improve the accuracy, reaching 83.18% (in open test). DG-improved classifier can more effectively conquer the noise effect existing in Naive-Bayesian classifier. Experimental results show that this approach does better on Chinese WSD, and the open test achieved an accuracy of 86.27%.
基金Na tureScienceFoundationof JiangsuProvinceunder Grant No .BK2005027 and the211 FoundationofSoochow University
文摘This paper uses the geometric method to describe Lie group machine learning(LML)based on the theoretical framework of LML,which gives the geometric algorithms of Dynkin diagrams in LML.It includes the basic conceptions of Dynkin diagrams in LML,the classification theorems of Dynkin diagrams in LML,the classification algorithm of Dynkin diagrams in LML and the verification of the classification algorithm with experimental results.