Word Sense Disambiguation (WSD) is to decide the sense of an ambiguous word on particular context. Most of current studies on WSD only use several ambiguous words as test samples, thus leads to some limitation in prac...Word Sense Disambiguation (WSD) is to decide the sense of an ambiguous word on particular context. Most of current studies on WSD only use several ambiguous words as test samples, thus leads to some limitation in practical application. In this paper, we perform WSD study based on large scale real-world corpus using two unsupervised learning algorithms based on ±n-improved Bayesian model and Dependency Grammar (DG)-improved Bayesian model. ±n-improved classifiers reduce the window size of context of ambiguous words with close-distance feature extraction method, and decrease the jamming of useless features, thus obviously improve the accuracy, reaching 83.18% (in open test). DG-improved classifier can more effectively conquer the noise effect existing in Naive-Bayesian classifier. Experimental results show that this approach does better on Chinese WSD, and the open test achieved an accuracy of 86.27%.展开更多
In the statistical standard literature the stationarity of a time dependent process generally is defined by the invariance in time of the distribution of the variable, like a SPL (sound pressure level) fluctuating i...In the statistical standard literature the stationarity of a time dependent process generally is defined by the invariance in time of the distribution of the variable, like a SPL (sound pressure level) fluctuating in time. However in reality there cannot exist constant distribution, respectively characteristics, in time in the strict mathematical sense because the time intervals of observation only can be finite due to practical reasons. Hence on every distribution and characteristics based on it a certain, but evaluable uncertainty is imposed. For monitoring these uncertainties the online-measurement technique, i.e. primarily appropriate software, is already available, also for customers. According to this state of the art the following expanded definition of the stationarity is proposed: Stationarity during a quality controlled measurement process becomes established, when the upper confidence limit of the interesting specific characteristic has no positive slope in time and correspondingly the lower confidence limit of the specific characteristic no negative slope and, as a third, a common condition, the interesting specific characteristic has adjusted itself to a constant position in time. From this a systematic criteria scheme is established and in examples applied on different in- and outdoor situations of sound impact.展开更多
基金Supported by the National Natural Science Foundation of China (No.60435020).
文摘Word Sense Disambiguation (WSD) is to decide the sense of an ambiguous word on particular context. Most of current studies on WSD only use several ambiguous words as test samples, thus leads to some limitation in practical application. In this paper, we perform WSD study based on large scale real-world corpus using two unsupervised learning algorithms based on ±n-improved Bayesian model and Dependency Grammar (DG)-improved Bayesian model. ±n-improved classifiers reduce the window size of context of ambiguous words with close-distance feature extraction method, and decrease the jamming of useless features, thus obviously improve the accuracy, reaching 83.18% (in open test). DG-improved classifier can more effectively conquer the noise effect existing in Naive-Bayesian classifier. Experimental results show that this approach does better on Chinese WSD, and the open test achieved an accuracy of 86.27%.
文摘In the statistical standard literature the stationarity of a time dependent process generally is defined by the invariance in time of the distribution of the variable, like a SPL (sound pressure level) fluctuating in time. However in reality there cannot exist constant distribution, respectively characteristics, in time in the strict mathematical sense because the time intervals of observation only can be finite due to practical reasons. Hence on every distribution and characteristics based on it a certain, but evaluable uncertainty is imposed. For monitoring these uncertainties the online-measurement technique, i.e. primarily appropriate software, is already available, also for customers. According to this state of the art the following expanded definition of the stationarity is proposed: Stationarity during a quality controlled measurement process becomes established, when the upper confidence limit of the interesting specific characteristic has no positive slope in time and correspondingly the lower confidence limit of the specific characteristic no negative slope and, as a third, a common condition, the interesting specific characteristic has adjusted itself to a constant position in time. From this a systematic criteria scheme is established and in examples applied on different in- and outdoor situations of sound impact.