To solve the complicated feature extraction and long distance dependency problem in Word Segmentation Disambiguation (WSD), this paper proposes to apply rough sets ill WSD based on the Maximum Entropy model. Firstly...To solve the complicated feature extraction and long distance dependency problem in Word Segmentation Disambiguation (WSD), this paper proposes to apply rough sets ill WSD based on the Maximum Entropy model. Firstly, rough set theory is applied to extract the complicated features and long distance features, even frnm noise or inconsistent corpus. Secondly, these features are added into the Maximum Entropy model, and consequently, the feature weights can be assigned according to the performance of the whole disambiguation mnltel. Finally, tile semantic lexicou is adopted to build class-hased rough set teatures to overcome data spareness. The experiment indicated that our method performed better than previous models, which got top rank in WSD in 863 Evaluation in 2003. This system ranked first and second respcetively in MSR and PKU open test in the Second International Chinese Word Segmentation Bankeoff held in 2005.展开更多
文摘To solve the complicated feature extraction and long distance dependency problem in Word Segmentation Disambiguation (WSD), this paper proposes to apply rough sets ill WSD based on the Maximum Entropy model. Firstly, rough set theory is applied to extract the complicated features and long distance features, even frnm noise or inconsistent corpus. Secondly, these features are added into the Maximum Entropy model, and consequently, the feature weights can be assigned according to the performance of the whole disambiguation mnltel. Finally, tile semantic lexicou is adopted to build class-hased rough set teatures to overcome data spareness. The experiment indicated that our method performed better than previous models, which got top rank in WSD in 863 Evaluation in 2003. This system ranked first and second respcetively in MSR and PKU open test in the Second International Chinese Word Segmentation Bankeoff held in 2005.