期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
Feature study for improving Chinese overlapping ambiguity resolution based on SVM 被引量:1
1
作者 熊英 朱杰 《Journal of Southeast University(English Edition)》 EI CAS 2007年第2期179-184,共6页
In order to improve Chinese overlapping ambiguity resolution based on a support vector machine, statistical features are studied for representing the feature vectors. First, four statistical parameters-mutual informat... In order to improve Chinese overlapping ambiguity resolution based on a support vector machine, statistical features are studied for representing the feature vectors. First, four statistical parameters-mutual information, accessor variety, two-character word frequency and single-character word frequency are used to describe the feature vectors respectively. Then other parameters are tried to add as complementary features to the parameters which obtain the best results for further improving the classification performance. Experimental results show that features represented by mutual information, single-character word frequency and accessor variety can obtain an optimum result of 94. 39%. Compared with a commonly used word probability model, the accuracy has been improved by 6. 62%. Such comparative results confirm that the classification performance can be improved by feature selection and representation. 展开更多
关键词 support vector machine Chinese overlapping ambiguity Chinese word segmentation word probability model
下载PDF
Resolution of overlapping ambiguity strings based on maximum entropy model 被引量:1
2
作者 ZHANG Feng FAN Xiao-zhong 《Frontiers of Electrical and Electronic Engineering in China》 CSCD 2006年第3期273-276,共4页
The resolution of overlapping ambiguity strings(OAS)is studied based on the maximum entropy model.There are two model outputs,where either the first two characters form a word or the last two characters form a word.Th... The resolution of overlapping ambiguity strings(OAS)is studied based on the maximum entropy model.There are two model outputs,where either the first two characters form a word or the last two characters form a word.The features of the model include one word in con-text of OAS,the current OAS and word probability relation of two kinds of segmentation results.OAS in training text is found by the combination of the FMM and BMM segmen-tation method.After feature tagging they are used to train the maximum entropy model.The People Daily corpus of January 1998 is used in training and testing.Experimental results show a closed test precision of 98.64%and an open test precision of 95.01%.The open test precision is 3.76%better compared with that of the precision of common word probability method. 展开更多
关键词 Chinese information processing Chinese auto-matic word segmentation overlapping ambiguity strings maximum entropy model
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部