In order to improve Chinese overlapping ambiguity resolution based on a support vector machine, statistical features are studied for representing the feature vectors. First, four statistical parameters-mutual informat...In order to improve Chinese overlapping ambiguity resolution based on a support vector machine, statistical features are studied for representing the feature vectors. First, four statistical parameters-mutual information, accessor variety, two-character word frequency and single-character word frequency are used to describe the feature vectors respectively. Then other parameters are tried to add as complementary features to the parameters which obtain the best results for further improving the classification performance. Experimental results show that features represented by mutual information, single-character word frequency and accessor variety can obtain an optimum result of 94. 39%. Compared with a commonly used word probability model, the accuracy has been improved by 6. 62%. Such comparative results confirm that the classification performance can be improved by feature selection and representation.展开更多
Predicate-Argument (PA) structure anal- ysis is often divided into three subtasks: predicate sense disambiguation, argument identification and argument classification mostly been modeled in To date, they have isol...Predicate-Argument (PA) structure anal- ysis is often divided into three subtasks: predicate sense disambiguation, argument identification and argument classification mostly been modeled in To date, they have isolation. However, this approach neglects logical constraints between them. We therefore exploite integrating predicate sense disambiguation with the latter two subtasks respectively, which verifies that the automatic predicate sense disambiguation could help the se- mantic role labeling task. In addition, a dual de- composition algorithm is used to alleviate the er- ror propagation between argument identification subtask and argument classification subtask by benefitting the argument identification subtask greatly. Experiment results show that our ap- proach leads to a better performance with PA a- nalysis than other pipeline approaches.展开更多
Both a general domain-independent bottom-up multi-level model and an algorithm for establishing the taxonomic relation of Chinese ontology are proposed.The model consists of extracting domain vocabularies and establis...Both a general domain-independent bottom-up multi-level model and an algorithm for establishing the taxonomic relation of Chinese ontology are proposed.The model consists of extracting domain vocabularies and establishing taxonomic relation,with the consideration of characteristics unique to Chinese natural language.By establishing the semantic forests of domain vocabularies and then using the existing semantic dictionary or machine-readable dictionary(MRD),the proposed algorithm can integrate these semantic forests together to establish the taxonomic relation.Experimental results show that the proposed algorithm is feasible and effective in establishing the integrated taxonomic relation among domain vocabularies and concepts.展开更多
文摘In order to improve Chinese overlapping ambiguity resolution based on a support vector machine, statistical features are studied for representing the feature vectors. First, four statistical parameters-mutual information, accessor variety, two-character word frequency and single-character word frequency are used to describe the feature vectors respectively. Then other parameters are tried to add as complementary features to the parameters which obtain the best results for further improving the classification performance. Experimental results show that features represented by mutual information, single-character word frequency and accessor variety can obtain an optimum result of 94. 39%. Compared with a commonly used word probability model, the accuracy has been improved by 6. 62%. Such comparative results confirm that the classification performance can be improved by feature selection and representation.
文摘Predicate-Argument (PA) structure anal- ysis is often divided into three subtasks: predicate sense disambiguation, argument identification and argument classification mostly been modeled in To date, they have isolation. However, this approach neglects logical constraints between them. We therefore exploite integrating predicate sense disambiguation with the latter two subtasks respectively, which verifies that the automatic predicate sense disambiguation could help the se- mantic role labeling task. In addition, a dual de- composition algorithm is used to alleviate the er- ror propagation between argument identification subtask and argument classification subtask by benefitting the argument identification subtask greatly. Experiment results show that our ap- proach leads to a better performance with PA a- nalysis than other pipeline approaches.
基金Sponsored by the National Natural Science Foundation of China(Grant No.60496326 and No.10671045)
文摘Both a general domain-independent bottom-up multi-level model and an algorithm for establishing the taxonomic relation of Chinese ontology are proposed.The model consists of extracting domain vocabularies and establishing taxonomic relation,with the consideration of characteristics unique to Chinese natural language.By establishing the semantic forests of domain vocabularies and then using the existing semantic dictionary or machine-readable dictionary(MRD),the proposed algorithm can integrate these semantic forests together to establish the taxonomic relation.Experimental results show that the proposed algorithm is feasible and effective in establishing the integrated taxonomic relation among domain vocabularies and concepts.