摘要
This paper proposed a new method of semi-automatic extraction for semantic structures from unlabelled corpora in specific domains. The approach is statistical in nature. The extracted structures can be used for shallow parsing and semantic labeling. By iteratively extracting new words and clustering words, we get an inital semantic lexicon that groups words of the same semantic meaning together as a class. After that, a bootstrapping algorithm is adopted to extract semantic structures. Then the semantic structures are used to extract
This paper proposed a new method of semi-automatic extraction for semantic structures from unlabelled corpora in specific domains. The approach is statistical in nature. The extracted structures can be used for shallow parsing and semantic labeling. By iteratively extracting new words and clustering words, we get an inital semantic lexicon that groups words of the same semantic meaning together as a class. After that, a bootstrapping algorithm is adopted to extract semantic structures. Then the semantic structures are used to extract new
基金
FoundationResearchProgram,Science&TechnologyCommitteeofShanghaiMunicipality(No.01JC14033)
关键词
语义结构
语言模型
半自动提取
语义分组
NLU
and augment the semantic lexicon. The resultant semantic structures are interpreted by persons and are amenable to hand-editing for refinement. In this experiment, the semi-automatically extracted structures S SA provide recall rate of 84.