摘要
针对中文组合范畴语法(CCG)分析困难的特点,研究如何将两种彼此相互独立的技术共同应用在中文CCG句法分析上。首先使用预标注算法,使用对数线性模型通过去除那些概率较低的词汇范畴来对句子的潜在分析空间进行剪枝。然后应用启发式搜索算法进一步加速分析过程。最后从时间效率和分析精度两个维度对所使用的方法进行验证。实验表明,基于启发式搜索与预标注的句法分析算法可以显著地提高分析效率与分析精度。
Chinese CCG is difficult to parse, in light of this character, in the paper we investigate the way to integrate two independent techniques on Chinese CCG parsing. Firstly the supertagging is used, and by eliminating with log-linear model those words categories whose possibility is low, the latent parsing space of sentences is pruned, Secondly, A * search is applied to further accelerate the parsing procedure. At last the verifications are done on the approach used from the dimensions of both time efficiency and parsing accuracy. Experiments indicate that the parsing algorithm based on A * search and supertagging can significantly improve the efficiency and accuracy.
出处
《计算机应用与软件》
CSCD
北大核心
2014年第9期231-235,共5页
Computer Applications and Software
基金
国家自然科学基金项目(61003091)
关键词
中文句法分析
组合范畴语法
启发式搜索
预标注
Chinese parsing
Combinatory categorial grammar (CCG)
A * search Supertagging