摘要
提出了一种基于规则的试卷文本语块识别方法,有效解决了试题库中大规模试题数据的初始化问题。通过定义文本语块识别规则,构建自动机识别模型,在理论上描述了试卷文本的识别过程。实验表明,该模型具有良好的性能,在此基础上,实现了一个原型系统,通过具体的应用实例验证了该方法的可行性和有效性。
To solve the initiating of massive examination questions in database efficiently, proposed a paper texts chunking method based on rules. Defining recognition rules of paper texts and constructing automata recognition model, described the recognition processing of paper texts theoretically. Experiment results show that this model has better performance. By these works, implemented a prototype system, and verified its feasibility and effectiveness by a practical application.
出处
《计算机应用研究》
CSCD
北大核心
2009年第4期1391-1393,1401,共4页
Application Research of Computers
基金
国家自然科学基金资助项目(70572099)
辽宁省自然科学基金资助项目(1050349)
关键词
规则
语块
试卷文本
识别模型
rules
chunk
examination paper texts
recognition model