摘要
数学公式抽取是公式识别的首要步骤,目前相关的研究还很欠缺。针对印刷文档中数学公式的抽取展开了研究,提出了一种Parzen窗和启发式规则相结合的公式抽取方法。对于孤立式公式采用Parzen窗方法将其从文档中抽取出来,对于嵌入式公式采用启发式规则将其从文本行中抽取出来。实验表明,这两种抽取方法的结合取得了较好的效果。
Mathematical Formulas extraction is the first step in mathematical formulas recognition.At present,the research in this aspect is rare.Some researches are taken for the mathematical formulas in the printed documents,An approach that contains both Parzen windows and heuristic rules for mathematical formulas extraction is proposed,Parzen windows approach is used to extract the isolated formulas from the printed documents and some heuristic rules are used to extract the embedded formulas from the text lines.The experiments indicate that the combination of the two methods can obtain favorable results。
出处
《计算机工程与应用》
CSCD
北大核心
2005年第23期200-202,共3页
Computer Engineering and Applications
基金
河北省自然科学基金资助项目(编号:F2004000132)