摘要
在对网络信息资源进行检索、过滤、提取的过程中,对于文档格式的转换是进行信息处理的必然途径。将PDF文档转换为XML文档,在对分析PDF文档的内容和结构方面具有重要意义。论文介绍了从PDF文档向XML文档转换的设计和实现原理。
Document transformation among different document format is a necessary approach to information retrieval,filtering and extraction.XML Document has been an open standard in the exchange of data in different types and fields in the web.Transformation from PDF to XML is necessary for analysing the contents and structure of PDF documents.This paper discusses design and realization of the document transformation from PDF to XML.
出处
《计算机工程与应用》
CSCD
北大核心
2004年第14期120-122,共3页
Computer Engineering and Applications