The Extensible Markup Language(XML)files,widely used for storing and exchanging information on the web require efficient parsing mechanisms to improve the performance of the applications.With the existing Document Obj...The Extensible Markup Language(XML)files,widely used for storing and exchanging information on the web require efficient parsing mechanisms to improve the performance of the applications.With the existing Document Object Model(DOM)based parsing,the performance degrades due to sequential processing and large memory requirements,thereby requiring an efficient XML parser to mitigate these issues.In this paper,we propose a Parallel XML Tree Generator(PXTG)algorithm for accelerating the parsing of XML files and a Regression-based XML Parsing Framework(RXPF)that analyzes and predicts performance through profiling,regression,and code generation for efficient parsing.The PXTG algorithm is based on dividing the XML file into n parts and producing n trees in parallel.The profiling phase of the RXPF framework produces a dataset by measuring the performance of various parsing models including StAX,SAX,DOM,JDOM,and PXTG on different cores by using multiple file sizes.The regression phase produces the prediction model,based on which the final code for efficient parsing of XML files is produced through the code generation phase.The RXPF framework has shown a significant improvement in performance varying from 9.54%to 32.34%over other existing models used for parsing XML files.展开更多
Excel操作题自动阅卷系统的设计往往采用RTF、VBA方式,但效果并不理想,针对这种情况提出了一种基于Office Open XML格式和Python语言,使用开源库Openpyxl、Element Tree及自编类解析Excel文件的操作题自动阅卷评分系统的方法,总结了Offi...Excel操作题自动阅卷系统的设计往往采用RTF、VBA方式,但效果并不理想,针对这种情况提出了一种基于Office Open XML格式和Python语言,使用开源库Openpyxl、Element Tree及自编类解析Excel文件的操作题自动阅卷评分系统的方法,总结了Office Open XML格式的Spread sheet ML标记语言的常用标记及作用,其次设计了自动阅卷流程,细化为格式解析、格式保存和格式对比量化评分三个阶段,并将格式解析分解为Openpyx l解析、Element Tree解析两个步骤,解决了解析中容错的自定义问题。经过测试,使用该方法能可靠地实现Excel 2010版本操作题的自动阅卷。展开更多
文摘The Extensible Markup Language(XML)files,widely used for storing and exchanging information on the web require efficient parsing mechanisms to improve the performance of the applications.With the existing Document Object Model(DOM)based parsing,the performance degrades due to sequential processing and large memory requirements,thereby requiring an efficient XML parser to mitigate these issues.In this paper,we propose a Parallel XML Tree Generator(PXTG)algorithm for accelerating the parsing of XML files and a Regression-based XML Parsing Framework(RXPF)that analyzes and predicts performance through profiling,regression,and code generation for efficient parsing.The PXTG algorithm is based on dividing the XML file into n parts and producing n trees in parallel.The profiling phase of the RXPF framework produces a dataset by measuring the performance of various parsing models including StAX,SAX,DOM,JDOM,and PXTG on different cores by using multiple file sizes.The regression phase produces the prediction model,based on which the final code for efficient parsing of XML files is produced through the code generation phase.The RXPF framework has shown a significant improvement in performance varying from 9.54%to 32.34%over other existing models used for parsing XML files.