期刊文献+

用于化学文摘生成的压缩算法

Research on the generation of chemical abstract based on compression technology
原文传递
导出
摘要 化学工作者经常要阅读大量的英文资料来获取本行业最新的研究成果和资源。但是,大量的阅读会占用很多宝贵的时间,从而增大查找工作的投入,降低工作效率。本论文研究一些文本压缩算法,利用当今文本处理领域现有的压缩算法,针对一些化学文章进行人工信息抽取,生成相关的化学文摘。并聘请化学领域的专业人员对人工生成的文摘进行评估。结果表明人工生成的文摘比较理想,对化学工作者的资料查询和收集工作带来很大的帮助。压缩技术是文本处理的一个重要应用方向,句子的自动压缩是其基础部分。在这个领域的前沿技术中有很多种压缩算法,本论文采用了一种基于噪声信道模型(Noisy-Channel Model)的英文句子压缩方法,这种方法是利用概率句法分析来进行句子成分筛选和压缩。实验表明,这种压缩方法能够较好地应用于化学文摘的生成,保障化学文摘的概括性与准确性,保留了化学文章中的主要内容和核心思想。 Chemical researchers usually read an amount of chemical resources in English. Chemical abstract generated by compression technology will help them a lot on their research work. But the reading work will occupy a lot of time so that the researchers must spend more time on the understanding of resources and enlarge the total time of work. We talked about some text compression approaches. We used some compression technology which has been applied in the text processing field. We compressed some chemical sources manually and generated some chemical abstracts. The abstracts have been evaluated by some chemical researchers. The result is that the chemical abstracts generated manually is nice and will help the chemical researchers for their information querying. Compression technology is an important application in natural language processing. The basis step of it is sentence compression. Now many approaches have appeared in the field of compression technology. In this paper we used a kind of sentence compression approach: Noisy-Channel Model which is based on probabilistic syntax analysis. It used probabilistic syntax analysis to get sentences reduced. The test showed that this compression approach based on noisy-channel model can generate accurately the Chemical abstract. The chemical abstract kept the main idea of the chemical resource and remained the key words of chemical resource.
作者 杜文洁
出处 《计算机与应用化学》 CAS CSCD 北大核心 2010年第2期249-252,共4页 Computers and Applied Chemistry
关键词 压缩技术 化学 文摘 compression technology, chemistry, abstract
  • 相关文献

参考文献6

二级参考文献60

共引文献54

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部