期刊文献+
共找到2篇文章
< 1 >
每页显示 20 50 100
基于深度学习的西夏文献数字化 被引量:1
1
作者 张光伟 《西夏学》 2020年第2期206-213,共8页
西夏文献的数字化是近半个世纪以来西夏文研究学者一直在努力地工作,以往西夏文献的录入几乎都是手工进行,效率非常低。近年来,以深度学习为代表的人工智能技术进行手写文字识别的准确率已经达到或超过人类专家的水平,我们训练的深度神... 西夏文献的数字化是近半个世纪以来西夏文研究学者一直在努力地工作,以往西夏文献的录入几乎都是手工进行,效率非常低。近年来,以深度学习为代表的人工智能技术进行手写文字识别的准确率已经达到或超过人类专家的水平,我们训练的深度神经网络模型在单个手写体西夏字数据集上的识别准确率超过97%。深度学习算法用于西夏字识别需要大量的标注数据,即每个西夏字都需要几十上百的实际例子;从大量的西夏文献中找到西夏字典中的每个字的多个实例需要的工作量巨大,而且有些西夏字的使用频率比较低更难以找到大量实例。我们提出了一种目标导向的混合算法来对西夏字进行标注:使用已有大量实例的西夏字生成虚拟字符表示那些实例数量不足的西夏字。实验证明这些虚拟字符在训练神经网络方面的效能与实际数据近似,让我们能够从大量未标注的西夏字图片中找到那些神经网络尚未识别的西夏字,并进行标注,而且这些虚拟字符可以与使用频率低的西夏字实例一起纳入训练集,以逐步迭代的方式训练识别能力更强的神经网络。在识别单个西夏字的基础上,我们通过在模型中引入循环神经网络单元,实现了西夏文献整列的识别,从而将传统的扫描文献切割为单个字符进行识别工作流程改进为以列为单位进行识别。TDM与列级别的识别模型相结合能够为我们构建高效的西夏文献自动转录系统,进而加速西夏文献的数字化进程。 展开更多
关键词 深度学习 西夏文 自动识别 训练数据集 虚拟实例 整列识别
下载PDF
Comparative study of de novo assembly and genome-guided assembly strategies for transcriptome reconstruction based on RNA-Seq 被引量:2
2
作者 LU BingXin ZENG ZhenBing SHI TieLiu 《Science China(Life Sciences)》 SCIE CAS 2013年第2期143-155,共13页
Transcriptome reconstruction is an important application of RNA-Seq,providing critical information for further analysis of transcriptome.Although RNA-Seq offers the potential to identify the whole picture of transcrip... Transcriptome reconstruction is an important application of RNA-Seq,providing critical information for further analysis of transcriptome.Although RNA-Seq offers the potential to identify the whole picture of transcriptome,it still presents special challenges.To handle these difficulties and reconstruct transcriptome as completely as possible,current computational approaches mainly employ two strategies:de novo assembly and genome-guided assembly.In order to find the similarities and differences between them,we firstly chose five representative assemblers belonging to the two classes respectively,and then investigated and compared their algorithm features in theory and real performances in practice.We found that all the methods can be reduced to graph reduction problems,yet they have different conceptual and practical implementations,thus each assembly method has its specific advantages and disadvantages,performing worse than others in certain aspects while outperforming others in anther aspects at the same time.Finally we merged assemblies of the five assemblers and obtained a much better assembly.Additionally we evaluated an assembler using genome-guided de novo assembly approach,and achieved good performance.Based on these results,we suggest that to obtain a comprehensive set of recovered transcripts,it is better to use a combination of de novo assembly and genome-guided assembly. 展开更多
关键词 transcriptome reconstruction RNA-SEQ de novo assembly genome-guided assembly
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部