期刊文献+

基于专名识别技术的古典文献“远读”初探——以雍正《畿辅通志》为例

A Preliminary Study on Distant Reading of Classical Literature Based on Proper Name Entity Recognition:Take General Records of the Capital Area as an Example
下载PDF
导出
摘要 应用BERT模型,设计了一种基于多任务联合学习的古籍文本信息标注工具,可实现对标点、专名信息的自动标注。相较于以往同类技术而言,该工具对人名、地名、时间名、书名的有效识别度更高,并将有助于“远读”方法在古籍文献领域的实现。以《四库全书》所收雍正《畿辅通志》为例,专名自动识别技术可快速提取文献出处、建筑设施的建造时间、人口分布等历史信息,也可以快速提取作家作品、经典意境。在对水利设施的兴建与对黄河水患的书写中,可以看出治河名臣李卫在编纂《畿辅通志》时的个人意志。 Applying the BERT model,this paper designs a tool for labeling text information of ancient books based on multi task joint learning,which can realize the automatic labeling of punctuation and proper nouns.Compared with previous similar technologies,it can effectively recognize people,location,time,and book names,and will help to achieve the distant reading in the field of ancient books.Taking the General Records of the Capital Area of Yongzheng’s years collected in Complete library in the Four Branches of Literature as an example,the automatic recognition of name entities can quickly extract historical information such as the source of literature,the construction time of facilities and population distribution,as well as the writers and classic artistic conceptions.From the writing of the construction of water conservancy facilities and the floods of the Yellow River,we can speculate Li Wei’s personal intents as a famous minister of river management in compiling of General Records of the Capital Area.
作者 诸雨辰 李绅 胡韧奋 Zhu Yuchen;Li Shen;Hu Renfen
出处 《南京师范大学文学院学报》 2023年第1期53-61,共9页 Journal of School of Chinese Language and Culture Nanjing Normal University
基金 国家自然科学基金青年项目“面向古籍整理智能化的知识表示与加工研究”(62006021) 北京市社科重点项目“古典文献的智能化分析与关联技术研究”(21DTR037)。
关键词 命名实体识别 远读 《畿辅通志》 proper name entity recognition distant reading General Records of the Capital Area
  • 相关文献

参考文献6

二级参考文献72

共引文献69

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部