摘要
[目的/意义]菊花古典诗词的命名实体识别有助于深度挖掘菊花诗词文本之间的关联,传承菊花文化,助力菊花产业及乡村文化旅游,同时也为其他花卉诗词的文本深度挖掘提供了思路。[方法/过程]文章通过网络、论文和书籍进行菊花古典诗词数据的采集,重点选择诗词中涉及的时间、地点、季节、花名、花色、人物和节日7类命名实体进行标注和识别,得到BiLSTM、BiLSTM-CRF和BERT模型不同识别结果,并与CRF模型识别结果进行对比。[结果/结论]BERT模型在菊花古诗词文本的命名实体识别中表现优异,实体识别的调和平均数高于其他模型,最优调和平均数达到91.60%。BERT模型可用于菊花古诗词文本的深层次挖掘研究,并可向更多的花卉诗词扩展,古诗词文本的命名实体标注体系可以为后续研究提供借鉴。
[Purpose/significance]The named entity recognition of chrysanthemum poetry helps to deeply explore the association between chrysanthemum poetry texts,pass on the culture of chrysanthemum and develop the chrysanthemum industry and rural cultural tourism.At the same time,it also provides ideas for the deep mining of texts of other flower-theme poems.[Method/process]The paper collected data about chrysanthemum poetry through the webs,papers and books.Combining the analysis of the data,the paper focuses on the annotation and selection of seven types of named entities,such as time,places,seasons,chrysanthemum’s names,colours,characters and festivals in the poetry,to get different recognition results of BiLSTM,BiLSTM-CRF and BERT models and comparing them with the recognition results of CRF model.[Result/conclusion]The BERT model excels in the named entity recognition of chrysanthemum ancient poetry texts.The F-score of the entity recognition of this model is higher than the other models and its F-score reaches 91.60%.The BERT model can be used for the deep mining of the texts of chrysanthemum ancient poetry and can be extended to more floral poetry.The processing method of the named entity annotation of ancient poetry texts can provide reference for subsequent research.
出处
《情报理论与实践》
CSSCI
北大核心
2020年第11期150-155,共6页
Information Studies:Theory & Application
基金
中央高校业务费项目“乡村特色产业的移动知识服务模式研究”的成果,项目编号:SKYZ2019030。