摘要
Web网页中包含了大量异构的半结构化或非结构化数据,如何准确地从这些网页中提取有价值的信息显得极其重要。文章基于深度学习,结合BERT构建了一种新型的BERT+BiLSTM+CRF信息抽取模型,实验结果表明了该方法的有效性。
Web pages contain large amount of heterogeneous semi-structured or unstructured data, and how to accurately extract valuable information from web pages is extremely important. With the help of deep learning, this paper proposes a new BERT +BiLSTM+CRF information extraction model. The experimental results show the effectiveness of the proposed method.
作者
俞鑫
吴明晖
Yu Xin;Wu Minghui(Computer and Computing Science School, Zhejiang University City College, Hangzhou 310015, China)
出处
《计算机时代》
2019年第9期30-32,共3页
Computer Era