With the help of pre-trained language models,the accuracy of the entity linking task has made great strides in recent years.However,most models with excellent performance require fine-tuning on a large amount of train...With the help of pre-trained language models,the accuracy of the entity linking task has made great strides in recent years.However,most models with excellent performance require fine-tuning on a large amount of training data using large pre-trained language models,which is a hardware threshold to accomplish this task.Some researchers have achieved competitive results with less training data through ingenious methods,such as utilizing information provided by the named entity recognition model.This paper presents a novel semantic-enhancement-based entity linking approach,named semantically enhanced hardware-friendly entity linking(SHEL),which is designed to be hardware friendly and efficient while maintaining good performance.Specifically,SHEL's semantic enhancement approach consists of three aspects:(1)semantic compression of entity descriptions using a text summarization model;(2)maximizing the capture of mention contexts using asymmetric heuristics;(3)calculating a fixed size mention representation through pooling operations.These series of semantic enhancement methods effectively improve the model's ability to capture semantic information while taking into account the hardware constraints,and significantly improve the model's convergence speed by more than 50%compared with the strong baseline model proposed in this paper.In terms of performance,SHEL is comparable to the previous method,with superior performance on six well-established datasets,even though SHEL is trained using a smaller pre-trained language model as the encoder.展开更多
Researchers and scientists need rapid access to text documents such as research papers,source code and dissertations.Many research documents are available on the Internet and need more time to retrieve exact documents...Researchers and scientists need rapid access to text documents such as research papers,source code and dissertations.Many research documents are available on the Internet and need more time to retrieve exact documents based on keywords.An efficient classification algorithm for retrieving documents based on keyword words is required.The traditional algorithm performs less because it never considers words’polysemy and the relationship between bag-of-words in keywords.To solve the above problem,Semantic Featured Convolution Neural Networks(SF-CNN)is proposed to obtain the key relationships among the searching keywords and build a structure for matching the words for retrieving correct text documents.The proposed SF-CNN is based on deep semantic-based bag-of-word representation for document retrieval.Traditional deep learning methods such as Convolutional Neural Network and Recurrent Neural Network never use semantic representation for bag-of-words.The experiment is performed with different document datasets for evaluating the performance of the proposed SF-CNN method.SF-CNN classifies the documents with an accuracy of 94%than the traditional algorithms.展开更多
基金the Beijing Municipal Science and Technology Program(Z231100001323004)。
文摘With the help of pre-trained language models,the accuracy of the entity linking task has made great strides in recent years.However,most models with excellent performance require fine-tuning on a large amount of training data using large pre-trained language models,which is a hardware threshold to accomplish this task.Some researchers have achieved competitive results with less training data through ingenious methods,such as utilizing information provided by the named entity recognition model.This paper presents a novel semantic-enhancement-based entity linking approach,named semantically enhanced hardware-friendly entity linking(SHEL),which is designed to be hardware friendly and efficient while maintaining good performance.Specifically,SHEL's semantic enhancement approach consists of three aspects:(1)semantic compression of entity descriptions using a text summarization model;(2)maximizing the capture of mention contexts using asymmetric heuristics;(3)calculating a fixed size mention representation through pooling operations.These series of semantic enhancement methods effectively improve the model's ability to capture semantic information while taking into account the hardware constraints,and significantly improve the model's convergence speed by more than 50%compared with the strong baseline model proposed in this paper.In terms of performance,SHEL is comparable to the previous method,with superior performance on six well-established datasets,even though SHEL is trained using a smaller pre-trained language model as the encoder.
文摘Researchers and scientists need rapid access to text documents such as research papers,source code and dissertations.Many research documents are available on the Internet and need more time to retrieve exact documents based on keywords.An efficient classification algorithm for retrieving documents based on keyword words is required.The traditional algorithm performs less because it never considers words’polysemy and the relationship between bag-of-words in keywords.To solve the above problem,Semantic Featured Convolution Neural Networks(SF-CNN)is proposed to obtain the key relationships among the searching keywords and build a structure for matching the words for retrieving correct text documents.The proposed SF-CNN is based on deep semantic-based bag-of-word representation for document retrieval.Traditional deep learning methods such as Convolutional Neural Network and Recurrent Neural Network never use semantic representation for bag-of-words.The experiment is performed with different document datasets for evaluating the performance of the proposed SF-CNN method.SF-CNN classifies the documents with an accuracy of 94%than the traditional algorithms.