Sentence semantic matching(SSM)is a fundamental research in solving natural language processing tasks such as question answering and machine translation.The latest SSM research benefits from deep learning techniques b...Sentence semantic matching(SSM)is a fundamental research in solving natural language processing tasks such as question answering and machine translation.The latest SSM research benefits from deep learning techniques by incorporating attention mechanism to semantically match given sentences.However,how to fully capture the semantic context without losing significant features for sentence encoding is still a challenge.To address this challenge,we propose a deep feature fusion model and integrate it into the most popular deep learning architecture for sentence matching task.The integrated architecture mainly consists of embedding layer,deep feature fusion layer,matching layer and prediction layer.In addition,we also compare the commonly used loss function,and propose a novel hybrid loss function integrating MSE and cross entropy together,considering confidence interval and threshold setting to preserve the indistinguishable instances in training process.To evaluate our model performance,we experiment on two real world public data sets:LCQMC and Quora.The experiment results demonstrate that our model outperforms the most existing advanced deep learning models for sentence matching,benefited from our enhanced loss function and deep feature fusion model for capturing semantic context.展开更多
Word sense disambiguation(WSD)is a fundamental but significant task in natural language processing,which directly affects the performance of upper applications.However,WSD is very challenging due to the problem of kno...Word sense disambiguation(WSD)is a fundamental but significant task in natural language processing,which directly affects the performance of upper applications.However,WSD is very challenging due to the problem of knowledge bottleneck,i.e.,it is hard to acquire abundant disambiguation knowledge,especially in Chinese.To solve this problem,this paper proposes a graph-based Chinese WSD method with multi-knowledge integration.Particularly,a graph model combining various Chinese and English knowledge resources by word sense mapping is designed.Firstly,the content words in a Chinese ambiguous sentence are extracted and mapped to English words with BabelNet.Then,English word similarity is computed based on English word embeddings and knowledge base.Chinese word similarity is evaluated with Chinese word embedding and HowNet,respectively.The weights of the three kinds of word similarity are optimized with simulated annealing algorithm so as to obtain their overall similarities,which are utilized to construct a disambiguation graph.The graph scoring algorithm evaluates the importance of each word sense node and judge the right senses of the ambiguous words.Extensive experimental results on SemEval dataset show that our proposed WSD method significantly outperforms the baselines.展开更多
The algorithm based on combination learning usually is superior to a singleclassification algorithm on the task of protein secondary structure prediction. However,the assignment of the weight of the base classifier us...The algorithm based on combination learning usually is superior to a singleclassification algorithm on the task of protein secondary structure prediction. However,the assignment of the weight of the base classifier usually lacks decision-makingevidence. In this paper, we propose a protein secondary structure prediction method withdynamic self-adaptation combination strategy based on entropy, where the weights areassigned according to the entropy of posterior probabilities outputted by base classifiers.The higher entropy value means a lower weight for the base classifier. The final structureprediction is decided by the weighted combination of posterior probabilities. Extensiveexperiments on CB513 dataset demonstrates that the proposed method outperforms theexisting methods, which can effectively improve the prediction performance.展开更多
基金supported by National Nature Science Foundation of China under Grant No.61502259National Key R&D Program of China under Grant No.2018YFC0831704Natural Science Foundation of Shandong Province under Grant No.ZR2017MF056.
文摘Sentence semantic matching(SSM)is a fundamental research in solving natural language processing tasks such as question answering and machine translation.The latest SSM research benefits from deep learning techniques by incorporating attention mechanism to semantically match given sentences.However,how to fully capture the semantic context without losing significant features for sentence encoding is still a challenge.To address this challenge,we propose a deep feature fusion model and integrate it into the most popular deep learning architecture for sentence matching task.The integrated architecture mainly consists of embedding layer,deep feature fusion layer,matching layer and prediction layer.In addition,we also compare the commonly used loss function,and propose a novel hybrid loss function integrating MSE and cross entropy together,considering confidence interval and threshold setting to preserve the indistinguishable instances in training process.To evaluate our model performance,we experiment on two real world public data sets:LCQMC and Quora.The experiment results demonstrate that our model outperforms the most existing advanced deep learning models for sentence matching,benefited from our enhanced loss function and deep feature fusion model for capturing semantic context.
基金The research work is supported by National Key R&D Program of China under Grant No.2018YFC0831704National Nature Science Foundation of China under Grant No.61502259+1 种基金Natural Science Foundation of Shandong Province under Grant No.ZR2017MF056Taishan Scholar Program of Shandong Province in China(Directed by Prof.Yinglong Wang).
文摘Word sense disambiguation(WSD)is a fundamental but significant task in natural language processing,which directly affects the performance of upper applications.However,WSD is very challenging due to the problem of knowledge bottleneck,i.e.,it is hard to acquire abundant disambiguation knowledge,especially in Chinese.To solve this problem,this paper proposes a graph-based Chinese WSD method with multi-knowledge integration.Particularly,a graph model combining various Chinese and English knowledge resources by word sense mapping is designed.Firstly,the content words in a Chinese ambiguous sentence are extracted and mapped to English words with BabelNet.Then,English word similarity is computed based on English word embeddings and knowledge base.Chinese word similarity is evaluated with Chinese word embedding and HowNet,respectively.The weights of the three kinds of word similarity are optimized with simulated annealing algorithm so as to obtain their overall similarities,which are utilized to construct a disambiguation graph.The graph scoring algorithm evaluates the importance of each word sense node and judge the right senses of the ambiguous words.Extensive experimental results on SemEval dataset show that our proposed WSD method significantly outperforms the baselines.
文摘The algorithm based on combination learning usually is superior to a singleclassification algorithm on the task of protein secondary structure prediction. However,the assignment of the weight of the base classifier usually lacks decision-makingevidence. In this paper, we propose a protein secondary structure prediction method withdynamic self-adaptation combination strategy based on entropy, where the weights areassigned according to the entropy of posterior probabilities outputted by base classifiers.The higher entropy value means a lower weight for the base classifier. The final structureprediction is decided by the weighted combination of posterior probabilities. Extensiveexperiments on CB513 dataset demonstrates that the proposed method outperforms theexisting methods, which can effectively improve the prediction performance.