期刊文献+

基于前向分步算法的文档实体排序

Forward stagewise additive modeling for entity ranking in documents
下载PDF
导出
摘要 文档中的关键实体可以抽象概括文本所描述的事件(或话题)的主体,推动面向实体的检索和问答系统等方面的研究.然而,文档中的实体是无序的,对文本中的实体进行排序显得尤为重要.提取文本实体特征并借助维基百科和词汇分布表示引入外部特征,提出了一种基于前向分步算法(Forward Stagewise Algorithm,FSAM)的排序模型LA-FSAM(FSAM based on AUC Metric and Logistic Function).该模型利用曲线下面积(Area Under the Curve,AUC)准则构造损失函数,逻辑斯谛函数整合实体特征,最后使用随机梯度下降法求解模型参数.通过LA-FSAM与基线方法的实验对比证明了所提方法的有效性. Key entities of a document can help to summarize the subjects of the events or the topics that the document describes,which can contribute to applications such as entity-oriented information retrieval and question-answering.However,entities in free text are unordered and hence it is important to rank entities of a document.In this paper,firstly,we make full use of features of entities that extracted from the document and draw support from Wikipedia and Word Embedding to generate external features.Then,we propose a novel ranking model named LA-FSAM(FSAM based on AUC Metric and Logistic Function)which is based on forward stagewise algorithm additive modeling.In LA-FSAM,we employ the AUC(Area Under the Curve)metric to construct the loss function and the logistic function to integrate features of entities.Finally,the stochastic gradient descent is utilized to optimize parameters of LA-FSAM model.After experiments,our evaluation shows the efficiency of the model we proposed.
作者 王燕华 WANG Yan-hua(School of Data Science and Engineering, East China Normal University,Shanghai 200062, China)
出处 《华东师范大学学报(自然科学版)》 CAS CSCD 北大核心 2018年第1期91-102,145,共13页 Journal of East China Normal University(Natural Science)
基金 上海市科技兴农推广项目(2015第3-2号)
关键词 实体排序 前向分步算法 曲线下面积 逻辑斯谛函数 随机梯度下降 entity ranking forward stagewise additive modeling area under the curve logistic function stochastic gradient descent
  • 相关文献

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部