期刊文献+
共找到5篇文章
< 1 >
每页显示 20 50 100
Sememe knowledge computation:a review of recent advances in application and expansion of sememe knowledge bases 被引量:1
1
作者 Fanchao QI Ruobing XIE +2 位作者 Yuan ZANG Zhiyuan LIU Maosong SUN 《Frontiers of Computer Science》 SCIE EI CSCD 2021年第5期13-23,共11页
A sememe is defined as the minimum semantic unit of languages in linguistics.Sememe knowledge bases are built by manually annotating sememes for words and phrases.HowNet is the most well-known sememe knowledge base.It... A sememe is defined as the minimum semantic unit of languages in linguistics.Sememe knowledge bases are built by manually annotating sememes for words and phrases.HowNet is the most well-known sememe knowledge base.It has been extensively utilized in many natural language processing tasks in the era of statistical natural language processing and proven to be effective and helpful to understanding and using languages.In the era of deep learning,although data are thought to be of vital importance,there are some studies working on incorporating sememe knowledge bases like HowNet into neural network models to enhance system performance.Some successful attempts have been made in the tasks including word representation learning,language modeling,semantic composition,etc.In addition,considering the high cost of manual annotation and update for sememe knowledge bases,some work has tried to use machine learning methods to automatically predict sememes for words and phrases to expand sememe knowledge bases.Besides,some studies try to extend HowNet to other languages by automatically predicting sememes for words and phrases in a new language.In this paper,we summarize recent studies on application and expansion of sememe knowledge bases and point out some future directions of research on sememes. 展开更多
关键词 natural language process SEMANTICS knowledge base SEMEME HOWNET
原文传递
Extracting Variable-Depth Logical Document Hierarchy from Long Documents:Method,Evaluation,and Application
2
作者 Rong-Yu Cao Yi-Xuan Cao +1 位作者 Gan-Bin Zhou Ping Luo 《Journal of Computer Science & Technology》 SCIE EI CSCD 2022年第3期699-718,共20页
In this paper,we study the problem of extracting variable-depth"logical document hierarchy"from long documents,namely organizing the recognized"physical document objects"into hierarchical structure... In this paper,we study the problem of extracting variable-depth"logical document hierarchy"from long documents,namely organizing the recognized"physical document objects"into hierarchical structures.The discovery of logical document hierarchy is the vital step to support many downstream applications(e.g.,passage-based retrieval and high-quality information extraction).However,long documents,containing hundreds or even thousands of pages and a variable-depth hierarchy,challenge the existing methods.To address these challenges,we develop a framework,namely Hierarchy Extraction from Long Document(HELD),where we"sequentially"insert each physical object at the proper position on the current tree.Determining whether each possible position is proper or not can be formulated as a binary classification problem.To further improve its effectiveness and efficiency,we study the design variants in HELD,including traversal orders of the insertion positions,heading extraction explicitly or implicitly,tolerance to insertion errors in predecessor steps,and so on.As for evaluations,we find that previous studies ignore the error that the depth of a node is correct while its path to the root is wrong.Since such mistakes may worsen the downstream applications seriously,a new measure is developed for a more careful evaluation.The empirical experiments based on thousands of long documents from Chinese financial market,English financial market and English scientific publication show that the HELD model with the"root-to-leaf"traversal order and explicit heading extraction is the best choice to achieve the tradeoff between effectiveness and efficiency with the accuracy of 0.972,6,0.729,1 and 0.957,8 in the Chinese financial,English financial and arXiv datasets,respectively.Finally,we show that the logical document hierarchy can be employed to significantly improve the performance of the downstream passage retrieval task.In summary,we conduct a systematic study on this task in terms of methods,evaluations,and applications. 展开更多
关键词 logical document hierarchy long documents passage retrieval
原文传递
A survey on causal inference for recommendation 被引量:2
3
作者 Huishi Luo Fuzhen Zhuang +4 位作者 Ruobing Xie Hengshu Zhu Deqing Wang Zhulin An Yongjun Xu 《The Innovation》 EI 2024年第2期130-144,共15页
Causal inference has recently garnered significant interest among recommender system(RS)researchers due to its ability to dissect cause-and-effect relationships and its broad applicability across multiple fields.It of... Causal inference has recently garnered significant interest among recommender system(RS)researchers due to its ability to dissect cause-and-effect relationships and its broad applicability across multiple fields.It offers a framework to model the causality in RSs such as confounding effects and deal with counterfactual problems such as offline policy evaluation and data augmentation.Although there are already some valuable surveys on causal recommendations,they typically classify approaches based on the practical issues faced in RS,a classification that may disperse and fragment the uni-fied causal theories.Considering RS researchers’unfamiliarity with causality,it is necessary yet challenging to comprehensively review relevant studies from a coherent causal theoretical perspective,thereby facilitating a deeper integration of causal inference in RS.This survey provides a systematic review of up-to-date papers in this area from a causal theory standpoint and traces the evolutionary development of RS methods within the same causal strategy.First,we introduce the fundamental concepts of causal inference as the basis of the following review.Subsequently,we propose a novel theory-driven taxonomy,categorizing existing methods based on the causal theory employed,namely those based on the potential outcome framework,the structural causal model,and general counterfactuals.The review then delves into the technical details of how existing methods apply causal inference to address particular recommender issues.Finally,we highlight some promising directions for future research in this field.Representative papers and open-source resources will be progressively available at https://github.com/Chrissie-Law/Causal-Inference-forRecommendation. 展开更多
关键词 SURVEY DETAILS CAUSAL
原文传递
Bayesian dual neural networks for recommendation 被引量:3
4
作者 Jia HE Fuzhen ZHUANG +2 位作者 Yanchi LIU Qing HE Fen LIN 《Frontiers of Computer Science》 SCIE EI CSCD 2019年第6期1255-1265,共11页
Most traditional collaborative filtering(CF)methods only use the user-item rating matrix to make recommendations,which usually suffer from cold-start and sparsity problems.To address these problems,on the one hand,som... Most traditional collaborative filtering(CF)methods only use the user-item rating matrix to make recommendations,which usually suffer from cold-start and sparsity problems.To address these problems,on the one hand,some CF methods are proposed to incorporate auxiliary information such as user/item profiles;on the other hand,deep neural networks,which have powerful ability in learning effective representations,have achieved great success in recommender systems.However,these neural network based recommendation methods rarely consider the uncertainty of weights in the network and only obtain point estimates of the weights.Therefore,they maybe lack of calibrated probabilistic predictions and make overly confident decisions.To this end,we propose a new Bayesian dual neural network framework,named BDNet,to incorporate auxiliary information for recommendation.Specifically,we design two neural networks,one is to learn a common low dimensional space for users and items from the rating matrix,and another one is to project the attributes of users and items into another shared latent space.After that,the outputs of these two neural networks are combined to produce the final prediction.Furthermore,we introduce the uncertainty to all weights which are represented by probability distributions in our neural networks to make calibrated probabilistic predictions.Extensive experiments on real-world data sets are conducted to demonstrate the superiority of our model over various kinds of competitors. 展开更多
关键词 collaborative filtering Bayesian neural network hybrid recommendation algorithm
原文传递
Rich-text document styling restoration via reinforcement learning 被引量:1
5
作者 Hongwei LI Yingpeng HU +2 位作者 Yixuan CAO Ganbin ZHOU Ping LUO 《Frontiers of Computer Science》 SCIE EI CSCD 2021年第4期93-103,共11页
Richly formatted documents,such as financial disclosures,scientific articles,government regulations,widely exist on Web.However,since most of these documents are only for public reading,the styling information inside ... Richly formatted documents,such as financial disclosures,scientific articles,government regulations,widely exist on Web.However,since most of these documents are only for public reading,the styling information inside them is usually missing,making them improper or even burdensome to be displayed and edited in different formats and platforms.In this study we formulate the task of document styling restoration as an optimization problem,which aims to identify the styling settings on the document elements,e.g.,lines,table cells,text,so that rendering with the output styling settings results in a document,where each element inside it holds the(closely)exact position with the one in the original document.Considering that each styling setting is a decision,this problem can be transformed as a multi-step decision-making task over all the document elements,and then be solved by reinforcement learning.Specifically,Monte-Carlo Tree Search(MCTS)is leveraged to explore the different styling settings,and the policy function is learnt under the supervision of the delayed rewards.As a case study,we restore the styling information inside tables,where structural and functional data in the documents are usually presented.Experiment shows that,our best reinforcement method successfully restores the stylings in 87.65%of the tables,with 25.75%absolute improvement over the greedymethod.We also discuss the tradeoff between the inference time and restoration success rate,and argue that although the reinforcement methods cannot be used in real-time scenarios,it is suitable for the offline tasks with high-quality requirement.Finally,this model has been applied in a PDF parser to support cross-format display. 展开更多
关键词 styling restoration monte-carlo tree search reinforcement learning richly formatted documents TABLES
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部