期刊文献+

基于中间域语义传导的跨领域文本生成方法

Cross-Domain Text Generation Method Based on Semantic Conduction of Intermediate Domains
下载PDF
导出
摘要 在多领域数据的文本生成场景中,不同领域中的数据通常存在差异性,而新领域的引入会同时带来数据缺失的问题.传统的有监督方法,需要目标领域中大量包含标记的数据来训练深度神经网络文本生成模型,而且训练好的模型无法在新领域中取得良好的泛化效果.针对多领域场景中数据差异和数据缺失的问题,受到迁移学习方法的启发,设计了一种综合性的迁移式文本生成方法,减少了不同领域之间文本数据的差异性,同时借助已有领域和新领域之间文本数据上的语义关联性,帮助深度神经网络文本生成模型在新领域上进行泛化.通过在公开数据集上的实验,验证了所提方法在多领域场景下领域迁移的有效性,模型在新领域上进行文本生成时具有较好的表现,对比现有的其他迁移式文本生成方法,在各项文本生成评价指标上均有提升. The deep neural network has been widely used in natural language processing.In text generation tasks with multi-domain data,there is often a discrepancy of data in different domains.And the introduction of new domains can simultaneously bring about the problem of data deficiency.The supervised methods require a large amount of data containing ground-truth in the domain of the task to train a deep neural network text generation model,and the trained model cannot achieve good generalization in a new domain.To address the problems of data distribution differences and data deficiency in multi-domain tasks,a comprehensive transfer text generation method inspired by transfer learning methods is designed to reduce the data distribution differences in text data between different domains while leveraging the semantic correlation on text data between source domain and target domain to help deep neural network text generation models generalize over new domains.The effectiveness of the proposed method for domain transfer is verified through experiments on a publicly available dataset,and the transfer deep neural network text generation model has a better performance in text generation on new domains.Also,the proposed method improves in all text generation evaluation metrics compared with other existing transfer text generation methods.
作者 马廷淮 于信 荣欢 Ma Tinghuai;Yu Xin;Rong Huan(School of Software,Nanjing University of Information Science&Technology,Nanjing 210044;School of Artificial Intelligence(School of Future Technology),Nanjing University of Information Science&Technology,Nanjing 210044)
出处 《计算机研究与发展》 EI CSCD 北大核心 2023年第12期2844-2863,共20页 Journal of Computer Research and Development
基金 国家自然科学基金项目(62102187,62372243) 江苏省自然科学基金(基础研究计划)项目(BK20210639) 国家重点研发计划项目(2021YFE0104400)。
关键词 深度神经网络 文本生成模型 数据分布对齐 最大均值差异 零次学习 语义要素传导 deep neural network text generation model data distribution alignment maximum mean discrepancy zero-shot learning semantic conduction
  • 相关文献

参考文献8

二级参考文献103

  • 1秦兵,刘挺,陈尚林,李生.多文档文摘中句子优化选择方法研究[J].计算机研究与发展,2006,43(6):1129-1134. 被引量:13
  • 2Ben-David S,Blitzer J,Crammer K,Pereira F.Analysis of representations for domain adaptation.In:Platt JC,Koller D,Singer Y,Roweis ST,eds.Proc.of the Advances in Neural Information Processing Systems 19.Cambridge:MIT Press,2007.137-144.
  • 3Blitzer J,McDonald R,Pereira F.Domain adaptation with structural correspondence learning.In:Jurafsky D,Gaussier E,eds.Proc.of the Int’l Conf.on Empirical Methods in Natural Language Processing.Stroudsburg PA:ACL,2006.120-128.
  • 4Dai WY,Xue GR,Yang Q,Yu Y.Co-Clustering based classification for out-of-domain documents.In:Proc.of the 13th ACM Int’l Conf.on Knowledge Discovery and Data Mining.New York:ACM Press,2007.210-219.[doi:10.1145/1281192.1281218].
  • 5Dai WY,Xue GR,Yang Q,Yu Y.Transferring naive Bayes classifiers for text classification.In:Proc.of the 22nd Conf.on Artificial Intelligence.AAAI Press,2007.540-545.
  • 6Liao XJ,Xue Y,Carin L.Logistic regression with an auxiliary data source.In:Proc.of the 22nd lnt*I Conf.on Machine Learning.San Francisco:Morgan Kaufmann Publishers,2005.505-512.[doi:10.1145/1102351.1102415].
  • 7Xing DK,Dai WY,Xue GR,Yu Y.Bridged refinement for transfer learning.In:Proc.of the Ilth European Conf.on Practice of Knowledge Discovery in Databases.Berlin:Springer-Verlag,2007.324-335.[doi:10.1007/978-3-540-74976-9_31].
  • 8Mahmud MMH.On universal transfer learning.In:Proc.of the 18th Int’l Conf.on Algorithmic Learning Theory.Sendai,2007.135-149.[doi:10,1007/978-3-540-75225-7_14].
  • 9Samarth S,Sylvian R.Cross domain knowledge transfer using structured representations.In:Proc.of the 21st Conf.on Artificial Intelligence.AAAI Press,2006.506-511.
  • 10Bel N,Koster CHA,Villegas M.Cross-Lingual text categorization.In:Proc.of the European Conf.on Digital Libraries.Berlin:Springer-Verlag,2003.126-139.[doi:10.1007/978-3-540-45175-4_13].

共引文献725

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部