期刊文献+

基于主题特征的问答文本摘要自动生成研究 被引量:1

Research on Automatic Generation of Question andAnswer Text Abstract Based on Topic Feature
下载PDF
导出
摘要 [目的/意义]为帮助用户在拥有海量文本信息的问答社区高效率、高质量定位到符合自身需求的信息。[方法/过程]本文提出基于主题特征的问答文本摘要生成模型,该模型融合Word2Vec和SLDA算法多层次表达问答文本语义特征,而后基于图排序的思想,结合MRR冗余控制算法与文本句特征标签,调整句子权重,高效筛选出贴合问题标签的摘要内容。[结果/结论]本文对知乎问答社区多个问题下的问答文本数据进行验证,结果证明该模型具有较高的可行性和有效性。但本文选取了500份回答文本数据进行实证,未来可进一步扩大数据量开展更为充分的验证。 [Purpose/Significance]To help users locate information that meets their own needs with high efficiency and quality in the question and answer community with massive text information.[Method/Process]This paper proposed a question and answer text summary generation model based on topic features.This model combined Word2vec and SLDA algorithms to express the semantic features of question and answer text at multiple levels.Then,based on the idea of graph sorting,combined with MRR redundancy control algorithm and text sentence feature tags,the sentence weight was adjusted,and the summary content fitting the question tag was efficiently screened.[Result/Conclusion]Thise paper verifies the question and answer text data under multiple questions of Zhihu question and answer community,and the results show that the model is highly feasible and effective.However,this paper only selects 500 response text data for empirical analysis,and the data volume can be further expanded to carry out more full verification in the future.
作者 刘梦豪 熊回香 王妞妞 贺宇航 Liu Menghao;Xiong Huixiang;Wang Niuniu;He Yuhang(School of Information Management,Central China Normal University,Wuhan 430079,China;Undergraduate School,Central China Normal University,Wuhan 430079,China)
出处 《现代情报》 CSSCI 2023年第8期114-124,177,共12页 Journal of Modern Information
基金 国家社会科学基金重点项目“数据驱动的在线健康资源挖掘与智慧服务研究”(项目编号:22ATQ004) 2022年度华中师范大学基本科研业务费(人文社科类)交叉科学研究项目“基于量化自我技术的个体健康管理研究”(项目编号:CCNU22JC033) 华中师范大学研究生教育创新资助项目“跨学科科研合作视角下学术群落发现与知识增长点探测研究”(项目编号:2022CXZZ106)。
关键词 摘要自动生成 知乎 问答社区 监督主题模型 图排序 Word2Vec automatic generation of summary Zhihu Q&A community monitor the subject model graph sorting Word2Vec
  • 相关文献

参考文献10

二级参考文献80

  • 1黄波,刘传才.基于加权TextRank的中文自动文本摘要[J].计算机应用研究,2020,37(2):407-410. 被引量:21
  • 2Page Let al. The pagerank citation ranking: Bringing order to the web. Stanford University, Stanford, CA, USA: Technical Report 1999 -66, 1999.
  • 3Kleinberg J M. Authoritative sources in a hyperlinked environment. Journal of the ACM, 1999, 46(5): 604 632.
  • 4Koutrika Get al. Combating spamin tagging systems//Proceedings of the 3rd International Workshop on Adversarial Information Retrieval on the Web (AIRWeb' 07). Banff, Canada, 2007:57-64.
  • 5Koutrika G et al. Combating spam in tagging systems: An evaluation. ACM Transactions on the Web, 2008, 2 (4): 1-34.
  • 6Heymann P, Koutrika G, Garcia Molina H. Fighting spam on social web sites: A survey of approaches and future chal lenges. IEEE Internet Computing, 2007, 11(6) 36-45.
  • 7Krause Bet al. The anti-social tagger: Detecting spam in social bookrnarking systems//Proceedings of the 4th International Workshop on Adversarial Information Retrieval on the Web(AIRWeb'08). Beijing, China, 2008:61-68.
  • 8Hotho A et al. Information retrieval in folksonomies: Search and ranking. The Semantic Web: Research and Applications, 2006, 4011:411-426.
  • 9Bao S et al. Optimizing web search using social annotations// Proceedings of the 16th International Conference on World WideWeb(WWW'07). Banff, Canada, 2007:501- 510.
  • 10Noll M G et al. Telling experts from spammers: Expertise ranking in folksonomies//Proceedings of the 32nd Interns- tional ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR' 09). Boston, MA, USA, 2009:612 -619.

共引文献102

同被引文献2

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部