Purpose: This study aims to build an automatic survey generation tool, named CitationAS, based on citation content as represented by the set of citing sentences in the original articles.Design/methodology/approach: ...Purpose: This study aims to build an automatic survey generation tool, named CitationAS, based on citation content as represented by the set of citing sentences in the original articles.Design/methodology/approach: Firstly, we apply LDA to analyse topic distribution of citation content. Secondly, in CitationAS, we use bisecting K-means, Lingo and STC to cluster retrieved citation content. Then Word2Vec, Word Net and combination of them are applied to generate cluster labels. Next, we employ TF-IDF, MMR, as well as considering sentence location information, to extract important sentences, which are used to generate surveys. Finally, we adopt manual evaluation for the generated surveys.Findings: In experiments, we choose 20 high-frequency phrases as search terms. Results show that Lingo-Word2Vec, STC-Word Net and bisecting K-means-Word2Vec have better clustering effects. In 5 points evaluation system, survey quality scores obtained by designing methods are close to 3, indicating surveys are within acceptable limits. When considering sentence location information, survey quality will be improved. Combination of Lingo, Word2Vec, TF-IDF or MMR can acquire higher survey quality.Research limitations: The manual evaluation method may have a certain subjectivity. We use a simple linear function to combine Word2Vec and Word Net that may not bring out their strengths. The generated surveys may not contain some newly created knowledge of some articles which may concentrate on sentences with no citing.Practical implications: CitationAS tool can automatically generate a comprehensive, detailed and accurate survey according to user’s search terms. It can also help researchers learn about research status in a certain field.Originality/value: Citaiton AS tool is of practicability. It merges cluster labels from semantic level to improve clustering results. The tool also considers sentence location information when calculating sentence score by TF-IDF and MMR.展开更多
In this paper, CiteSpace, a bibliometrics software, was adopted to collect research papers published on the Web of Science, which are relevant to biological model and effluent quality prediction in activated sludge pr...In this paper, CiteSpace, a bibliometrics software, was adopted to collect research papers published on the Web of Science, which are relevant to biological model and effluent quality prediction in activated sludge process in the wastewater treatment. By the way of trend map, keyword knowledge map, and co-cited knowledge map, specific visualization analysis and identification of the authors, institutions and regions were concluded. Furthermore, the topics and hotspots of water quality prediction in activated sludge process through the literature-co-citation-based cluster analysis and literature citation burst analysis were also determined, which not only reflected the historical evolution progress to a certain extent, but also provided the direction and insight of the knowledge structure of water quality prediction and activated sludge process for future research.展开更多
运用科学计量学中的聚类分析和社会网络等方法,分析SSCI数据库中收录的44种人文地理学期刊之间的关系。首先在Web of Science中检索期刊的共被引次数矩阵,计算共被引率矩阵,然后进行CONCOR聚类分析和网络结构分析。统计中国地理学者发表...运用科学计量学中的聚类分析和社会网络等方法,分析SSCI数据库中收录的44种人文地理学期刊之间的关系。首先在Web of Science中检索期刊的共被引次数矩阵,计算共被引率矩阵,然后进行CONCOR聚类分析和网络结构分析。统计中国地理学者发表的SSCI地理学论文,分析中国地理学期刊在Web of Science中的被引情况,定量考察中国地理学在国际学术界的影响。展开更多
基于Web of Science的SCI-EXPANDED与SSCI数据库数据,运用Cite SpaceⅢ软件对农业碳排放研究文献进行共被引聚类分析、关键节点分析以及突现词分析,以期发掘农业碳排放领域的研究前沿及其演进趋势。结果表明:(1)农业碳排放文献呈逐年递...基于Web of Science的SCI-EXPANDED与SSCI数据库数据,运用Cite SpaceⅢ软件对农业碳排放研究文献进行共被引聚类分析、关键节点分析以及突现词分析,以期发掘农业碳排放领域的研究前沿及其演进趋势。结果表明:(1)农业碳排放文献呈逐年递增趋势,属于生态环境与农业的交叉学科。(2)农地/土壤碳排与碳汇研究、农业生产非CO2类温室气体排放研究、农产品碳足迹研究以及土地利用变化的碳排与碳汇效应研究是主要的前沿主题。(3)农业碳排放领域的前沿演进有相对明显的脉络可循,对农地/土壤碳排与碳汇的研究从实验研究与情境研究发展至综合已有测算数据或者说已有知识的元分析,然后是基于过程模型的模型化研究(例如脱氮分解模型,DNDC)。(4)基于生命周期分析法的农产品碳足迹、气候变化背景下粮食安全问题与生物炭运用在近两年呈现高度突现,是后续学术研究需要重点关注的领域。展开更多
相关反馈是一种根据用户或系统的相关性判断重构初始检索提问的方法,已被证明可以有效地改进检索效果。具体到学术文献,其引用关系表征了文献内容上的相关性,因而可以为相关反馈提供有价值的辅助信息。本文提出了一种基于引用上下文...相关反馈是一种根据用户或系统的相关性判断重构初始检索提问的方法,已被证明可以有效地改进检索效果。具体到学术文献,其引用关系表征了文献内容上的相关性,因而可以为相关反馈提供有价值的辅助信息。本文提出了一种基于引用上下文、文献同被引和文献耦合的相关反馈改进算法。该算法的基本思想包括:利用学术文献的引用上下文信息扩充词包模型(bags of words)进行文本表示;在相关文献判断阶段利用相关文献在引文网络中与其他文献的同被引强度和耦合强度扩充相关文献集合;结合基于聚类的相关反馈思想抽取查询扩展项。实验证明该算法提高了相关反馈效果。此外,相关分析的结果表明文献同被引以及文献耦合强度与文献内容相似度具有显著的相关性。展开更多
随着学习科学领域的兴起,深度学习逐渐成为教育领域的核心研究主题。运用基于引文分析与共词聚类分析的方法,以Web of Science数据库2005-2015年收录的459篇文献为研究对象,通过可视化知识图谱,探析近十年来国外深度学习领域的研究现状...随着学习科学领域的兴起,深度学习逐渐成为教育领域的核心研究主题。运用基于引文分析与共词聚类分析的方法,以Web of Science数据库2005-2015年收录的459篇文献为研究对象,通过可视化知识图谱,探析近十年来国外深度学习领域的研究现状与研究热点,并进一步结合国内研究现状提出几点启示,以期对同类研究提供有益的借鉴。展开更多
基金supported by Major Projects of National Social Science Fund (No. 17ZDA291)Fujian Provincial Key Laboratory of Information Processing and Intelligent Control (Minjiang University) (No. MJUKF201704)Qing Lan Project
文摘Purpose: This study aims to build an automatic survey generation tool, named CitationAS, based on citation content as represented by the set of citing sentences in the original articles.Design/methodology/approach: Firstly, we apply LDA to analyse topic distribution of citation content. Secondly, in CitationAS, we use bisecting K-means, Lingo and STC to cluster retrieved citation content. Then Word2Vec, Word Net and combination of them are applied to generate cluster labels. Next, we employ TF-IDF, MMR, as well as considering sentence location information, to extract important sentences, which are used to generate surveys. Finally, we adopt manual evaluation for the generated surveys.Findings: In experiments, we choose 20 high-frequency phrases as search terms. Results show that Lingo-Word2Vec, STC-Word Net and bisecting K-means-Word2Vec have better clustering effects. In 5 points evaluation system, survey quality scores obtained by designing methods are close to 3, indicating surveys are within acceptable limits. When considering sentence location information, survey quality will be improved. Combination of Lingo, Word2Vec, TF-IDF or MMR can acquire higher survey quality.Research limitations: The manual evaluation method may have a certain subjectivity. We use a simple linear function to combine Word2Vec and Word Net that may not bring out their strengths. The generated surveys may not contain some newly created knowledge of some articles which may concentrate on sentences with no citing.Practical implications: CitationAS tool can automatically generate a comprehensive, detailed and accurate survey according to user’s search terms. It can also help researchers learn about research status in a certain field.Originality/value: Citaiton AS tool is of practicability. It merges cluster labels from semantic level to improve clustering results. The tool also considers sentence location information when calculating sentence score by TF-IDF and MMR.
文摘In this paper, CiteSpace, a bibliometrics software, was adopted to collect research papers published on the Web of Science, which are relevant to biological model and effluent quality prediction in activated sludge process in the wastewater treatment. By the way of trend map, keyword knowledge map, and co-cited knowledge map, specific visualization analysis and identification of the authors, institutions and regions were concluded. Furthermore, the topics and hotspots of water quality prediction in activated sludge process through the literature-co-citation-based cluster analysis and literature citation burst analysis were also determined, which not only reflected the historical evolution progress to a certain extent, but also provided the direction and insight of the knowledge structure of water quality prediction and activated sludge process for future research.
文摘运用科学计量学中的聚类分析和社会网络等方法,分析SSCI数据库中收录的44种人文地理学期刊之间的关系。首先在Web of Science中检索期刊的共被引次数矩阵,计算共被引率矩阵,然后进行CONCOR聚类分析和网络结构分析。统计中国地理学者发表的SSCI地理学论文,分析中国地理学期刊在Web of Science中的被引情况,定量考察中国地理学在国际学术界的影响。
文摘基于Web of Science的SCI-EXPANDED与SSCI数据库数据,运用Cite SpaceⅢ软件对农业碳排放研究文献进行共被引聚类分析、关键节点分析以及突现词分析,以期发掘农业碳排放领域的研究前沿及其演进趋势。结果表明:(1)农业碳排放文献呈逐年递增趋势,属于生态环境与农业的交叉学科。(2)农地/土壤碳排与碳汇研究、农业生产非CO2类温室气体排放研究、农产品碳足迹研究以及土地利用变化的碳排与碳汇效应研究是主要的前沿主题。(3)农业碳排放领域的前沿演进有相对明显的脉络可循,对农地/土壤碳排与碳汇的研究从实验研究与情境研究发展至综合已有测算数据或者说已有知识的元分析,然后是基于过程模型的模型化研究(例如脱氮分解模型,DNDC)。(4)基于生命周期分析法的农产品碳足迹、气候变化背景下粮食安全问题与生物炭运用在近两年呈现高度突现,是后续学术研究需要重点关注的领域。
文摘相关反馈是一种根据用户或系统的相关性判断重构初始检索提问的方法,已被证明可以有效地改进检索效果。具体到学术文献,其引用关系表征了文献内容上的相关性,因而可以为相关反馈提供有价值的辅助信息。本文提出了一种基于引用上下文、文献同被引和文献耦合的相关反馈改进算法。该算法的基本思想包括:利用学术文献的引用上下文信息扩充词包模型(bags of words)进行文本表示;在相关文献判断阶段利用相关文献在引文网络中与其他文献的同被引强度和耦合强度扩充相关文献集合;结合基于聚类的相关反馈思想抽取查询扩展项。实验证明该算法提高了相关反馈效果。此外,相关分析的结果表明文献同被引以及文献耦合强度与文献内容相似度具有显著的相关性。
文摘随着学习科学领域的兴起,深度学习逐渐成为教育领域的核心研究主题。运用基于引文分析与共词聚类分析的方法,以Web of Science数据库2005-2015年收录的459篇文献为研究对象,通过可视化知识图谱,探析近十年来国外深度学习领域的研究现状与研究热点,并进一步结合国内研究现状提出几点启示,以期对同类研究提供有益的借鉴。