利用WOS(Web of Science)和Wikipedia两种数据源,对大数据相关的内容进行词频统计、文本归类分析,得出两种数据源下大数据主题的共识和差异,并进一步梳理提炼出大数据领域的主题类别。共同的类别包括整体角度、技术层面、应用层面、实...利用WOS(Web of Science)和Wikipedia两种数据源,对大数据相关的内容进行词频统计、文本归类分析,得出两种数据源下大数据主题的共识和差异,并进一步梳理提炼出大数据领域的主题类别。共同的类别包括整体角度、技术层面、应用层面、实体和活动等,进一步细分的主题包括数据及数据源、大数据处理和分析技术、大数据系统和应用、国家地区以及企业的推动、社会和人的讨论、行业和学科变化等。最后论文还结合相关数据探讨了大数据领域的研究前沿。展开更多
维基百科(Wikipedia)现有搜索模块采用关键词匹配方式导致搜索效率相对低下.为了提高Wikipedia中的知识获取效率,提出基于链接分析的词间距算法(TDL,TermDistance based on Linkage).利用可扩展的计算模型,通过内部链接结构分析发现词簇...维基百科(Wikipedia)现有搜索模块采用关键词匹配方式导致搜索效率相对低下.为了提高Wikipedia中的知识获取效率,提出基于链接分析的词间距算法(TDL,TermDistance based on Linkage).利用可扩展的计算模型,通过内部链接结构分析发现词簇,并且引入排序和推荐机制.基于Wikipedia 2009年5月快照数据的实验表明,TDL有效增强了Wiki-pedia知识检索的准确性,经由用户评判检验证实TDL算法能有效提高用户意图识别度达7%.展开更多
In recent years,the Internet has become the primary source of health information for the general population,which may be attributed to improvements in digital technology and Internet accessibility[1].Since the World H...In recent years,the Internet has become the primary source of health information for the general population,which may be attributed to improvements in digital technology and Internet accessibility[1].Since the World Health Organization declared the coronavirus disease (COVID-19) as a pandemic in March 2020[2],digital information has gained more importance,as seen through the rapid growth in the number of people searching online[3].As seen from previous infectious disease outbreaks,the recent increase in monkeypox cases might compel individuals worldwide to broaden their searches for relevant virtual health information[3,4].展开更多
This paper proposes a method to construct conceptual semantic knowledge base of software engineering domain based on Wikipedia. First, it takes the concept of SWEBOK V3 as the standard to extract the interpretation of...This paper proposes a method to construct conceptual semantic knowledge base of software engineering domain based on Wikipedia. First, it takes the concept of SWEBOK V3 as the standard to extract the interpretation of the concept from the Wikipedia, and extracts the keywords as the concept of semantic;Second, through the conceptual semantic knowledge base, it is formed by the relationship between the hierarchical relationship concept and the other text interpretation concept in the Wikipedia. Finally, the semantic similarity between concepts is calculated by the random walk algorithm for the construction of the conceptual semantic knowledge base. The semantic similarity of knowledge base constructed by this method can reach more than 84%, and the effectiveness of the proposed method is verified.展开更多
文摘利用WOS(Web of Science)和Wikipedia两种数据源,对大数据相关的内容进行词频统计、文本归类分析,得出两种数据源下大数据主题的共识和差异,并进一步梳理提炼出大数据领域的主题类别。共同的类别包括整体角度、技术层面、应用层面、实体和活动等,进一步细分的主题包括数据及数据源、大数据处理和分析技术、大数据系统和应用、国家地区以及企业的推动、社会和人的讨论、行业和学科变化等。最后论文还结合相关数据探讨了大数据领域的研究前沿。
文摘维基百科(Wikipedia)现有搜索模块采用关键词匹配方式导致搜索效率相对低下.为了提高Wikipedia中的知识获取效率,提出基于链接分析的词间距算法(TDL,TermDistance based on Linkage).利用可扩展的计算模型,通过内部链接结构分析发现词簇,并且引入排序和推荐机制.基于Wikipedia 2009年5月快照数据的实验表明,TDL有效增强了Wiki-pedia知识检索的准确性,经由用户评判检验证实TDL算法能有效提高用户意图识别度达7%.
文摘In recent years,the Internet has become the primary source of health information for the general population,which may be attributed to improvements in digital technology and Internet accessibility[1].Since the World Health Organization declared the coronavirus disease (COVID-19) as a pandemic in March 2020[2],digital information has gained more importance,as seen through the rapid growth in the number of people searching online[3].As seen from previous infectious disease outbreaks,the recent increase in monkeypox cases might compel individuals worldwide to broaden their searches for relevant virtual health information[3,4].
文摘This paper proposes a method to construct conceptual semantic knowledge base of software engineering domain based on Wikipedia. First, it takes the concept of SWEBOK V3 as the standard to extract the interpretation of the concept from the Wikipedia, and extracts the keywords as the concept of semantic;Second, through the conceptual semantic knowledge base, it is formed by the relationship between the hierarchical relationship concept and the other text interpretation concept in the Wikipedia. Finally, the semantic similarity between concepts is calculated by the random walk algorithm for the construction of the conceptual semantic knowledge base. The semantic similarity of knowledge base constructed by this method can reach more than 84%, and the effectiveness of the proposed method is verified.