期刊文献+

基于语义网的中文百科知识组织与集成 被引量:4

Organization and integration of Chinese encyclopedia knowledge based on semantic web
下载PDF
导出
摘要 通过从3个最大的中文百科全书(百度百科、互动百科、中文维基百科)所包含的大规模知识数据中识别重要的结构化特征生成RDF三元组,并将这些信息数据整合加入分布式大规模RDF数据存储系统,从而构成符合Linked Data要求的中文百科知识库RDF数据集。主要工作包括,通过配置网络爬虫对百度百科和互动百科的网页进行爬取,解析其中信息框等内容,生成RDF三元组并实现三元组的动态插入;下载需要的DBpedia中文三元组数据,将三元组进行整合并存储到课题组的大规模语义数据存储库Jingwei中;设计显示动态插入和三元组模式查询的页面,通过原型系统实验,验证了该方法的有效性。 It identifies important structural features from immense knowledgeable data in three largest Chinese encyclopedias(Baidu Encyclopedia, Hudong Encyclopedia, Chinese Wikipedia)and generates RDF triples, then integrates the information data and sets into a distributed large-scale RDF data storage system, and constructs the RDF dataset of Chinese Encyclopedia Knowledge Base that fits with the requirements of Linked Data. The main work includes, configuring the web crawler to crawl the html pages from Baidu Encyclopedia and Hudong Encyclopedia, parsing the content of the information box to generate RDF triples, downloading Chinese triples data from DBpedia, integrating the data sets into the distributed large-scale RDF data storage system Jingwei, designing the display of dynamic inserting and triple pattern query pages, it also verifies the validity of the method through prototype system experiments.
出处 《计算机工程与应用》 CSCD 北大核心 2015年第14期120-126,169,共8页 Computer Engineering and Applications
基金 国家自然科学基金(No.61100049 No.61070202) 国家高技术研究发展计划(863)(No.2013AA013204)
关键词 语义网 资源描述框架(RDF) 中文百科全书 Linked Open Data NUTCH semantic web Resource Description Framework(RDF) Chinese encyclopedia Linked Open Data Nutch
  • 相关文献

参考文献10

  • 1Berners-Lee T,Hendler J,Lassila O.The semantic web[J]. Scientific American, 2001,284 ( 5 ) : 34-43.
  • 2Klyne G, Carroll J J, McBride B.Resource Description Framework (RDF) : concepts and abstract syntax[S].[S.I.] : W3C Recommendation, 2004.
  • 3Bizer C, Heath T, Berners-Lee T.Linked Data-the story so far[J].lntemational Journal on Semantic Web and Infor- mation Systems, 2009,5 ( 3 ) : 1-22.
  • 4Zhao,J.Publishing Chinese medicine knowledge as Linked Data on the Web[J].Chinese Medicine,2010,5( 1): 1-12.
  • 5Niu X,Sun X,Wang H,et al.Zhishi.me-weaving Chinese linking open data[C]//Proceedings of the 10th International Conference on the Semantic Web, Bonn, Germany,2011 :205-220.
  • 6Bizer C,Lehmann J,Kobilarov G,et al.DBpedia-a crys- tallization point for the Web of data[J].Journal of Web Semantics, 2009,7( 3 ) : 154-165.
  • 7Wang Xin, Shi Hong, Jiang Longxiang, et al.Jingwei+ : a distributed large-scale RDF data server[C]//Proceedings of the 14th Asia-Pacific Web Conference(APWeb2012), 2012:779-783.
  • 8Khare R, Cutting D, Sitaker K, et al.Nutch: a flexible and scalable open-source web search engine[J].Oregon State University, 2004.
  • 9Chang F, Dean J, Ghemawat S, et al.Bigtable: a distributed storage system for structured data[J].ACM Transactions on Computer Systems(TOCS ), 2008,26(2).
  • 10Dean J, Ghemawat S.MapReduce: simplified data pro- cessing on large clusters[J].Communications of the ACM, 2008,51(1): 107-113.

同被引文献19

引证文献4

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部