

Web data source selection for humanities information integration of tourism
摘要 人文信息集成对提升一个景点的文化内涵有重要意义,为提升集成数据的效用和效率,提出了一种面向人文信息集成的数据源选择策略。基于名人、人文主题、信息长度和标记词构建人文信息摘要;基于人物扩展策略丰富人文摘要内容;基于名人人文信息增量设计了相应的数据源选择策略。利用领域数据集进行实验的结果表明所提方法准确率较高。 Humanities information integration is import to enhance the cultural connotation of a landscape. To enhance the effectiveness and efficiency of data integration,we propose a data source selection strategy for humanities-oriented information integration. First,building a humanities information summary based on celebrities,cultural themes,message length and mark words; Second,proposing an expansion strategy to rich cultural content of the summary; Finally,selecting data sources based on information gain of celebrities. We conduct a number of experiments based on the data collections of tourism,and the result shows that our methods accuracy is high.
作者 邓松
出处 《山东大学学报(理学版)》 CAS CSCD 北大核心 2016年第3期70-76,共7页 Journal of Shandong University(Natural Science)
基金 国家自然科学基金资助项目(61462037 61173146) 江西省自然科学基金资助项目(20142BAB217014) 江西省高等学校科技落地计划(产学研合作)项目(KJLD12022)
关键词 数据源选择 摘要 旅游 人文信息集成 data source selection summary tourism humanities information integration
  • 相关文献



  • 1Madhavan J, Cohen S, Dong X, Halevy A, Jeffery S, Ko D, Yu C. Web-scale data integration: You can afford to pay as you go//Proceedings of the CIDR. Asilomar, USA, 2007: 342-350.
  • 2Madhavan J, Ko D, Kot L, Ganapathy V, Rasmussen A, Halevy A. Google's deep web crawl. PVLDB, 2008, 1: 1241- 1252.
  • 3He H, Meng W, Yu C, Wu Z. Automatic integration of Web search interfaces with wise integrator. VLDB Journal, 2004, 12: 256- 273.
  • 4He B, Zhang Z, Chang K C-C. Knocking the door to the deep web: Integrating web query interfaces//Proceedings of theSIGMOD. Paris, France, 2004:913-914.
  • 5Zhang Z, He B, Chang K C C. Light weight domain based form assistant: Querying Web databases on the Fly//Proceedings of the VLDB. Trondheim, Norway, 2005:97-108.
  • 6Fan J, Li G, Zhou L. Interactive SQL query suggestion: Making databases user-friendly//Proeeedings of the ICDE. Hannover, Germany, 2011:351- 362.
  • 7Agarwal G, Kabra G, Chang K C C. Towards rich query in terpretation: Walking back and forth for mining query tern plates//Proceedings of the WWW. Raleign, USA, 2010: 1-10.
  • 8Bu Y, Howe B, Balazinska M, Ernst M D. HaLoop: Efficient iterative data processing on large clusters. PVLDB, 2010, 3(1): 285 -296.
  • 9Si L, Callan J P. Relevant document distribution estimation method for resource selection//Proceedings of the S1GIR. Toronto, Canada, 2003: 298-305.
  • 10Thomas P, Shokouhi M. Sushi: Scoring scaled samples for server selection//Proceedings of the SIGIR. Boston, USA, 2009:419-426.









使用帮助 返回顶部