期刊文献+

基于潜在语义分析的Deep Web查询接口聚类研究 被引量:3

Research on Deep Web Query Interface Clustering Based on Latent Semantic Analysis
下载PDF
导出
摘要 集成查询接口的生成是Deep Web数据集成的重要组成环节。如何对不同领域的查询接口进行有效的聚类是生成集成查询接口时需要解决的核心问题之一。针对传统的向量空间模型在Deep Web查询接口聚类时单纯依赖关键词匹配的缺点,引入潜在语义分析(LSA)的方法来发掘查询接口之间的语义关系,并给出了基于潜在语义分析的Deep Web查询接口聚类算法,最后采用UIUC的Web集成资源库提供的数据进行了实验。结果表明,潜在语义分析的方法提高了同一领域查询接口之间的相似度,明显改善了Deep Web查询接口聚类的质量。 Generation of integrated query interfaces is the important issue of Deep Web data integration. How to cluster different query interfaces effectively is one of the most core issues when generating integrated query interface. Due to the traditional vector space model can't solve the shortage of relying on keyword maching in the Deep Web query inter- face clustering, the Latent Semantic Analysis (LSA) method was introduced and then the algorithm of Deep Web query interface clustering based on Latent Semantic Analysis was proposed. The experimental results on UIUC Web integra- tion repository show that LSA method can significantly improve the performance of Deep Web query interface clustering.
出处 《计算机科学》 CSCD 北大核心 2013年第11期228-230,247,共4页 Computer Science
基金 国家自然科学基金(61163057) 广西自然科学基金(2012jjAAG0063) 广西可信软件重点实验室开放基金(KX201117) 广西研究生科研创新项目(YCSZ2012070)资助
关键词 潜在语义分析 奇异值分解 DEEP Web 查询接口聚类 Latent semantic analysis, Singular value decomposition, Deep Web, Query interface clustering
  • 相关文献

参考文献9

二级参考文献82

  • 1李莉,张太红,李霞.潜在语义分析在中文文本分类中的应用[J].新疆农业大学学报,2006,29(2):99-102. 被引量:2
  • 2王颖晖,刘西林.基于Bass模型的竞争产品市场扩散研究[J].西安电子科技大学学报(社会科学版),2006,16(4):11-15. 被引量:4
  • 3Fung B C M,Wang K,Ester M.Hierarchical document clustering//Wang John ed.The Encyclopedia of Data Warehousing and Mining,idea Group.2005:970-975.
  • 4Salton G.The SMART Retrieval System-Experiments in Automatic Document Processing.Englewood Cliffs,New Jersey:Prentice Hall Inc,1971.
  • 5Wang Y,Julia H.Document clustering with semantic analysis//Proceedings of the 39th Hawaii International Conferences on System Sciences.Hawaii,US,2006:54-63.
  • 6Hotho A,Staab S,Stumme G.Wordnet improves text document clustering//Proceedings of the Semantic Web Workshop at SIGIR-2003,26th Annual International ACM SIGIR Conference.Toronto,Canada,2003:541-550.
  • 7Hall P,Dowling G.Approximate string matching.Computing Survey,1980,12(4):381-402.
  • 8Coelho T,Calado P,Souza L,Ribeiro-Neto B,Muntz R.Image retrieval using multiple evidence ranking.IEEETransactions on Knowledge and Data Engineering,2004,16(4):408-417.
  • 9Ko Y,Park J,Seo J.Improving text categorization using the importance of sentences.lnformation Processing and Management,2004,40(1):65-79.
  • 10Erkan G,Radev D.Lexrank:Graph-based lexical centrality as salience in text summarization.Journal of Artificial Intelligence Research,2004,22(7):457-479.

共引文献383

同被引文献27

引证文献3

二级引证文献12

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部