期刊文献+

基于视觉特征的网页最优分割算法 被引量:3

Web Page Optimal Segmentation Algorithm Based on Visual Features
下载PDF
导出
摘要 网页分割技术是实现网页自适应呈现的关键。针对经典的基于视觉的网页分割算法VIPS(Vision-based Page Segmentation Algorithm)分割过碎和半自动的问题,基于图最优划分思想提出了一种新颖的基于视觉的网页最优分割算法VWOS(Vision-based Web Optimal Segmentation)。考虑到视觉特征和网页结构,将网页构造为加权无向连通图,网页分割转化为图的最优划分,基于Kruskal算法并结合网页分割的过程,设计网页分割算法VWOS。实验证明,与VIPS相比,采用VWOS算法分割网页的语义完整性更好,且不需要人工参与。 The Web page segmentation technique is a key point to realize Web page adaptive presentatiorL To overcome the shortcomings of the classical Web page segmentation algorithm VIPS(Vision-based Page Segmentation Algorithm) including fragmented content and semi-automatic, a novel Web page segmentation VWOS(Vision-based Web Optimal Segmentation) was proposed based on the optimal division of graph. The Web page is constructed as the weighted undi- rected connected graph from the perspective of visual features and structure of the Web page. Therefore, the problem of Web page segmentation is transformed into the optimal division of graph. VWOS was designed by combining Kruskal algorithm and the process of the Web page segmentation. It was proved by the experimentation that the effect of Web page segmentation produced by VWOS is better than that by VIPS.
出处 《计算机科学》 CSCD 北大核心 2015年第11期284-287,309,共5页 Computer Science
基金 教育部科技发展中心网络时代的科技论文快速共享专项研究资助课题:基于学术社交网络的多粒度科技论文共享技术研究(2013123) 中央高校基本科研业务费项目:内容适配系统中最优适配决策器模型及分布式寻优算法研究(CCNU14A02012)资助
关键词 网页最优分割 网页视觉特征 网页自适应呈现 最优划分 Web page optimal segmentation, Web page vision features, Web page adaptive presentation, Optimal division
  • 相关文献

参考文献18

  • 1Diao Y,Lu H.Chen S, et al. Toward Learning Based Web QueryProcessing[C]//VLDB. 2000:317-328.
  • 2Wong W, Fu AW C. Finding Structure and Characteristics ofWeb Documents for Classification[C] // ACM SIGMOD Work-shop on Research Issues in Data Mining and Knowledge Disco-very. 2000(sl):96-105.
  • 3Kaasinen E, Aaltonen M, Kolari J, et al. Two approaches tobringing Internet services to WAP devices[J]. Computer Net-works,2000,33(1) :231-246.
  • 4Buyukkokten 0. Garcia-Molina H, Paepcke A. Accordion sum-marization for end-game browsing on PDAs and cellular phones[C] // Proceedings of the SIGCHI Conference on Human Fac-tors in Computing Systems. ACM,2001 ; 213-220.
  • 5吴鹏飞,孟祥增,刘俊晓,马凤娟.网页区域分割与识别技术[J].现代计算机,2006(6):48-50. 被引量:4
  • 6王琦,唐世渭,杨冬青,王腾蛟.基于DOM的网页主题信息自动提取[J].计算机研究与发展,2004,41(10):1786-1792. 被引量:81
  • 7Hattori G, Hoashi K . Matsumoto K,et al. Robust web page seg-mentation for mobile terminal using content-distances and pagelayout information [C] // Proceedings of the 16th internationalconference on World Wide Web. ACM, 2007: 361-370.
  • 8Romero R,Berger A. Automatic partitioning of web pages usingclustering[M] //Mobile Human-Computer Interaction-MobileH-CI 2004. Springer Berlin Heidelberg,2004 : 388-393.
  • 9Hattori G,Matsumoto K, Sugaya F. Auto Web Page DistillingScheme Using Content Distance Based on Depth of Tag Hierar-chy [J]. DBSJ Letters,2005,4(1) : 1-8.
  • 10Chen Y, Xie X, Ma W Y,et al. Adapting Web pages for small-screen devices[J]. Internet Computing,IEEE, 2005.9(1) : 50-56.

二级参考文献51

共引文献89

同被引文献20

引证文献3

二级引证文献11

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部