A web page clustering algorithm called PageCluster and the improved algorithm ImPageCluster solving overlapping are proposed. These methods not only take the web structure and page hyperlink into account, but also con...A web page clustering algorithm called PageCluster and the improved algorithm ImPageCluster solving overlapping are proposed. These methods not only take the web structure and page hyperlink into account, but also consider the importance of each page which is described as in-weight and out-weight. Compared with the traditional clustering methods, the experiments show that the runtimes of the proposed algorithms are less with the improved accuracies.展开更多
Recently, 3D display technology, and content creation tools have been undergone rigorous development and as a result they have been widely adopted by home and professional users. 3D digital repositories are increasing...Recently, 3D display technology, and content creation tools have been undergone rigorous development and as a result they have been widely adopted by home and professional users. 3D digital repositories are increasing and becoming available ubiquitously. However, searching and visualizing 3D content remains a great challenge. In this paper, we propose and present the development of a novel approach for creating hypervideos, which ease the 3D content search and retrieval. It is called the dynamic hyperlinker for 3D content search and retrieval process. It advances 3D multimedia navigability and searchability by creating dynamic links for selectable and clickable objects in the video scene whilst the user consumes the 3D video clip. The proposed system involves 3D video processing, such as detecting/tracking clickable objects, annotating objects, and metadata engineering including 3D content descriptive protocol. Such system attracts the attention from both home and professional users and more specifically broadcasters and digital content providers. The experiment is conducted on full parallax holoscopic 3D videos “also known as integral images”.展开更多
目前主流开源爬虫框架在分析页面与主题领域关联性上,常采用基于关键词的量化和向量空间模型算法相融合,但融合疏忽了界面语义与特定主题间的关联,导致爬取内容与主题产生偏差。为了给金融等领域的舆情分析提供准确的数据支撑,提出一种...目前主流开源爬虫框架在分析页面与主题领域关联性上,常采用基于关键词的量化和向量空间模型算法相融合,但融合疏忽了界面语义与特定主题间的关联,导致爬取内容与主题产生偏差。为了给金融等领域的舆情分析提供准确的数据支撑,提出一种面向领域扩展主题库的爬虫及系统,通过扩展主题特征库,融合向量空间模型(Vector Space Model,VSM)与超链接主题搜索算法(Hyperlink-Induced Topic Search,HITS),优化了主题页面相关度计算,并针对股票舆情信息爬取进行仿真。结果表明,上述扩展主题型爬虫在爬取准确率和效率等方面有较好地提升,能够有效地完成领域主题信息的爬取任务。展开更多
基金Sponsored bythe Huo Ying-Dong Education Foundation of China(91101)
文摘A web page clustering algorithm called PageCluster and the improved algorithm ImPageCluster solving overlapping are proposed. These methods not only take the web structure and page hyperlink into account, but also consider the importance of each page which is described as in-weight and out-weight. Compared with the traditional clustering methods, the experiments show that the runtimes of the proposed algorithms are less with the improved accuracies.
文摘Recently, 3D display technology, and content creation tools have been undergone rigorous development and as a result they have been widely adopted by home and professional users. 3D digital repositories are increasing and becoming available ubiquitously. However, searching and visualizing 3D content remains a great challenge. In this paper, we propose and present the development of a novel approach for creating hypervideos, which ease the 3D content search and retrieval. It is called the dynamic hyperlinker for 3D content search and retrieval process. It advances 3D multimedia navigability and searchability by creating dynamic links for selectable and clickable objects in the video scene whilst the user consumes the 3D video clip. The proposed system involves 3D video processing, such as detecting/tracking clickable objects, annotating objects, and metadata engineering including 3D content descriptive protocol. Such system attracts the attention from both home and professional users and more specifically broadcasters and digital content providers. The experiment is conducted on full parallax holoscopic 3D videos “also known as integral images”.
文摘目前主流开源爬虫框架在分析页面与主题领域关联性上,常采用基于关键词的量化和向量空间模型算法相融合,但融合疏忽了界面语义与特定主题间的关联,导致爬取内容与主题产生偏差。为了给金融等领域的舆情分析提供准确的数据支撑,提出一种面向领域扩展主题库的爬虫及系统,通过扩展主题特征库,融合向量空间模型(Vector Space Model,VSM)与超链接主题搜索算法(Hyperlink-Induced Topic Search,HITS),优化了主题页面相关度计算,并针对股票舆情信息爬取进行仿真。结果表明,上述扩展主题型爬虫在爬取准确率和效率等方面有较好地提升,能够有效地完成领域主题信息的爬取任务。