期刊文献+
共找到298篇文章
< 1 2 15 >
每页显示 20 50 100
分布式Web Crawler的研究:结构、算法和策略 被引量:23
1
作者 叶允明 于水 +2 位作者 马范援 宋晖 张岭 《电子学报》 EI CAS CSCD 北大核心 2002年第12A期2008-2011,共4页
本文介绍了一个大型分布式Web Crawler系统——Igloo 1.2版。它采用分布式的系统结构,通过我们设计的二级哈希映射算法使系统可以进行高效的任务分割,并且系统的规模动态可扩展.爬行网页的质量是评价Crawler的一个重要指标,Igloo以PageR... 本文介绍了一个大型分布式Web Crawler系统——Igloo 1.2版。它采用分布式的系统结构,通过我们设计的二级哈希映射算法使系统可以进行高效的任务分割,并且系统的规模动态可扩展.爬行网页的质量是评价Crawler的一个重要指标,Igloo以PageRank值作为网页质量评价的标准,从而提高了爬行质量.加快爬行速度的关键是如何解除Crawler系统中的性能瓶颈,本文对此也作了详细的讨论,并提出了一种基于“滞后合并”策略的UBL数据库存取方法.实验表明,Igloo在保持高性能的同时能快速爬行到高质量的网页. 展开更多
关键词 WEB爬虫 爬行策略 分布式系统 计算机网络 网页
下载PDF
一种并行Crawler系统中的URL分配算法设计 被引量:1
2
作者 万源 万方 王大震 《计算机工程与应用》 CSCD 北大核心 2006年第A01期117-119,共3页
研究了分布式体系结构下的并行Crawler采集模型,分析了各组件的功能及各Cmwler在并行搜索时,为保证系统的负载均衡而应遵循的基本规则,并提出了一种基于散列(hash)的URL的调度算法。
关键词 分布式crawler 散列算法 URL分配
下载PDF
面向动态网页爬行的Crawler架构 被引量:7
3
作者 严亚兰 《图书情报知识》 CSSCI 北大核心 2003年第4期51-53,共3页
 本文分析了Crawler动态网页爬行功能,提出了面向动态网页爬行的Crawler架构,并对相应模块进行了探讨。
关键词 crawler架构 爬行 动态网页
下载PDF
基于神经网络的增量式crawler重访频率研究 被引量:1
4
作者 周英飚 王军 《华中科技大学学报(自然科学版)》 EI CAS CSCD 北大核心 2004年第12期32-33,45,共3页
crawler是搜索引擎必备的核心组件 ,以何种频率对变化的Web页面进行重访是增量式crawler要解决的主要问题 .结合人工神经网络建立页面变化模型 ,由模型确定增量式crawler重访时间 ,同时分析模型在实践中的应用 ,提出一种应用方案 ,具有... crawler是搜索引擎必备的核心组件 ,以何种频率对变化的Web页面进行重访是增量式crawler要解决的主要问题 .结合人工神经网络建立页面变化模型 ,由模型确定增量式crawler重访时间 ,同时分析模型在实践中的应用 ,提出一种应用方案 ,具有较好的自适应性 . 展开更多
关键词 搜索引擎 crawler 增量式crawler 神经网络
下载PDF
履带式王草收获机驱动系统设计
5
作者 赵彦瑞 尤泳 +4 位作者 惠云婷 王海翼 张学宁 金峤 王德成 《农机化研究》 北大核心 2025年第2期101-104,共4页
王草是优质的多年生饲草作物,在我国南方地区广泛种植,具有直立簇生的特征,机械化平茬收获难度大。为此,基于王草机械化收获的技术要求,提出了一种驱动系统技术方案,介绍了其结构组成和工作原理,设计研制了驱动系统。经过理论分析与计算... 王草是优质的多年生饲草作物,在我国南方地区广泛种植,具有直立簇生的特征,机械化平茬收获难度大。为此,基于王草机械化收获的技术要求,提出了一种驱动系统技术方案,介绍了其结构组成和工作原理,设计研制了驱动系统。经过理论分析与计算,完成了驱动系统的主要参数匹配,并实现了驱动系统的样机搭载。性能试验表明:样机最高行驶速度可达9.13 km/h,能够在低速挡位内无级变速,直行偏驶率为4.115%,达到了王草机械化收获的作业要求。 展开更多
关键词 王草收获机 驱动系统 履带式 静液压无级变速
下载PDF
面向主题Crawler的设计与实现 被引量:1
6
作者 苗长芬 冯伟华 《平原大学学报》 2005年第3期110-112,共3页
针对目前通用搜索引擎所搜索到的结果过多,与主题相关性不强的情况,提出了面向主题的搜索引擎,文章以主题相关度为核心研究和设计了主题crawler,为进行主题搜索引擎的研究奠定了良好的基础.
关键词 crawler 主题搜索引擎 相关度
下载PDF
基于Crawler技术的超链接测试系统
7
作者 吉向东 《信息技术》 2009年第9期106-108,共3页
设计和实现了一个基于搜索引擎Crawler技术的超链接测试系统。通过将Crawler的爬行范围限制在一个网站之内,系统可以自动对待测网站进行扫描,有效地找出烂链和孤页。测试表明,相对于其它超链接测试产品而言,系统的测试自动化程度较高,... 设计和实现了一个基于搜索引擎Crawler技术的超链接测试系统。通过将Crawler的爬行范围限制在一个网站之内,系统可以自动对待测网站进行扫描,有效地找出烂链和孤页。测试表明,相对于其它超链接测试产品而言,系统的测试自动化程度较高,为测试人员提供了较为丰富的控制手段。 展开更多
关键词 crawler 超链接 测试 烂链 孤页
下载PDF
分布式Crawler系统研究与设计
8
作者 万方 王大震 《软件导刊》 2007年第5期45-46,共2页
分布式Crawler系统是在传统集中式信息采集系统基础上,结合分布式并行技术的产物,是搜索引擎的一个重要组成部分。研究了分布式Crawler系统中并行调度和URL处理的主要实现技术,并设计了一个分布式Crawler系统,对其中的任务划分机制和UR... 分布式Crawler系统是在传统集中式信息采集系统基础上,结合分布式并行技术的产物,是搜索引擎的一个重要组成部分。研究了分布式Crawler系统中并行调度和URL处理的主要实现技术,并设计了一个分布式Crawler系统,对其中的任务划分机制和URL检索算法作了详细描述。 展开更多
关键词 分布式crawler 并行调度 URL检索
下载PDF
Experimental Study on the Ride Comfort of a Crawler Power Chassis Scale Model Based on the Similitude Theory 被引量:2
9
作者 ZHAO Jianzhu WANG Fengchen +2 位作者 YU Bin TONG Pengcheng CHEN Kuifu 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2015年第3期496-503,共8页
The ride comfort experimental assessment of crawler off-road vehicle is relatively overlooked, and is expensive and difficult to execute with higher and higher ride comfort performance requirements. To trade off betwe... The ride comfort experimental assessment of crawler off-road vehicle is relatively overlooked, and is expensive and difficult to execute with higher and higher ride comfort performance requirements. To trade off between precise and cost, an experimental method based on the similitude theory is proposed. Under the guidance of the similitude theory, a 1:5 crawler power chassis scale model equipped with a kind of variable stiffness suspension system is used. The power spectrum density(PSD), the root mean square(RMS) of weighed acceleration, peak factor, average absorbed power(AAP) and vibration dose value(VDV) are selected as ride comfort evaluation indexes, and tests results are transformed via similarity indexes to predict the performance of full-scale power chassis. PSD shows that the low-order natural frequency of the vertical natural frequency(z axis) is 1.1 Hz, and the RMS, AAP and VDV values indicate the ride comfort performance of this kind of power chassis is between the "A little uncomfortable" and "Rather uncomfortable". From the results, low-order vertical natural frequency, obtained by PSD, validates that the similarity relationship between two models is satisfied, and 1:5 scale model used in experiment meets the similarity relationship with the full-scale model; consequently, the ride comfort prophase evaluation with the 1:5 scale model is feasible. The attempt of applying the similitude theory to crawler vehicle ride comfort test study decreases the cost and improves the test feasibility with sufficient test precise. 展开更多
关键词 crawler power chassis scale model similitude theory off-road vehicle ride comfort
下载PDF
Hierarchical Stream Clustering Based NEWS Summarization System 被引量:2
10
作者 M.Arun Manicka Raja S.Swamynathan 《Computers, Materials & Continua》 SCIE EI 2022年第1期1263-1280,共18页
News feed is one of the potential information providing sources which give updates on various topics of different domains.These updates on various topics need to be collected since the domain specific interested users... News feed is one of the potential information providing sources which give updates on various topics of different domains.These updates on various topics need to be collected since the domain specific interested users are in need of important updates in their domains with organized data from various sources.In this paper,the news summarization system is proposed for the news data streams from RSS feeds and Google news.Since news stream analysis requires live content,the news data are continuously collected for our experimentation.Themajor contributions of thiswork involve domain corpus based news collection,news content extraction,hierarchical clustering of the news and summarization of news.Many of the existing news summarization systems lack in providing dynamic content with domain wise representation.This is alleviated in our proposed systemby tagging the news feed with domain corpuses and organizing the news streams with the hierarchical structure with topic wise representation.Further,the news streams are summarized for the users with a novel summarization algorithm.The proposed summarization system generates topic wise summaries effectively for the user and no system in the literature has handled the news summarization by collecting the data dynamically and organizing the content hierarchically.The proposed system is compared with existing systems and achieves better results in generating news summaries.The Online news content editors are highly benefitted by this system for instantly getting the news summaries of their domain interest. 展开更多
关键词 News feed content similarity parallel crawler collaborative filtering hierarchical clustering news summarization
下载PDF
Understanding pollution dynamics in large-scale peer-to-peer IPTV system 被引量:2
11
作者 王海舟 陈兴蜀 +1 位作者 王文贤 郝正鸿 《Journal of Central South University》 SCIE EI CAS 2012年第8期2203-2217,共15页
With the great commercial success of several IPTV (internet protocal television) applications, PPLive has received more and more attention from both industry and academia. At present, PPLive system is one of the most ... With the great commercial success of several IPTV (internet protocal television) applications, PPLive has received more and more attention from both industry and academia. At present, PPLive system is one of the most popular instances of IPTV applications which attract a large number of users across the globe; however, the dramatic rise in popularity makes it more likely to become a vulnerable target. The main contribution of this work is twofold. Firstly, a dedicated distributed crawler system was proposed and its crawling performance was analyzed, which was used to evaluate the impact of pollution attack in P2P live streaming system. The measurement results reveal that the crawler system with distributed architecture could capture PPLive overlay snapshots with more efficient way than previous crawlers. To the best of our knowledge, our study work is the first to employ distributed architecture idea to design crawler system and discuss the crawling performance of capturing accurate overlay snapshots for P2P live streaming system. Secondly, a feasible and effective pollution architecture was proposed to deploy content pollution attack in a real-world P2P live streaming system called PPLive, and deeply evaluate the impact of pollution attack from following five aspects:dynamic evolution of participating users, user lifetime characteristics, user connectivity-performance, dynamic evolution of uploading polluted chunks and dynamic evolution of pollution ratio. Specifically, the experiment results show that a single polluter is capable of compromising all the system and its destructiveness is severe. 展开更多
关键词 peer-to-peer technology internet protocol television active measurement distributed crawler pollution attack PPLIVE
下载PDF
Stress Path Analysis of Deep-Sea Sediments Under the Compression-Shear Coupling Load of Crawler Collectors 被引量:1
12
作者 ZHANG Ning MA Ning +2 位作者 YIN Shiyang CHEN Xuguang SONG Yuheng 《Journal of Ocean University of China》 SCIE CAS CSCD 2023年第1期65-74,共10页
The mechanical properties of deep-sea sediments during the driving process of crawler collectors are essential factors in the design of mining systems.In this study,a crawler load is divided into a normal compression ... The mechanical properties of deep-sea sediments during the driving process of crawler collectors are essential factors in the design of mining systems.In this study,a crawler load is divided into a normal compression load and a horizontal shear load.Then,the internal stress state of sedimentary soil is examined through a theoretical calculation and finite element numerical simulation.Finally,the driving of crawlers is simulated by changing the relative spatial position between the load and stress unit,obtaining the stress path of the soil unit.Based on the calculation results,the effect of the horizontal shear load on the soil stress response is analyzed at different depths,and the spatial variation law of the soil stress path is examined.The results demonstrate that the horizontal shear load has a significant effect on the rotation of the principal stress,and the reverse rotation of the principal stress axis becomes obvious with the increase in the burial depth.The stress path curve of the soil is different at various depths.The spatial variation rule of the stress path of the shallow soil is complex,whereas the stress path curve of the deep soil tends to shrink as the depth increases.The stress path of the corresponding depth should be selected according to the actual research purpose and applied to the laboratory test. 展开更多
关键词 deep-sea sediment crawler collector compression-shear coupling load stress path principal stress axis direction
下载PDF
聚焦式Web Crawler工具的设计与开发
13
作者 唐详 《情报杂志》 CSSCI 北大核心 2005年第4期58-60,共3页
进行了一种面向特定领域主题搜索的实践——聚焦式WebCrawler。分析了搜索引擎和聚类算法的一般工作原理,并指出其不足。在此基础上,综合两者的优点形成了聚焦式WebCrawler工具,介绍了该工具的主要技术及实现方式。
关键词 主题挖掘 搜索引擎 WEB crawler 自动分类 聚类算法
下载PDF
Prototype Line Crawler for Power Line Inspection 被引量:1
14
作者 Rupert Gouws Nicolaas du Plessis 《Journal of Energy and Power Engineering》 2013年第11期2174-2180,共7页
In South Africa, electricity is supplied through thousands-of-kilometers of overhead power cables, which is owned by Eskom the national energy supplier. Currently monitoring of these overhead power cables are done by ... In South Africa, electricity is supplied through thousands-of-kilometers of overhead power cables, which is owned by Eskom the national energy supplier. Currently monitoring of these overhead power cables are done by means of helicopter inspection flights and foot patrols, which are infrequent and expensive. In this paper, the authors present the design of a prototype power line crawler (inspection robot) for the monitoring of these overhead power lines in South Africa. The designed prototype power line crawler is capable of driving on the wire, balancing on the wire and is capable of maneuvering past certain obstacles found on the overhead power cables. The prototype power line crawler is designed to host a monitoring system that monitors the power line as the inspection robot drives on it. Various experimental tests were performed and are presented in this paper, showing the capability of performing these tasks. This prototype inspection robot ensures a platform for future development in this area. 展开更多
关键词 Inspection robot line crawler power lines maneuverabiliW center of ~ravitv balancing.
下载PDF
Body Structure Design, Simulation and Parameter Optimization of a Crawler Type Mobile Flat Bedplate Frame Composite Car 被引量:1
15
作者 LUO Qing-sheng HOU Yuan +1 位作者 ZHANG Ze-zheng ZHAO Hai-bo 《Computer Aided Drafting,Design and Manufacturing》 2013年第4期53-57,共5页
The report mainly studied the crawler frame motion platform to reduce weight and increase the intensity. Report described the structural design process which using CAD/CAE technology for solid modeling, simulation ana... The report mainly studied the crawler frame motion platform to reduce weight and increase the intensity. Report described the structural design process which using CAD/CAE technology for solid modeling, simulation analysis, parameter optimization. And it also explained the outstanding advantages of CAD/CAE technology in mechanical design as well as simulation analysis. 展开更多
关键词 crawler robot flame solid modeling simulation Analysis parameter optimization CAD/CAE technology
下载PDF
A Survey about Algorithms Utilized by Focused Web Crawler
16
作者 Yong-Bin Yu Shi-Lei Huang +3 位作者 Nyima Tashi Huan Zhang Fei Lei Lin-Yang Wu 《Journal of Electronic Science and Technology》 CAS CSCD 2018年第2期129-138,共10页
Abstract—Focused crawlers (also known as subjectoriented crawlers), as the core part of vertical search engine, collect topic-specific web pages as many as they can to form a subject-oriented corpus for the latter ... Abstract—Focused crawlers (also known as subjectoriented crawlers), as the core part of vertical search engine, collect topic-specific web pages as many as they can to form a subject-oriented corpus for the latter data analyzing or user querying. This paper demonstrates that the popular algorithms utilized at the process of focused web crawling, basically refer to webpage analyzing algorithms and crawling strategies (prioritize the uniform resource locator (URLs) in the queue). Advantages and disadvantages of three crawling strategies are shown in the first experiment, which indicates that the best-first search with an appropriate heuristics is a smart choice for topic-oriented crawlingwhile the depth-first search is helpless in focused crawling. Besides, another experiment on comparison of improved ones (with a webpage analyzing algorithm added) is carried out to verify that crawling strategies alone are not quite efficient for focused crawling and in most cases their mutual efforts are taken into consideration. In light of the experiment results and recent researches, some points on the research tendency of focused crawler algorithms are suggested. 展开更多
关键词 Crawling strategies focused crawler harvest rate uniform resource locator(URL) prioritizing webpage analyzing
下载PDF
Analysis on steering characteristics of crawler pipeline robot
17
作者 耿林康 RAO Jinjun LEI Jingtao 《High Technology Letters》 EI CAS 2023年第1期60-67,共8页
In order to improve the elbow passing performance and different diameter adaptability of pipeline robot,a supported crawler pipeline robot is designed,which adopts screw nut mechanism and hinge four-bar mechanism to a... In order to improve the elbow passing performance and different diameter adaptability of pipeline robot,a supported crawler pipeline robot is designed,which adopts screw nut mechanism and hinge four-bar mechanism to adapt to the complex environment such as variable diameter pipeline and elbow.The steering characteristics passing through the elbow are studied,the kinematic of pipeline robot bending steering is established,the geometric constraint(GC)and steering constraint(SC)in the elbow are analyzed,and the steering experiment is conducted.The results show that the robot can pass through the elbow by the SC model.The SC model can reduce the motor current and energy consumption when the robot passes through the elbow. 展开更多
关键词 crawler pipeline robot steering characteristics geometric constraint(GC) steering constraint(SC)
下载PDF
Crawler for Nodes in the Internet of Things
18
作者 Xuemeng Li Yongyi Wang +1 位作者 Fan Shi Wenchao Jia 《ZTE Communications》 2015年第3期46-50,共5页
Determining the application and version of nodes in the Internet of Things (IoT) is very important for warning about and managing vulnerabilities in the IoT. This article defines the attributes for determining the a... Determining the application and version of nodes in the Internet of Things (IoT) is very important for warning about and managing vulnerabilities in the IoT. This article defines the attributes for determining the application and version of nodes in the roT. By improving the structure of the Internet web crawler, which obtains raw data from nodes, we can obtain data from nodes in the IoT. We improve on the existing strategy, in which only determinations are stored, by also storing downloaded raw data locally in MongoDB. This stored raw data can be conveniently used to determine application type and node version when a new determination method emerges or when there is a new application type or node version. In such instances, the crawler does not have to scan the Internet again. We show through experimentation that our crawler can crawl the loT and obtain data necessary for determining the application type and node version. 展开更多
关键词 crawler local storage NODES Internet of Things
下载PDF
基于Crawler4j和Quartz的分布式爬虫系统
19
作者 刘晓东 林凤德 朱文欢 《科技创新与应用》 2020年第13期15-16,共2页
网络爬虫是实现数据分析的重要基础,通过网络爬虫可实现对数据的获取。针对爬虫的渠道广、数量多且杂、单点效率低的问题,引入了轻量级的多线程爬虫框架Crawler4j和分布式定时任务调度框架Quartz,文章主要基于这两个框架来搭建稳定、高... 网络爬虫是实现数据分析的重要基础,通过网络爬虫可实现对数据的获取。针对爬虫的渠道广、数量多且杂、单点效率低的问题,引入了轻量级的多线程爬虫框架Crawler4j和分布式定时任务调度框架Quartz,文章主要基于这两个框架来搭建稳定、高效的分布式爬虫系统。 展开更多
关键词 网络爬虫 分布式 crawler4j QUARTZ
下载PDF
Research on Key Technology of Web Vulnerability Detection System Based on Cloud Environment
20
作者 Zhang Zhen 《International Journal of Technology Management》 2013年第12期121-124,共4页
In the light of the defect of web vulnerability detection system, combined with the characteristics of high efficient and sharing in the cloud environment, a design proposal is presented based on cloud environment, wh... In the light of the defect of web vulnerability detection system, combined with the characteristics of high efficient and sharing in the cloud environment, a design proposal is presented based on cloud environment, which analyses the key technology of gaining the URL, task allocation and scheduling and the design of attack detection. Experiment shows its feasibility and effectiveness in this paper. 展开更多
关键词 cloud technology web crawler task allocation and scheduling detection of SQL injection XSS detection
下载PDF
上一页 1 2 15 下一页 到第
使用帮助 返回顶部