Web crawlers are an important part of modern search engines.With the development of the times,data has exploded and humans have entered a“big data era”.For example,Wikipedia carries the knowledge from all over the w...Web crawlers are an important part of modern search engines.With the development of the times,data has exploded and humans have entered a“big data era”.For example,Wikipedia carries the knowledge from all over the world,records the realtime news that occurs every day,and provides users with a good database of data,but because of the large amount of data,it puts a lot of pressure on users to search.At present,single-threaded crawling data can no longer meet the requirements of text crawling.In order to improve the performance and program versatility of single-threaded crawlers,a high-speed multi-threaded web crawler is designed to crawl the network hyper-scale text database.Multi-threaded crawling uses multiple threads to process web pages in parallel,combining breadth-first and depth-first algorithms to control web crawling.The practice project is based on the Python language to achieve multi-threaded optimization network hyper-large-scale text database-Wikipedia book crawling method,the project is inspired by the article on the Wikipedia article in the Big Data Digest public number.展开更多
This paper introduces the general process of the search algorithm Structure through the knight problem. According to the characteristics of the problem, we detailed discuss the DFS(Depth First Search) algorithm and ...This paper introduces the general process of the search algorithm Structure through the knight problem. According to the characteristics of the problem, we detailed discuss the DFS(Depth First Search) algorithm and BFS(Breadth First Search) algorithm, and combine the two algorithms together to solve the knights coverage problem. This article has a good reference for the mixed-use scenarios which requires a variety of search algorithms.展开更多
电力系统仿真验证往往希望通过拓扑结构图直观地分析网络的潮流分布以及动态特性。然而电力系统机电暂态过程仿真软件如BPA、PSS/E和PSASP都不能自动地根据电力系统的电气联系合理地布置网络中的元件,而需要人为地调整各元件的位置来形...电力系统仿真验证往往希望通过拓扑结构图直观地分析网络的潮流分布以及动态特性。然而电力系统机电暂态过程仿真软件如BPA、PSS/E和PSASP都不能自动地根据电力系统的电气联系合理地布置网络中的元件,而需要人为地调整各元件的位置来形成一个直观的电气接线图。这种人为手动调整,不仅给仿真增加了工作量,更有可能带来更多的人为误差。为此,文中提出了基于图论的深度优先搜索(depth first searching,DFS)算法,依据电力系统的电气拓扑结构形成电力系统生成树的实现方法。用文中方法生成的IEEE9节点算例系统的可视化界面验证了该算法的有效性和准确性。展开更多
实现各类预想故障下潮流转移比快速仿真分析是电网安全稳定运行的重要保证。针对现有实际运行方式中潮流转移分析困难问题,提出大规模电力系统潮流转移比多核并行批处理方法。该方法基于广泛使用的商业大系统分析工具,在参数解析分类、...实现各类预想故障下潮流转移比快速仿真分析是电网安全稳定运行的重要保证。针对现有实际运行方式中潮流转移分析困难问题,提出大规模电力系统潮流转移比多核并行批处理方法。该方法基于广泛使用的商业大系统分析工具,在参数解析分类、故障自动设置及结果解析的基础上,引入深度优先搜索(depth first search,DFS)算法进行孤立节点和孤岛区域检测以保证网络完整性,结合潮流计算合理性的自动判别以实现潮流转移比的批处理分析;同时在多核环境下,构建基于Fork/Join的并行框架,采用"分治模式"递归分解计算任务,从而实现分析方法的多核并行。算例仿真和在云南电网的实际应用验证了所提方法的有效性和快速性。展开更多
基金This research is funded by the Open Foundation for the University Innovation Platform in the Hunan Province,grant number 16K013Hunan Provincial Natural Science Foundation of China,grant number 2017JJ2016+2 种基金2016 Science Research Project of Hunan Provincial Department of Education,grant number 16C0269.Accurate crawler design and implementation with a data cleaning function,National Students innovation and entrepreneurship of training program,grant number 201811532010.This research work is implemented at the 2011 Collaborative Innovation Center for Development and Utilization of Finance and Economics Big Data Property,Universities of Hunan Province.Open Foundation for the University Innovation Platform in the Hunan Province,grant number 16K013Hunan Provincial Natural Science Foundation of China,grant number 2017JJ20162016 Science Research Project of Hunan Provincial Department of Education,grant number 16C0269.This research work is implemented at the 2011 Collaborative Innovation Center for Development and Utilization of Finance and Economics Big Data Property,Universities of Hunan Province.Open project,grant number 20181901CRP03,20181901CRP04,20181901CRP05.
文摘Web crawlers are an important part of modern search engines.With the development of the times,data has exploded and humans have entered a“big data era”.For example,Wikipedia carries the knowledge from all over the world,records the realtime news that occurs every day,and provides users with a good database of data,but because of the large amount of data,it puts a lot of pressure on users to search.At present,single-threaded crawling data can no longer meet the requirements of text crawling.In order to improve the performance and program versatility of single-threaded crawlers,a high-speed multi-threaded web crawler is designed to crawl the network hyper-scale text database.Multi-threaded crawling uses multiple threads to process web pages in parallel,combining breadth-first and depth-first algorithms to control web crawling.The practice project is based on the Python language to achieve multi-threaded optimization network hyper-large-scale text database-Wikipedia book crawling method,the project is inspired by the article on the Wikipedia article in the Big Data Digest public number.
文摘This paper introduces the general process of the search algorithm Structure through the knight problem. According to the characteristics of the problem, we detailed discuss the DFS(Depth First Search) algorithm and BFS(Breadth First Search) algorithm, and combine the two algorithms together to solve the knights coverage problem. This article has a good reference for the mixed-use scenarios which requires a variety of search algorithms.
文摘电力系统仿真验证往往希望通过拓扑结构图直观地分析网络的潮流分布以及动态特性。然而电力系统机电暂态过程仿真软件如BPA、PSS/E和PSASP都不能自动地根据电力系统的电气联系合理地布置网络中的元件,而需要人为地调整各元件的位置来形成一个直观的电气接线图。这种人为手动调整,不仅给仿真增加了工作量,更有可能带来更多的人为误差。为此,文中提出了基于图论的深度优先搜索(depth first searching,DFS)算法,依据电力系统的电气拓扑结构形成电力系统生成树的实现方法。用文中方法生成的IEEE9节点算例系统的可视化界面验证了该算法的有效性和准确性。
文摘实现各类预想故障下潮流转移比快速仿真分析是电网安全稳定运行的重要保证。针对现有实际运行方式中潮流转移分析困难问题,提出大规模电力系统潮流转移比多核并行批处理方法。该方法基于广泛使用的商业大系统分析工具,在参数解析分类、故障自动设置及结果解析的基础上,引入深度优先搜索(depth first search,DFS)算法进行孤立节点和孤岛区域检测以保证网络完整性,结合潮流计算合理性的自动判别以实现潮流转移比的批处理分析;同时在多核环境下,构建基于Fork/Join的并行框架,采用"分治模式"递归分解计算任务,从而实现分析方法的多核并行。算例仿真和在云南电网的实际应用验证了所提方法的有效性和快速性。