Most of the neural network architectures are based on human experience,which requires a long and tedious trial-and-error process.Neural architecture search(NAS)attempts to detect effective architectures without human ...Most of the neural network architectures are based on human experience,which requires a long and tedious trial-and-error process.Neural architecture search(NAS)attempts to detect effective architectures without human intervention.Evolutionary algorithms(EAs)for NAS can find better solutions than human-designed architectures by exploring a large search space for possible architectures.Using multiobjective EAs for NAS,optimal neural architectures that meet various performance criteria can be explored and discovered efficiently.Furthermore,hardware-accelerated NAS methods can improve the efficiency of the NAS.While existing reviews have mainly focused on different strategies to complete NAS,a few studies have explored the use of EAs for NAS.In this paper,we summarize and explore the use of EAs for NAS,as well as large-scale multiobjective optimization strategies and hardware-accelerated NAS methods.NAS performs well in healthcare applications,such as medical image analysis,classification of disease diagnosis,and health monitoring.EAs for NAS can automate the search process and optimize multiple objectives simultaneously in a given healthcare task.Deep neural network has been successfully used in healthcare,but it lacks interpretability.Medical data is highly sensitive,and privacy leaks are frequently reported in the healthcare industry.To solve these problems,in healthcare,we propose an interpretable neuroevolution framework based on federated learning to address search efficiency and privacy protection.Moreover,we also point out future research directions for evolutionary NAS.Overall,for researchers who want to use EAs to optimize NNs in healthcare,we analyze the advantages and disadvantages of doing so to provide detailed guidance,and propose an interpretable privacy-preserving framework for healthcare applications.展开更多
With the construction of the power Internet of Things(IoT),communication between smart devices in urban distribution networks has been gradually moving towards high speed,high compatibility,and low latency,which provi...With the construction of the power Internet of Things(IoT),communication between smart devices in urban distribution networks has been gradually moving towards high speed,high compatibility,and low latency,which provides reliable support for reconfiguration optimization in urban distribution networks.Thus,this study proposed a deep reinforcement learning based multi-level dynamic reconfiguration method for urban distribution networks in a cloud-edge collaboration architecture to obtain a real-time optimal multi-level dynamic reconfiguration solution.First,the multi-level dynamic reconfiguration method was discussed,which included feeder-,transformer-,and substation-levels.Subsequently,the multi-agent system was combined with the cloud-edge collaboration architecture to build a deep reinforcement learning model for multi-level dynamic reconfiguration in an urban distribution network.The cloud-edge collaboration architecture can effectively support the multi-agent system to conduct“centralized training and decentralized execution”operation modes and improve the learning efficiency of the model.Thereafter,for a multi-agent system,this study adopted a combination of offline and online learning to endow the model with the ability to realize automatic optimization and updation of the strategy.In the offline learning phase,a Q-learning-based multi-agent conservative Q-learning(MACQL)algorithm was proposed to stabilize the learning results and reduce the risk of the next online learning phase.In the online learning phase,a multi-agent deep deterministic policy gradient(MADDPG)algorithm based on policy gradients was proposed to explore the action space and update the experience pool.Finally,the effectiveness of the proposed method was verified through a simulation analysis of a real-world 445-node system.展开更多
The flexibility of traditional image processing system is limited because those system are designed for specific applications. In this paper, a new TMS320C64x-based multi-DSP parallel computing architecture is present...The flexibility of traditional image processing system is limited because those system are designed for specific applications. In this paper, a new TMS320C64x-based multi-DSP parallel computing architecture is presented. It has many promising characteristics such as powerful computing capability, broad I/O bandwidth, topology flexibility, and expansibility. The parallel system performance is evaluated by practical experiment.展开更多
In this paper, we propose two kinds of modifications in speaker recognition. First, the correlations between frequency channels are of prime importance for speaker recognition. Some of these correlations are lost when...In this paper, we propose two kinds of modifications in speaker recognition. First, the correlations between frequency channels are of prime importance for speaker recognition. Some of these correlations are lost when the frequency domain is divided into sub-bands. Consequently we propose a particularly redundant parallel architecture for which most of the correlations are kept. Second, generally a log transformation used to modify the power spectrum is done after the filter-bank in the classical spectrum calculation. We will see that performing this transformation before the filter bank is more interesting in our case. In the processing of recognition, the Gaussian mixture model (GMM) recognition arithmetic is adopted. Experiments on speech corrupted by noise show a better adaptability of this approach in noisy environments, comoared with a conventional device, esoeciallv when oruning of some recognizers is performed.展开更多
We propose a content-based parallel image retrieval system to achieve high responding ability. Our system is developed on cluster architectures. It has several retrieval. servers to supply the service of content-based...We propose a content-based parallel image retrieval system to achieve high responding ability. Our system is developed on cluster architectures. It has several retrieval. servers to supply the service of content-based image retrieval. It adopts the Browser/Server (B/S) mode. The users could visit our system though web pages. It uses the symmetrical color-spatial features (SCSF) to represent the content of an image. The SCSF is effective and efficient for image matching because it is independent of image distortion such as rotation and flip as well as it increases the matching accuracy. The SCSF was organized by M-tree, which could speedup the searching procedure. Our experiments show that the image matching is quickly and efficiently with the use of SCSF. And with the support of several retrieval servers, the system could respond to many users at mean time. Key words content-based image retrieval - cluster architecture - color-spatial feature - B/S mode - task parallel - WWW - Internet CLC number TP391 Foundation item: Supported by the National Natural Science Foundation of China (60173058)Biography: ZHOU Bing (1975-), male, Ph. D candidate, reseach direction: data mining, content-based image retrieval.展开更多
Computing resources are one of the key factors restricting the extraction of marine targets by using deep learning.In order to increase computing speed and shorten the computing time,parallel distributed architecture ...Computing resources are one of the key factors restricting the extraction of marine targets by using deep learning.In order to increase computing speed and shorten the computing time,parallel distributed architecture is adopted to extract marine targets.The advantages of two distributed architectures,Parameter Server and Ring-allreduce architecture,are combined to design a parallel distributed architecture suitable for deep learning–Optimal Interleaved Distributed Architecture(OIDA).Three marine target extraction methods including OTD_StErf,OTD_Loglogistic and OTD_Sgmloglog are used to test OIDA,and a total of 18 experiments in 3categories are carried out.The results show that OIDA architecture can meet the timeliness requirements of marine target extraction.The average speed of target parallel extraction with single-machine 8-core CPU is 5.75 times faster than that of single-machine single-core CPU,and the average speed with 5-machine 40-core CPU is 20.75 times faster.展开更多
Architecture singularity of a parallel mechanism with five degrees of freedom (DOF) is analyzed. Such mechanism consists of a movable platform connected to the base by five active limbs. Four of them are identical 6-D...Architecture singularity of a parallel mechanism with five degrees of freedom (DOF) is analyzed. Such mechanism consists of a movable platform connected to the base by five active limbs. Four of them are identical 6-DOF limbs and the last one has the same DOF as the specified DOF of the movable platform. Based on the kinematics analysis, two categories of architecture singularities for such mechanism are proposed. Then the sufficient condition for each singularity is researched. Results show that the mechanism is singular when it employs each category of the proposed architecture, provided that it satisfies the corresponding sufficient condition. It can be concluded that the proposed two categories of architecture singularities should be avoided with the following dimensional synthesis of such mechanism.展开更多
Ray tracing is a computer graphics method that renders images realistically. As the name suggests, this technique primarily traces the path of light rays interacting with objects in a scene [1], permitting the calcula...Ray tracing is a computer graphics method that renders images realistically. As the name suggests, this technique primarily traces the path of light rays interacting with objects in a scene [1], permitting the calculation of lighting and reflecting impact [2]. As ray tracing is a time-consuming process, the need for parallelization to solve this problem arises. One downside of this solution is the existence of race conditions. In this work, we explore and experiment with a different, well-known solution for this race condition. Starting with the introduction and the background section, a brief overview of the topic is followed by a detailed part of how the race conditions may occur in the case of the ray tracing algorithm. Continuing with the methods and results section, we have used OpenMP to parallelize the Ray tracing algorithm with the different compiler directives critical, atomic, and first-private. Hence, it concluded that both critical and atomic are not efficient solutions to produce a good-quality picture, but first-private succeeded in producing a high-quality picture.展开更多
An optimal algorithmic approach to task scheduling for, triplet based architecture(TriBA), is proposed in this paper. TriBA is considered to be a high performance, distributed parallel computing architecture. TriBA ...An optimal algorithmic approach to task scheduling for, triplet based architecture(TriBA), is proposed in this paper. TriBA is considered to be a high performance, distributed parallel computing architecture. TriBA consists of a 2D grid of small, programmable processing units, each physically connected to its three neighbors. In parallel or distributed environment an efficient assignment of tasks to the processing elements is imperative to achieve fast job turnaround time. Moreover, the sojourn time experienced by each individual job should be minimized. The arriving jobs are comprised of parallel applications, each consisting of multiple-independent tasks that must be instantaneously assigned to processor queues, as they arrive. The processors independently and concurrently service these tasks. The key scheduling issues is, when some queue backlogs are small, an incoming job should first spread its tasks to those lightly loaded queues in order to take advantage of the parallel processing gain. Our algorithmic approach achieves optimality in task scheduling by assigning consecutive tasks to a triplet of processors exploiting locality in tasks. The experimental results show that tasks allocation to triplets of processing elements is efficient and optimal. Comparison to well accepted interconnection strategy, 2D mesh, is shown to prove the effectiveness of our algorithmic approach for TriBA. Finally we conclude that TriBA can be an efficient interconnection strategy for computations intensive applications, if tasks assignment is carried out optimally using algorithmic approach.展开更多
Personal desktop platform with teraflops peak performance of thousands of cores is realized at the price of conventional workstations using the programmable graphics processing units(GPUs).A GPU-based parallel Euler/N...Personal desktop platform with teraflops peak performance of thousands of cores is realized at the price of conventional workstations using the programmable graphics processing units(GPUs).A GPU-based parallel Euler/Navier-Stokes solver is developed for 2-D compressible flows by using NVIDIA′s Compute Unified Device Architecture(CUDA)programming model in CUDA Fortran programming language.The techniques of implementation of CUDA kernels,double-layered thread hierarchy and variety memory hierarchy are presented to form the GPU-based algorithm of Euler/Navier-Stokes equations.The resulting parallel solver is validated by a set of typical test flow cases.The numerical results show that dozens of times speedup relative to a serial CPU implementation can be achieved using a single GPU desktop platform,which demonstrates that a GPU desktop can serve as a costeffective parallel computing platform to accelerate computational fluid dynamics(CFD)simulations substantially.展开更多
This paper provides an overview of the main recommendations and approaches of the methodology on parallel computation application development for hybrid structures. This methodology was developed within the master's ...This paper provides an overview of the main recommendations and approaches of the methodology on parallel computation application development for hybrid structures. This methodology was developed within the master's thesis project "Optimization of complex tasks' computation on hybrid distributed computational structures" accomplished by Orekhov during which the main research objective was the determination of" patterns of the behavior of scaling efficiency and other parameters which define performance of different algorithms' implementations executed on hybrid distributed computational structures. Major outcomes and dependencies obtained within the master's thesis project were formed into a methodology which covers the problems of applications based on parallel computations and describes the process of its development in details, offering easy ways of avoiding potentially crucial problems. The paper is backed by the real-life examples such as clustering algorithms instead of artificial benchmarks.展开更多
Several parallel sorting techniques on different architectures have been studied for many years. Due to the need for faster systems in today's world, parallelism can be used to accelerate applications. Nowadays, para...Several parallel sorting techniques on different architectures have been studied for many years. Due to the need for faster systems in today's world, parallelism can be used to accelerate applications. Nowadays, parallel operations are used to solve computer problems such as sort and search, which result in a reasonable speed. Sorting is one of the most important operations in computing world. The authors always try to find the best in different areas which the premier is speedup. In this paper, the authors issued a sort with O(logn) time complexity on PRAM EREW (Parallel Random Access Machine Exclusive Read Exclusive Write). The algorithm is designed in a manner that keeps the tradeoff between the number of processor elements in the architecture and execution time. The simulation of the algorithm proves the theoretical analysis of the algorithm. The results of this research can be utilized in developing faster embedded systems. Sorting on Centralized Diamond (SOCD) algorithm is issued on the novel Centralized Diamond architecture which takes the advantages of Single Instruction Multiple Data (SIMD) architecture. This architecture and the sort on it are intuitive and optimal.展开更多
流量数据丢失是网络系统中常见的问题,通常由传感器故障、传输错误和存储丢失引起.现有的数据修复方法无法学习流量数据的多维特征,因此本文提出了一种结合双向长短期记忆网络与多尺度卷积网络的双通道并行架构(ST-MFCN)用于填补流量数...流量数据丢失是网络系统中常见的问题,通常由传感器故障、传输错误和存储丢失引起.现有的数据修复方法无法学习流量数据的多维特征,因此本文提出了一种结合双向长短期记忆网络与多尺度卷积网络的双通道并行架构(ST-MFCN)用于填补流量数据的缺失值,同时设计了一种新的对抗性损失函数进一步提高预测精度,该模型有效地学习流量数据的时间特征和动态空间特征.本文在Web traffic time series数据集上对模型进行测试,并与现有的修复方法进行对比,实验结果表明,ST-MFCN能够减少数据恢复的误差,提升了数据修复的精确度,为网络系统中的流量数据修复提供了一种稳健高效的解决方案.展开更多
提出一种基于Yarn云平台的基因启发式多序列比对算法。建立核酸替换等价矩阵作为基因启发式数学模型,构建Yarn云平台逻辑架构,通过对基因数据预处理、基因数据存储、基因序列比对、基因数据管理、基因数据分析等步骤,对数据分类保存,划...提出一种基于Yarn云平台的基因启发式多序列比对算法。建立核酸替换等价矩阵作为基因启发式数学模型,构建Yarn云平台逻辑架构,通过对基因数据预处理、基因数据存储、基因序列比对、基因数据管理、基因数据分析等步骤,对数据分类保存,划分错误率较高的长序列,得到多个较短的基因片段。对不同片段实施定位,将其中的变长种子生成,进行骨架构建和孔隙填补,可以实现基因启发式多序列比对。结果表明,设计的算法在不同数据集下处理时间缩短,多序列比对SP(Sum of Pairs)的分值较高,实验验证了该多序列比对方法具有很好的应用价值。展开更多
基金supported in part by the National Natural Science Foundation of China (NSFC) under Grant No.61976242in part by the Natural Science Fund of Hebei Province for Distinguished Young Scholars under Grant No.F2021202010+2 种基金in part by the Fundamental Scientific Research Funds for Interdisciplinary Team of Hebei University of Technology under Grant No.JBKYTD2002funded by Science and Technology Project of Hebei Education Department under Grant No.JZX2023007supported by 2022 Interdisciplinary Postgraduate Training Program of Hebei University of Technology under Grant No.HEBUT-YXKJC-2022122.
文摘Most of the neural network architectures are based on human experience,which requires a long and tedious trial-and-error process.Neural architecture search(NAS)attempts to detect effective architectures without human intervention.Evolutionary algorithms(EAs)for NAS can find better solutions than human-designed architectures by exploring a large search space for possible architectures.Using multiobjective EAs for NAS,optimal neural architectures that meet various performance criteria can be explored and discovered efficiently.Furthermore,hardware-accelerated NAS methods can improve the efficiency of the NAS.While existing reviews have mainly focused on different strategies to complete NAS,a few studies have explored the use of EAs for NAS.In this paper,we summarize and explore the use of EAs for NAS,as well as large-scale multiobjective optimization strategies and hardware-accelerated NAS methods.NAS performs well in healthcare applications,such as medical image analysis,classification of disease diagnosis,and health monitoring.EAs for NAS can automate the search process and optimize multiple objectives simultaneously in a given healthcare task.Deep neural network has been successfully used in healthcare,but it lacks interpretability.Medical data is highly sensitive,and privacy leaks are frequently reported in the healthcare industry.To solve these problems,in healthcare,we propose an interpretable neuroevolution framework based on federated learning to address search efficiency and privacy protection.Moreover,we also point out future research directions for evolutionary NAS.Overall,for researchers who want to use EAs to optimize NNs in healthcare,we analyze the advantages and disadvantages of doing so to provide detailed guidance,and propose an interpretable privacy-preserving framework for healthcare applications.
基金supported by the National Natural Science Foundation of China under Grant 52077146.
文摘With the construction of the power Internet of Things(IoT),communication between smart devices in urban distribution networks has been gradually moving towards high speed,high compatibility,and low latency,which provides reliable support for reconfiguration optimization in urban distribution networks.Thus,this study proposed a deep reinforcement learning based multi-level dynamic reconfiguration method for urban distribution networks in a cloud-edge collaboration architecture to obtain a real-time optimal multi-level dynamic reconfiguration solution.First,the multi-level dynamic reconfiguration method was discussed,which included feeder-,transformer-,and substation-levels.Subsequently,the multi-agent system was combined with the cloud-edge collaboration architecture to build a deep reinforcement learning model for multi-level dynamic reconfiguration in an urban distribution network.The cloud-edge collaboration architecture can effectively support the multi-agent system to conduct“centralized training and decentralized execution”operation modes and improve the learning efficiency of the model.Thereafter,for a multi-agent system,this study adopted a combination of offline and online learning to endow the model with the ability to realize automatic optimization and updation of the strategy.In the offline learning phase,a Q-learning-based multi-agent conservative Q-learning(MACQL)algorithm was proposed to stabilize the learning results and reduce the risk of the next online learning phase.In the online learning phase,a multi-agent deep deterministic policy gradient(MADDPG)algorithm based on policy gradients was proposed to explore the action space and update the experience pool.Finally,the effectiveness of the proposed method was verified through a simulation analysis of a real-world 445-node system.
基金This project was supported by the National Natural Science Foundation of China (60135020).
文摘The flexibility of traditional image processing system is limited because those system are designed for specific applications. In this paper, a new TMS320C64x-based multi-DSP parallel computing architecture is presented. It has many promising characteristics such as powerful computing capability, broad I/O bandwidth, topology flexibility, and expansibility. The parallel system performance is evaluated by practical experiment.
基金the National Natural Science Foundation of China (No. 60171043, 60371046)
文摘In this paper, we propose two kinds of modifications in speaker recognition. First, the correlations between frequency channels are of prime importance for speaker recognition. Some of these correlations are lost when the frequency domain is divided into sub-bands. Consequently we propose a particularly redundant parallel architecture for which most of the correlations are kept. Second, generally a log transformation used to modify the power spectrum is done after the filter-bank in the classical spectrum calculation. We will see that performing this transformation before the filter bank is more interesting in our case. In the processing of recognition, the Gaussian mixture model (GMM) recognition arithmetic is adopted. Experiments on speech corrupted by noise show a better adaptability of this approach in noisy environments, comoared with a conventional device, esoeciallv when oruning of some recognizers is performed.
文摘We propose a content-based parallel image retrieval system to achieve high responding ability. Our system is developed on cluster architectures. It has several retrieval. servers to supply the service of content-based image retrieval. It adopts the Browser/Server (B/S) mode. The users could visit our system though web pages. It uses the symmetrical color-spatial features (SCSF) to represent the content of an image. The SCSF is effective and efficient for image matching because it is independent of image distortion such as rotation and flip as well as it increases the matching accuracy. The SCSF was organized by M-tree, which could speedup the searching procedure. Our experiments show that the image matching is quickly and efficiently with the use of SCSF. And with the support of several retrieval servers, the system could respond to many users at mean time. Key words content-based image retrieval - cluster architecture - color-spatial feature - B/S mode - task parallel - WWW - Internet CLC number TP391 Foundation item: Supported by the National Natural Science Foundation of China (60173058)Biography: ZHOU Bing (1975-), male, Ph. D candidate, reseach direction: data mining, content-based image retrieval.
基金the Natural Science Foundation of Shandong Province(No.ZR2019MD034)the Education Reform Project of Shandong Province(No.M2020266)。
文摘Computing resources are one of the key factors restricting the extraction of marine targets by using deep learning.In order to increase computing speed and shorten the computing time,parallel distributed architecture is adopted to extract marine targets.The advantages of two distributed architectures,Parameter Server and Ring-allreduce architecture,are combined to design a parallel distributed architecture suitable for deep learning–Optimal Interleaved Distributed Architecture(OIDA).Three marine target extraction methods including OTD_StErf,OTD_Loglogistic and OTD_Sgmloglog are used to test OIDA,and a total of 18 experiments in 3categories are carried out.The results show that OIDA architecture can meet the timeliness requirements of marine target extraction.The average speed of target parallel extraction with single-machine 8-core CPU is 5.75 times faster than that of single-machine single-core CPU,and the average speed with 5-machine 40-core CPU is 20.75 times faster.
文摘Architecture singularity of a parallel mechanism with five degrees of freedom (DOF) is analyzed. Such mechanism consists of a movable platform connected to the base by five active limbs. Four of them are identical 6-DOF limbs and the last one has the same DOF as the specified DOF of the movable platform. Based on the kinematics analysis, two categories of architecture singularities for such mechanism are proposed. Then the sufficient condition for each singularity is researched. Results show that the mechanism is singular when it employs each category of the proposed architecture, provided that it satisfies the corresponding sufficient condition. It can be concluded that the proposed two categories of architecture singularities should be avoided with the following dimensional synthesis of such mechanism.
文摘Ray tracing is a computer graphics method that renders images realistically. As the name suggests, this technique primarily traces the path of light rays interacting with objects in a scene [1], permitting the calculation of lighting and reflecting impact [2]. As ray tracing is a time-consuming process, the need for parallelization to solve this problem arises. One downside of this solution is the existence of race conditions. In this work, we explore and experiment with a different, well-known solution for this race condition. Starting with the introduction and the background section, a brief overview of the topic is followed by a detailed part of how the race conditions may occur in the case of the ray tracing algorithm. Continuing with the methods and results section, we have used OpenMP to parallelize the Ray tracing algorithm with the different compiler directives critical, atomic, and first-private. Hence, it concluded that both critical and atomic are not efficient solutions to produce a good-quality picture, but first-private succeeded in producing a high-quality picture.
文摘An optimal algorithmic approach to task scheduling for, triplet based architecture(TriBA), is proposed in this paper. TriBA is considered to be a high performance, distributed parallel computing architecture. TriBA consists of a 2D grid of small, programmable processing units, each physically connected to its three neighbors. In parallel or distributed environment an efficient assignment of tasks to the processing elements is imperative to achieve fast job turnaround time. Moreover, the sojourn time experienced by each individual job should be minimized. The arriving jobs are comprised of parallel applications, each consisting of multiple-independent tasks that must be instantaneously assigned to processor queues, as they arrive. The processors independently and concurrently service these tasks. The key scheduling issues is, when some queue backlogs are small, an incoming job should first spread its tasks to those lightly loaded queues in order to take advantage of the parallel processing gain. Our algorithmic approach achieves optimality in task scheduling by assigning consecutive tasks to a triplet of processors exploiting locality in tasks. The experimental results show that tasks allocation to triplets of processing elements is efficient and optimal. Comparison to well accepted interconnection strategy, 2D mesh, is shown to prove the effectiveness of our algorithmic approach for TriBA. Finally we conclude that TriBA can be an efficient interconnection strategy for computations intensive applications, if tasks assignment is carried out optimally using algorithmic approach.
基金supported by the National Natural Science Foundation of China (No.11172134)the Funding of Jiangsu Innovation Program for Graduate Education (No.CXLX13_132)
文摘Personal desktop platform with teraflops peak performance of thousands of cores is realized at the price of conventional workstations using the programmable graphics processing units(GPUs).A GPU-based parallel Euler/Navier-Stokes solver is developed for 2-D compressible flows by using NVIDIA′s Compute Unified Device Architecture(CUDA)programming model in CUDA Fortran programming language.The techniques of implementation of CUDA kernels,double-layered thread hierarchy and variety memory hierarchy are presented to form the GPU-based algorithm of Euler/Navier-Stokes equations.The resulting parallel solver is validated by a set of typical test flow cases.The numerical results show that dozens of times speedup relative to a serial CPU implementation can be achieved using a single GPU desktop platform,which demonstrates that a GPU desktop can serve as a costeffective parallel computing platform to accelerate computational fluid dynamics(CFD)simulations substantially.
文摘This paper provides an overview of the main recommendations and approaches of the methodology on parallel computation application development for hybrid structures. This methodology was developed within the master's thesis project "Optimization of complex tasks' computation on hybrid distributed computational structures" accomplished by Orekhov during which the main research objective was the determination of" patterns of the behavior of scaling efficiency and other parameters which define performance of different algorithms' implementations executed on hybrid distributed computational structures. Major outcomes and dependencies obtained within the master's thesis project were formed into a methodology which covers the problems of applications based on parallel computations and describes the process of its development in details, offering easy ways of avoiding potentially crucial problems. The paper is backed by the real-life examples such as clustering algorithms instead of artificial benchmarks.
文摘Several parallel sorting techniques on different architectures have been studied for many years. Due to the need for faster systems in today's world, parallelism can be used to accelerate applications. Nowadays, parallel operations are used to solve computer problems such as sort and search, which result in a reasonable speed. Sorting is one of the most important operations in computing world. The authors always try to find the best in different areas which the premier is speedup. In this paper, the authors issued a sort with O(logn) time complexity on PRAM EREW (Parallel Random Access Machine Exclusive Read Exclusive Write). The algorithm is designed in a manner that keeps the tradeoff between the number of processor elements in the architecture and execution time. The simulation of the algorithm proves the theoretical analysis of the algorithm. The results of this research can be utilized in developing faster embedded systems. Sorting on Centralized Diamond (SOCD) algorithm is issued on the novel Centralized Diamond architecture which takes the advantages of Single Instruction Multiple Data (SIMD) architecture. This architecture and the sort on it are intuitive and optimal.
文摘流量数据丢失是网络系统中常见的问题,通常由传感器故障、传输错误和存储丢失引起.现有的数据修复方法无法学习流量数据的多维特征,因此本文提出了一种结合双向长短期记忆网络与多尺度卷积网络的双通道并行架构(ST-MFCN)用于填补流量数据的缺失值,同时设计了一种新的对抗性损失函数进一步提高预测精度,该模型有效地学习流量数据的时间特征和动态空间特征.本文在Web traffic time series数据集上对模型进行测试,并与现有的修复方法进行对比,实验结果表明,ST-MFCN能够减少数据恢复的误差,提升了数据修复的精确度,为网络系统中的流量数据修复提供了一种稳健高效的解决方案.
文摘提出一种基于Yarn云平台的基因启发式多序列比对算法。建立核酸替换等价矩阵作为基因启发式数学模型,构建Yarn云平台逻辑架构,通过对基因数据预处理、基因数据存储、基因序列比对、基因数据管理、基因数据分析等步骤,对数据分类保存,划分错误率较高的长序列,得到多个较短的基因片段。对不同片段实施定位,将其中的变长种子生成,进行骨架构建和孔隙填补,可以实现基因启发式多序列比对。结果表明,设计的算法在不同数据集下处理时间缩短,多序列比对SP(Sum of Pairs)的分值较高,实验验证了该多序列比对方法具有很好的应用价值。