This paper presents an algorithm that combines the chaos optimization algorithm with the maximum entropy ( COA-ME) by using entropy model based on chaos algorithm,in which the maximum entropy is used as the second met...This paper presents an algorithm that combines the chaos optimization algorithm with the maximum entropy ( COA-ME) by using entropy model based on chaos algorithm,in which the maximum entropy is used as the second method of searching the excellent solution. The search direction is improved by chaos optimization algorithm and realizes the selective acceptance of wrong solution. The experimental result shows that the presented algorithm can be used in the partitioning of hardware/software of reconfigurable system. It effectively reduces the local extremum problem,and search speed as well as performance of partitioning is improved.展开更多
In this paper, the storage capacity of communication among cores and processors is taken into account and a maximum D-value-first algorithm is proposed. By improving the hardware parallelism in the task execution proc...In this paper, the storage capacity of communication among cores and processors is taken into account and a maximum D-value-first algorithm is proposed. By improving the hardware parallelism in the task execution process, the maximum storage requirements for communication are minimized. Experimental results with various directed acyclic graph models showed that compared with the earliest-task-first algorithm, the storage requirements for communication were reduced by 22.46%, on average, while the average of makespan only increased by 0.82%,.展开更多
This paper deals with a new hardware/software embedded system design methodology based on design pattern approach by development of a new design tool called smartcell. Three main constraints of embedded systems design...This paper deals with a new hardware/software embedded system design methodology based on design pattern approach by development of a new design tool called smartcell. Three main constraints of embedded systems design process are investigated: the complexity, the partitioning between hardware and software aspects and the reusability. Two intermediate models are carried out in order to solve the complexity problem. The partitioning problem deals with the proposed hardware/software partitioning algorithm based on Ant Colony Optimisation. The reusability problem is resolved by synthesis of intellectual property blocks. Specification and integration of an intelligent controller on heterogeneous platform are considered to illustrate the proposed approach.展开更多
Hardware/software partitioning is an important step in the design of embedded systems. In this paper, the hardware/software partitioning problem is modeled as a constrained binary integer programming problem, which is...Hardware/software partitioning is an important step in the design of embedded systems. In this paper, the hardware/software partitioning problem is modeled as a constrained binary integer programming problem, which is further converted equivalently to an unconstrained binary integer programming problem by a penalty method. A local search method, HSFM, is developed to obtain a discrete local minimizer of the unconstrained binary integer programming problem. Next, an auxiliary function, which has the same global optimal solutions as the unconstrained binary integer programming problem, is constructed, and its properties are studied. We show that applying HSFM to minimize the auxiliary function can escape from previous local optima by the increase of the parameter value successfully. Finally, a discrete dynamic convexized method is developed to solve the hardware/software partitioning problem. Computational results and comparisons indicate that the proposed algorithm can get high-quality solutions.展开更多
With the development of the design complexity in embedded systems, hardware/software (HW/SW) partitioning becomes a challenging optimization problem in HW/SW co-design. A novel HW/SW partitioning method based on pos...With the development of the design complexity in embedded systems, hardware/software (HW/SW) partitioning becomes a challenging optimization problem in HW/SW co-design. A novel HW/SW partitioning method based on position disturbed particle swarm optimization with invasive weed optimization (PDPSO-IWO) is presented in this paper. It is found by biologists that the ground squirrels produce alarm calls which warn their peers to move away when there is potential predatory threat. Here, we present PDPSO algorithm, in each iteration of which the squirrel behavior of escaping from the global worst particle can be simulated to increase population diversity and avoid local optimum. We also present new initialization and reproduction strategies to improve IWO algorithm for searching a better position, with which the global best position can be updated. Then the search accuracy and the solution quality can be enhanced. PDPSO and improved IWO are synthesized into one single PDPSO-IWO algorithm, which can keep both searching diversification and searching intensification. Furthermore, a hybrid NodeRank (HNodeRank) algorithm is proposed to initialize the population of PDPSO-IWO, and the solution quality can be enhanced further. Since the HW/SW communication cost computing is the most time-consuming process for HW/SW partitioning algorithm, we adopt the GPU parallel technique to accelerate the computing. In this way, the runtime of PDPSO-IWO for large-scale HW/SW partitioning problem can be reduced efficiently. Finally, multiple experiments on benchmarks from state-of-the-art publications and large-scale HW/SW partitioning demonstrate that the proposed algorithm can achieve higher performance than other algorithms.展开更多
This paper focuses on the algorithmic aspects for the hardware/software (HW/SW) partitioning which searches a reasonable composition of hardware and software components which not only satisfies the constraint of har...This paper focuses on the algorithmic aspects for the hardware/software (HW/SW) partitioning which searches a reasonable composition of hardware and software components which not only satisfies the constraint of hardware area but also optimizes the execution time. The computational model is extended so that all possible types of communications can be taken into account for the HW/SW partitioning. Also, a new dynamic programming algorithm is proposed on the basis of the computational model, in which source data, rather than speedup in previous work, of basic scheduling blocks are directly utilized to calculate the optimal solution. The proposed algorithm runs in O(n·A) for n code fragments and the available hardware area A. Simulation results show that the proposed algorithm solves the HW/SW partitioning without increase in running time, compared with the algorithm cited in the literature.展开更多
Hardware/software partitioning is an essential step in hardware/software co-design.For large size problems,it is difficult to consider both solution quality and time.This paper presents an efficient GPU-based parallel...Hardware/software partitioning is an essential step in hardware/software co-design.For large size problems,it is difficult to consider both solution quality and time.This paper presents an efficient GPU-based parallel tabu search algorithm(GPTS)for HW/SW partitioning.A single GPU kernel of compacting neighborhood is proposed to reduce the amount of GPU global memory accesses theoretically.A kernel fusion strategy is further proposed to reduce the amount of GPU global memory accesses of GPTS.To further minimize the transfer overhead of GPTS between CPU and GPU,an optimized transfer strategy for GPU-based tabu evaluation is proposed,which considers that all the candidates do not satisfy the given constraint.Experiments show that GPTS outperforms state-of-the-art work of tabu search and is competitive with other methods for HW/SW partitioning.The proposed parallelization is significant when considering the ordinary GPU platform.展开更多
Dynamic optimization relies on runtime profile information to improve the performance of program execution. Traditional profiling techniques incur significant overhead and are not suitable for dynamic optimization. In...Dynamic optimization relies on runtime profile information to improve the performance of program execution. Traditional profiling techniques incur significant overhead and are not suitable for dynamic optimization. In this paper, a new profiling technique is proposed, that incorporates the strength of both software and hardware to achieve near-zero overhead profiling. The compiler passes profiling requests as a few bits of information in branch instructions to the hardware, and the processor executes profiling operations asynchronously in available free slots or on dedicated hardware. The compiler instrumentation of this technique is implemented using an Itanium research compiler. The result shows that the accurate block profiling incurs very little overhead to the user program in terms of the program scheduling cycles. For example, the average overhead is 0.6% for the SPECint95 benchmarks. The hardware support required for the new profiling is practical. The technique is extended to collect edge profiles for continuous phase transition detection. It is believed that the hardware-software collaborative scheme will enable many profile-driven dynamic optimizations for EPIC processors such as the Itanium processors.展开更多
基金Sponsored by the Natural Science Foundation of Heilongjiang Province( Grant No B2007-07)Industrial Research Projects in Qiqihaer( Grant No GYGG-09009)
文摘This paper presents an algorithm that combines the chaos optimization algorithm with the maximum entropy ( COA-ME) by using entropy model based on chaos algorithm,in which the maximum entropy is used as the second method of searching the excellent solution. The search direction is improved by chaos optimization algorithm and realizes the selective acceptance of wrong solution. The experimental result shows that the presented algorithm can be used in the partitioning of hardware/software of reconfigurable system. It effectively reduces the local extremum problem,and search speed as well as performance of partitioning is improved.
基金Supported by the National Natural Science Foundation of China(No.61179045 and No.61350009)
文摘In this paper, the storage capacity of communication among cores and processors is taken into account and a maximum D-value-first algorithm is proposed. By improving the hardware parallelism in the task execution process, the maximum storage requirements for communication are minimized. Experimental results with various directed acyclic graph models showed that compared with the earliest-task-first algorithm, the storage requirements for communication were reduced by 22.46%, on average, while the average of makespan only increased by 0.82%,.
文摘This paper deals with a new hardware/software embedded system design methodology based on design pattern approach by development of a new design tool called smartcell. Three main constraints of embedded systems design process are investigated: the complexity, the partitioning between hardware and software aspects and the reusability. Two intermediate models are carried out in order to solve the complexity problem. The partitioning problem deals with the proposed hardware/software partitioning algorithm based on Ant Colony Optimisation. The reusability problem is resolved by synthesis of intellectual property blocks. Specification and integration of an intelligent controller on heterogeneous platform are considered to illustrate the proposed approach.
基金Supported by the National Natural Science Foundation of China(11301255)the Fund by Collaborative Innovation Center of IoT Industrialization and Intelligent Production,Minjiang University(IIC1703)+1 种基金Foundation of Minjiang University(MYK17032)the Program for New Century Excellent Talents in Fujian Province University
文摘Hardware/software partitioning is an important step in the design of embedded systems. In this paper, the hardware/software partitioning problem is modeled as a constrained binary integer programming problem, which is further converted equivalently to an unconstrained binary integer programming problem by a penalty method. A local search method, HSFM, is developed to obtain a discrete local minimizer of the unconstrained binary integer programming problem. Next, an auxiliary function, which has the same global optimal solutions as the unconstrained binary integer programming problem, is constructed, and its properties are studied. We show that applying HSFM to minimize the auxiliary function can escape from previous local optima by the increase of the parameter value successfully. Finally, a discrete dynamic convexized method is developed to solve the hardware/software partitioning problem. Computational results and comparisons indicate that the proposed algorithm can get high-quality solutions.
基金The work was supported by the National Natural Science Foundation of China under Grant No. 61472289 and the National Key Research and Development Project of China under Grant No. 2016YFC0106305.
文摘With the development of the design complexity in embedded systems, hardware/software (HW/SW) partitioning becomes a challenging optimization problem in HW/SW co-design. A novel HW/SW partitioning method based on position disturbed particle swarm optimization with invasive weed optimization (PDPSO-IWO) is presented in this paper. It is found by biologists that the ground squirrels produce alarm calls which warn their peers to move away when there is potential predatory threat. Here, we present PDPSO algorithm, in each iteration of which the squirrel behavior of escaping from the global worst particle can be simulated to increase population diversity and avoid local optimum. We also present new initialization and reproduction strategies to improve IWO algorithm for searching a better position, with which the global best position can be updated. Then the search accuracy and the solution quality can be enhanced. PDPSO and improved IWO are synthesized into one single PDPSO-IWO algorithm, which can keep both searching diversification and searching intensification. Furthermore, a hybrid NodeRank (HNodeRank) algorithm is proposed to initialize the population of PDPSO-IWO, and the solution quality can be enhanced further. Since the HW/SW communication cost computing is the most time-consuming process for HW/SW partitioning algorithm, we adopt the GPU parallel technique to accelerate the computing. In this way, the runtime of PDPSO-IWO for large-scale HW/SW partitioning problem can be reduced efficiently. Finally, multiple experiments on benchmarks from state-of-the-art publications and large-scale HW/SW partitioning demonstrate that the proposed algorithm can achieve higher performance than other algorithms.
文摘This paper focuses on the algorithmic aspects for the hardware/software (HW/SW) partitioning which searches a reasonable composition of hardware and software components which not only satisfies the constraint of hardware area but also optimizes the execution time. The computational model is extended so that all possible types of communications can be taken into account for the HW/SW partitioning. Also, a new dynamic programming algorithm is proposed on the basis of the computational model, in which source data, rather than speedup in previous work, of basic scheduling blocks are directly utilized to calculate the optimal solution. The proposed algorithm runs in O(n·A) for n code fragments and the available hardware area A. Simulation results show that the proposed algorithm solves the HW/SW partitioning without increase in running time, compared with the algorithm cited in the literature.
基金This paper was supported by the National Natural Science Foundation of China(Grant No.61472289)National Key Research and Development Project(2016YFC0106305).We also would like to thank the anonymous reviewers for their valuable and constructive comments.
文摘Hardware/software partitioning is an essential step in hardware/software co-design.For large size problems,it is difficult to consider both solution quality and time.This paper presents an efficient GPU-based parallel tabu search algorithm(GPTS)for HW/SW partitioning.A single GPU kernel of compacting neighborhood is proposed to reduce the amount of GPU global memory accesses theoretically.A kernel fusion strategy is further proposed to reduce the amount of GPU global memory accesses of GPTS.To further minimize the transfer overhead of GPTS between CPU and GPU,an optimized transfer strategy for GPU-based tabu evaluation is proposed,which considers that all the candidates do not satisfy the given constraint.Experiments show that GPTS outperforms state-of-the-art work of tabu search and is competitive with other methods for HW/SW partitioning.The proposed parallelization is significant when considering the ordinary GPU platform.
文摘Dynamic optimization relies on runtime profile information to improve the performance of program execution. Traditional profiling techniques incur significant overhead and are not suitable for dynamic optimization. In this paper, a new profiling technique is proposed, that incorporates the strength of both software and hardware to achieve near-zero overhead profiling. The compiler passes profiling requests as a few bits of information in branch instructions to the hardware, and the processor executes profiling operations asynchronously in available free slots or on dedicated hardware. The compiler instrumentation of this technique is implemented using an Itanium research compiler. The result shows that the accurate block profiling incurs very little overhead to the user program in terms of the program scheduling cycles. For example, the average overhead is 0.6% for the SPECint95 benchmarks. The hardware support required for the new profiling is practical. The technique is extended to collect edge profiles for continuous phase transition detection. It is believed that the hardware-software collaborative scheme will enable many profile-driven dynamic optimizations for EPIC processors such as the Itanium processors.