Ethernet over SDH/SONET (EOS) is a hotspot in today's data transmission technology for it combines the merits of both Ethernet and SDH/SONET. However, implementing an EOS system on a chip is complex and needs full...Ethernet over SDH/SONET (EOS) is a hotspot in today's data transmission technology for it combines the merits of both Ethernet and SDH/SONET. However, implementing an EOS system on a chip is complex and needs full verifications. This paper introduces our design of Hardware/Software co-verification platform for EOS design. The hardware platform contains a microprocessor board and an FPGA (Field Programmable Gate Array)-based verification board, and the corresponding software includes test benches running in FPGAs, controlling programs for the microprocessor and a console program with GUI (Graphical User Interface) interface for configuration, management and supervision. The design is cost-effective and has been successfully employed to verify several IP (Intellectual Property) blocks of our EOS chip. Moreover, it is flexible and can be applied as a general-purpose verification platform.展开更多
In this paper, the storage capacity of communication among cores and processors is taken into account and a maximum D-value-first algorithm is proposed. By improving the hardware parallelism in the task execution proc...In this paper, the storage capacity of communication among cores and processors is taken into account and a maximum D-value-first algorithm is proposed. By improving the hardware parallelism in the task execution process, the maximum storage requirements for communication are minimized. Experimental results with various directed acyclic graph models showed that compared with the earliest-task-first algorithm, the storage requirements for communication were reduced by 22.46%, on average, while the average of makespan only increased by 0.82%,.展开更多
This paper presents an algorithm that combines the chaos optimization algorithm with the maximum entropy ( COA-ME) by using entropy model based on chaos algorithm,in which the maximum entropy is used as the second met...This paper presents an algorithm that combines the chaos optimization algorithm with the maximum entropy ( COA-ME) by using entropy model based on chaos algorithm,in which the maximum entropy is used as the second method of searching the excellent solution. The search direction is improved by chaos optimization algorithm and realizes the selective acceptance of wrong solution. The experimental result shows that the presented algorithm can be used in the partitioning of hardware/software of reconfigurable system. It effectively reduces the local extremum problem,and search speed as well as performance of partitioning is improved.展开更多
In order to improve the efficiency of embedded software running on processor core, this paper proposes a hard-ware/software co-optimization approach for embedded software from the system point of view. The proposed st...In order to improve the efficiency of embedded software running on processor core, this paper proposes a hard-ware/software co-optimization approach for embedded software from the system point of view. The proposed stepwise methods aim at exploiting the structure and the resources of the processor as much as possible for software algorithm optimization. To achieve low memory usage and low frequency need for the same performance, this co-optimization approach was used to optimize embedded software of MP3 decoder based on a 16-bit fixed-point DSP core. After the optimization, the results of decoding 128 kbps, 44.1 kHz stereo MP3 on DSP evaluation platform need 45.9 MIPS and 20.4 kbytes memory space. The optimization rate achieves 65.6% for memory and 49.6% for frequency respectively compared with the results by compiler using floating-point computation. The experimental result indicates the availability of the hardware/software co-optimization approach depending on the algorithm and architecture.展开更多
This paper deals with a new hardware/software embedded system design methodology based on design pattern approach by development of a new design tool called smartcell. Three main constraints of embedded systems design...This paper deals with a new hardware/software embedded system design methodology based on design pattern approach by development of a new design tool called smartcell. Three main constraints of embedded systems design process are investigated: the complexity, the partitioning between hardware and software aspects and the reusability. Two intermediate models are carried out in order to solve the complexity problem. The partitioning problem deals with the proposed hardware/software partitioning algorithm based on Ant Colony Optimisation. The reusability problem is resolved by synthesis of intellectual property blocks. Specification and integration of an intelligent controller on heterogeneous platform are considered to illustrate the proposed approach.展开更多
A hardware/software co-synthesis method is presented for SoC designs consisting of both hardware IP cores and software components on a graph-theoretic formulation. Given a SoC integrated with a set of functions and a ...A hardware/software co-synthesis method is presented for SoC designs consisting of both hardware IP cores and software components on a graph-theoretic formulation. Given a SoC integrated with a set of functions and a set of performance factors, a core for each function is selected from a set of alternative IP cores and software components, and optimal partitions is found in a way to evenly balance the performance factors and to ultimately reduce the overall cost, size, power consumption and runtime of the core-based SoC. The algorithm formulates IP cores and components into the corresponding mathematical models, presents a graph-theoretic model for finding the optimal partitions of SoC design and transforms SoC hardware/software co-synthesis problem into finding optimal paths in a weighted, directed graph. Overcoming the three main deficiencies of the traditional methods, this method can work automatically, evaluate more performance factors at the same time and meet the particularity of SoC designs. At last, the approach is illustrated that is practical and effective through partitioning a practical system.展开更多
We present a simulation framework for wireless sensor networks developed to allow the design exploration and the complete microprocessor-instruction-level debug of network formation, data congestion, nodes interaction...We present a simulation framework for wireless sensor networks developed to allow the design exploration and the complete microprocessor-instruction-level debug of network formation, data congestion, nodes interaction, all in one simulation environment. A specifically innovative feature is the co-emulation of selected nodes at clock-cycle-accurate hardware processing level, allowing code debug and exact execution latency evaluation (considering both protocol stack and application), together with other nodes at abstract protocol level, meeting a designer’s needs of simulation speed, scalability and reliability. The simulator is centered on the Zigbee protocol and can be retargeted for different node micro-architectures.展开更多
Graph processing has been widely used in many scenarios,from scientific computing to artificial intelligence.Graph processing exhibits irregular computational parallelism and random memory accesses,unlike traditional ...Graph processing has been widely used in many scenarios,from scientific computing to artificial intelligence.Graph processing exhibits irregular computational parallelism and random memory accesses,unlike traditional workloads.Therefore,running graph processing workloads on conventional architectures(e.g.,CPUs and GPUs)often shows a significantly low compute-memory ratio with few performance benefits,which can be,in many cases,even slower than a specialized single-thread graph algorithm.While domain-specific hardware designs are essential for graph processing,it is still challenging to transform the hardware capability to performance boost without coupled software codesigns.This article presents a graph processing ecosystem from hardware to software.We start by introducing a series of hardware accelerators as the foundation of this ecosystem.Subsequently,the codesigned parallel graph systems and their distributed techniques are presented to support graph applications.Finally,we introduce our efforts on novel graph applications and hardware architectures.Extensive results show that various graph applications can be efficiently accelerated in this graph processing ecosystem.展开更多
Hardware/software partitioning is an essential step in hardware/software co-design.For large size problems,it is difficult to consider both solution quality and time.This paper presents an efficient GPU-based parallel...Hardware/software partitioning is an essential step in hardware/software co-design.For large size problems,it is difficult to consider both solution quality and time.This paper presents an efficient GPU-based parallel tabu search algorithm(GPTS)for HW/SW partitioning.A single GPU kernel of compacting neighborhood is proposed to reduce the amount of GPU global memory accesses theoretically.A kernel fusion strategy is further proposed to reduce the amount of GPU global memory accesses of GPTS.To further minimize the transfer overhead of GPTS between CPU and GPU,an optimized transfer strategy for GPU-based tabu evaluation is proposed,which considers that all the candidates do not satisfy the given constraint.Experiments show that GPTS outperforms state-of-the-art work of tabu search and is competitive with other methods for HW/SW partitioning.The proposed parallelization is significant when considering the ordinary GPU platform.展开更多
With the development of the design complexity in embedded systems, hardware/software (HW/SW) partitioning becomes a challenging optimization problem in HW/SW co-design. A novel HW/SW partitioning method based on pos...With the development of the design complexity in embedded systems, hardware/software (HW/SW) partitioning becomes a challenging optimization problem in HW/SW co-design. A novel HW/SW partitioning method based on position disturbed particle swarm optimization with invasive weed optimization (PDPSO-IWO) is presented in this paper. It is found by biologists that the ground squirrels produce alarm calls which warn their peers to move away when there is potential predatory threat. Here, we present PDPSO algorithm, in each iteration of which the squirrel behavior of escaping from the global worst particle can be simulated to increase population diversity and avoid local optimum. We also present new initialization and reproduction strategies to improve IWO algorithm for searching a better position, with which the global best position can be updated. Then the search accuracy and the solution quality can be enhanced. PDPSO and improved IWO are synthesized into one single PDPSO-IWO algorithm, which can keep both searching diversification and searching intensification. Furthermore, a hybrid NodeRank (HNodeRank) algorithm is proposed to initialize the population of PDPSO-IWO, and the solution quality can be enhanced further. Since the HW/SW communication cost computing is the most time-consuming process for HW/SW partitioning algorithm, we adopt the GPU parallel technique to accelerate the computing. In this way, the runtime of PDPSO-IWO for large-scale HW/SW partitioning problem can be reduced efficiently. Finally, multiple experiments on benchmarks from state-of-the-art publications and large-scale HW/SW partitioning demonstrate that the proposed algorithm can achieve higher performance than other algorithms.展开更多
This paper focuses on the algorithmic aspects for the hardware/software (HW/SW) partitioning which searches a reasonable composition of hardware and software components which not only satisfies the constraint of har...This paper focuses on the algorithmic aspects for the hardware/software (HW/SW) partitioning which searches a reasonable composition of hardware and software components which not only satisfies the constraint of hardware area but also optimizes the execution time. The computational model is extended so that all possible types of communications can be taken into account for the HW/SW partitioning. Also, a new dynamic programming algorithm is proposed on the basis of the computational model, in which source data, rather than speedup in previous work, of basic scheduling blocks are directly utilized to calculate the optimal solution. The proposed algorithm runs in O(n·A) for n code fragments and the available hardware area A. Simulation results show that the proposed algorithm solves the HW/SW partitioning without increase in running time, compared with the algorithm cited in the literature.展开更多
Due to the interdependency of frame synchronization(FS)and channel estimation(CE),joint FS and CE(JFSCE)schemes are proposed to enhance their functionalities and therefore boost the overall performance of wireless com...Due to the interdependency of frame synchronization(FS)and channel estimation(CE),joint FS and CE(JFSCE)schemes are proposed to enhance their functionalities and therefore boost the overall performance of wireless communication systems.Although traditional JFSCE schemes alleviate the influence between FS and CE,they show deficiencies in dealing with hardware imperfection(HI)and deterministic line-of-sight(LOS)path.To tackle this challenge,we proposed a cascaded ELM-based JFSCE to alleviate the influence of HI in the scenario of the Rician fading channel.Specifically,the conventional JFSCE method is first employed to extract the initial features,and thus forms the non-Neural Network(NN)solutions for FS and CE,respectively.Then,the ELMbased networks,named FS-NET and CE-NET,are cascaded to capture the NN solutions of FS and CE.Simulation and analysis results show that,compared with the conventional JFSCE methods,the proposed cascaded ELM-based JFSCE significantly reduces the error probability of FS and the normalized mean square error(NMSE)of CE,even against the impacts of parameter variations.展开更多
The purpose of software defect prediction is to identify defect-prone code modules to assist software quality assurance teams with the appropriate allocation of resources and labor.In previous software defect predicti...The purpose of software defect prediction is to identify defect-prone code modules to assist software quality assurance teams with the appropriate allocation of resources and labor.In previous software defect prediction studies,transfer learning was effective in solving the problem of inconsistent project data distribution.However,target projects often lack sufficient data,which affects the performance of the transfer learning model.In addition,the presence of uncorrelated features between projects can decrease the prediction accuracy of the transfer learning model.To address these problems,this article propose a software defect prediction method based on stable learning(SDP-SL)that combines code visualization techniques and residual networks.This method first transforms code files into code images using code visualization techniques and then constructs a defect prediction model based on these code images.During the model training process,target project data are not required as prior knowledge.Following the principles of stable learning,this paper dynamically adjusted the weights of source project samples to eliminate dependencies between features,thereby capturing the“invariance mechanism”within the data.This approach explores the genuine relationship between code defect features and labels,thereby enhancing defect prediction performance.To evaluate the performance of SDP-SL,this article conducted comparative experiments on 10 open-source projects in the PROMISE dataset.The experimental results demonstrated that in terms of the F-measure,the proposed SDP-SL method outperformed other within-project defect prediction methods by 2.11%-44.03%.In cross-project defect prediction,the SDP-SL method provided an improvement of 5.89%-25.46% in prediction performance compared to other cross-project defect prediction methods.Therefore,SDP-SL can effectively enhance within-and cross-project defect predictions.展开更多
Software Defined Networking(SDN)is programmable by separation of forwarding control through the centralization of the controller.The controller plays the role of the‘brain’that dictates the intelligent part of SDN t...Software Defined Networking(SDN)is programmable by separation of forwarding control through the centralization of the controller.The controller plays the role of the‘brain’that dictates the intelligent part of SDN technology.Various versions of SDN controllers exist as a response to the diverse demands and functions expected of them.There are several SDN controllers available in the open market besides a large number of commercial controllers;some are developed tomeet carrier-grade service levels and one of the recent trends in open-source SDN controllers is the Open Network Operating System(ONOS).This paper presents a comparative study between open source SDN controllers,which are known as Network Controller Platform(NOX),Python-based Network Controller(POX),component-based SDN framework(Ryu),Java-based OpenFlow controller(Floodlight),OpenDayLight(ODL)and ONOS.The discussion is further extended into ONOS architecture,as well as,the evolution of ONOS controllers.This article will review use cases based on ONOS controllers in several application deployments.Moreover,the opportunities and challenges of open source SDN controllers will be discussed,exploring carriergrade ONOS for future real-world deployments,ONOS unique features and identifying the suitable choice of SDN controller for service providers.In addition,we attempt to provide answers to several critical questions relating to the implications of the open-source nature of SDN controllers regarding vendor lock-in,interoperability,and standards compliance,Similarly,real-world use cases of organizations using open-source SDN are highlighted and how the open-source community contributes to the development of SDN controllers.Furthermore,challenges faced by open-source projects,and considerations when choosing an open-source SDN controller are underscored.Then the role of Artificial Intelligence(AI)and Machine Learning(ML)in the evolution of open-source SDN controllers in light of recent research is indicated.In addition,the challenges and limitations associated with deploying open-source SDN controllers in production networks,how can they be mitigated,and finally how opensource SDN controllers handle network security and ensure that network configurations and policies are robust and resilient are presented.Potential opportunities and challenges for future Open SDN deployment are outlined to conclude the article.展开更多
The Message Passing Interface (MPI) is a widely accepted standard for parallel computing on distributed memorysystems.However, MPI implementations can contain defects that impact the reliability and performance of par...The Message Passing Interface (MPI) is a widely accepted standard for parallel computing on distributed memorysystems.However, MPI implementations can contain defects that impact the reliability and performance of parallelapplications. Detecting and correcting these defects is crucial, yet there is a lack of published models specificallydesigned for correctingMPI defects. To address this, we propose a model for detecting and correcting MPI defects(DC_MPI), which aims to detect and correct defects in various types of MPI communication, including blockingpoint-to-point (BPTP), nonblocking point-to-point (NBPTP), and collective communication (CC). The defectsaddressed by the DC_MPI model include illegal MPI calls, deadlocks (DL), race conditions (RC), and messagemismatches (MM). To assess the effectiveness of the DC_MPI model, we performed experiments on a datasetconsisting of 40 MPI codes. The results indicate that the model achieved a detection rate of 37 out of 40 codes,resulting in an overall detection accuracy of 92.5%. Additionally, the execution duration of the DC_MPI modelranged from 0.81 to 1.36 s. These findings show that the DC_MPI model is useful in detecting and correctingdefects in MPI implementations, thereby enhancing the reliability and performance of parallel applications. TheDC_MPImodel fills an important research gap and provides a valuable tool for improving the quality ofMPI-basedparallel computing systems.展开更多
Software testing is a critical phase due to misconceptions about ambiguities in the requirements during specification,which affect the testing process.Therefore,it is difficult to identify all faults in software.As re...Software testing is a critical phase due to misconceptions about ambiguities in the requirements during specification,which affect the testing process.Therefore,it is difficult to identify all faults in software.As requirement changes continuously,it increases the irrelevancy and redundancy during testing.Due to these challenges;fault detection capability decreases and there arises a need to improve the testing process,which is based on changes in requirements specification.In this research,we have developed a model to resolve testing challenges through requirement prioritization and prediction in an agile-based environment.The research objective is to identify the most relevant and meaningful requirements through semantic analysis for correct change analysis.Then compute the similarity of requirements through case-based reasoning,which predicted the requirements for reuse and restricted to error-based requirements.Afterward,the apriori algorithm mapped out requirement frequency to select relevant test cases based on frequently reused or not reused test cases to increase the fault detection rate.Furthermore,the proposed model was evaluated by conducting experiments.The results showed that requirement redundancy and irrelevancy improved due to semantic analysis,which correctly predicted the requirements,increasing the fault detection rate and resulting in high user satisfaction.The predicted requirements are mapped into test cases,increasing the fault detection rate after changes to achieve higher user satisfaction.Therefore,the model improves the redundancy and irrelevancy of requirements by more than 90%compared to other clustering methods and the analytical hierarchical process,achieving an 80%fault detection rate at an earlier stage.Hence,it provides guidelines for practitioners and researchers in the modern era.In the future,we will provide the working prototype of this model for proof of concept.展开更多
Software Development Life Cycle (SDLC) is one of the major ingredients for the development of efficient software systems within a time frame and low-cost involvement. From the literature, it is evident that there are ...Software Development Life Cycle (SDLC) is one of the major ingredients for the development of efficient software systems within a time frame and low-cost involvement. From the literature, it is evident that there are various kinds of process models that are used by the software industries for the development of small, medium and long-term software projects, but many of them do not cover risk management. It is quite obvious that the improper selection of the software development process model leads to failure of the software products as it is time bound activity. In the present work, a new software development process model is proposed which covers the risks at any stage of the development of the software product. The model is named a Hemant-Vipin (HV) process model and may be helpful for the software industries for development of the efficient software products and timely delivery at the end of the client. The efficiency of the HV process model is observed by considering various kinds of factors like requirement clarity, user feedback, change agility, predictability, risk identification, practical implementation, customer satisfaction, incremental development, use of ready-made components, quick design, resource organization and many more and found through a case study that the presented approach covers many of parameters in comparison of the existing process models. .展开更多
Agile Transformations are challenging processes for organizations that look to extend the benefits of Agile philosophy and methods beyond software engineering.Despite the impact of these transformations on orga-nizati...Agile Transformations are challenging processes for organizations that look to extend the benefits of Agile philosophy and methods beyond software engineering.Despite the impact of these transformations on orga-nizations,they have not been extensively studied in academia.We conducted a study grounded in workshops and interviews with 99 participants from 30 organizations,including organizations undergoing transformations(“final organizations”)and companies supporting these processes(“consultants”).The study aims to understand the motivations,objectives,and factors driving and challenging these transformations.Over 700 responses were collected to the question and categorized into 32 objectives.The findings show that organizations primarily aim to achieve customer centricity and adaptability,both with 8%of the mentions.Other primary important objectives,with above 4%of mentions,include alignment of goals,lean delivery,sustainable processes,and a flatter,more team-based organizational structure.We also detect discrepancies in perspectives between the objectives identified by the two kinds of organizations and the existing agile literature and models.This misalignment highlights the need for practitioners to understand with the practical realities the organizations face.展开更多
Accurate software cost estimation in Global Software Development(GSD)remains challenging due to reliance on historical data and expert judgments.Traditional models,such as the Constructive Cost Model(COCOMO II),rely h...Accurate software cost estimation in Global Software Development(GSD)remains challenging due to reliance on historical data and expert judgments.Traditional models,such as the Constructive Cost Model(COCOMO II),rely heavily on historical and accurate data.In addition,expert judgment is required to set many input parameters,which can introduce subjectivity and variability in the estimation process.Consequently,there is a need to improve the current GSD models to mitigate reliance on historical data,subjectivity in expert judgment,inadequate consideration of GSD-based cost drivers and limited integration of modern technologies with cost overruns.This study introduces a novel hybrid model that synergizes the COCOMO II with Artificial Neural Networks(ANN)to address these challenges.The proposed hybrid model integrates additional GSD-based cost drivers identified through a systematic literature review and further vetted by industry experts.This article compares the effectiveness of the proposedmodelwith state-of-the-artmachine learning-basedmodels for software cost estimation.Evaluating the NASA 93 dataset by adopting twenty-six GSD-based cost drivers reveals that our hybrid model achieves superior accuracy,outperforming existing state-of-the-artmodels.The findings indicate the potential of combining COCOMO II,ANN,and additional GSD-based cost drivers to transform cost estimation in GSD.展开更多
Massive computational complexity and memory requirement of artificial intelligence models impede their deploy-ability on edge computing devices of the Internet of Things(IoT).While Power-of-Two(PoT)quantization is pro...Massive computational complexity and memory requirement of artificial intelligence models impede their deploy-ability on edge computing devices of the Internet of Things(IoT).While Power-of-Two(PoT)quantization is pro-posed to improve the efficiency for edge inference of Deep Neural Networks(DNNs),existing PoT schemes require a huge amount of bit-wise manipulation and have large memory overhead,and their efficiency is bounded by the bottleneck of computation latency and memory footprint.To tackle this challenge,we present an efficient inference approach on the basis of PoT quantization and model compression.An integer-only scalar PoT quantization(IOS-PoT)is designed jointly with a distribution loss regularizer,wherein the regularizer minimizes quantization errors and training disturbances.Additionally,two-stage model compression is developed to effectively reduce memory requirement,and alleviate bandwidth usage in communications of networked heterogenous learning systems.The product look-up table(P-LUT)inference scheme is leveraged to replace bit-shifting with only indexing and addition operations for achieving low-latency computation and implementing efficient edge accelerators.Finally,comprehensive experiments on Residual Networks(ResNets)and efficient architectures with Canadian Institute for Advanced Research(CIFAR),ImageNet,and Real-world Affective Faces Database(RAF-DB)datasets,indicate that our approach achieves 2×∼10×improvement in the reduction of both weight size and computation cost in comparison to state-of-the-art methods.A P-LUT accelerator prototype is implemented on the Xilinx KV260 Field Programmable Gate Array(FPGA)platform for accelerating convolution operations,with performance results showing that P-LUT reduces memory footprint by 1.45×,achieves more than 3×power efficiency and 2×resource efficiency,compared to the conventional bit-shifting scheme.展开更多
文摘Ethernet over SDH/SONET (EOS) is a hotspot in today's data transmission technology for it combines the merits of both Ethernet and SDH/SONET. However, implementing an EOS system on a chip is complex and needs full verifications. This paper introduces our design of Hardware/Software co-verification platform for EOS design. The hardware platform contains a microprocessor board and an FPGA (Field Programmable Gate Array)-based verification board, and the corresponding software includes test benches running in FPGAs, controlling programs for the microprocessor and a console program with GUI (Graphical User Interface) interface for configuration, management and supervision. The design is cost-effective and has been successfully employed to verify several IP (Intellectual Property) blocks of our EOS chip. Moreover, it is flexible and can be applied as a general-purpose verification platform.
基金Supported by the National Natural Science Foundation of China(No.61179045 and No.61350009)
文摘In this paper, the storage capacity of communication among cores and processors is taken into account and a maximum D-value-first algorithm is proposed. By improving the hardware parallelism in the task execution process, the maximum storage requirements for communication are minimized. Experimental results with various directed acyclic graph models showed that compared with the earliest-task-first algorithm, the storage requirements for communication were reduced by 22.46%, on average, while the average of makespan only increased by 0.82%,.
基金Sponsored by the Natural Science Foundation of Heilongjiang Province( Grant No B2007-07)Industrial Research Projects in Qiqihaer( Grant No GYGG-09009)
文摘This paper presents an algorithm that combines the chaos optimization algorithm with the maximum entropy ( COA-ME) by using entropy model based on chaos algorithm,in which the maximum entropy is used as the second method of searching the excellent solution. The search direction is improved by chaos optimization algorithm and realizes the selective acceptance of wrong solution. The experimental result shows that the presented algorithm can be used in the partitioning of hardware/software of reconfigurable system. It effectively reduces the local extremum problem,and search speed as well as performance of partitioning is improved.
基金Project supported by the Key-Tech Program of Zhejiang Province,China (No. 021101559), and the Fok Ying Tong Education Founda-tion (No. 94031), China
文摘In order to improve the efficiency of embedded software running on processor core, this paper proposes a hard-ware/software co-optimization approach for embedded software from the system point of view. The proposed stepwise methods aim at exploiting the structure and the resources of the processor as much as possible for software algorithm optimization. To achieve low memory usage and low frequency need for the same performance, this co-optimization approach was used to optimize embedded software of MP3 decoder based on a 16-bit fixed-point DSP core. After the optimization, the results of decoding 128 kbps, 44.1 kHz stereo MP3 on DSP evaluation platform need 45.9 MIPS and 20.4 kbytes memory space. The optimization rate achieves 65.6% for memory and 49.6% for frequency respectively compared with the results by compiler using floating-point computation. The experimental result indicates the availability of the hardware/software co-optimization approach depending on the algorithm and architecture.
文摘This paper deals with a new hardware/software embedded system design methodology based on design pattern approach by development of a new design tool called smartcell. Three main constraints of embedded systems design process are investigated: the complexity, the partitioning between hardware and software aspects and the reusability. Two intermediate models are carried out in order to solve the complexity problem. The partitioning problem deals with the proposed hardware/software partitioning algorithm based on Ant Colony Optimisation. The reusability problem is resolved by synthesis of intellectual property blocks. Specification and integration of an intelligent controller on heterogeneous platform are considered to illustrate the proposed approach.
基金This project was supported by the Defense Pre-Research Project of the ‘Tenth Five-Year-Plan’ of China(41315040106) and the National"863"High Technology Research and Development Programof China (2003AAIZ2210)
文摘A hardware/software co-synthesis method is presented for SoC designs consisting of both hardware IP cores and software components on a graph-theoretic formulation. Given a SoC integrated with a set of functions and a set of performance factors, a core for each function is selected from a set of alternative IP cores and software components, and optimal partitions is found in a way to evenly balance the performance factors and to ultimately reduce the overall cost, size, power consumption and runtime of the core-based SoC. The algorithm formulates IP cores and components into the corresponding mathematical models, presents a graph-theoretic model for finding the optimal partitions of SoC design and transforms SoC hardware/software co-synthesis problem into finding optimal paths in a weighted, directed graph. Overcoming the three main deficiencies of the traditional methods, this method can work automatically, evaluate more performance factors at the same time and meet the particularity of SoC designs. At last, the approach is illustrated that is practical and effective through partitioning a practical system.
文摘We present a simulation framework for wireless sensor networks developed to allow the design exploration and the complete microprocessor-instruction-level debug of network formation, data congestion, nodes interaction, all in one simulation environment. A specifically innovative feature is the co-emulation of selected nodes at clock-cycle-accurate hardware processing level, allowing code debug and exact execution latency evaluation (considering both protocol stack and application), together with other nodes at abstract protocol level, meeting a designer’s needs of simulation speed, scalability and reliability. The simulator is centered on the Zigbee protocol and can be retargeted for different node micro-architectures.
基金supported by the National Key Research and Development Program of China under Grant No.2023YFB4502300.
文摘Graph processing has been widely used in many scenarios,from scientific computing to artificial intelligence.Graph processing exhibits irregular computational parallelism and random memory accesses,unlike traditional workloads.Therefore,running graph processing workloads on conventional architectures(e.g.,CPUs and GPUs)often shows a significantly low compute-memory ratio with few performance benefits,which can be,in many cases,even slower than a specialized single-thread graph algorithm.While domain-specific hardware designs are essential for graph processing,it is still challenging to transform the hardware capability to performance boost without coupled software codesigns.This article presents a graph processing ecosystem from hardware to software.We start by introducing a series of hardware accelerators as the foundation of this ecosystem.Subsequently,the codesigned parallel graph systems and their distributed techniques are presented to support graph applications.Finally,we introduce our efforts on novel graph applications and hardware architectures.Extensive results show that various graph applications can be efficiently accelerated in this graph processing ecosystem.
基金This paper was supported by the National Natural Science Foundation of China(Grant No.61472289)National Key Research and Development Project(2016YFC0106305).We also would like to thank the anonymous reviewers for their valuable and constructive comments.
文摘Hardware/software partitioning is an essential step in hardware/software co-design.For large size problems,it is difficult to consider both solution quality and time.This paper presents an efficient GPU-based parallel tabu search algorithm(GPTS)for HW/SW partitioning.A single GPU kernel of compacting neighborhood is proposed to reduce the amount of GPU global memory accesses theoretically.A kernel fusion strategy is further proposed to reduce the amount of GPU global memory accesses of GPTS.To further minimize the transfer overhead of GPTS between CPU and GPU,an optimized transfer strategy for GPU-based tabu evaluation is proposed,which considers that all the candidates do not satisfy the given constraint.Experiments show that GPTS outperforms state-of-the-art work of tabu search and is competitive with other methods for HW/SW partitioning.The proposed parallelization is significant when considering the ordinary GPU platform.
基金The work was supported by the National Natural Science Foundation of China under Grant No. 61472289 and the National Key Research and Development Project of China under Grant No. 2016YFC0106305.
文摘With the development of the design complexity in embedded systems, hardware/software (HW/SW) partitioning becomes a challenging optimization problem in HW/SW co-design. A novel HW/SW partitioning method based on position disturbed particle swarm optimization with invasive weed optimization (PDPSO-IWO) is presented in this paper. It is found by biologists that the ground squirrels produce alarm calls which warn their peers to move away when there is potential predatory threat. Here, we present PDPSO algorithm, in each iteration of which the squirrel behavior of escaping from the global worst particle can be simulated to increase population diversity and avoid local optimum. We also present new initialization and reproduction strategies to improve IWO algorithm for searching a better position, with which the global best position can be updated. Then the search accuracy and the solution quality can be enhanced. PDPSO and improved IWO are synthesized into one single PDPSO-IWO algorithm, which can keep both searching diversification and searching intensification. Furthermore, a hybrid NodeRank (HNodeRank) algorithm is proposed to initialize the population of PDPSO-IWO, and the solution quality can be enhanced further. Since the HW/SW communication cost computing is the most time-consuming process for HW/SW partitioning algorithm, we adopt the GPU parallel technique to accelerate the computing. In this way, the runtime of PDPSO-IWO for large-scale HW/SW partitioning problem can be reduced efficiently. Finally, multiple experiments on benchmarks from state-of-the-art publications and large-scale HW/SW partitioning demonstrate that the proposed algorithm can achieve higher performance than other algorithms.
文摘This paper focuses on the algorithmic aspects for the hardware/software (HW/SW) partitioning which searches a reasonable composition of hardware and software components which not only satisfies the constraint of hardware area but also optimizes the execution time. The computational model is extended so that all possible types of communications can be taken into account for the HW/SW partitioning. Also, a new dynamic programming algorithm is proposed on the basis of the computational model, in which source data, rather than speedup in previous work, of basic scheduling blocks are directly utilized to calculate the optimal solution. The proposed algorithm runs in O(n·A) for n code fragments and the available hardware area A. Simulation results show that the proposed algorithm solves the HW/SW partitioning without increase in running time, compared with the algorithm cited in the literature.
基金supported in part by the Sichuan Science and Technology Program(Grant No.2023YFG0316)the Industry-University Research Innovation Fund of China University(Grant No.2021ITA10016)+1 种基金the Key Scientific Research Fund of Xihua University(Grant No.Z1320929)the Special Funds of Industry Development of Sichuan Province(Grant No.zyf-2018-056).
文摘Due to the interdependency of frame synchronization(FS)and channel estimation(CE),joint FS and CE(JFSCE)schemes are proposed to enhance their functionalities and therefore boost the overall performance of wireless communication systems.Although traditional JFSCE schemes alleviate the influence between FS and CE,they show deficiencies in dealing with hardware imperfection(HI)and deterministic line-of-sight(LOS)path.To tackle this challenge,we proposed a cascaded ELM-based JFSCE to alleviate the influence of HI in the scenario of the Rician fading channel.Specifically,the conventional JFSCE method is first employed to extract the initial features,and thus forms the non-Neural Network(NN)solutions for FS and CE,respectively.Then,the ELMbased networks,named FS-NET and CE-NET,are cascaded to capture the NN solutions of FS and CE.Simulation and analysis results show that,compared with the conventional JFSCE methods,the proposed cascaded ELM-based JFSCE significantly reduces the error probability of FS and the normalized mean square error(NMSE)of CE,even against the impacts of parameter variations.
基金supported by the NationalNatural Science Foundation of China(Grant No.61867004)the Youth Fund of the National Natural Science Foundation of China(Grant No.41801288).
文摘The purpose of software defect prediction is to identify defect-prone code modules to assist software quality assurance teams with the appropriate allocation of resources and labor.In previous software defect prediction studies,transfer learning was effective in solving the problem of inconsistent project data distribution.However,target projects often lack sufficient data,which affects the performance of the transfer learning model.In addition,the presence of uncorrelated features between projects can decrease the prediction accuracy of the transfer learning model.To address these problems,this article propose a software defect prediction method based on stable learning(SDP-SL)that combines code visualization techniques and residual networks.This method first transforms code files into code images using code visualization techniques and then constructs a defect prediction model based on these code images.During the model training process,target project data are not required as prior knowledge.Following the principles of stable learning,this paper dynamically adjusted the weights of source project samples to eliminate dependencies between features,thereby capturing the“invariance mechanism”within the data.This approach explores the genuine relationship between code defect features and labels,thereby enhancing defect prediction performance.To evaluate the performance of SDP-SL,this article conducted comparative experiments on 10 open-source projects in the PROMISE dataset.The experimental results demonstrated that in terms of the F-measure,the proposed SDP-SL method outperformed other within-project defect prediction methods by 2.11%-44.03%.In cross-project defect prediction,the SDP-SL method provided an improvement of 5.89%-25.46% in prediction performance compared to other cross-project defect prediction methods.Therefore,SDP-SL can effectively enhance within-and cross-project defect predictions.
基金supported by UniversitiKebangsaan Malaysia,under Dana Impak Perdana 2.0.(Ref:DIP–2022–020).
文摘Software Defined Networking(SDN)is programmable by separation of forwarding control through the centralization of the controller.The controller plays the role of the‘brain’that dictates the intelligent part of SDN technology.Various versions of SDN controllers exist as a response to the diverse demands and functions expected of them.There are several SDN controllers available in the open market besides a large number of commercial controllers;some are developed tomeet carrier-grade service levels and one of the recent trends in open-source SDN controllers is the Open Network Operating System(ONOS).This paper presents a comparative study between open source SDN controllers,which are known as Network Controller Platform(NOX),Python-based Network Controller(POX),component-based SDN framework(Ryu),Java-based OpenFlow controller(Floodlight),OpenDayLight(ODL)and ONOS.The discussion is further extended into ONOS architecture,as well as,the evolution of ONOS controllers.This article will review use cases based on ONOS controllers in several application deployments.Moreover,the opportunities and challenges of open source SDN controllers will be discussed,exploring carriergrade ONOS for future real-world deployments,ONOS unique features and identifying the suitable choice of SDN controller for service providers.In addition,we attempt to provide answers to several critical questions relating to the implications of the open-source nature of SDN controllers regarding vendor lock-in,interoperability,and standards compliance,Similarly,real-world use cases of organizations using open-source SDN are highlighted and how the open-source community contributes to the development of SDN controllers.Furthermore,challenges faced by open-source projects,and considerations when choosing an open-source SDN controller are underscored.Then the role of Artificial Intelligence(AI)and Machine Learning(ML)in the evolution of open-source SDN controllers in light of recent research is indicated.In addition,the challenges and limitations associated with deploying open-source SDN controllers in production networks,how can they be mitigated,and finally how opensource SDN controllers handle network security and ensure that network configurations and policies are robust and resilient are presented.Potential opportunities and challenges for future Open SDN deployment are outlined to conclude the article.
基金the Deanship of Scientific Research at King Abdulaziz University,Jeddah,Saudi Arabia under the Grant No.RG-12-611-43.
文摘The Message Passing Interface (MPI) is a widely accepted standard for parallel computing on distributed memorysystems.However, MPI implementations can contain defects that impact the reliability and performance of parallelapplications. Detecting and correcting these defects is crucial, yet there is a lack of published models specificallydesigned for correctingMPI defects. To address this, we propose a model for detecting and correcting MPI defects(DC_MPI), which aims to detect and correct defects in various types of MPI communication, including blockingpoint-to-point (BPTP), nonblocking point-to-point (NBPTP), and collective communication (CC). The defectsaddressed by the DC_MPI model include illegal MPI calls, deadlocks (DL), race conditions (RC), and messagemismatches (MM). To assess the effectiveness of the DC_MPI model, we performed experiments on a datasetconsisting of 40 MPI codes. The results indicate that the model achieved a detection rate of 37 out of 40 codes,resulting in an overall detection accuracy of 92.5%. Additionally, the execution duration of the DC_MPI modelranged from 0.81 to 1.36 s. These findings show that the DC_MPI model is useful in detecting and correctingdefects in MPI implementations, thereby enhancing the reliability and performance of parallel applications. TheDC_MPImodel fills an important research gap and provides a valuable tool for improving the quality ofMPI-basedparallel computing systems.
文摘Software testing is a critical phase due to misconceptions about ambiguities in the requirements during specification,which affect the testing process.Therefore,it is difficult to identify all faults in software.As requirement changes continuously,it increases the irrelevancy and redundancy during testing.Due to these challenges;fault detection capability decreases and there arises a need to improve the testing process,which is based on changes in requirements specification.In this research,we have developed a model to resolve testing challenges through requirement prioritization and prediction in an agile-based environment.The research objective is to identify the most relevant and meaningful requirements through semantic analysis for correct change analysis.Then compute the similarity of requirements through case-based reasoning,which predicted the requirements for reuse and restricted to error-based requirements.Afterward,the apriori algorithm mapped out requirement frequency to select relevant test cases based on frequently reused or not reused test cases to increase the fault detection rate.Furthermore,the proposed model was evaluated by conducting experiments.The results showed that requirement redundancy and irrelevancy improved due to semantic analysis,which correctly predicted the requirements,increasing the fault detection rate and resulting in high user satisfaction.The predicted requirements are mapped into test cases,increasing the fault detection rate after changes to achieve higher user satisfaction.Therefore,the model improves the redundancy and irrelevancy of requirements by more than 90%compared to other clustering methods and the analytical hierarchical process,achieving an 80%fault detection rate at an earlier stage.Hence,it provides guidelines for practitioners and researchers in the modern era.In the future,we will provide the working prototype of this model for proof of concept.
文摘Software Development Life Cycle (SDLC) is one of the major ingredients for the development of efficient software systems within a time frame and low-cost involvement. From the literature, it is evident that there are various kinds of process models that are used by the software industries for the development of small, medium and long-term software projects, but many of them do not cover risk management. It is quite obvious that the improper selection of the software development process model leads to failure of the software products as it is time bound activity. In the present work, a new software development process model is proposed which covers the risks at any stage of the development of the software product. The model is named a Hemant-Vipin (HV) process model and may be helpful for the software industries for development of the efficient software products and timely delivery at the end of the client. The efficiency of the HV process model is observed by considering various kinds of factors like requirement clarity, user feedback, change agility, predictability, risk identification, practical implementation, customer satisfaction, incremental development, use of ready-made components, quick design, resource organization and many more and found through a case study that the presented approach covers many of parameters in comparison of the existing process models. .
基金funding from the European Commission for the Ruralities Project(grant agreement no.101060876).
文摘Agile Transformations are challenging processes for organizations that look to extend the benefits of Agile philosophy and methods beyond software engineering.Despite the impact of these transformations on orga-nizations,they have not been extensively studied in academia.We conducted a study grounded in workshops and interviews with 99 participants from 30 organizations,including organizations undergoing transformations(“final organizations”)and companies supporting these processes(“consultants”).The study aims to understand the motivations,objectives,and factors driving and challenging these transformations.Over 700 responses were collected to the question and categorized into 32 objectives.The findings show that organizations primarily aim to achieve customer centricity and adaptability,both with 8%of the mentions.Other primary important objectives,with above 4%of mentions,include alignment of goals,lean delivery,sustainable processes,and a flatter,more team-based organizational structure.We also detect discrepancies in perspectives between the objectives identified by the two kinds of organizations and the existing agile literature and models.This misalignment highlights the need for practitioners to understand with the practical realities the organizations face.
文摘Accurate software cost estimation in Global Software Development(GSD)remains challenging due to reliance on historical data and expert judgments.Traditional models,such as the Constructive Cost Model(COCOMO II),rely heavily on historical and accurate data.In addition,expert judgment is required to set many input parameters,which can introduce subjectivity and variability in the estimation process.Consequently,there is a need to improve the current GSD models to mitigate reliance on historical data,subjectivity in expert judgment,inadequate consideration of GSD-based cost drivers and limited integration of modern technologies with cost overruns.This study introduces a novel hybrid model that synergizes the COCOMO II with Artificial Neural Networks(ANN)to address these challenges.The proposed hybrid model integrates additional GSD-based cost drivers identified through a systematic literature review and further vetted by industry experts.This article compares the effectiveness of the proposedmodelwith state-of-the-artmachine learning-basedmodels for software cost estimation.Evaluating the NASA 93 dataset by adopting twenty-six GSD-based cost drivers reveals that our hybrid model achieves superior accuracy,outperforming existing state-of-the-artmodels.The findings indicate the potential of combining COCOMO II,ANN,and additional GSD-based cost drivers to transform cost estimation in GSD.
基金This work was supported by Open Fund Project of State Key Laboratory of Intelligent Vehicle Safety Technology by Grant with No.IVSTSKL-202311Key Projects of Science and Technology Research Programme of Chongqing Municipal Education Commission by Grant with No.KJZD-K202301505+1 种基金Cooperation Project between Chongqing Municipal Undergraduate Universities and Institutes Affiliated to the Chinese Academy of Sciences in 2021 by Grant with No.HZ2021015Chongqing Graduate Student Research Innovation Program by Grant with No.CYS240801.
文摘Massive computational complexity and memory requirement of artificial intelligence models impede their deploy-ability on edge computing devices of the Internet of Things(IoT).While Power-of-Two(PoT)quantization is pro-posed to improve the efficiency for edge inference of Deep Neural Networks(DNNs),existing PoT schemes require a huge amount of bit-wise manipulation and have large memory overhead,and their efficiency is bounded by the bottleneck of computation latency and memory footprint.To tackle this challenge,we present an efficient inference approach on the basis of PoT quantization and model compression.An integer-only scalar PoT quantization(IOS-PoT)is designed jointly with a distribution loss regularizer,wherein the regularizer minimizes quantization errors and training disturbances.Additionally,two-stage model compression is developed to effectively reduce memory requirement,and alleviate bandwidth usage in communications of networked heterogenous learning systems.The product look-up table(P-LUT)inference scheme is leveraged to replace bit-shifting with only indexing and addition operations for achieving low-latency computation and implementing efficient edge accelerators.Finally,comprehensive experiments on Residual Networks(ResNets)and efficient architectures with Canadian Institute for Advanced Research(CIFAR),ImageNet,and Real-world Affective Faces Database(RAF-DB)datasets,indicate that our approach achieves 2×∼10×improvement in the reduction of both weight size and computation cost in comparison to state-of-the-art methods.A P-LUT accelerator prototype is implemented on the Xilinx KV260 Field Programmable Gate Array(FPGA)platform for accelerating convolution operations,with performance results showing that P-LUT reduces memory footprint by 1.45×,achieves more than 3×power efficiency and 2×resource efficiency,compared to the conventional bit-shifting scheme.