期刊文献+
共找到26篇文章
< 1 2 >
每页显示 20 50 100
A Multithreaded CGRA for Convolutional Neural Network Processing 被引量:1
1
作者 Kota Ando Shinya Takamaeda-Yamazaki +2 位作者 Masayuki Ikebe Tetsuya Asai Masato Motomura 《Circuits and Systems》 2017年第6期149-170,共22页
Convolutional neural network (CNN) is an essential model to achieve high accuracy in various machine learning applications, such as image recognition and natural language processing. One of the important issues for CN... Convolutional neural network (CNN) is an essential model to achieve high accuracy in various machine learning applications, such as image recognition and natural language processing. One of the important issues for CNN acceleration with high energy efficiency and processing performance is efficient data reuse by exploiting the inherent data locality. In this paper, we propose a novel CGRA (Coarse Grained Reconfigurable Array) architecture with time-domain multithreading for exploiting input data locality. The multithreading on each processing element enables the input data reusing through multiple computation periods. This paper presents the accelerator design performance analysis of the proposed architecture. We examine the structure of memory subsystems, as well as the architecture of the computing array, to supply required data with minimal performance overhead. We explore efficient architecture design alternatives based on the characteristics of modern CNN configurations. The evaluation results show that the available bandwidth of the external memory can be utilized efficiently when the output plane is wider (in earlier layers of many CNNs) while the input data locality can be utilized maximally when the number of output channel is larger (in later layers). 展开更多
关键词 CNN Convolutional NEURAL Network DEEP LEARNING multithreaded ARCHITECTURE CGRA
下载PDF
Multi-Plate Microbial Monitoring Terminal Based on Raspberry Pi 4B
2
作者 Qirong Luo Xichang Cai Tongyuan Liu 《Journal of Electronic Research and Application》 2024年第3期28-33,共6页
We utilized Raspberry Pi 4B to develop a microbial monitoring system to simplify the microbial image-capturing process and facilitate the informatization of microbial observation results.The Raspberry Pi 4B firmware,d... We utilized Raspberry Pi 4B to develop a microbial monitoring system to simplify the microbial image-capturing process and facilitate the informatization of microbial observation results.The Raspberry Pi 4B firmware,developed under Python on the Linux platform,achieves sum verification of serial data,file upload based on TCP protocol,control of sequence light source and light valve,real-time self-test based on multithreading,and an experiment-oriented file management method.The system demonstrated improved code logic,scheduling,exception handling,and code readability. 展开更多
关键词 Raspberry Pi 4B OBJECT-ORIENTED MULTITHREADING Serial port protocol and parsing TCP
下载PDF
An Efficient and Flexible Deterministic Framework for Multithreaded Programs 被引量:1
3
作者 卢凯 周旭 +2 位作者 王小平 Tom Bergan 陈沉 《Journal of Computer Science & Technology》 SCIE EI CSCD 2015年第1期42-56,共15页
Determinism is very useful to multithreaded programs in debugging, testing, etc. Many deterministic ap- proaches have been proposed, such as deterministic multithreading (DMT) and deterministic replay. However, thes... Determinism is very useful to multithreaded programs in debugging, testing, etc. Many deterministic ap- proaches have been proposed, such as deterministic multithreading (DMT) and deterministic replay. However, these sys- tems either are inefficient or target a single purpose, which is not flexible. In this paper, we propose an efficient and flexible deterministic framework for multithreaded programs. Our framework implements determinism in two steps: relaxed determinism and strong determinism. Relaxed determinism solves data races eificiently by using a proper weak memory consistency model. After that, we implement strong determinism by solving lock contentions deterministically. Since we can apply different approaches for these two steps independently, our framework provides a spectrum of deterministic choices, including nondeterministic system (fast), weak deterministic system (fast and conditionally deterministic), DMT system, and deternfinistic replay system. Our evaluation shows that the DMT configuration of this framework could even outperform a state-of-the-art DMT system. 展开更多
关键词 DETERMINISM MULTITHREADING FRAMEWORK FLEXIBLE
原文传递
Chip Multithreaded Consistency Model
4
作者 李祖松 郇丹丹 +1 位作者 胡伟武 唐志敏 《Journal of Computer Science & Technology》 SCIE EI CSCD 2008年第2期298-304,F0003,共8页
Multithreaded technique is the developing trend of high performance processor. Memory consistency model is essential to the correctness, performance and complexity of multithreaded processor. The chip multithreaded co... Multithreaded technique is the developing trend of high performance processor. Memory consistency model is essential to the correctness, performance and complexity of multithreaded processor. The chip multithreaded consistency model adapting to multithreaded processor is proposed in this paper. The restriction imposed on memory event ordering by chip multithreaded consistency is presented and formalized. With the idea of critical cycle built by Wei-Wu Hu, we prove that the proposed chip multithreaded consistency model satisfies the criterion of correct execution of sequential consistency model. Chip multithreaded consistency model provides a way of achieving high performance compared with sequential consistency model and easures the compatibility of software that the execution result in multithreaded processor is the same as the execution result in uniprocessor. The implementation strategy of chip multithreaded consistency model in Godson-2 SMT processor is also proposed. Godson-2 SMT processor supports chip multithreaded consistency model correctly by exception scheme based on the sequential memory access queue of each thread. 展开更多
关键词 computer architecture GODSON-2 MULTITHREADING memory consistency model event ordering
原文传递
PsmArena:Partitioned Shared Memory for NUMA-Awareness in Multithreaded Scientific Applications
5
作者 Zhang Yang Aiqing Zhang Zeyao Mo 《Tsinghua Science and Technology》 SCIE EI CAS CSCD 2021年第3期287-295,共9页
The Distributed Shared Memory(DSM)architecture is widely used in today’s computer design to mitigate the ever-widening processing-memory gap,and it inevitably exhibits Non-Uniform Memory Access(NUMA)to shared-memory ... The Distributed Shared Memory(DSM)architecture is widely used in today’s computer design to mitigate the ever-widening processing-memory gap,and it inevitably exhibits Non-Uniform Memory Access(NUMA)to shared-memory parallel applications.Failure to adapt to the NUMA effect can significantly downgrade application performance,especially on today’s manycore platforms with tens to hundreds of cores.However,traditional approaches such as first-touch and memory policy fall short in false page-sharing,fragmentation,or ease of use.In this paper,we propose a partitioned shared-memory approach that allows multithreaded applications to achieve full NUMA-awareness with only minor code changes and develop an accompanying NUMA-aware heap manager which eliminates false page-sharing and minimizes fragmentation.Experiments on a 256-core cc-NUMA computing node show that the proposed approach helps applications to adapt to NUMA with only minor code changes and improves the performance of typical multithreaded scientific applications by up to 4.3 folds with the increased use of cores. 展开更多
关键词 partitioned shared memory Non-Uniform Memory Access(NUMA) heap manager multithread manycore
原文传递
A Low-Cost and High-Performance Cryptosystem Using Tripling-Oriented Elliptic Curve
6
作者 Mohammad Alkhatib Wafa S.Aldalbahy 《Intelligent Automation & Soft Computing》 SCIE 2023年第8期1807-1831,共25页
Developing a high-performance public key cryptosystem is crucial for numerous modern security applications.The Elliptic Curve Cryptosystem(ECC)has performance and resource-saving advantages compared to other types of ... Developing a high-performance public key cryptosystem is crucial for numerous modern security applications.The Elliptic Curve Cryptosystem(ECC)has performance and resource-saving advantages compared to other types of asymmetric ciphers.However,the sequential design implementation for ECC does not satisfy the current applications’performance requirements.Therefore,several factors should be considered to boost the cryptosystem performance,including the coordinate system,the scalar multiplication algo-rithm,and the elliptic curve form.The tripling-oriented(3DIK)form is imple-mented in this work due to its minimal computational complexity compared to other elliptic curves forms.This experimental study explores the factors playing an important role in ECC performance to determine the best combi-nation that leads to developing high-speed ECC.The proposed cryptosystem uses parallel software implementation to speed up ECC performance.To our knowledge,previous studies have no similar software implementation for 3DIK ECC.Supported by using parallel design,projective coordinates,and a fast scalar multiplication algorithm,the proposed 3DIK ECC improved the speed of the encryption process compared with other counterparts and the usual sequential implementation.The highest performance level for 3DIK ECC was achieved when it was implemented using the Non-Adjacent Form algorithm and homogenous projection.Compared to the costly hardware implementations,the proposed software implementation is cost effective and can be easily adapted to other environments.In addition,the power con-sumption of the proposed ECC is analyzed and compared with other known cryptosystems.thus,the current study presents a detailed overview of the design and implementation of 3DIK ECC. 展开更多
关键词 Security CRYPTOGRAPHY elliptic curves software implement greatation MULTITHREADING
下载PDF
关于提高基于OpenSSL软件性能的研究 被引量:1
7
作者 张妍 许云峰 张焕生 《河北科技大学学报》 CAS 2007年第2期157-161,共5页
OpenSSL是用来开发网络安全软件的一个开源软件包,使用OpenSSL可以缩短软件开发周期,提高软件的运行效率和稳定性。但是目前大部分基于OpenSSL开发的软件并不能完全发挥这个开源软件包所具有的高效率和高稳定性的特点。原因在于这些软... OpenSSL是用来开发网络安全软件的一个开源软件包,使用OpenSSL可以缩短软件开发周期,提高软件的运行效率和稳定性。但是目前大部分基于OpenSSL开发的软件并不能完全发挥这个开源软件包所具有的高效率和高稳定性的特点。原因在于这些软件的开发人员对OpenSSL的高级应用所知甚少。从提高软件运行效率及稳定性的角度对OpenSSL的高级应用进行论述,提出了2种用来提高软件运行效率和稳定性的解决方法。 展开更多
关键词 SSL OPENSSL Multithread 线程安全
下载PDF
Parallelization of a Branch and Bound Algorithm on Multicore Systems 被引量:1
8
作者 Chia-Shin Chung James Flynn Janche Sang 《Journal of Software Engineering and Applications》 2012年第8期621-629,共9页
The general m-machine permutation flowshop problem with the total flow-time objective is known to be NP-hard for m ≥ 2. The only practical method for finding optimal solutions has been branch-and-bound algorithms. In... The general m-machine permutation flowshop problem with the total flow-time objective is known to be NP-hard for m ≥ 2. The only practical method for finding optimal solutions has been branch-and-bound algorithms. In this paper, we present an improved sequential algorithm which is based on a strict alternation of Generation and Exploration execution modes as well as Depth-First/Best-First hybrid strategies. The experimental results show that the proposed scheme exhibits improved performance compared with the algorithm in [1]. More importantly, our method can be easily extended and implemented with lightweight threads to speed up the execution times. Good speedups can be obtained on shared-memory multicore systems. 展开更多
关键词 Parallel Branch and BOUND multithreaded Programming MULTICORE System PERMUTATION FLOWSHOP Software REUSE
下载PDF
The Serial Communication Based on Multithreading Technique of Windows 被引量:2
9
作者 Chen Shu-zhen Shi Bo 《Wuhan University Journal of Natural Sciences》 CAS 2000年第3期328-328,共1页
Present a kind of method which is used to communicate between serial serial port and peripheral equipment dynamicly and real-time using multithreading technique based on the basic principle of communication and multit... Present a kind of method which is used to communicate between serial serial port and peripheral equipment dynamicly and real-time using multithreading technique based on the basic principle of communication and multitasking mechanism in the circumstance of Windows. This method resolves the question of Real-time answering in the serial communication validly, reduces losing rate of data and improves reliability of system. This article presents a general method used in the serial communication which is practical. 展开更多
关键词 MULTITHREADING serial communication real-time query
下载PDF
Data Processing Middleware in a High-Powered Neutral Beam Injection Control System 被引量:1
10
作者 盛鹏 胡纯栋 +4 位作者 宋士花 刘智民 赵远哲 张小丹 窦少彬 《Plasma Science and Technology》 SCIE EI CAS CSCD 2013年第6期593-598,共6页
A set of data-processing middleware for a high-powered neutral beam injection(NBI) control system is presented in this paper.The middleware,based on TCP/IP and multi-threading technologies,focuses mainly on data pro... A set of data-processing middleware for a high-powered neutral beam injection(NBI) control system is presented in this paper.The middleware,based on TCP/IP and multi-threading technologies,focuses mainly on data processing and transmission.It separates the data processing and compression from data acquisition and storage.It provides universal transmitting interfaces for different software circumstances,such as WinCC,LabView and other measurement systems. The experimental data acquired on Windows,QNX and Linux platforms are processed by the middleware and sent to the monitoring applications.There are three middleware deployment models:serial processing,parallel processing and alternate serial processing.By using these models,the middleware solves real-time data-processing problems on heterogeneous environmental acquisition hardware with different operating systems and data applications. 展开更多
关键词 neutral beam injection control system MIDDLEWARE MULTITHREADING
下载PDF
TRSTR: A Fault-Tolerant Microprocessor Architecture Based on SMT 被引量:1
11
作者 YANGHua CUIGang YANGXiao-zong 《Wuhan University Journal of Natural Sciences》 CAS 2005年第1期51-55,共5页
Based on Simultancous Multithrtading (SMT), we propose a fault-tola antscheme called Tri-modular Redun-danlly and Simultaneously threaded processor with Recovery (TRSTR),TRSTR features as following: First, we introduc... Based on Simultancous Multithrtading (SMT), we propose a fault-tola antscheme called Tri-modular Redun-danlly and Simultaneously threaded processor with Recovery (TRSTR),TRSTR features as following: First, we introduce an arbitrator context into thtconventional SRT(Simultaneous and Redundantly Threaded), which acts as an arbitrator when results from the other twocontexts disagree, or acts as an ordinary thread generally, thus making full use of SMT'sparallelism. Second, we append reconfigurablefeature to sphere of replication in SRT, making it moreflexible for changing demands and situations Third, TRSFR has two working modes: Tri-Simultancouswith Voling (TSV) and Dual-Simultaneous with Arbitrator CDSA), which can switch at will. Finally, inaddition to transient-fault coverage, TRSTR has on-line self-checking and self-recover ingabilities, so as to shield off some permanent faults and reconfigure itself without stopping thecrucial job. improving its reliability and availability. 展开更多
关键词 FAULT-TOLERANT HIGH-PERFORMANCE simultaneous multithreading ARCHITECTURE
下载PDF
Design of Control Server Application Software for Neutral Beam Injection System
12
作者 施齐林 胡纯栋 +1 位作者 盛鹏 宋士花 《Plasma Science and Technology》 SCIE EI CAS CSCD 2012年第4期343-346,共4页
For the remote control of a neutral beam injection (NBI) system, a software NBIcsw is developed to work on the control server. It can meet the requirements of data transmission and operation-control between the NBI ... For the remote control of a neutral beam injection (NBI) system, a software NBIcsw is developed to work on the control server. It can meet the requirements of data transmission and operation-control between the NBI measurement and control layer (MCL) and the remote monitoring layer (RML). The NBIcsw runs on a Linux system, developed with client/server (C/S) mode and multithreading technology. It is shown through application that the software is with good efficiency. 展开更多
关键词 NBI OPC SOCKET MULTITHREADING C/S
下载PDF
Design of Timing Synchronization Software on EAST-NBI
13
作者 赵远哲 胡纯栋 +1 位作者 盛鹏 张小丹 《Plasma Science and Technology》 SCIE EI CAS CSCD 2013年第12期1237-1240,共4页
To ensure the uniqueness and recognition of data and make it easy to analyze and process the data of all subsystems of the neutral beam injector (NBI), it is required that all subsystems have a unified system time. ... To ensure the uniqueness and recognition of data and make it easy to analyze and process the data of all subsystems of the neutral beam injector (NBI), it is required that all subsystems have a unified system time. In this paper, the timing synchronization software is presented which is related to many kinds of technologies, such as shared memory, multithreading, TCP protocol and so on. Shared memory helps the server save the information of clients and system time, multithreading can deal with different clients with different threads, the server works under Linux operating system, the client works under Linux operating system and Windows operating system. With the help of this design, synchronization of all subsystems can be achieved in less than one second, and this accuracy is enough for the NBI system and the reliability of data is thus ensured. 展开更多
关键词 EAST NBI timing synchronization shared memory MULTITHREADING SERVER/CLIENT
下载PDF
Key Technology in Telemetry System
14
作者 Zheng Dong Wu Zhi-bin Chen Shu-zhen 《Wuhan University Journal of Natural Sciences》 EI CAS 1999年第4期454-458,共5页
The recent development of telemetry system is driven by the fast development of technology in the field of computer and network. The systematic introduction is provided to: digital video and image processing, network ... The recent development of telemetry system is driven by the fast development of technology in the field of computer and network. The systematic introduction is provided to: digital video and image processing, network communication and the realization of those techniques in computer. 展开更多
关键词 TELEMETRY WINSOCK multithread AVICAP
下载PDF
Improved Tomasulo algorithm
15
作者 崔光佐 胡铭曾 《Journal of Harbin Institute of Technology(New Series)》 EI CAS 1999年第4期16-19,共4页
1 EXPLOITINGPARALLELISMINPROGRAMBYMULTITHREADSUPERSCALARARCHITECTUREInsuperscalararchitecture,instructionle... 1 EXPLOITINGPARALLELISMINPROGRAMBYMULTITHREADSUPERSCALARARCHITECTUREInsuperscalararchitecture,instructionlevelparallelismwi... 展开更多
关键词 multithread SUPERSCALAR ARCHITECTURE Tomasulo ALGORITHM dynamic SCHEDULING INSTRUCTION level PARALLELISM
下载PDF
A Perfect Knob to Scale Thread Pool on Runtime
16
作者 Faisal Bahadur Arif Iqbal Umar +3 位作者 Insaf Ullah Fahad Algarni Muhammad Asghar Khan Samih M.Mostafa 《Computers, Materials & Continua》 SCIE EI 2022年第7期1483-1493,共11页
Scalability is one of the utmost nonfunctional requirement of server applications,because it maintains an effective performance parallel to the large fluctuating and sometimes unpredictable workload.In order to achiev... Scalability is one of the utmost nonfunctional requirement of server applications,because it maintains an effective performance parallel to the large fluctuating and sometimes unpredictable workload.In order to achieve scalability,thread pool system(TPS)has been used extensively as a middleware service in server applications.The size of thread pool is the most significant factor,that affects the overall performance of servers.Determining the optimal size of thread pool dynamically on runtime is a challenging problem.The most widely used and simple method to tackle this problem is to keep the size of thread pool equal to the request rate,i.e.,the frequencyoriented thread pool(FOTP).The FOTPs are the most widely used TPSs in the industry,because of the implementation simplicity,the negligible overhead and the capability to use in any system.However,the frequency-based schemes only focused on one aspect of changes in the load,and that is the fluctuations in request rate.The request rate alone is an imperfect knob to scale thread pool.Thus,this paper presents a workload profiling based FOTP,that focuses on request size(service time of request)besides the request rate as a knob to scale thread pool on runtime,because we argue that the combination of both truly represents the load fluctuation in server-side applications.We evaluated the results of the proposed system against state of the art TPS of Oracle Corporation(by a client-server-based simulator)and concluded that our system outperformed in terms of both;the response times and throughput. 展开更多
关键词 SCALABILITY performance MIDDLEWARE workload profiling MULTITHREADING thread pool
下载PDF
Redundant Multithreading Architecture Overview
17
作者 YANG Hua CUI Gang LIU Hongwei YANG Xiaozong 《Wuhan University Journal of Natural Sciences》 CAS 2006年第6期1793-1796,共4页
To overcome the ever-increasing susceptibility to transient-fault in processors, various redundant multithreading (RMT) architectures have been proposed, which is becoming a most effective approach for detecting and... To overcome the ever-increasing susceptibility to transient-fault in processors, various redundant multithreading (RMT) architectures have been proposed, which is becoming a most effective approach for detecting and recovering from transient-fault. This paper surveys a wide range of RMT architectures-from the original AR-SMT(A-stream R-stream Simultaneous MultiThreading) to the most-recent SD-SRT (Slack-Decode Simultaneous Redundant Threading), presenting traverse analyses and comparisons among them, and hereby demonstrates its evolution and tendency. Finally, some directions and suggestions are put forward for the further RMT research and development. 展开更多
关键词 redundant multithreading PROCESSOR RELIABILITY
下载PDF
Simultaneous Multithreading Fault Tolerance Processor
18
作者 DONGLan HUMing-zeng +3 位作者 JIZhen-zhou CUIGuang-zuo TANGXin-min HEFeng 《Wuhan University Journal of Natural Sciences》 EI CAS 2005年第1期17-20,共4页
Transient fault detection mechanism is added to simultaneous multithreading architecture. By exploiting both ILP (Instruction Level Parallelism) and TLP (Thread Level Parallelism), Simultaneous Multithreading (SMT) Fa... Transient fault detection mechanism is added to simultaneous multithreading architecture. By exploiting both ILP (Instruction Level Parallelism) and TLP (Thread Level Parallelism), Simultaneous Multithreading (SMT) Fault Tolerance Processor can be expected to achieve better tradeoff between performance and hardware cost than traditional Fault Tolerance Processors. Detailed simulations of 3 of SPEC95 benchmarks show that executing two redundant programs on the fault-tolerant microarchitecture takes only 40%–61%longer than running a single version of the program. The new instruction fetch algorithm enhances the performance by 0.4%~1%to most of the benchmarks we choose randomly. 展开更多
关键词 Key words simultaneous multithreading rault tolerance TLP (Thread Level Parallelism) fetch policy
下载PDF
A Parallelization Research for FY Satellite Rainfall Estimate Day Knock off Product Algorithm
19
作者 Weixia Lin Xiangang Zhao +2 位作者 Cunqun Fan Manyun Lin Lizi Xie 《Atmospheric and Climate Sciences》 2018年第2期248-261,共14页
With the development of satellite remote sensing technology, more and more requirements are put forward on the timeliness and stability of the satellite weather service system. The FY satellite rainfall estimate day k... With the development of satellite remote sensing technology, more and more requirements are put forward on the timeliness and stability of the satellite weather service system. The FY satellite rainfall estimate day knock off product algorithm runs longer, about 20 minutes, which affects the estimated rainfall product generated timeliness. Research and development of parallel optimization algorithms based on the needs of satellite meteorological services and their effectiveness in practical applications are necessary ways to enhance the high-performance and high-availability capabilities of satellite meteorological services. So aiming at this problem, we started the parallel algorithm research based on the analysis of precipitation estimation algorithm. Firstly, we explained the steps of precipitation estimated date knock off product algorithm;secondly, we analyzed the four main calculation module calculating the amount of algorithms;thirdly, multithreaded parallel algorithm and MPI parallelization was designed. Finally, the multithreaded parallel and MPI parallelization were realized. Experimental results show that the multithreaded parallel and MPI parallelization algorithm could greatly improve the overall degree of computational efficiency. And, MPI parallelization mode has a higher operating efficiency. The performance of parallel processing is closely related to the architecture of the computer. From the perspective of service scheduling and product algorithms, the MPI parallelization approach is adopted to achieve the purpose of improving service quality. 展开更多
关键词 RAINFALL ESTIMATE PARALLELIZATION MULTITHREADING MPI
下载PDF
The BBC News Hunter:A Novel Crawler for BBC News
20
作者 Mingxin Wang Ning Wang +4 位作者 Boran Wang Can Tian Yanchun Liang Guozhong Zhao Xiaosong Han 《国际计算机前沿大会会议论文集》 2016年第2期63-64,共2页
In order to distinguish and extract the topic information from other interferential information on the BBC news website for the study in social computing,the BBC News Hunter was proposed in this paper.The whole system... In order to distinguish and extract the topic information from other interferential information on the BBC news website for the study in social computing,the BBC News Hunter was proposed in this paper.The whole system consists of 6 subsystems,respectively named:UI,Control,Download,Analysis,Storage and Log.Numerical experiments show that satisfactory results can be obtained from the BBC news website,whose average accuracy as well as efficiency are acceptable. 展开更多
关键词 BBC CRAWLER NEWS HTML PARSER Multithread
下载PDF
上一页 1 2 下一页 到第
使用帮助 返回顶部