期刊文献+
共找到22篇文章
< 1 2 >
每页显示 20 50 100
Design and Implementation of Virtual Experiments for Computer Architecture Based on Simulators
1
作者 ZHANG Chen-xi LIU Yi LI Jiang-feng 《计算机教育》 2012年第10期28-30,共3页
In china,many students are unable to do experiments in computer architecture courses,which is very important in helping them to understand many key points.The reason is that the cost of the hardware required is too mu... In china,many students are unable to do experiments in computer architecture courses,which is very important in helping them to understand many key points.The reason is that the cost of the hardware required is too much.Besides,it is very difficult to do research study in hardware experiments.In our course,we adopted an alternative way to deal with the problem: to use software simulators,and designed a set of virtual experiments based on these simulators,which are described in detail in this paper. 展开更多
关键词 computer architecture course EXPERIMENT SIMULATOR DESIGN
下载PDF
CALL FOR PAPERS Special Section of JCST on Computer Architecture and Systems for Big Data 被引量:1
2
《Journal of Computer Science & Technology》 SCIE EI CSCD 2014年第3期F0003-F0003,共1页
Introduction Research on computer architecture and systems is typically driven by technology and applications. Big data has emerged as an important application domain which has shown its huge impact on scientific rese... Introduction Research on computer architecture and systems is typically driven by technology and applications. Big data has emerged as an important application domain which has shown its huge impact on scientific research, business, and society. Big data is known for its large volume, high velocity, and a variety of formats. The collection, storage, retrieval, processing, and visualization of big data issues many challenges to computer architecture and systems. This special section is an effort to encourage and promote research to address the big data challenges from the computer architecture and systems perspectives. 展开更多
关键词 CALL FOR PAPERS Special Section of JCST on computer architecture and Systems for Big Data
原文传递
New multi-DSP parallel computing architecture for real-time image processing 被引量:4
3
作者 Hu Junhong Zhang Tianxu Jiang Haoyang 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2006年第4期883-889,共7页
The flexibility of traditional image processing system is limited because those system are designed for specific applications. In this paper, a new TMS320C64x-based multi-DSP parallel computing architecture is present... The flexibility of traditional image processing system is limited because those system are designed for specific applications. In this paper, a new TMS320C64x-based multi-DSP parallel computing architecture is presented. It has many promising characteristics such as powerful computing capability, broad I/O bandwidth, topology flexibility, and expansibility. The parallel system performance is evaluated by practical experiment. 展开更多
关键词 parallel computing image processing REAL-TIME computer architecture
下载PDF
Hardware Architecture for RSA Cryptography Based on Residue Number System
4
作者 郭炜 刘亚灵 +2 位作者 白松辉 魏继增 孙达志 《Transactions of Tianjin University》 EI CAS 2012年第4期237-242,共6页
A parallel architecture for efficient hardware implementation of Rivest Shamir Adleman(RSA) cryptography is proposed.Residue number system(RNS) is introduced to realize high parallelism,thus all the elements under the... A parallel architecture for efficient hardware implementation of Rivest Shamir Adleman(RSA) cryptography is proposed.Residue number system(RNS) is introduced to realize high parallelism,thus all the elements under the same base are independent of each other and can be computed in parallel.Moreover,a simple and fast base transformation is used to achieve RNS Montgomery modular multiplication algorithm,which facilitates hardware implementation.Based on transport triggered architecture(TTA),the proposed architecture is designed to evaluate the performance and feasibility of the algorithm.With these optimizations,a decryption rate of 106 kbps can be achieved for 1 024-b RSA at the frequency of 100 MHz. 展开更多
关键词 residue number system RSA cryptography Montgomery algorithm computer architecture parallelalgorithm
下载PDF
Fault-Tolerant Design Techniques in ACMP Architecture
5
作者 YAOWen.bin WANGDong-sheng 《Wuhan University Journal of Natural Sciences》 CAS 2005年第1期5-8,共4页
Single-chip multiprocessor (CMP) combined with the fault-loleranl(FT)techniques offers an ideal architecture to achieve high availability on the basis of sustaining highcomputing performance FT design of a single-chip... Single-chip multiprocessor (CMP) combined with the fault-loleranl(FT)techniques offers an ideal architecture to achieve high availability on the basis of sustaining highcomputing performance FT design of a single-chip multiprocessor is described, including thetechniques from hard-wart redundancy to software support and firmware strategy. The design aims atmasking the influences of errors and automatically correcting the system states. 展开更多
关键词 computer architecture fault-tolerant design single-chip multiprocessor
下载PDF
The Implementation of Ray Tracing Algorithm with OpenMP Parallelization
6
作者 Noor Alnasser Raghad Alabssi +2 位作者 Batool Faran Latifah Alessa Naya Nagy 《Journal of Computer and Communications》 2024年第1期120-130,共11页
Ray tracing is a computer graphics method that renders images realistically. As the name suggests, this technique primarily traces the path of light rays interacting with objects in a scene [1], permitting the calcula... Ray tracing is a computer graphics method that renders images realistically. As the name suggests, this technique primarily traces the path of light rays interacting with objects in a scene [1], permitting the calculation of lighting and reflecting impact [2]. As ray tracing is a time-consuming process, the need for parallelization to solve this problem arises. One downside of this solution is the existence of race conditions. In this work, we explore and experiment with a different, well-known solution for this race condition. Starting with the introduction and the background section, a brief overview of the topic is followed by a detailed part of how the race conditions may occur in the case of the ray tracing algorithm. Continuing with the methods and results section, we have used OpenMP to parallelize the Ray tracing algorithm with the different compiler directives critical, atomic, and first-private. Hence, it concluded that both critical and atomic are not efficient solutions to produce a good-quality picture, but first-private succeeded in producing a high-quality picture. 展开更多
关键词 PARALLELIZATION Ray Tracing Parallel computer architecture OPENMP
下载PDF
ArchSim:A System-Level Parallel Simulation Platform for the Architecture Design of High Performance Computer 被引量:4
7
作者 黄永勤 李宏亮 +4 位作者 谢向辉 钱磊 郝子宇 过锋 张昆 《Journal of Computer Science & Technology》 SCIE EI CSCD 2009年第5期901-912,共12页
High performance computer (HPC) is a complex huge system, of which the architecture design meets increasing difficulties and risks. Traditional methods, such as theoretical analysis, component-level simulation and s... High performance computer (HPC) is a complex huge system, of which the architecture design meets increasing difficulties and risks. Traditional methods, such as theoretical analysis, component-level simulation and sequential simulation, are not applicable to system-level simulations of HPC systems. Even the parallel simulation using large-scale parallel machines also have many difficulties in scalability, reliability, generality, as well as efficiency. According to the current needs of HPC architecture design, this paper proposes a system-level parallel simulation platform: ArchSim. We first introduce the architecture of ArchSim simulation platform which is composed of a global server (GS), local server agents (LSA) and entities. Secondly, we emphasize some key techniques of ArchSim, including the synchronization protocol, the communication mechanism and the distributed checkpointing/restart mechanism. We then make a synthesized test of some main performance indices of ArchSim with the phold benchmark and analyze the extra overhead generated by ArchSim. Finally, based on ArchSim, we construct a parallel event-driven interconnection network simulator and a system-level simulator for a small scale HPC system with 256 processors. The results of the performance test and HPC system simulations demonstrate that ArchSim can achieve high speedup ratio and high scalability on parallel host machine and support system-level simulations for the architecture design of HPC systems. 展开更多
关键词 high performance computer architecture system-level parallel simulation synchronization protocol message communication distributed checkpointing/restart
原文传递
IDSS: Designing to Extend the Cognitive Limits 被引量:1
8
作者 Feng, Shan 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 1993年第1期33-44,共12页
The paper presents the conceptual and operational basis of the creation of IDSS based on our recent research experience. In this paper, an intelligent decision support system, IDSS is defined as: any interactive syste... The paper presents the conceptual and operational basis of the creation of IDSS based on our recent research experience. In this paper, an intelligent decision support system, IDSS is defined as: any interactive system that is specially designed to improve the decision making of its user by extending the user's cognitive decision making abilities. As a result, this view of man-machine joint cognitive system stresses the need to use computational technology to aid the user in the decision making process. And the human's role is to achieve total systems's objectives. The paper outlines the designing procedure in successive steps. First, the decision maker's cognitive needs for decision support are identified. Second, the computationally realizable support functions are defined that could be provided by IDSS. Then, the specific techniques that would best fill the decision needs are discussed. And finally, for system implementation the modern computational technology infrastructure is emphasized. 展开更多
关键词 Artificial intelligence Cognitive systems Computational methods computer architecture Decision theory Information science Information theory Interactive computer systems Knowledge based systems Man machine systems Systems engineering
下载PDF
Research of Crossbar Switch of High Performance Network of Signal Processing System 被引量:1
9
作者 何宾 韩月秋 《Journal of Beijing Institute of Technology》 EI CAS 2006年第1期85-90,共6页
The new type of embedded signal processing system based on the packet switched network is achieved. According to the application field and the-characteristics of signal processing system, the RapidIO protocol is used ... The new type of embedded signal processing system based on the packet switched network is achieved. According to the application field and the-characteristics of signal processing system, the RapidIO protocol is used to solve the high-speed interconnection of multi-digital signal processor (DSP). Based on this protocol, a kind of crossbar switch module which is used to interconnect multi-DSP in the system is introduced. A route strategy, some flow control rules and error control rules, which adapt to different RapidIO network topology are also introduced. Crossbar switch performance is analyzed in detail by the probability module. By researching the technique of crossbar switch and analyzing the system performance, it has a significant meaning for building the general signal processing system. 展开更多
关键词 RapidlO protocol crossbar switch signal processing system computer architecture
下载PDF
CORBA-Based Discrete Event Simulation System
10
作者 Luo, J. Zheng, S. +1 位作者 Zhong, L. Duan, F. 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2001年第3期16-19,共4页
The CORBA technique is an integration of the object-oriented conception and distributed computing technique. It can make the application within distributed heterogeneous environments reusable, portable and interoperab... The CORBA technique is an integration of the object-oriented conception and distributed computing technique. It can make the application within distributed heterogeneous environments reusable, portable and interoperable. The architecture of CORBA-based discrete event simulation systems is presented and the interface of distributed simulation objects (DSO) is defined in this paper after the DSO is identified and the synchronization mechanism among DSO is discussed. 展开更多
关键词 Computational methods computer architecture computer simulation computer software portability computer software reusability Interfaces (computer) INTEROPERABILITY Object oriented programming SYNCHRONIZATION
下载PDF
Study on the System Design of Multiple Expert Systems Integrated Decision Support System
11
作者 Wang, Zongjun 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 1993年第1期74-81,共8页
There has been an increasing interest in integrating decision support systems (DSS) and expert systems (ES) to provide decision makers a more accessible, productive and domain-independent information and computing env... There has been an increasing interest in integrating decision support systems (DSS) and expert systems (ES) to provide decision makers a more accessible, productive and domain-independent information and computing environment. This paper is aimed at designing a multiple expert systems integrated decision support system (MESIDSS) to enhance decision makers' ability in more complex cases. The basic framework, management system of multiple ESs, and functions of MESIDSS are presented. The applications of MESIDSS in large-scale decision making processes are discussed from the following aspects of problem decomposing, dynamic combination of multiple ESs, link of multiple bases and decision coordinating. Finally, a summary and some ideas for the future are presented. 展开更多
关键词 Computational methods computer architecture Database systems Expert systems Information management Knowledge based systems Large scale systems Logic design Systems analysis User interfaces
下载PDF
Component-Based Software Reuseon the World Wide Web
12
作者 Sang, Da-yong Wang, Ying 《Wuhan University Journal of Natural Sciences》 EI CAS 2000年第1期31-34,共4页
Component-based software reuse (CBSR) has been widely used in software developing practice and has an even more brilliant future with the rapid extension of the Internet, because World Wide Web (WWW) makes the large s... Component-based software reuse (CBSR) has been widely used in software developing practice and has an even more brilliant future with the rapid extension of the Internet, because World Wide Web (WWW) makes the large scale of component resources from different vendors become available to software developers. In this paper, an abstract component model suitable for representing components on WWW is proposed, which plays important roles both in achieving interoperability among components and among reusable component libraries (RCLs). Some necessary changes to many aspects of component management brought by WWW are also discussed, such as the classification of components and the corresponding searching methods, and the certification of components. 展开更多
关键词 computer architecture INTEROPERABILITY Software engineering World Wide Web
下载PDF
Concept Learning in Neuromorphic Vision Systems: What Can We Learn from Insects?
13
作者 Fredrik Sandin Asad I.Khan +4 位作者 Adrian G.Dyer Anang Hudaya M.Amin Giacomo Indiveri Elisabetta Chicca Evgeny Osipov 《Journal of Software Engineering and Applications》 2014年第5期387-395,共9页
Vision systems that enable collision avoidance, localization and navigation in complex and uncertain environments are common in biology, but are extremely challenging to mimic in artificial electronic systems, in part... Vision systems that enable collision avoidance, localization and navigation in complex and uncertain environments are common in biology, but are extremely challenging to mimic in artificial electronic systems, in particular when size and power limitations apply. The development of neuromorphic electronic systems implementing models of biological sensory-motor systems in silicon is one promising approach to addressing these challenges. Concept learning is a central part of animal cognition that enables appropriate motor response in novel situations by generalization of former experience, possibly from a few examples. These aspects make concept learning a challenging and important problem. Learning methods in computer vision are typically inspired by mammals, but recent studies of insects motivate an interesting complementary research direction. There are several remarkable results showing that honeybees can learn to master abstract concepts, providing a road map for future work to allow direct comparisons between bio-inspired computing architectures and information processing in miniaturized “real” brains. Considering that the brain of a bee has less than 0.01% as many neurons as a human brain, the task to infer a minimal architecture and mechanism of concept learning from studies of bees appears well motivated. The relatively low complexity of insect sensory-motor systems makes them an interesting model for the further development of bio-inspired computing architectures, in particular for resource-constrained applications such as miniature robots, wireless sensors and handheld or wearable devices. Work in that direction is a natural step towards understanding and making use of prototype circuits for concept learning, which eventually may also help us to understand the more complex learning circuits of the human brain. By adapting concept learning mechanisms to a polymorphic computing framework we could possibly create large-scale decentralized computer vision systems, for example in the form of wireless sensor networks. 展开更多
关键词 Concept Learning computer Vision computer architecture Neuromorphic Engineering INSECT
下载PDF
Skyway:Accelerate Graph Applications with a Dual-Path Architecture and Fine-Grained Data Management
14
作者 Mo Zou Ming-Zhe Zhang +4 位作者 Ru-Jia Wang Xian-He Sun Xiao-Chun Ye Dong-Rui Fan Zhi-Min Tang 《Journal of Computer Science & Technology》 SCIE EI CSCD 2024年第4期871-894,共24页
Graph processing is a vital component of many AI and big data applications.However,due to its poor locality and complex data access patterns,graph processing is also a known performance killer of AI and big data appli... Graph processing is a vital component of many AI and big data applications.However,due to its poor locality and complex data access patterns,graph processing is also a known performance killer of AI and big data applications.In this work,we propose to enhance graph processing applications by leveraging fine-grained memory access patterns with a dual-path architecture on top of existing software-based graph optimizations.We first identify that memory accesses to the offset,edge,and state array have distinct locality and impact on performance.We then introduce the Skyway architecture,which consists of two primary components:1)a dedicated direct data path between the core and memory to transfer state array elements efficiently,and 2)a data-type aware fine-grained memory-side row buffer hardware for both the newly designed direct data path and the regular memory hierarchy data path.The proposed Skyway architecture is able to improve the overall performance by reducing the memory access interference and improving data access efficiency with a minimal overhead.We evaluate Skyway on a set of diverse algorithms using large real-world graphs.On a simulated fourcore system,Skyway improves the performance by 23%on average over the best-performing graph-specialized hardware optimizations. 展开更多
关键词 graph application computer architecture memory hierarchy
原文传递
CASA:A New IFU Architecture for Power-Efficient Instruction Cache and TLB Designs
15
作者 孙含欣 杨鲲鹏 +2 位作者 赵雨来 佟冬 程旭 《Journal of Computer Science & Technology》 SCIE EI CSCD 2008年第1期141-153,共13页
The instruction fetch unit (IFU) usually dissipates a considerable portion of total chip power. In traditional IFU architectures, as soon as the fetch address is generated, it needs to be sent to the instruction cac... The instruction fetch unit (IFU) usually dissipates a considerable portion of total chip power. In traditional IFU architectures, as soon as the fetch address is generated, it needs to be sent to the instruction cache and TLB arrays for instruction fetch. Since limited work can be done by the power-saving logic after the fetch address generation and before the instruction fetch, previous power-saving approaches usually suffer from the unnecessary restrictions from traditional IFU architectures. In this paper, we present CASA, a new power-aware IFU architecture, which effectively reduces the unnecessary restrictions on the power-saving approaches and provides sufficient time and information for the power-saving logic of both instruction cache and TLB. By analyzing, recording, and utilizing the key information of the dynamic instruction flow early in the front-end pipeline, CASA brings the opportunity to maximize the power efficiency and minimize the performance overhead. Compared to the baseline configuration, the leakage and dynamic power of instruction cache is reduced by 89.7% and 64.1% respectively, and the dynamic power of instruction TLB is reduced by 90.2%. Meanwhile the performance degradation in the worst case is only 0.63%. Compared to previous state-of-the-art power-saving approaches, the CASA-based approach saves IFU power more effectively, incurs less performance overhead and achieves better scalability. It is promising that CASA can stimulate further work on architectural solutions to power-efficient IFU designs. 展开更多
关键词 computer architecture instruction cache instruction TLB instruction fetch unit power-efficient design dynamic voltage scaling
原文传递
Research Progress of UniCore CPUs and PKUnity SoCs 被引量:5
16
作者 程旭 王箫音 +7 位作者 陆俊林 易江芳 佟冬 管雪涛 刘锋 刘先华 杨春 冯毅 《Journal of Computer Science & Technology》 SCIE EI CSCD 2010年第2期200-213,共14页
CPU and System-on-Chip (SoC) are two key technologies of IT industry. During the course of ten years of research, we have defined the UniCore instruction set architecture, and designed the UniCore CPU and the PKUnit... CPU and System-on-Chip (SoC) are two key technologies of IT industry. During the course of ten years of research, we have defined the UniCore instruction set architecture, and designed the UniCore CPU and the PKUnity SoC family. This cross-disciplinary practice has also fostered many innovations in microprocessor architecture, optimizing compilers, low power design, functional verification, physical design, and so on. In the mean time, we have put technology transfer on the list of our top priorities. This effort has led to several marketable products, such as ultra mobile personal computers, secure micro-workstations and 3C-converged consumer electronics. The development of the next generation products, the 64-bit multi-core CPU and SoC, is also underway. They will find their applications in secure and adaptable computers for mobile and desktop, as well as personal digital multimedia devices. Being consistent with the philosophy and the long-term plan, and by leveraging the cutting-edge process technology, we will continue to make more innovations in CPUs and SoCs, and strengthen our commitment to technology transfer. 展开更多
关键词 computer architecture UniCore microprocessor PKUnity SoC
原文传递
Performance characterization of illumination algorithms for reconfigurable graphics processor 被引量:2
17
作者 Deng Junyong Liu Yang Xie Xiaoyan 《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 2019年第5期60-71,共12页
Graphics processing is an increasing important application domain with the demand of real-time rendering,video streaming,virtual reality,and so on.Illumination is a critical module in graphics rendering and is typical... Graphics processing is an increasing important application domain with the demand of real-time rendering,video streaming,virtual reality,and so on.Illumination is a critical module in graphics rendering and is typically compute-bound,memory-bound,and power-bound in different application cases.It is crucial to decide how to schedule different illumination algorithms with different features according to the practical requirements in reconfigurable graphics hardware.This paper analyze the performance characteristics of four main-stream lighting algorithms,Lambert illumination algorithm,Phong illumination algorithm,Blinn-Phong illumination algorithm,and Cook-Torrance illumination algorithm,using hardware performance counters on x86 processor platform KabyLake(KBL).The data movement,computation,power consumption,and memory accessing are evaluated over a range of application scenarios.Further,by analyzing the system-level behavior of these illumination algorithms,obtains the cons and pros of these specific algorithms were obtained.The associated relationship between performance/energy and the evaluated metrics was analyzed through Pearson correlation coefficient(PCC)analysis.According to these performance characterization data,this paper presents some reconfiguration suggestions in reconfigurable graphics processor. 展开更多
关键词 performance characterization illumination algorithms reconfigurable graphics processor correlation analysis computer architecture
原文传递
Improvements to the Control Techniques of Sequential Inference Machines——from Instructions to Hardware Organization
18
作者 邢汉承 李春林 邢东生 《Journal of Computer Science & Technology》 SCIE EI CSCD 1991年第1期66-73,共8页
Nondeterminism of PROLOG execution requires that a block of control information or a choice point for each procedure call be stored when there are other candidate clauses to be used.When the currently selected clause ... Nondeterminism of PROLOG execution requires that a block of control information or a choice point for each procedure call be stored when there are other candidate clauses to be used.When the currently selected clause fails,the bindings made by the clause must be undone and the stored choice point is reactivated,and then another clause of the candidate ones is chosen to run on it. Storing and reactivating choice points and undoing account for the great overhead are required to control PROLOG execution,which is quite different from conventional programs. This paper focuses on the techniques used in Sequential PROLOG Engine(SPE)to reduce the overhead of control operations.The control instructions of SPE store no more choice points than the necessary.Its architecture takes the approaches of analysing the potential parallelism in the con- trol operations and developing a fraction of it due to the cost-effect consideration.The results of executing two sample programs on SPE in the form of hand timings are presented,which favor the approach. 展开更多
关键词 Artificial Intelligence Automata Theory Sequential Machines computer architecture computer Programming Languages PROLOG Data Storage Digital
原文传递
Chip Multithreaded Consistency Model
19
作者 李祖松 郇丹丹 +1 位作者 胡伟武 唐志敏 《Journal of Computer Science & Technology》 SCIE EI CSCD 2008年第2期298-304,F0003,共8页
Multithreaded technique is the developing trend of high performance processor. Memory consistency model is essential to the correctness, performance and complexity of multithreaded processor. The chip multithreaded co... Multithreaded technique is the developing trend of high performance processor. Memory consistency model is essential to the correctness, performance and complexity of multithreaded processor. The chip multithreaded consistency model adapting to multithreaded processor is proposed in this paper. The restriction imposed on memory event ordering by chip multithreaded consistency is presented and formalized. With the idea of critical cycle built by Wei-Wu Hu, we prove that the proposed chip multithreaded consistency model satisfies the criterion of correct execution of sequential consistency model. Chip multithreaded consistency model provides a way of achieving high performance compared with sequential consistency model and easures the compatibility of software that the execution result in multithreaded processor is the same as the execution result in uniprocessor. The implementation strategy of chip multithreaded consistency model in Godson-2 SMT processor is also proposed. Godson-2 SMT processor supports chip multithreaded consistency model correctly by exception scheme based on the sequential memory access queue of each thread. 展开更多
关键词 computer architecture GODSON-2 MULTITHREADING memory consistency model event ordering
原文传递
Design and Implementation of a Heterogeneous Distributed Database System
20
作者 金志权 柳诚飞 +3 位作者 孙钟秀 周晓方 陈佩佩 顾建明 《Journal of Computer Science & Technology》 SCIE EI CSCD 1990年第4期363-373,共11页
This paper introduces a heterogeneous distributed database system called LSZ system, where LSZ is an abbreviation of Li Shizhen, an ancient Chinese medical scientist. LSZ system adopts cluster as distributed database ... This paper introduces a heterogeneous distributed database system called LSZ system, where LSZ is an abbreviation of Li Shizhen, an ancient Chinese medical scientist. LSZ system adopts cluster as distributed database node (or site). Each cluster consists of one or several microcomputers and one server. The paper describes its basic architecture and the prototype implementation, which includes query processing and optimization, transaction manager and data language translation. The system provides a uniform retrieve and update user interface through global relational data language GRDL. 展开更多
关键词 computer architecture Database Systems RELATIONAL Information Retrieval Systems OPTIMIZATION
原文传递
上一页 1 2 下一页 到第
使用帮助 返回顶部