期刊文献+
共找到951篇文章
< 1 2 48 >
每页显示 20 50 100
Architecture-level performance/power tradeoff in network processor design
1
作者 陈红松 季振洲 胡铭曾 《Journal of Harbin Institute of Technology(New Series)》 EI CAS 2007年第1期45-48,共4页
Network processors are used in the core node of network to flexibly process packet streams. With the increase of performance, the power of network processor increases fast, and power and cooling become a bottleneck. A... Network processors are used in the core node of network to flexibly process packet streams. With the increase of performance, the power of network processor increases fast, and power and cooling become a bottleneck. Architecture-level power conscious design must go beyond low-level circuit design. Architectural power and performance tradeoff should be considered at the same time. Simulation is an efficient method to design modem network processor before making chip. In order to achieve the tradeoff between performance and power, the processor simulator is used to design the architecture of network processor. Using Netbeneh, Commubench benchmark and processor simulator-SimpleScalar, the performance and power of network processor are quantitatively evaluated. New performance tradeoff evaluation metric is proposed to analyze the architecture of network processor. Based on the high performance lnteI IXP 2800 Network processor eonfignration, optimized instruction fetch width and speed ,instruction issue width, instruction window size are analyzed and selected. Simulation resuits show that the tradeoff design method makes the usage of network processor more effectively. The optimal key parameters of network processor are important in architecture-level design. It is meaningful for the next generation network processor design. 展开更多
关键词 network processor design performance/power simulation tradeoff evaluation optimization
下载PDF
Reconfigurable Communication Processor: A New Approach for Network Processor
2
作者 孙华 陈青山 张文渊 《Journal of Shanghai Jiaotong university(Science)》 EI 2003年第1期43-47,共5页
As the traditional RISC+ASIC/ASSP approach for network processor design can not meet the today’s requirements, this paper described an alternate approach, Reconfigurable Processing Architecture, to boost the performa... As the traditional RISC+ASIC/ASSP approach for network processor design can not meet the today’s requirements, this paper described an alternate approach, Reconfigurable Processing Architecture, to boost the performance to ASIC level while reserve the programmability of the traditional RISC based system. This paper covers both the hardware architecture and the software development environment architecture. 展开更多
关键词 network processor reconfigurable processor run time reconfiguration field programmable gate array (FPGA) raduced instruction set circuit (RISC) application specific integrated circuit(ASIC)
下载PDF
SDN-Based Switch Implementation on Network Processors 被引量:1
3
作者 Yunchun Li Guodong Wang 《Communications and Network》 2013年第3期434-437,共4页
Virtualization is the key technology of cloud computing. Network virtualization plays an important role in this field. Its performance is very relevant to network virtualizing. Nowadays its implementations are mainly ... Virtualization is the key technology of cloud computing. Network virtualization plays an important role in this field. Its performance is very relevant to network virtualizing. Nowadays its implementations are mainly based on the idea of Software Define Network (SDN). Open vSwitch is a sort of software virtual switch, which conforms to the OpenFlow protocol standard. It is basically deployed in the Linux kernel hypervisor. This leads to its performance relatively poor because of the limited system resource. In turn, the packet process throughput is very low.In this paper, we present a Cavium-based Open vSwitch implementation. The Cavium platform features with multi cores and couples of hard ac-celerators. It supports zero-copy of packets and handles packet more quickly. We also carry some experiments on the platform. It indicates that we can use it in the enterprise network or campus network as convergence layer and core layer device. 展开更多
关键词 SDN OPEN vSwitch network processorS OpenFlow
下载PDF
An Improved Cache Mechanism for a Cache-Based Network Processor
4
作者 Hayato Yamaki Hiroaki Nishi 《通讯和计算机(中英文版)》 2013年第3期277-286,共10页
关键词 高速缓存机制 网络处理器 网络流量 上下文 网络内容 IP电话 仿真结果 数据包
下载PDF
High-Level Portable Programming Language for Optimized Memory Use of Network Processors
5
作者 Yasusi Kanada 《Communications and Network》 2015年第1期55-69,共15页
Network processors (NPs) are widely used for programmable and high-performance networks;however, the programs for NPs are less portable, the number of NP program developers is small, and the development cost is high. ... Network processors (NPs) are widely used for programmable and high-performance networks;however, the programs for NPs are less portable, the number of NP program developers is small, and the development cost is high. To solve these problems, this paper proposes an open, high-level, and portable programming language called “Phonepl”, which is independent from vendor-specific proprietary hardware and software but can be translated into an NP program with high performance especially in the memory use. A common NP hardware feature is that a whole packet is stored in DRAM, but the header is cached in SRAM. Phonepl has a hardware-independent abstraction of this feature so that it allows programmers mostly unconscious of this hardware feature. To implement the abstraction, four representations of packet data type that cover all the packet operations (including substring, concatenation, input, and output) are introduced. Phonepl have been implemented on Octeon NPs used in plug-ins for a network-virtualization environment called the VNode Infrastructure, and several packet-handling programs were evaluated. As for the evaluation result, the conversion throughput is close to the wire rate, i.e., 10 Gbps, and no packet loss (by cache miss) occurs when the packet size is 256 bytes or larger. 展开更多
关键词 network processorS PORTABILITY HIGH-LEVEL Language Hardware Independence MEMORY Usage DRAM SRAM network Virtualization
下载PDF
Efficiency of Cache Mechanism for Network Processors 被引量:2
6
作者 徐波 常剑 +2 位作者 黄诗萌 薛一波 李军 《Tsinghua Science and Technology》 SCIE EI CAS 2009年第5期575-585,共11页
With the explosion of network bandwidth and the ever-changing requirements for diverse network-based applications,the traditional processing architectures,i.e.,general purpose processor(GPP) and application specific... With the explosion of network bandwidth and the ever-changing requirements for diverse network-based applications,the traditional processing architectures,i.e.,general purpose processor(GPP) and application specific integrated circuits(ASIC) cannot provide sufficient flexibility and high performance at the same time.Thus,the network processor(NP) has emerged as an alternative to meet these dual demands for today's network processing.The NP combines embedded multi-threaded cores with a rich memory hierarchy that can adapt to different networking circumstances when customized by the application developers.In today's NP architectures,multithreading prevails over cache mechanism,which has achieved great success in GPP to hide memory access latencies.This paper focuses on the efficiency of the cache mechanism in an NP.Theoretical timing models of packet processing are established for evaluating cache efficiency and experiments are performed based on real-life network backbone traces.Testing results show that an improvement of nearly 70% can be gained in throughput with assistance from the cache mechanism.Accordingly,the cache mechanism is still efficient and irreplaceable in network processing,despite the existing of multithreading. 展开更多
关键词 CACHE network processor efficiency evaluation
原文传递
Building RNC in All-IP Wireless Networks using Network Processors
7
作者 CHENGSheng NIXian-le ZHUXin-ning DINGWei 《The Journal of China Universities of Posts and Telecommunications》 EI CSCD 2004年第2期86-91,共6页
This paper describes a solution to build network-processor-based Radio Network Controller (RNC) in all-IP wireless networks, it includes the structure of the 3rd Generation (3G) wireless networks and the role of netw... This paper describes a solution to build network-processor-based Radio Network Controller (RNC) in all-IP wireless networks, it includes the structure of the 3rd Generation (3G) wireless networks and the role of network nodes, such as Base Station (BS), RNC, and Packet-Switched Core Networks (PSCN). The architecture of IXP2800 network processor; the detailed implementation of the solution on IXP2800-based RNC are also covered. This solution can provide scalable IP forward features and it will be widely used in 3G RNCs. 展开更多
关键词 G 3GPP All-IP wireless networks RNC IP network processor IXP2800
原文传递
Architecture-Aware Session Lookup Design for Inline Deep Inspection on Network Processors
8
作者 徐波 何飞 +1 位作者 薛一波 李军 《Tsinghua Science and Technology》 SCIE EI CAS 2009年第1期19-28,共10页
Today's firewalls and security gateways are required to not only block unauthorized accesses by authenticating packet headers, but also inspect flow payloads against malicious intrusions. Deep inspection emerges as a... Today's firewalls and security gateways are required to not only block unauthorized accesses by authenticating packet headers, but also inspect flow payloads against malicious intrusions. Deep inspection emerges as a seamless integration of packet classification for access control and pattern matching for intrusion prevention. The two function blocks are linked together via well-designed session lookup schemes. This paper presents an architecture-aware session lookup scheme for deep inspection on network processors (NPs). Test results show that the proposed session data structure and integration approach can achieve the OC-48 line rate (2.5 Gbps) with inline stateful content inspection on the Intel IXP2850 NP. This work provides an insight into application design and implementation on NPs and principles for performance tuning of NP-based programming such as data allocation, task partitioning, latency hiding, and thread synchronization. 展开更多
关键词 session lookup deep inspection network processor performance optimization
原文传递
Hardwired Logic and Multithread Design in Network Processors
9
作者 李旭东 徐扬 +1 位作者 刘斌 王小军 《Tsinghua Science and Technology》 SCIE EI CAS 2004年第2期207-212,共6页
High-performance network processors are expected to play an important role in future high-speed routers. This paper focuses on two representative techniques needed for high-performance network processors: hardwired lo... High-performance network processors are expected to play an important role in future high-speed routers. This paper focuses on two representative techniques needed for high-performance network processors: hardwired logic design and multithread design. Using hardwired logic, this paper compares a single-thread design with a multithread design, and proposes general models and principles to analyze the clock frequency and the resource cost for these environments. Then, two IP header processing schemes, one in single-thread mode and the other in double-thread mode, are developed using these principles and the implementation results verified the theoretical calculation. 展开更多
关键词 network processor (NP) hardwired logic multithread IP header processing
原文传递
Speeding up the MATLAB complex networks package using graphic processors 被引量:1
10
作者 张百达 唐玉华 +1 位作者 吴俊杰 李鑫 《Chinese Physics B》 SCIE EI CAS CSCD 2011年第9期460-467,共8页
The availability of computers and communication networks allows us to gather and analyse data on a far larger scale than previously. At present, it is believed that statistics is a suitable method to analyse networks ... The availability of computers and communication networks allows us to gather and analyse data on a far larger scale than previously. At present, it is believed that statistics is a suitable method to analyse networks with millions, or more, of vertices. The MATLAB language, with its mass of statistical functions, is a good choice to rapidly realize an algorithm prototype of complex networks. The performance of the MATLAB codes can be further improved by using graphic processor units (GPU). This paper presents the strategies and performance of the GPU implementation of a complex networks package, and the Jacket toolbox of MATLAB is used. Compared with some commercially available CPU implementations, GPU can achieve a speedup of, on average, 11.3x. The experimental result proves that the GPU platform combined with the MATLAB language is a good combination for complex network research. 展开更多
关键词 complex networks graphic processors unit MATLAB Jacket Toolbox
下载PDF
Optimized Processor for Sensor Networks Applications
11
作者 Ali Elkateeb 《通讯和计算机(中英文版)》 2012年第3期311-316,共6页
关键词 嵌入式处理器 传感器节点 网络应用 优化 节点设计 软核处理器 可重构系统 核心处理器
下载PDF
长向量处理器高效RNN推理方法
12
作者 苏华友 陈抗抗 杨乾明 《国防科技大学学报》 EI CAS CSCD 北大核心 2024年第1期121-130,共10页
模型深度的不断增加和处理序列长度的不一致对循环神经网络在不同处理器上的性能优化提出巨大挑战。针对自主研制的长向量处理器FT-M7032,实现了一个高效的循环神经网络加速引擎。该引擎采用行优先矩阵向量乘算法和数据感知的多核并行方... 模型深度的不断增加和处理序列长度的不一致对循环神经网络在不同处理器上的性能优化提出巨大挑战。针对自主研制的长向量处理器FT-M7032,实现了一个高效的循环神经网络加速引擎。该引擎采用行优先矩阵向量乘算法和数据感知的多核并行方式,提高矩阵向量乘的计算效率;采用两级内核融合优化方法降低临时数据传输的开销;采用手写汇编优化多种算子,进一步挖掘长向量处理器的性能潜力。实验表明,长向量处理器循环神经网络推理引擎可获得较高性能,相较于多核ARM CPU以及Intel Golden CPU,类循环神经网络模型长短记忆网络可获得最高62.68倍和3.12倍的性能加速。 展开更多
关键词 多核DSP 长向量处理器 循环神经网络 并行优化
下载PDF
NM-SpMM:面向国产异构向量处理器的半结构化稀疏矩阵乘算法
13
作者 姜晶菲 何源宏 +2 位作者 许金伟 许诗瑶 钱希福 《计算机工程与科学》 CSCD 北大核心 2024年第7期1141-1150,共10页
深度神经网络在自然语言处理、计算机视觉等领域取得了优异的成果,由于智能应用处理数据规模的增长和大模型的快速发展,对深度神经网络的推理性能要求越来越高,N∶M半结构化稀疏化技术成为平衡算力需求和应用效果的热点技术之一。国产... 深度神经网络在自然语言处理、计算机视觉等领域取得了优异的成果,由于智能应用处理数据规模的增长和大模型的快速发展,对深度神经网络的推理性能要求越来越高,N∶M半结构化稀疏化技术成为平衡算力需求和应用效果的热点技术之一。国产异构向量处理器FT-M7032为智能模型处理中的数据并行和指令并行开发提供了较大空间。针对N∶M半结构化稀疏模型计算稀疏模式多样性,提出了一种面向FT-M7032的可灵活配置的稀疏矩阵乘算法NM-SpMM。NM-SpMM设计了一种高效的压缩偏移地址稀疏编码格式COA,避免了半结构化参数配置对稀疏数据访存计算的影响。基于COA编码,NM-SpMM对不同维度稀疏矩阵计算进行了细粒度优化。在FT-M7032单核上的实验结果表明,相较于稠密矩阵乘,NM-SpMM能获得1.73~21.00倍的加速,相较于采用CuSPARSE稀疏计算库的NVIDIA V100 GPU,能获得0.04~1.04倍的加速。 展开更多
关键词 深度神经网络 图形处理器 向量处理器 稀疏矩阵乘 流水线
下载PDF
分布式高性能自组网节点技术研究 被引量:1
14
作者 于哲 周舜民 +4 位作者 王彬 孙艺铭 陈方 赵子龙 李贝贝 《现代电子技术》 北大核心 2024年第5期1-7,共7页
针对当前主流Mesh自组网技术节点传输带宽不足百兆,级跳数小于10的问题,提出采用多处理器构建实现分布式多跳、高带宽低时延的无线跳频的高性能自组网节点,对节点自动化组网连接、多信道选择避让、漫游切换及低时延高带宽网络多跳实现... 针对当前主流Mesh自组网技术节点传输带宽不足百兆,级跳数小于10的问题,提出采用多处理器构建实现分布式多跳、高带宽低时延的无线跳频的高性能自组网节点,对节点自动化组网连接、多信道选择避让、漫游切换及低时延高带宽网络多跳实现等关键技术进行研究实现。由测试结果分析可知,在20级跳内,文中节点组网带宽损失在30%以内且带宽保持在200 Mb/s以上,时延控制在100 ms内,可以满足现实应急场景下多终端智能硬件实时进行图像、视频等大数据量信息交互对高带宽低时延网络通信的需求。 展开更多
关键词 多处理器 自组网连接 多级跳 高带宽 低时延 信息交互
下载PDF
一种基于PYNQ的神经网络加速系统
15
作者 赖嘉伟 魏洪健 +1 位作者 孙科学 王艳 《电子设计工程》 2024年第17期16-21,共6页
针对传统卷积神经网络计算复杂度高,耗时较长,难以应用到嵌入式移动端的问题,提出了一种以ZYNQ芯片作为主控的FPAG联合ARM实现的的神经网络加速系统。该系统的PL部分采用纯RTL开发,对卷积层的输入层和输出层进行了全并行化,对卷积窗口... 针对传统卷积神经网络计算复杂度高,耗时较长,难以应用到嵌入式移动端的问题,提出了一种以ZYNQ芯片作为主控的FPAG联合ARM实现的的神经网络加速系统。该系统的PL部分采用纯RTL开发,对卷积层的输入层和输出层进行了全并行化,对卷积窗口进行完全的展开,在一个时钟周期内可以同时完成81次乘法运算,同时对池化层和全连接层采用流水线的优化方式。相比常用的使用高层次综合工具进行优化的方法,该系统使用RTL语言从零开始设计卷积神经网络各个模块,进行了细粒度的优化,避免了冗余逻辑资源的产生,充分利用了片上资源。针对MINIST手写数字识别的网络模型,该系统的DSP利用率达到了95%,在100 MHz时钟频率下,硬件单帧图像处理时间仅为0.81 ms,功耗仅为1.601 W。 展开更多
关键词 PYNQ ARM处理器 神经网络 现场可编程门阵列 硬件加速器
下载PDF
适用于S-NUCA异构处理器的任务调度与热管理系统
16
作者 周义涛 李阳 +3 位作者 韩超 赵玉来 汪玲 李建华 《计算机工程》 CAS CSCD 北大核心 2024年第2期196-205,共10页
异构多核处理器凭借其高性能、低功耗和广泛的应用场景而成为当前计算机平台的主流方案,且大容量的非均匀缓存架构(S-NUCA)具有较低的平均访问时间。然而,不断上升的晶体管规模给异构多核处理器的资源调度和功耗控制带来挑战,传统的调... 异构多核处理器凭借其高性能、低功耗和广泛的应用场景而成为当前计算机平台的主流方案,且大容量的非均匀缓存架构(S-NUCA)具有较低的平均访问时间。然而,不断上升的晶体管规模给异构多核处理器的资源调度和功耗控制带来挑战,传统的调度算法在面对基于S-NUCA的多核处理器时忽略了核心之间的缓存访问延迟,且传统热管理方案只提供芯片级功率约束,容易使得系统因核心使用率降低而造成性能下降。为此,提出一种适用于S-NUCA异构多核系统、满足热安全约束的动态线程调度机制TSCDM。利用基于动态每周期指令(IPC)值的阶段检测技术,并基于人工神经网络预测线程的IPC值,以获取线程与核心类型的最佳绑定关系,依据S-NUCA缓存特性获得最优映射和基于任务分类的任务迁移策略。在此基础上,TSCDM基于片上热模型为每个核心实时分配功率预算。在HotSniper上运行SPLASH-2性能测试套件进行实验,结果表明,相较于传统调度方案与基于机器学习的调度方案,TSCDM在加速比和资源利用率上均表现出优势,TSCDM中使用的基于瞬态温度的安全功率算法相比传统热安全功率算法能够降低核心热余量,同时处理器的全频段均有更高的能效比。 展开更多
关键词 异构多核处理器 人工神经网络 线程调度 阶段检测 热安全功率
下载PDF
基于网络处理器的防火墙模型设计
17
作者 范晓东 《自动化应用》 2024年第10期288-291,共4页
设计一种基于机器学习的神经网络算法,评价网络处理器的防火墙模型,并测试评价算法效能。考虑网络处理器的防火墙设计模型的影响因素,分析网络处理器的数据结构,对防火墙模型的任务进行分解实现防火墙模型的设计,并在神经网络算法下完... 设计一种基于机器学习的神经网络算法,评价网络处理器的防火墙模型,并测试评价算法效能。考虑网络处理器的防火墙设计模型的影响因素,分析网络处理器的数据结构,对防火墙模型的任务进行分解实现防火墙模型的设计,并在神经网络算法下完成算法的仿真测试。测试结果证实模糊神经网络对配网防火墙的稳定性预警能力优于传统的机器学习算法。该算法对防火墙模型设计的稳定性评价过程有积极意义。 展开更多
关键词 网络处理器 防火墙模型 模糊神经网络
下载PDF
Trends of Communication Processors
18
作者 LIU Dake CAI Zhaoyun WANG Wei 《China Communications》 SCIE CSCD 2016年第1期1-16,共16页
Processors have been playing important roles in both communication infrastructure systems and terminals.In this paper,both application specific and general purpose processors for communications are discussed including... Processors have been playing important roles in both communication infrastructure systems and terminals.In this paper,both application specific and general purpose processors for communications are discussed including the roles,the history,the current situations,and the trends.One trend is that ASIPs(Application Specific Instruction-set Processors) are taking over ASICs(Application Specific Integrated Circuits) because of the increasing needs both on performance and compatibility of multi-modes.The trend opened opportunities for researchers crossing the boundary between communications and computer architecture.Another trend is the serverlization,i.e.,more infrastructure equipments are replaced by servers.The trend opened opportunities for researchers working towards high performance computing for communication,such as research on communication algorithm kernels and real time programming methods on servers. 展开更多
关键词 ASIP baseband processor network processor application processor server processor
下载PDF
数控机床工作台DSP定位误差系统设计及分析
19
作者 路晓云 杨光 《机械管理开发》 2024年第3期187-188,191,共3页
为进一步优化数控机床对于测试误差的补偿功能,开发通过DSP硬件系统对误差进行准确预测并设置补偿措施。建立的定位误差模型预测补偿系统包含数控系统进给轴反馈结构、DSP建模预测系统以及数控系统。研究结果表明,采用Matlab软件运行得... 为进一步优化数控机床对于测试误差的补偿功能,开发通过DSP硬件系统对误差进行准确预测并设置补偿措施。建立的定位误差模型预测补偿系统包含数控系统进给轴反馈结构、DSP建模预测系统以及数控系统。研究结果表明,采用Matlab软件运行得到的优化权值与阈值建立的GA-BP网络进行误差预测共需251μs;采用GA-BP网络构建的模型进行预测时达到了更高精度。该研究有助于提高数控机床加工精度,对提高加工参数的优化起到很好的指导意义以及控制效果。 展开更多
关键词 数控机床 定位误差 数字信号处理器 遗传算法 反向传播网络
下载PDF
申威平台高速网络数据处理框架的设计与实现
20
作者 曹建军 佘平 聂世强 《计算机技术与发展》 2024年第7期184-191,共8页
随着大数据时代网络流量的激增,传统内核网络协议栈由于内核切换开销占比高等原因导致现有基于内核的网络数据处理系统无法充分利用10 Gb乃至100 Gb的高速网卡收发能力。为了降低内核切换开销,开源DPDK用户态网络开发套件被提出以支持... 随着大数据时代网络流量的激增,传统内核网络协议栈由于内核切换开销占比高等原因导致现有基于内核的网络数据处理系统无法充分利用10 Gb乃至100 Gb的高速网卡收发能力。为了降低内核切换开销,开源DPDK用户态网络开发套件被提出以支持高速网络流量处理,并在x86平台得到大规模应用和部署。为了满足国产化信创和网络安全的要求,面向国产申威处理器平台设计并实现了一套基于DPDK的网络流量组包解析框架,充分利用DPDK的大页内存、无锁队列等机制,设计多线程并行以发挥申威处理器多核性能,支持常见基于TCP/UDP的多种应用层协议解析,并具有轻量化和可扩展特点。基于真实硬件平台实验结果表明,该框架性能比现有主流软件提高10%左右,为基于国产处理器平台的高速网络数据处理做了初步探索。 展开更多
关键词 DPDK 协议分析 高速网络 TCP/IP协议栈 国产处理器
下载PDF
上一页 1 2 48 下一页 到第
使用帮助 返回顶部