期刊文献+
共找到24篇文章
< 1 2 >
每页显示 20 50 100
Research and design of matrix operation accelerator based on reconfigurable array
1
作者 邓军勇 ZHANG Pan +2 位作者 JIANG Lin XIE Xiaoyan DENG Jingwen 《High Technology Letters》 EI CAS 2024年第2期128-137,共10页
In the case of massive data,matrix operations are very computationally intensive,and the memory limitation in standalone mode leads to the system inefficiencies.At the same time,it is difficult for matrix operations t... In the case of massive data,matrix operations are very computationally intensive,and the memory limitation in standalone mode leads to the system inefficiencies.At the same time,it is difficult for matrix operations to achieve flexible switching between different requirements when implemented in hardware.To address this problem,this paper proposes a matrix operation accelerator based on reconfigurable arrays in the context of the application of recommender systems(RS).Based on the reconfigurable array processor(APR-16)with reconfiguration,a parallelized design of matrix operations on processing element(PE)array is realized with flexibility.The experimental results show that,compared with the proposed central processing unit(CPU)and graphics processing unit(GPU)hybrid implementation matrix multiplication framework,the energy efficiency ratio of the accelerator proposed in this paper is improved by about 35×.Compared with blocked alternating least squares(BALS),its the energy efficiency ratio has been accelerated by about 1×,and the switching of matrix factorization(MF)schemes suitable for different sparsity can be realized. 展开更多
关键词 matrix factorization(MF) recommender system(RS) array processor RECONFIGURABLE matrix multiplication
下载PDF
Optimizing Memory Access Efficiency in CUDA Kernel via Data Layout Technique
2
作者 Neda Seifi Abdullah Al-Mamun 《Journal of Computer and Communications》 2024年第5期124-139,共16页
Over the past decade, Graphics Processing Units (GPUs) have revolutionized high-performance computing, playing pivotal roles in advancing fields like IoT, autonomous vehicles, and exascale computing. Despite these adv... Over the past decade, Graphics Processing Units (GPUs) have revolutionized high-performance computing, playing pivotal roles in advancing fields like IoT, autonomous vehicles, and exascale computing. Despite these advancements, efficiently programming GPUs remains a daunting challenge, often relying on trial-and-error optimization methods. This paper introduces an optimization technique for CUDA programs through a novel Data Layout strategy, aimed at restructuring memory data arrangement to significantly enhance data access locality. Focusing on the dynamic programming algorithm for chained matrix multiplication—a critical operation across various domains including artificial intelligence (AI), high-performance computing (HPC), and the Internet of Things (IoT)—this technique facilitates more localized access. We specifically illustrate the importance of efficient matrix multiplication in these areas, underscoring the technique’s broader applicability and its potential to address some of the most pressing computational challenges in GPU-accelerated applications. Our findings reveal a remarkable reduction in memory consumption and a substantial 50% decrease in execution time for CUDA programs utilizing this technique, thereby setting a new benchmark for optimization in GPU computing. 展开更多
关键词 Data Layout Optimization CUDA Performance Optimization GPU Memory Optimization Dynamic Programming matrix Multiplication Memory Access Pattern Optimization in CUDA
下载PDF
An Evaluation of the Critical Success Factors in Sustainable Food Supply Chains in Developing Countries
3
作者 Muhammad Ahasan Habib Deen Islam Preyo +1 位作者 Muhammad Kamruzzaman Ahasan Md. Maruf Hossain 《World Journal of Engineering and Technology》 2024年第3期466-492,共27页
Food is one of the biggest industries in developed and underdeveloped countries. Supply chain sustainability is essential in established and emerging economies because of the rising acceptance of cost-based outsourcin... Food is one of the biggest industries in developed and underdeveloped countries. Supply chain sustainability is essential in established and emerging economies because of the rising acceptance of cost-based outsourcing and the growing technological, social, and environmental concerns. The food business faces serious sustainability and growth challenges in developing countries. A comprehensive analysis of the critical success factors (CSFs) influencing the performance outcome and the sustainable supply chain management (SSCM) process. A theoretical framework is established to explain how they are used to examine the organizational aspect of the food supply chain life cycle analysis. This study examined the CSFs and revealed the relationships between them using a methodology that included a review of literature, interpretative structural modeling (ISM), and cross-impact matrix multiplication applied in classification (MICMAC) tool analysis of soil liquefaction factors. The findings of this research demonstrate that the quality and safety of food are important factors and have a direct effect on other factors. To make sustainable food supply chain management more adequate, legislators, managers, and experts need to pay attention to this factor. In this work. It also shows that companies aiming to create a sustainable business model must make sustainability a fundamental tenet of their organization. Practitioners and managers may devise effective long-term plans for establishing a sustainable food supply chain utilizing the recommended methodology. 展开更多
关键词 Supply Chain Collaboration Interpretative Structural Modeling Cross-Impact matrix Multiplication SUSTAINABILITY Critical Success Factors Multi-Criteria Decision Making Technique for Order of Preference by Similarity to Idea Solution
下载PDF
A General Representation of Hankel Matrix about Bell Numbers 被引量:2
4
作者 刘麦学 张海模 《Chinese Quarterly Journal of Mathematics》 CSCD 2003年第4期338-342,共5页
The purpose of this note is to establish a general representation of Hankel matrices of Bell numbers and the convoluted Bell numbers. As a special case, the results of Aigner are extended.
关键词 Bell number Hankel matrix matrix multiplication RECURRENCE
下载PDF
Code Design and Latency Analysis of Distributed Matrix Multiplication with Straggling Servers in Fading Channels 被引量:1
5
作者 Ning Liu Kuikui Li Meixia Tao 《China Communications》 SCIE CSCD 2021年第10期15-29,共15页
This paper exploits coding to speed up computation offloading in a multi-server mobile edge computing(MEC)network with straggling servers and channel fading.The specific task we consider is to compute the product betw... This paper exploits coding to speed up computation offloading in a multi-server mobile edge computing(MEC)network with straggling servers and channel fading.The specific task we consider is to compute the product between a user-generated input data matrix and a large-scale model matrix that is stored distributively across the multiple edge nodes.The key idea of coding is to introduce computation redundancy to improve robustness against straggling servers and to create communication redundancy to improve reliability against channel fading.We utilize the hybrid design of maximum distance separable(MDS)coding and repetition coding.Based on the hybrid coding scheme,we conduct theoretical analysis on the average task uploading time,average edge computing time,and average output downloading time,respectively and then obtain the end-to-end task execution time.Numerical results demonstrate that when the task uploading phase or the edge computing phase is the performance bottleneck,the hybrid coding reduces to MDS coding;when the downlink transmission is the bottleneck,the hybrid coding reduces to repetition coding.The hybrid coding also outperforms the entangled polynomial coding that causes higher uplink and downlink communication loads. 展开更多
关键词 mobile edge computing distributed matrix multiplication coded computing cooperative transmission
下载PDF
Research and Design of Reconfigurable Matrix Multiplication over Finite Field in VLIW Processor
6
作者 Yang Su Xiaoyuan Yang Yuechuan Wei 《China Communications》 SCIE CSCD 2016年第10期222-232,共11页
Matrix multiplication plays a pivotal role in the symmetric cipher algorithms, but it is one of the most complex and time consuming units, its performance directly affects the efficiency of cipher algorithms. Combined... Matrix multiplication plays a pivotal role in the symmetric cipher algorithms, but it is one of the most complex and time consuming units, its performance directly affects the efficiency of cipher algorithms. Combined with the characteristics of VLIW processor and matrix multiplication of symmetric cipher algorithms, this paper extracted the reconfigurable elements and analyzed the principle of matrix multiplication, then designed the reconfigurable architecture of matrix multiplication of VLIW processor further, at last we put forward single instructions for matrix multiplication between 4×1 and 4×4 matrix or two 4×4 matrix over GF(2~8), through the instructions extension, the instructions could support larger dimension operations. The experiment shows that the instructions we designed supports different dimensions matrix multiplication and improves the processing speed of multiplication greatly. 展开更多
关键词 CRYPTOGRAPHY reconfigurable matrix multiplication research and design dedicated instruction VLIW processor
下载PDF
Multiple extended target tracking algorithm based on Gaussian surface matrix 被引量:2
7
作者 Jinlong Yang Peng Li +1 位作者 Zhihua Li Le Yang 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2016年第2期279-289,共11页
In this paper, we consider the problem of irregular shapes tracking for multiple extended targets by introducing the Gaussian surface matrix(GSM) into the framework of the random finite set(RFS) theory. The Gaussi... In this paper, we consider the problem of irregular shapes tracking for multiple extended targets by introducing the Gaussian surface matrix(GSM) into the framework of the random finite set(RFS) theory. The Gaussian surface function is constructed first by the measurements, and it is used to define the GSM via a mapping function. We then integrate the GSM with the probability hypothesis density(PHD) filter, the Bayesian recursion formulas of GSM-PHD are derived and the Gaussian mixture implementation is employed to obtain the closed-form solutions. Moreover, the estimated shapes are designed to guide the measurement set sub-partition, which can cope with the problem of the spatially close target tracking. Simulation results show that the proposed algorithm can effectively estimate irregular target shapes and exhibit good robustness in cross extended target tracking. 展开更多
关键词 multiple extended target tracking irregular shape Gaussian surface matrix(GSM) probability hypothesis density(PHD)
下载PDF
Cache performance optimization of irregular sparse matrix multiplication on modern multi-core CPU and GPU
8
作者 刘力 LiuLi Yang Guang wen 《High Technology Letters》 EI CAS 2013年第4期339-345,共7页
This paper focuses on how to optimize the cache performance of sparse matrix-matrix multiplication(SpGEMM).It classifies the cache misses into two categories;one is caused by the irregular distribution pattern of the ... This paper focuses on how to optimize the cache performance of sparse matrix-matrix multiplication(SpGEMM).It classifies the cache misses into two categories;one is caused by the irregular distribution pattern of the multiplier-matrix,and the other is caused by the multiplicand.For each of them,the paper puts forward an optimization method respectively.The first hash based method removes cache misses of the 1 st category effectively,and improves the performance by a factor of 6 on an Intel 8-core CPU for the best cases.For cache misses of the 2nd category,it proposes a new cache replacement algorithm,which achieves a cache hit rate much higher than other historical knowledge based algorithms,and the algorithm is applicable on CELL and GPU.To further verify the effectiveness of our methods,we implement our algorithm on GPU,and the performance perfectly scales with the size of on-chip storage. 展开更多
关键词 sparse matrix multiplication cache miss SCALABILITY multi-core CPU GPU
下载PDF
Multiple Regression and Big Data Analysis for Predictive Emission Monitoring Systems
9
作者 Zinovi Krougly Vladimir Krougly Serge Bays 《Applied Mathematics》 2023年第5期386-410,共25页
Predictive Emission Monitoring Systems (PEMS) offer a cost-effective and environmentally friendly alternative to Continuous Emission Monitoring Systems (CEMS) for monitoring pollution from industrial sources. Multiple... Predictive Emission Monitoring Systems (PEMS) offer a cost-effective and environmentally friendly alternative to Continuous Emission Monitoring Systems (CEMS) for monitoring pollution from industrial sources. Multiple regression is one of the fundamental statistical techniques to describe the relationship between dependent and independent variables. This model can be effectively used to develop a PEMS, to estimate the amount of pollution emitted by industrial sources, where the fuel composition and other process-related parameters are available. It often makes them sufficient to predict the emission discharge with acceptable accuracy. In cases where PEMS are accepted as an alternative method to CEMS, which use gas analyzers, they can provide cost savings and substantial benefits for ongoing system support and maintenance. The described mathematical concept is based on the matrix algebra representation in multiple regression involving multiple precision arithmetic techniques. Challenging numerical examples for statistical big data analysis, are investigated. Numerical examples illustrate computational accuracy and efficiency of statistical analysis due to increasing the precision level. The programming language C++ is used for mathematical model implementation. The data for research and development, including the dependent fuel and independent NOx emissions data, were obtained from CEMS software installed on a petrochemical plant. 展开更多
关键词 matrix Algebra in Multiple Linear Regression Numerical Integration High Precision Computation Applications in Predictive Emission Monitoring Systems
下载PDF
A practical and dynamic key management scheme for a user hierarchy 被引量:2
10
作者 JENG Fuh-gwo WANG Chung-ming 《Journal of Zhejiang University-Science A(Applied Physics & Engineering)》 SCIE EI CAS CSCD 2006年第3期296-301,共6页
In this paper, we propose a practical and dynamic key management scheme based on the Rabin public key system and a set of matrices with canonical matrix multiplication to solve the access control problem in an arbitra... In this paper, we propose a practical and dynamic key management scheme based on the Rabin public key system and a set of matrices with canonical matrix multiplication to solve the access control problem in an arbitrary partially ordered user hierarchy. The advantage is in ensuring that the security class in the higher level can derive any of its successor’s secret keys directly and efficiently and show it is dynamic while a new security class is added into or a class is removed from the hierarchy. Even the ex-member problem can be solved efficiently. Moreover, any user can freely change its own key for some security reasons. 展开更多
关键词 User hierarchy Key management Rabin public key matrix multiplication
下载PDF
A Revisited Definition of the Three Solute Descriptors Related to the Van der Waals Forces in Solutions 被引量:2
11
作者 Paul Laffort 《Open Journal of Physical Chemistry》 2016年第4期86-100,共15页
It is currently admitted that the intermolecular forces implicated in Gas Liquid Chromatography (GLC) can be expressed as a product of parameters (or descriptors) of solutes and of parameters of solvents. The present ... It is currently admitted that the intermolecular forces implicated in Gas Liquid Chromatography (GLC) can be expressed as a product of parameters (or descriptors) of solutes and of parameters of solvents. The present study is limited to those of solutes, and among them the three ones are involved in the Van der Waals forces, whereas the two ones involved in the hydrogen bonding are left aside at this stage. These three studied parameters, which we call δ, ω and ε, respectively reflect the three types of Van der Waals forces: dispersion, orientation or polarity strictly speaking, and induction-polarizability. These parameters have been experimentally obtained in previous studies for 121 Volatile Organic Compounds (VOC) via an original Multiplicative Matrix Analysis (MMA) applied to a superabundant and accurate GLC data set. Then, also in previous studies, attempts have been made to predict these parameters via a Simplified Molecular Topology procedure (SMT). Because these last published results have been somewhat disappointing, a promising new strategy of prediction is developed and detailed in the present article. 展开更多
关键词 Van der Waals Intermolecular Forces Solute Descriptors Gas Liquid Chromatography Chemo Informatics Multiplicative matrix Analysis
下载PDF
ADJACENT MARTIX METHOD OF IDENTIFYING ISOMORPHISM TO PLANAR KINEMATIC CHAIN WITH MULTIPLE JOINTS AND HIGHER PAIRS 被引量:2
12
作者 SONG Li HUANG Yong CHENG Ling 《Chinese Journal of Mechanical Engineering》 SCIE EI CAS CSCD 2006年第4期605-609,共5页
The adjacent matrix method for identifying isomorphism to planar kinematic chain with multiple joints and higher pairs is presented. The topological invariants of the planar kinematic chain can be calculated and compa... The adjacent matrix method for identifying isomorphism to planar kinematic chain with multiple joints and higher pairs is presented. The topological invariants of the planar kinematic chain can be calculated and compared by adjacent matrix. The quantity of calculation can be reduced effectively using the several divisions of bars and the reconfiguration of the adjacent matrix. As two structural characteristics of adjacent matrix, the number of division and division code are presented. It can be identified that two kinematic chains are isomorphic or not by comparing the structural characteristics of their adjacent matrixes using a method called matching row-to-row. This method may be applied to the planar linkage chain too. So, the methods of identifying isomorphism are unified in the planar kinematic chain that has or hasn't higher pairs with or without multiple joints. And it has some characters such as visual, simple and convenient for processing by computer, and so on. 展开更多
关键词 Planar kinematic chain Higher pair Multiple joint Adjacent matrix Identifying isomorphism
下载PDF
HOLDITCH THEOREM FOR THE CLOSED SPACE CURVES IN LORENTZIAN 3-SPACE
13
作者 Handan Yιldιrιm Salim Yce Nuri Kuruoglu 《Acta Mathematica Scientia》 SCIE CSCD 2011年第1期172-180,共9页
In this article, we give the area formula of the closed projection curve of a closed space curve in Lorentzian 3-space L3. For the 1-parameter closed Lorentzian space motion in L3, we obtain a Holditch Theorem taking ... In this article, we give the area formula of the closed projection curve of a closed space curve in Lorentzian 3-space L3. For the 1-parameter closed Lorentzian space motion in L3, we obtain a Holditch Theorem taking into account the Lorentzian matrix multiplication for the closed space curves by using their othogonal projections onto the Euclidean plane in the fixed Lorentzian space. Moreover, we generalize this Holditch Theorem for noncollinear three fixed points of the moving Lorentzian space and any other fixed point on the plane which is determined by these three fixed points. 展开更多
关键词 Lorentzian matrix multiplication Lorentzian motion Holditch Theorem orthogonal projection area
下载PDF
MapReduce based computation of the diffusion method in recommender systems
14
作者 彭飞 You Jiali +1 位作者 Zeng Xuewen Deng Haojiang 《High Technology Letters》 EI CAS 2016年第3期288-296,共9页
The performance of existing diffusion-based algorithms in recommender systems is still limited by the processing ability of a single computer. In order to conduct the diffusion computation on large data sets,a paralle... The performance of existing diffusion-based algorithms in recommender systems is still limited by the processing ability of a single computer. In order to conduct the diffusion computation on large data sets,a parallel implementation of the classic diffusion method on the MapReduce framework is proposed. At first,the diffusion computation is transformed from a summation format to a cascade matrix multiplication format,and then,a parallel matrix multiplication algorithm based on dynamic vector is proposed to reduce the CPU and I / O cost on the MapReduce framework,which can also be applied to other parallel matrix multiplication scenarios. Then,block partitioning is used to further improve the performance,while the order of matrix multiplication is also taken into consideration.Experiments on different kinds of data sets have verified the efficiency of the proposed method. 展开更多
关键词 MAPREDUCE recommender system DIFFUSION PARALLEL matrix multiplication
下载PDF
Optical SDMA for applying compressive sensing in WSN 被引量:1
15
作者 Xuewen Liu Song Xiao Lei Quan 《Journal of Systems Engineering and Electronics》 SCIE EI CSCD 2016年第4期780-789,共10页
In order to apply compressive sensing in wireless sensor network, inside the nodes cluster classified by the spatial correlation, we propose that a cluster head adopts free space optical communication with space divis... In order to apply compressive sensing in wireless sensor network, inside the nodes cluster classified by the spatial correlation, we propose that a cluster head adopts free space optical communication with space division multiple access, and a sensor node uses a modulating retro-reflector for communication. Thus while a random sampling matrix is used to guide the establishment of links between head cluster and sensor nodes, the random linear projection is accomplished. To establish multiple links at the same time, an optical space division multiple access antenna is designed. It works in fixed beams switching mode and consists of optic lens with a large field of view(FOV), fiber array on the focal plane which is used to realize virtual channels segmentation, direction of arrival sensor, optical matrix switch and controller. Based on the angles of nodes' laser beams, by dynamically changing the route, optical matrix switch actualizes the multi-beam full duplex tracking receiving and transmission. Due to the structure of fiber array, there will be several fade zones both in the focal plane and in lens' FOV. In order to lower the impact of fade zones and harmonize multibeam, a fiber array adjustment is designed. By theoretical, simulated and experimental study, the antenna's qualitative feasibility is validated. 展开更多
关键词 wireless sensor network compressive sensing space division multiple access optical matrix switch laser beam tracking
下载PDF
On Testing Equality of K Multiple Correlation Matrices
16
作者 A.K.Gupta D.G.Kabe 《Northeastern Mathematical Journal》 CSCD 2000年第4期405-410,共6页
Coutsourides derived an ad hoc nuisance paratmeter removal test for testing equality of two multiple correlation matrices of two independent p variate normal populations under the assumption that a sample of size ... Coutsourides derived an ad hoc nuisance paratmeter removal test for testing equality of two multiple correlation matrices of two independent p variate normal populations under the assumption that a sample of size n is available from each population. This paper presents a likelihood ratio test criterion for testing equality of K multiple correlation matrices and extends the results to the testing of equality of K partial correlation matrices. 展开更多
关键词 normal population multiple correlation matrix partial correlations matrix distribution theory test of hypothesis likelihood ratio test
下载PDF
An Introduction to the Computational Complexity of Matrix Multiplication 被引量:1
17
作者 Yan Li Sheng-Long Hu +1 位作者 Jie Wang Zheng-Hai Huang 《Journal of the Operations Research Society of China》 EI CSCD 2020年第1期29-43,共15页
This article introduces the approach on studying the computational complexity of matrix multiplication by ranks of the matrix multiplication tensors.Basic results and recent developments in this area are reviewed.
关键词 matrix multiplication Computational complexity Tensor rank Bilinear mapping Border rank
原文传递
An all-optical matrix multiplication scheme with non-linear material based switching system 被引量:2
18
作者 Archan Kumar Das Sourangshu Mukhopadhyay 《Chinese Optics Letters》 SCIE EI CAS CSCD 2005年第3期172-175,共4页
Optics is a potential candidate in information, data, and image processing. In all-optical data and information processing, optics has been used as information carrying signal because of its inherent advantages of par... Optics is a potential candidate in information, data, and image processing. In all-optical data and information processing, optics has been used as information carrying signal because of its inherent advantages of parallelism. Several optical methods are proposed in support of the above processing. In many algebraic, arithmetic, and image processing schemes fundamental logic and memory operations are conducted exploring all-optical devices. In this communication we report an all-optical matrix multiplication operation with non-linear material based switching circuit. 展开更多
关键词 An all-optical matrix multiplication scheme with non-linear material based switching system
原文传递
Fast rectangular matrix multiplication and some applications
19
作者 Victor Y PAN 《Science China Mathematics》 SCIE 2008年第3期389-406,共18页
We study asymptotically fast multiplication algorithms for matrix pairs of arbitrary dimensions, and optimize the exponents of their arithmetic complexity bounds. For a large class of input matrix pairs, we improve th... We study asymptotically fast multiplication algorithms for matrix pairs of arbitrary dimensions, and optimize the exponents of their arithmetic complexity bounds. For a large class of input matrix pairs, we improve the known exponents. We also show some applications of our results: (i) we decrease from O(n 2 + n 1+o(1)logq) to O(n 1.9998 + n 1+o(1)logq) the known arithmetic complexity bound for the univariate polynomial factorization of degree n over a finite field with q elements; (ii) we decrease from 2.837 to 2.7945 the known exponent of the work and arithmetic processor bounds for fast deterministic (NC) parallel evaluation of the determinant, the characteristic polynomial, and the inverse of an n × n matrix, as well as for the solution to a nonsingular linear system of n equations; (iii) we decrease from O(m 1.575 n) to O(m 1.5356 n) the known bound for computing basic solutions to a linear programming problem with m constraints and n variables. 展开更多
关键词 rectangular matrix multiplication asymptotic arithmetic complexity bilinear algorithm polynomial factorization over finite fields 68Q25 11Y16
原文传递
Optical tensor core architecture for neural network training based on dual-layer waveguide topology and homodyne detection 被引量:2
20
作者 Shaofu Xu Weiwen Zou 《Chinese Optics Letters》 SCIE EI CAS CSCD 2021年第8期84-89,共6页
We propose an optical tensor core(OTC) architecture for neural network training. The key computational components of the OTC are the arrayed optical dot-product units(DPUs). The homodyne-detection-based DPUs can condu... We propose an optical tensor core(OTC) architecture for neural network training. The key computational components of the OTC are the arrayed optical dot-product units(DPUs). The homodyne-detection-based DPUs can conduct the essential computational work of neural network training, i.e., matrix-matrix multiplication. Dual-layer waveguide topology is adopted to feed data into these DPUs with ultra-low insertion loss and cross talk. Therefore, the OTC architecture allows a large-scale dot-product array and can be integrated into a photonic chip. The feasibility of the OTC and its effectiveness on neural network training are verified with numerical simulations. 展开更多
关键词 optical tensor core neural network training matrix multiplication homodyne detection dual-layer waveguides
原文传递
上一页 1 2 下一页 到第
使用帮助 返回顶部