As one of the most essential and important operations in linear algebra, the performance prediction of sparse matrix-vector multiplication (SpMV) on GPUs has got more and more attention in recent years. In 2012, Guo a...As one of the most essential and important operations in linear algebra, the performance prediction of sparse matrix-vector multiplication (SpMV) on GPUs has got more and more attention in recent years. In 2012, Guo and Wang put forward a new idea to predict the performance of SpMV on GPUs. However, they didn’t consider the matrix structure completely, so the execution time predicted by their model tends to be inaccurate for general sparse matrix. To address this problem, we proposed two new similar models, which take into account the structure of the matrices and make the performance prediction model more accurate. In addition, we predict the execution time of SpMV for CSR-V, CSR-S, ELL and JAD sparse matrix storage formats by the new models on the CUDA platform. Our experimental results show that the accuracy of prediction by our models is 1.69 times better than Guo and Wang’s model on average for most general matrices.展开更多
Toeplitz matrix-vector multiplication is widely used in various fields,including optimal control,systolic finite field multipliers,multidimensional convolution,etc.In this paper,we first present a non-asymptotic quant...Toeplitz matrix-vector multiplication is widely used in various fields,including optimal control,systolic finite field multipliers,multidimensional convolution,etc.In this paper,we first present a non-asymptotic quantum algorithm for Toeplitz matrix-vector multiplication with time complexity O(κpolylogn),whereκand 2n are the condition number and the dimension of the circulant matrix extended from the Toeplitz matrix,respectively.For the case with an unknown generating function,we also give a corresponding non-asymptotic quantum version that eliminates the dependency on the L_(1)-normρof the displacement of the structured matrices.Due to the good use of the special properties of Toeplitz matrices,the proposed quantum algorithms are sufficiently accurate and efficient compared to the existing quantum algorithms under certain circumstances.展开更多
This paper focuses on how to optimize the cache performance of sparse matrix-matrix multiplication(SpGEMM).It classifies the cache misses into two categories;one is caused by the irregular distribution pattern of the ...This paper focuses on how to optimize the cache performance of sparse matrix-matrix multiplication(SpGEMM).It classifies the cache misses into two categories;one is caused by the irregular distribution pattern of the multiplier-matrix,and the other is caused by the multiplicand.For each of them,the paper puts forward an optimization method respectively.The first hash based method removes cache misses of the 1 st category effectively,and improves the performance by a factor of 6 on an Intel 8-core CPU for the best cases.For cache misses of the 2nd category,it proposes a new cache replacement algorithm,which achieves a cache hit rate much higher than other historical knowledge based algorithms,and the algorithm is applicable on CELL and GPU.To further verify the effectiveness of our methods,we implement our algorithm on GPU,and the performance perfectly scales with the size of on-chip storage.展开更多
In conventional linear spectral mixture analysis model,a class is represented by a single endmember.However,the intra-class spectral variability is usually very large,which makes it difficult to represent a class,and ...In conventional linear spectral mixture analysis model,a class is represented by a single endmember.However,the intra-class spectral variability is usually very large,which makes it difficult to represent a class,and in this case,it leads to incorrect unmixing results. Some proposed algorithms play a positive role in overcoming the endmember variability,but there are shortcomings on computation intensive,unsatisfactory unmixing results and so on. Recently,sparse regression has been applied to unmixing,assuming each mixed pixel can be expressed as a linear combination of only a few spectra in a spectral library. It is essentially the same as multiple endmember spectral unmixing. OMP( orthogonal matching pursuit),a sparse reconstruction algorithm,has advantages of simple structure and high efficiency. However,it does not take into account the constraints of abundance non-negativity and abundance sum-to-one( ANC and ASC),leading to undesirable unmixing results. In order to solve these issues,this paper presents an improved OMP algorithm( fully constraint OMP,FOMP) for multiple endmember hyperspectral sparse unmixing. The proposed algorithm overcomes the shortcomings of OMP,and on the other hand,it solves the problem of endmember variability.The ANC and ASC constraints are firstly added into the OMP algorithm,and then the endmember set is refined by the relative increase in root-mean-square-error( RMSE) to avoid over-fitting,finally pixels are unmixed by their optimal endmember set. The simulated and real hyperspectral data experiments show that FOPM unmixing results are ideally comparable and abundance RMSE reduces much lower than OMP and simple spectral mixture analysis( s SMA),and has a strong anti-noise performance. It proves that multiple endmember spectral mixture analysis is more reasonable.展开更多
Polar coded sparse code multiple access(SCMA) system is conceived in this paper. A simple but new iterative multiuser detection framework is proposed, which consists of a message passing algorithm(MPA) based multiuser...Polar coded sparse code multiple access(SCMA) system is conceived in this paper. A simple but new iterative multiuser detection framework is proposed, which consists of a message passing algorithm(MPA) based multiuser detector and a soft-input soft-output(SISO) successive cancellation(SC) polar decoder. In particular, the SISO polar decoding process is realized by a specifically designed soft re-encoder, which is concatenated to the original SC decoder. This soft re-encoder is capable of reconstructing the soft information of the entire polar codeword based on previously detected log-likelihood ratios(LLRs) of information bits. Benefiting from the soft re-encoding algorithm, the resultant iterative detection strategy is able to obtain a salient coding gain. Our simulation results demonstrate that significant improvement in error performance is achieved by the proposed polar-coded SCMA in additive white Gaussian noise(AWGN) channels, where the performance of the conventional SISO belief propagation(BP) polar decoder aided SCMA, the turbo coded SCMA and the low-density parity-check(LDPC) coded SCMA are employed as benchmarks.展开更多
In this paper,ambient IoT is used as a typical use case of massive connections for the sixth generation(6G)mobile communications where we derive the performance requirements to facilitate the evaluation of technical s...In this paper,ambient IoT is used as a typical use case of massive connections for the sixth generation(6G)mobile communications where we derive the performance requirements to facilitate the evaluation of technical solutions.A rather complete design of unsourced multiple access is proposed in which two key parts:a compressed sensing module for active user detection,and a sparse interleaver-division multiple access(SIDMA)module are simulated side by side on a same platform at balanced signal to noise ratio(SNR)operating points.With a proper combination of compressed sensing matrix,a convolutional encoder,receiver algorithms,the simulated performance results appear superior to the state-of-the-art benchmark,yet with relatively less complicated processing.展开更多
A greedy algorithm used for the recovery of sparse signals,multiple orthogonal least squares(MOLS)have recently attracted quite a big of attention.In this paper,we consider the number of iterations required for the MO...A greedy algorithm used for the recovery of sparse signals,multiple orthogonal least squares(MOLS)have recently attracted quite a big of attention.In this paper,we consider the number of iterations required for the MOLS algorithm for recovery of a K-sparse signal x∈R^(n).We show that MOLS provides stable reconstruction of all K-sparse signals x from y=Ax+w in|6K/ M|iterations when the matrix A satisfies the restricted isometry property(RIP)with isometry constantδ_(7K)≤0.094.Compared with the existing results,our sufficient condition is not related to the sparsity level K.展开更多
Sparse code multiple access(SCMA) is a novel non-orthogonal multiple access technology considered as a key component in 5G air interface design. In SCMA, the incoming bits are directly mapped to multi-dimensional cons...Sparse code multiple access(SCMA) is a novel non-orthogonal multiple access technology considered as a key component in 5G air interface design. In SCMA, the incoming bits are directly mapped to multi-dimensional constellation vectors known as SCMA codewords, which are then mapped onto blocks of physical resource elements in a sparse manner. The number of codewords that can be non-orthogonally multiplexed in each SCMA block is much larger than the number of resource elements therein, so the system is overloaded and can support larger number of users. The joint optimization of multi-dimensional modulation and low density spreading in SCMA codebook design ensures the SCMA receiver to recover the coded bits with high reliability and low complexity. The flexibility in design and the robustness in performance further prove SCMA to be a promising technology to meet the 5G communication demands such as massive connectivity and low latency transmissions.展开更多
We consider the problem of constructing one sparse signal from a few measurements. This problem has been extensively addressed in the literature, providing many sub-optimal methods that assure convergence to a locally...We consider the problem of constructing one sparse signal from a few measurements. This problem has been extensively addressed in the literature, providing many sub-optimal methods that assure convergence to a locally optimal solution under specific conditions. There are a few measurements associated with every signal, where the size of each measurement vector is less than the sparse signal's size. All of the sparse signals have the same unknown support. We generalize an existing algorithm for the recovery of one sparse signal from a single measurement to this problem and analyze its performances through simulations. We also compare the construction performance with other existing algorithms. Finally, the proposed method also shows advantages over the OMP (Orthogonal Matching Pursuit) algorithm in terms of the computational complexity.展开更多
A sparse channel estimation method is proposed for doubly selective channels in multiple- input multiple-output ( MIMO ) orthogonal frequency division multiplexing ( OFDM ) systems. Based on the basis expansion mo...A sparse channel estimation method is proposed for doubly selective channels in multiple- input multiple-output ( MIMO ) orthogonal frequency division multiplexing ( OFDM ) systems. Based on the basis expansion model (BEM) of the channel, the joint-sparsity of MIMO-OFDM channels is described. The sparse characteristics enable us to cast the channel estimation as a distributed compressed sensing (DCS) problem. Then, a low complexity DCS-based estimation scheme is designed. Compared with the conventional compressed channel estimators based on the compressed sensing (CS) theory, the DCS-based method has an improved efficiency because it reconstructs the MIMO channels jointly rather than addresses them separately. Furthermore, the group-sparse structure of each single channel is also depicted. To effectively use this additional structure of the sparsity pattern, the DCS algorithm is modified. The modified algorithm can further enhance the estimation performance. Simulation results demonstrate the superiority of our method over fast fading channels in MIMO-OFDM systems.展开更多
In this paper, we proposed a new semi-supervised multi-manifold learning method, called semi- supervised sparse multi-manifold embedding (S3MME), for dimensionality reduction of hyperspectral image data. S3MME exploit...In this paper, we proposed a new semi-supervised multi-manifold learning method, called semi- supervised sparse multi-manifold embedding (S3MME), for dimensionality reduction of hyperspectral image data. S3MME exploits both the labeled and unlabeled data to adaptively find neighbors of each sample from the same manifold by using an optimization program based on sparse representation, and naturally gives relative importance to the labeled ones through a graph-based methodology. Then it tries to extract discriminative features on each manifold such that the data points in the same manifold become closer. The effectiveness of the proposed multi-manifold learning algorithm is demonstrated and compared through experiments on a real hyperspectral images.展开更多
With the increasing complexity of industrial processes, the high-dimensional industrial data exhibit a strong nonlinearity, bringing considerable challenges to the fault diagnosis of industrial processes. To efficient...With the increasing complexity of industrial processes, the high-dimensional industrial data exhibit a strong nonlinearity, bringing considerable challenges to the fault diagnosis of industrial processes. To efficiently extract deep meaningful features that are crucial for fault diagnosis, a sparse Gaussian feature extractor(SGFE) is designed to learn a nonlinear mapping that projects the raw data into the feature space with the fault label dimension. The feature space is described by the one-hot encoding of the fault category label as an orthogonal basis. In this way, the deep sparse Gaussian features related to fault categories can be gradually learned from the raw data by SGFE. In the feature space,the sparse Gaussian(SG) loss function is designed to constrain the distribution of features to multiple sparse multivariate Gaussian distributions. The sparse Gaussian features are linearly separable in the feature space, which is conducive to improving the accuracy of the downstream fault classification task. The feasibility and practical utility of the proposed SGFE are verified by the handwritten digits MNIST benchmark and Tennessee-Eastman(TE) benchmark process,respectively.展开更多
A multiple-input multiple-output(MIMO) sonar can synthesize a large-aperture virtual uniform linear array(ULA) from a small number of physical elements. However, the large aperture is obtained at the cost of a gre...A multiple-input multiple-output(MIMO) sonar can synthesize a large-aperture virtual uniform linear array(ULA) from a small number of physical elements. However, the large aperture is obtained at the cost of a great number of matched filters with much heavy computation load. To reduce the computation load, a MIMO sonar imaging method using a virtual sparse linear array(SLA) is proposed, which contains the offline and online processing. In the offline processing, the virtual ULA of the MIMO sonar is thinned to a virtual SLA by the simulated annealing algorithm, and matched filters corresponding to inactive virtual elements are removed. In the online processing, outputs of matched filters corresponding to active elements are collected for further multibeam processing and hence, the number of matched filters in the echo processing procedure is effectively reduced. Numerical simulations show that the proposed method can reduce the computation load effectively while obtaining a similar imaging performance as the traditional method.展开更多
文摘As one of the most essential and important operations in linear algebra, the performance prediction of sparse matrix-vector multiplication (SpMV) on GPUs has got more and more attention in recent years. In 2012, Guo and Wang put forward a new idea to predict the performance of SpMV on GPUs. However, they didn’t consider the matrix structure completely, so the execution time predicted by their model tends to be inaccurate for general sparse matrix. To address this problem, we proposed two new similar models, which take into account the structure of the matrices and make the performance prediction model more accurate. In addition, we predict the execution time of SpMV for CSR-V, CSR-S, ELL and JAD sparse matrix storage formats by the new models on the CUDA platform. Our experimental results show that the accuracy of prediction by our models is 1.69 times better than Guo and Wang’s model on average for most general matrices.
基金supported by the National Natural Science Foundation of China(Grant Nos.62071015 and 62171264)。
文摘Toeplitz matrix-vector multiplication is widely used in various fields,including optimal control,systolic finite field multipliers,multidimensional convolution,etc.In this paper,we first present a non-asymptotic quantum algorithm for Toeplitz matrix-vector multiplication with time complexity O(κpolylogn),whereκand 2n are the condition number and the dimension of the circulant matrix extended from the Toeplitz matrix,respectively.For the case with an unknown generating function,we also give a corresponding non-asymptotic quantum version that eliminates the dependency on the L_(1)-normρof the displacement of the structured matrices.Due to the good use of the special properties of Toeplitz matrices,the proposed quantum algorithms are sufficiently accurate and efficient compared to the existing quantum algorithms under certain circumstances.
基金Supported by the National High Technology Research and Development Programme of China(No.2010AA012302,2009AA01 A134)Tsinghua National Laboratory for Information Science and Technology(TNList)Cross-discipline Foundation
文摘This paper focuses on how to optimize the cache performance of sparse matrix-matrix multiplication(SpGEMM).It classifies the cache misses into two categories;one is caused by the irregular distribution pattern of the multiplier-matrix,and the other is caused by the multiplicand.For each of them,the paper puts forward an optimization method respectively.The first hash based method removes cache misses of the 1 st category effectively,and improves the performance by a factor of 6 on an Intel 8-core CPU for the best cases.For cache misses of the 2nd category,it proposes a new cache replacement algorithm,which achieves a cache hit rate much higher than other historical knowledge based algorithms,and the algorithm is applicable on CELL and GPU.To further verify the effectiveness of our methods,we implement our algorithm on GPU,and the performance perfectly scales with the size of on-chip storage.
基金Sponsored by the National Natural Science Foundation of China(Grant No.61405041,61571145)the Key Program of Heilongjiang Natural Science Foundation(Grant No.ZD201216)+2 种基金the Program Excellent Academic Leaders of Harbin(Grant No.RC2013XK009003)the China Postdoctoral Science Foundation(Grant No.2014M551221)the Heilongjiang Postdoctoral Science Found(Grant No.LBH-Z13057)
文摘In conventional linear spectral mixture analysis model,a class is represented by a single endmember.However,the intra-class spectral variability is usually very large,which makes it difficult to represent a class,and in this case,it leads to incorrect unmixing results. Some proposed algorithms play a positive role in overcoming the endmember variability,but there are shortcomings on computation intensive,unsatisfactory unmixing results and so on. Recently,sparse regression has been applied to unmixing,assuming each mixed pixel can be expressed as a linear combination of only a few spectra in a spectral library. It is essentially the same as multiple endmember spectral unmixing. OMP( orthogonal matching pursuit),a sparse reconstruction algorithm,has advantages of simple structure and high efficiency. However,it does not take into account the constraints of abundance non-negativity and abundance sum-to-one( ANC and ASC),leading to undesirable unmixing results. In order to solve these issues,this paper presents an improved OMP algorithm( fully constraint OMP,FOMP) for multiple endmember hyperspectral sparse unmixing. The proposed algorithm overcomes the shortcomings of OMP,and on the other hand,it solves the problem of endmember variability.The ANC and ASC constraints are firstly added into the OMP algorithm,and then the endmember set is refined by the relative increase in root-mean-square-error( RMSE) to avoid over-fitting,finally pixels are unmixed by their optimal endmember set. The simulated and real hyperspectral data experiments show that FOPM unmixing results are ideally comparable and abundance RMSE reduces much lower than OMP and simple spectral mixture analysis( s SMA),and has a strong anti-noise performance. It proves that multiple endmember spectral mixture analysis is more reasonable.
基金supported in part by National Natural Science Foundation of China (no. 61571373, no. 61501383, no. U1734209, no. U1709219)in part by Key International Cooperation Project of Sichuan Province (no. 2017HH0002)+2 种基金in part by Marie Curie Fellowship (no. 792406)in part by the National Science and Technology Major Project under Grant 2016ZX03001018-002in part by NSFC China-Swedish project (no. 6161101297)
文摘Polar coded sparse code multiple access(SCMA) system is conceived in this paper. A simple but new iterative multiuser detection framework is proposed, which consists of a message passing algorithm(MPA) based multiuser detector and a soft-input soft-output(SISO) successive cancellation(SC) polar decoder. In particular, the SISO polar decoding process is realized by a specifically designed soft re-encoder, which is concatenated to the original SC decoder. This soft re-encoder is capable of reconstructing the soft information of the entire polar codeword based on previously detected log-likelihood ratios(LLRs) of information bits. Benefiting from the soft re-encoding algorithm, the resultant iterative detection strategy is able to obtain a salient coding gain. Our simulation results demonstrate that significant improvement in error performance is achieved by the proposed polar-coded SCMA in additive white Gaussian noise(AWGN) channels, where the performance of the conventional SISO belief propagation(BP) polar decoder aided SCMA, the turbo coded SCMA and the low-density parity-check(LDPC) coded SCMA are employed as benchmarks.
文摘In this paper,ambient IoT is used as a typical use case of massive connections for the sixth generation(6G)mobile communications where we derive the performance requirements to facilitate the evaluation of technical solutions.A rather complete design of unsourced multiple access is proposed in which two key parts:a compressed sensing module for active user detection,and a sparse interleaver-division multiple access(SIDMA)module are simulated side by side on a same platform at balanced signal to noise ratio(SNR)operating points.With a proper combination of compressed sensing matrix,a convolutional encoder,receiver algorithms,the simulated performance results appear superior to the state-of-the-art benchmark,yet with relatively less complicated processing.
基金supported by the National Natural Science Foundation of China(61907014,11871248,11701410,61901160)Youth Science Foundation of Henan Normal University(2019QK03).
文摘A greedy algorithm used for the recovery of sparse signals,multiple orthogonal least squares(MOLS)have recently attracted quite a big of attention.In this paper,we consider the number of iterations required for the MOLS algorithm for recovery of a K-sparse signal x∈R^(n).We show that MOLS provides stable reconstruction of all K-sparse signals x from y=Ax+w in|6K/ M|iterations when the matrix A satisfies the restricted isometry property(RIP)with isometry constantδ_(7K)≤0.094.Compared with the existing results,our sufficient condition is not related to the sparsity level K.
基金supported by the National Basic Research Program of China(973 Program 2012CB316000)the National Major Projects of China(2015ZX03002010)
文摘Sparse code multiple access(SCMA) is a novel non-orthogonal multiple access technology considered as a key component in 5G air interface design. In SCMA, the incoming bits are directly mapped to multi-dimensional constellation vectors known as SCMA codewords, which are then mapped onto blocks of physical resource elements in a sparse manner. The number of codewords that can be non-orthogonally multiplexed in each SCMA block is much larger than the number of resource elements therein, so the system is overloaded and can support larger number of users. The joint optimization of multi-dimensional modulation and low density spreading in SCMA codebook design ensures the SCMA receiver to recover the coded bits with high reliability and low complexity. The flexibility in design and the robustness in performance further prove SCMA to be a promising technology to meet the 5G communication demands such as massive connectivity and low latency transmissions.
文摘We consider the problem of constructing one sparse signal from a few measurements. This problem has been extensively addressed in the literature, providing many sub-optimal methods that assure convergence to a locally optimal solution under specific conditions. There are a few measurements associated with every signal, where the size of each measurement vector is less than the sparse signal's size. All of the sparse signals have the same unknown support. We generalize an existing algorithm for the recovery of one sparse signal from a single measurement to this problem and analyze its performances through simulations. We also compare the construction performance with other existing algorithms. Finally, the proposed method also shows advantages over the OMP (Orthogonal Matching Pursuit) algorithm in terms of the computational complexity.
基金Supported by the National Natural Science Foundation of China(61077022)
文摘A sparse channel estimation method is proposed for doubly selective channels in multiple- input multiple-output ( MIMO ) orthogonal frequency division multiplexing ( OFDM ) systems. Based on the basis expansion model (BEM) of the channel, the joint-sparsity of MIMO-OFDM channels is described. The sparse characteristics enable us to cast the channel estimation as a distributed compressed sensing (DCS) problem. Then, a low complexity DCS-based estimation scheme is designed. Compared with the conventional compressed channel estimators based on the compressed sensing (CS) theory, the DCS-based method has an improved efficiency because it reconstructs the MIMO channels jointly rather than addresses them separately. Furthermore, the group-sparse structure of each single channel is also depicted. To effectively use this additional structure of the sparsity pattern, the DCS algorithm is modified. The modified algorithm can further enhance the estimation performance. Simulation results demonstrate the superiority of our method over fast fading channels in MIMO-OFDM systems.
文摘In this paper, we proposed a new semi-supervised multi-manifold learning method, called semi- supervised sparse multi-manifold embedding (S3MME), for dimensionality reduction of hyperspectral image data. S3MME exploits both the labeled and unlabeled data to adaptively find neighbors of each sample from the same manifold by using an optimization program based on sparse representation, and naturally gives relative importance to the labeled ones through a graph-based methodology. Then it tries to extract discriminative features on each manifold such that the data points in the same manifold become closer. The effectiveness of the proposed multi-manifold learning algorithm is demonstrated and compared through experiments on a real hyperspectral images.
基金Projects(62125306, 62133003) supported by the National Natural Science Foundation of ChinaProject(TPL2019C03) supported by the Open Fund of Science and Technology on Thermal Energy and Power Laboratory,ChinaProject supported by the Fundamental Research Funds for the Central Universities(Zhejiang University NGICS Platform),China。
文摘With the increasing complexity of industrial processes, the high-dimensional industrial data exhibit a strong nonlinearity, bringing considerable challenges to the fault diagnosis of industrial processes. To efficiently extract deep meaningful features that are crucial for fault diagnosis, a sparse Gaussian feature extractor(SGFE) is designed to learn a nonlinear mapping that projects the raw data into the feature space with the fault label dimension. The feature space is described by the one-hot encoding of the fault category label as an orthogonal basis. In this way, the deep sparse Gaussian features related to fault categories can be gradually learned from the raw data by SGFE. In the feature space,the sparse Gaussian(SG) loss function is designed to constrain the distribution of features to multiple sparse multivariate Gaussian distributions. The sparse Gaussian features are linearly separable in the feature space, which is conducive to improving the accuracy of the downstream fault classification task. The feasibility and practical utility of the proposed SGFE are verified by the handwritten digits MNIST benchmark and Tennessee-Eastman(TE) benchmark process,respectively.
基金supported by the National Natural Science Foundation of China(51509204)the Opening Project of State Key Laboratory of Acoustics(SKLA201501)the Fundamental Research Funds for the Central Universities(3102015ZY011)
文摘A multiple-input multiple-output(MIMO) sonar can synthesize a large-aperture virtual uniform linear array(ULA) from a small number of physical elements. However, the large aperture is obtained at the cost of a great number of matched filters with much heavy computation load. To reduce the computation load, a MIMO sonar imaging method using a virtual sparse linear array(SLA) is proposed, which contains the offline and online processing. In the offline processing, the virtual ULA of the MIMO sonar is thinned to a virtual SLA by the simulated annealing algorithm, and matched filters corresponding to inactive virtual elements are removed. In the online processing, outputs of matched filters corresponding to active elements are collected for further multibeam processing and hence, the number of matched filters in the echo processing procedure is effectively reduced. Numerical simulations show that the proposed method can reduce the computation load effectively while obtaining a similar imaging performance as the traditional method.