Low-density parity-check(LDPC)codes are not only capacity-approaching,but also greatly suitable for high-throughput implementation.Thus,they are the most popular codes for high-speed data transmission in the past two ...Low-density parity-check(LDPC)codes are not only capacity-approaching,but also greatly suitable for high-throughput implementation.Thus,they are the most popular codes for high-speed data transmission in the past two decades.Thanks to the low-density property of their parity-check matrices,the optimal maximum a posteriori probability decoding of LDPC codes can be approximated by message-passing decoding with linear complexity and highly parallel nature.Then,it reveals that the approximation has to carry on Tanner graphs without short cycles and small trapping sets.Last,it demonstrates that well-designed LDPC codes with the aid of computer simulation and asymptotic analysis tools are able to approach the channel capacity.Moreover,quasi-cyclic(QC)structure is introduced to significantly facilitate their high-throughput implementation.In fact,compared to the other capacity-approaching codes,QC-LDPC codes can provide better area-efficiency and energy-efficiency.As a result,they are widely applied in numerous communication systems,e.g.,Landsat satellites,Chang’e Chinese Lunar mission,5G mobile communications and so on.What’s more,its extension to non-binary Galois fields has been adopted as the channel coding scheme for BeiDou navigation satellite system.展开更多
Directed networks such as gene regulation networks and neural networks are connected by arcs(directed links). The nodes in a directed network are often strongly interwound by a huge number of directed cycles, which ...Directed networks such as gene regulation networks and neural networks are connected by arcs(directed links). The nodes in a directed network are often strongly interwound by a huge number of directed cycles, which leads to complex information-processing dynamics in the network and makes it highly challenging to infer the intrinsic direction of information flow. In this theoretical paper, based on the principle of minimum-feedback, we explore the node hierarchy of directed networks and distinguish feedforward and feedback arcs. Nearly optimal node hierarchy solutions, which minimize the number of feedback arcs from lower-level nodes to higher-level nodes, are constructed by belief-propagation and simulated-annealing methods. For real-world networks, we quantify the extent of feedback scarcity by comparison with the ensemble of direction-randomized networks and identify the most important feedback arcs. Our methods are also useful for visualizing directed networks.展开更多
Object-oriented model possesses inherent concurrency. Integration of concurrency and object-orientation is a promising new field. MPI is a message-passing standard and has been adopted by more and more systems. This p...Object-oriented model possesses inherent concurrency. Integration of concurrency and object-orientation is a promising new field. MPI is a message-passing standard and has been adopted by more and more systems. This paper proposes a novel approach to realize concurrent object-oriented programming based on Message-passing interface(MPI) in which future method communication is adopted between concurrent objects. A state behavior set is proposed to solve inheritance anomaly, and a bounded buffer is taken as an example to illustrate this proposal. The definition of ParaMPI class, which is the most important class in the concurrent class library, and implementation issues are briefly described.展开更多
The Fourier transform is very important to numerous applications in science and engineering. However, its usefulness is hampered by its computational expense. In this paper, in an attempt to develop a faster method fo...The Fourier transform is very important to numerous applications in science and engineering. However, its usefulness is hampered by its computational expense. In this paper, in an attempt to develop a faster method for computing Fourier transforms, the authors present parallel implementations of two new algorithms developed for the type IV Discrete Cosine Transform (DCT-IV) which support the new interleaved fast Fourier transform method. The authors discuss the realizations of their implementations using two paradigms. The first involved commodity equipment and the Message-Passing Interface (MPI) library. The second utilized the RapidMind development platform and the Cell Broadband Engine (BE) processor. These experiments indicate that the authors' rotation-based algorithm is preferable to their lifting-based algorithm on the platforms tested, with increased efficiency demonstrated by their MPI implementation for large data sets. Finally, the authors outline future work by discussing an architecture-oriented method for computing DCT-IVs which promises further optimization. The results indicate a promising fresh direction in the search for efficient ways to compute Fourier transforms.展开更多
The distributed computer system described in this paper is a set of computernodes interconnected in an interconnection network via packet-switching interfaces.The nodes communicate with each other by means of message-...The distributed computer system described in this paper is a set of computernodes interconnected in an interconnection network via packet-switching interfaces.The nodes communicate with each other by means of message-passing protocols. Thispaper presents the implementation of rendezvous facilities as highlevel prhoitives provided by a parallel programming language to support interprocess cornmunication andsynchronisation.展开更多
Area-efficient design methodology is proposed for the analog decoding implementations of the rate-l/2 accumulate repeat-4 jagged-accumulate (AR4JA) low density parity check (LDPC) code. The proposed approach is de...Area-efficient design methodology is proposed for the analog decoding implementations of the rate-l/2 accumulate repeat-4 jagged-accumulate (AR4JA) low density parity check (LDPC) code. The proposed approach is designed using optimized decoding architecture and regularized routing network, in such a way that the overall wiring overhead is minimized and the silicon area utilization is significantly improved. The prototyping chip used to verily the approach is tully integrated in a four-metal double-poly 0.35 lam complementary metal oxide semiconductor (CMOS) technology, and includes an input-output interface that maximizes the decoder throughput. The decoding core area is 2.02 mm2 with a post-layout area utilization of 80%. The decoder was successfully tested at the maximum data rate of 10 Mbit/s, with a core power consumption of 6.78 mW at 3.3 V, which corresponds to an energy per decoded bit of 0.677 nJ. The proposed analog LDPC decoder with low processing power and high-reliability is suitable lbr space- and power-constrained spacecraft system.展开更多
基金supported in part by the National Natural Science Foundation of China(No.62071026,No.62201152 and No.61941106)the Natural Science Foundation of Fujian Province(No.2021J05034)Key Project of Science and Technology Innovation of Fujian Province(No.2021G02006)。
文摘Low-density parity-check(LDPC)codes are not only capacity-approaching,but also greatly suitable for high-throughput implementation.Thus,they are the most popular codes for high-speed data transmission in the past two decades.Thanks to the low-density property of their parity-check matrices,the optimal maximum a posteriori probability decoding of LDPC codes can be approximated by message-passing decoding with linear complexity and highly parallel nature.Then,it reveals that the approximation has to carry on Tanner graphs without short cycles and small trapping sets.Last,it demonstrates that well-designed LDPC codes with the aid of computer simulation and asymptotic analysis tools are able to approach the channel capacity.Moreover,quasi-cyclic(QC)structure is introduced to significantly facilitate their high-throughput implementation.In fact,compared to the other capacity-approaching codes,QC-LDPC codes can provide better area-efficiency and energy-efficiency.As a result,they are widely applied in numerous communication systems,e.g.,Landsat satellites,Chang’e Chinese Lunar mission,5G mobile communications and so on.What’s more,its extension to non-binary Galois fields has been adopted as the channel coding scheme for BeiDou navigation satellite system.
基金Project by the National Basic Research Program of China(Grant No.2013CB932804)the National Natural Science Foundations of China(Grant Nos.11121403 and 11225526)support by Fondazione CRT under project SIBYL,initiative "La Ricerca dei Talenti"
文摘Directed networks such as gene regulation networks and neural networks are connected by arcs(directed links). The nodes in a directed network are often strongly interwound by a huge number of directed cycles, which leads to complex information-processing dynamics in the network and makes it highly challenging to infer the intrinsic direction of information flow. In this theoretical paper, based on the principle of minimum-feedback, we explore the node hierarchy of directed networks and distinguish feedforward and feedback arcs. Nearly optimal node hierarchy solutions, which minimize the number of feedback arcs from lower-level nodes to higher-level nodes, are constructed by belief-propagation and simulated-annealing methods. For real-world networks, we quantify the extent of feedback scarcity by comparison with the ensemble of direction-randomized networks and identify the most important feedback arcs. Our methods are also useful for visualizing directed networks.
文摘Object-oriented model possesses inherent concurrency. Integration of concurrency and object-orientation is a promising new field. MPI is a message-passing standard and has been adopted by more and more systems. This paper proposes a novel approach to realize concurrent object-oriented programming based on Message-passing interface(MPI) in which future method communication is adopted between concurrent objects. A state behavior set is proposed to solve inheritance anomaly, and a bounded buffer is taken as an example to illustrate this proposal. The definition of ParaMPI class, which is the most important class in the concurrent class library, and implementation issues are briefly described.
文摘The Fourier transform is very important to numerous applications in science and engineering. However, its usefulness is hampered by its computational expense. In this paper, in an attempt to develop a faster method for computing Fourier transforms, the authors present parallel implementations of two new algorithms developed for the type IV Discrete Cosine Transform (DCT-IV) which support the new interleaved fast Fourier transform method. The authors discuss the realizations of their implementations using two paradigms. The first involved commodity equipment and the Message-Passing Interface (MPI) library. The second utilized the RapidMind development platform and the Cell Broadband Engine (BE) processor. These experiments indicate that the authors' rotation-based algorithm is preferable to their lifting-based algorithm on the platforms tested, with increased efficiency demonstrated by their MPI implementation for large data sets. Finally, the authors outline future work by discussing an architecture-oriented method for computing DCT-IVs which promises further optimization. The results indicate a promising fresh direction in the search for efficient ways to compute Fourier transforms.
文摘The distributed computer system described in this paper is a set of computernodes interconnected in an interconnection network via packet-switching interfaces.The nodes communicate with each other by means of message-passing protocols. Thispaper presents the implementation of rendezvous facilities as highlevel prhoitives provided by a parallel programming language to support interprocess cornmunication andsynchronisation.
文摘Area-efficient design methodology is proposed for the analog decoding implementations of the rate-l/2 accumulate repeat-4 jagged-accumulate (AR4JA) low density parity check (LDPC) code. The proposed approach is designed using optimized decoding architecture and regularized routing network, in such a way that the overall wiring overhead is minimized and the silicon area utilization is significantly improved. The prototyping chip used to verily the approach is tully integrated in a four-metal double-poly 0.35 lam complementary metal oxide semiconductor (CMOS) technology, and includes an input-output interface that maximizes the decoder throughput. The decoding core area is 2.02 mm2 with a post-layout area utilization of 80%. The decoder was successfully tested at the maximum data rate of 10 Mbit/s, with a core power consumption of 6.78 mW at 3.3 V, which corresponds to an energy per decoded bit of 0.677 nJ. The proposed analog LDPC decoder with low processing power and high-reliability is suitable lbr space- and power-constrained spacecraft system.