Accurate automatic segmentation of gliomas in various sub-regions,including peritumoral edema,necrotic core,and enhancing and non-enhancing tumor core from 3D multimodal MRI images,is challenging because of its highly...Accurate automatic segmentation of gliomas in various sub-regions,including peritumoral edema,necrotic core,and enhancing and non-enhancing tumor core from 3D multimodal MRI images,is challenging because of its highly heterogeneous appearance and shape.Deep convolution neural networks(CNNs)have recently improved glioma segmentation performance.However,extensive down-sampling such as pooling or stridden convolution in CNNs significantly decreases the initial image resolution,resulting in the loss of accurate spatial and object parts information,especially information on the small sub-region tumors,affecting segmentation performance.Hence,this paper proposes a novel multi-level parallel network comprising three different level parallel subnetworks to fully use low-level,mid-level,and high-level information and improve the performance of brain tumor segmentation.We also introduce the Combo loss function to address input class imbalance and false positives and negatives imbalance in deep learning.The proposed method is trained and validated on the BraTS 2020 training and validation dataset.On the validation dataset,ourmethod achieved a mean Dice score of 0.907,0.830,and 0.787 for the whole tumor,tumor core,and enhancing tumor core,respectively.Compared with state-of-the-art methods,the multi-level parallel network has achieved competitive results on the validation dataset.展开更多
In this study,we developed a system based on deep space–time neural networks for gesture recognition.When users change or the number of gesture categories increases,the accuracy of gesture recognition decreases consi...In this study,we developed a system based on deep space–time neural networks for gesture recognition.When users change or the number of gesture categories increases,the accuracy of gesture recognition decreases considerably because most gesture recognition systems cannot accommodate both user differentiation and gesture diversity.To overcome the limitations of existing methods,we designed a onedimensional parallel long short-term memory–fully convolutional network(LSTM–FCN)model to extract gesture features of different dimensions.LSTM can learn complex time dynamic information,whereas FCN can predict gestures efficiently by extracting the deep,abstract features of gestures in the spatial dimension.In the experiment,50 types of gestures of five users were collected and evaluated.The experimental results demonstrate the effectiveness of this system and robustness to various gestures and individual changes.Statistical analysis of the recognition results indicated that an average accuracy of approximately 98.9% was achieved.展开更多
A new method solution for the direct displacement of parallel mechanism, wavelet network method, is proposed. Comparing with the classical analytical and numerical methods, this method can be extended to any parallel ...A new method solution for the direct displacement of parallel mechanism, wavelet network method, is proposed. Comparing with the classical analytical and numerical methods, this method can be extended to any parallel mechanism with any selected degree of freedom and configuration. A wavelet network suiting to approach multi-input and multi-output system is constructed. The network is optimized by analyzing the sparseness of input data and selecting the fitting wavelets by orthogonalization method according to the output data. Then it is applied to solve the direct displace- ment of a general six-degree-of-freedom parallel mechanism as a numerical example. For comparison purposes, a BP neural network is also used for this problem. Simulation results show that the wavelet network performs better than BP neural network. In addition, the wavelet network learns much faster than BP network.展开更多
A parallel neural network-based controller (PNNC) is presented for the motion control of underwater vehicles in this paper. It consists of a real-time part, a self-learning part and a desired-state programmer, and i...A parallel neural network-based controller (PNNC) is presented for the motion control of underwater vehicles in this paper. It consists of a real-time part, a self-learning part and a desired-state programmer, and it is different from normal adaptive neural network controller in structure. Owing to the introduction of the self-learning part, on-line learning can be performed without sample data in several sample periods, resulting in high learning speed of the controller and good control performance. The desired-state programmer is utilized to obtain better learning samples of the neural network to keep the stability of the controller. The developed controller is applied to the 4-degree of freedom control of the AUV “IUV- IV” and is successful on the simulation platform. The control performance is also compared with that of neural network controller with different structures such as normal adaptive neural network and different learning methods. Current effects and surge velocity control are also included to demonstrate the controller' s performance. It is shown that the PNNC has a great possibility to solve the problems in the control system design of underwater vehicles.展开更多
Training deep neural networks(DNNs)requires a significant amount of time and resources to obtain acceptable results,which severely limits its deployment in resource-limited platforms.This paper proposes DarkFPGA,a nov...Training deep neural networks(DNNs)requires a significant amount of time and resources to obtain acceptable results,which severely limits its deployment in resource-limited platforms.This paper proposes DarkFPGA,a novel customizable framework to efficiently accelerate the entire DNN training on a single FPGA platform.First,we explore batch-level parallelism to enable efficient FPGA-based DNN training.Second,we devise a novel hardware architecture optimised by a batch-oriented data pattern and tiling techniques to effectively exploit parallelism.Moreover,an analytical model is developed to determine the optimal design parameters for the DarkFPGA accelerator with respect to a specific network specification and FPGA resource constraints.Our results show that the accelerator is able to perform about 10 times faster than CPU training and about a third of the energy consumption than GPU training using 8-bit integers for training VGG-like networks on the CIFAR dataset for the Maxeler MAX5 platform.展开更多
Accurate estimation of the solubility of a chemical compound is an important issue for many industrial proce sses.To overcome the defects of some thermodynamic models and simple correlations,a parallel neural network(...Accurate estimation of the solubility of a chemical compound is an important issue for many industrial proce sses.To overcome the defects of some thermodynamic models and simple correlations,a parallel neural network(PNN) model was conceived and optimized to predict the solubility of diosgenin in seven n-alkanols(C_(1)-C_(7)).The linear regression analysis of the parity plots indicates that the PNN model can give more accurate descriptions of the solubility of diosgenin than the ordinary neural network(ONN) model.The comparison of the average root mean square deviation(RMSD) shows that the suggested model has a slight advantage over the thermodynamic NRTL model in terms of the calculating precision.Moreover,the PNN model can reflect the effects of the temperature and the chain length of the alcohol solvent on the solution behavior of diosgenin correctly and can estimate its solubility in the n-alkanols with more carbon atoms.展开更多
Scalability is an important issue in the design of interconnection networks for massively parallel systems. In this paper a scalable class of interconnection network of Hex-Cell for massively parallel systems is intro...Scalability is an important issue in the design of interconnection networks for massively parallel systems. In this paper a scalable class of interconnection network of Hex-Cell for massively parallel systems is introduced. It is called Multilayer Hex-Cell (MLH). A node addressing scheme and routing algorithm are also presented and discussed. An interesting feature of the proposed MLH is that it maintains a constant network degree regardless of the increase in the network size degree which facilitates modularity in building blocks of scalable systems. The new addressing node scheme makes the proposed routing algorithm simple and efficient in terms of that it needs a minimum number of calculations to reach the destination node. Moreover, the diameter of the proposed MLH is less than Hex-Cell network.展开更多
This paper considers adaptive control of parallel manipulators combined with fuzzy-neural network algorithms (FNNA). With this algorithm, the robustness is guaranteed by the adaptive control law and the parametric u...This paper considers adaptive control of parallel manipulators combined with fuzzy-neural network algorithms (FNNA). With this algorithm, the robustness is guaranteed by the adaptive control law and the parametric uncertainties are eliminated. FNNA is used to handle model uncertainties and external disturbances. In the proposed control scheme, we consider modifying the weight of fuzzy rules and present these rules to a MIMO system of parallel manipulators with more than three degrees-of-freedom (DoF). The algorithm has the advantage of not requiring the inverse of the Jacobian matrix especially for the low DoF parallel manipulators. The validity of the control scheme is shown through numerical simulations of a 6-RPS parallel manipulator with three DoF.展开更多
To fully utilize the diversity of multi-radio, a new parallel transmission method for wireless mesh network is proposed. Compared with conventional packet transmission which follows “one flow on one radio”, it uses ...To fully utilize the diversity of multi-radio, a new parallel transmission method for wireless mesh network is proposed. Compared with conventional packet transmission which follows “one flow on one radio”, it uses the radio diversity to transmit the packets on different radios simultaneously. Three components are presented to achieve parallel-transmission, which are control module, selection module and schedule module. A localized selecting algorithm selects the right radios based on the quality of wireless links. Two kinds of distributed scheduling algorithms are implemented to transmit packets on the selected radios. Finally, a parallel-adaptive routing metric is presented. Simulation results by NS2 show that this parallel-transmission scheme could enhance the average throughput of network by more than 10%.展开更多
Objective To reduce the execution time of neural network training. Methods Parallel particle swarm optimization algorithm based on master-slave model is proposed to train radial basis function neural networks, which i...Objective To reduce the execution time of neural network training. Methods Parallel particle swarm optimization algorithm based on master-slave model is proposed to train radial basis function neural networks, which is implemented on a cluster using MPI libraries for inter-process communication. Results High speed-up factor is achieved and execution time is reduced greatly. On the other hand, the resulting neural network has good classification accuracy not only on training sets but also on test sets. Conclusion Since the fitness evaluation is intensive, parallel particle swarm optimization shows great advantages to speed up neural network training.展开更多
模型深度的不断增加和处理序列长度的不一致对循环神经网络在不同处理器上的性能优化提出巨大挑战。针对自主研制的长向量处理器FT-M7032,实现了一个高效的循环神经网络加速引擎。该引擎采用行优先矩阵向量乘算法和数据感知的多核并行方...模型深度的不断增加和处理序列长度的不一致对循环神经网络在不同处理器上的性能优化提出巨大挑战。针对自主研制的长向量处理器FT-M7032,实现了一个高效的循环神经网络加速引擎。该引擎采用行优先矩阵向量乘算法和数据感知的多核并行方式,提高矩阵向量乘的计算效率;采用两级内核融合优化方法降低临时数据传输的开销;采用手写汇编优化多种算子,进一步挖掘长向量处理器的性能潜力。实验表明,长向量处理器循环神经网络推理引擎可获得较高性能,相较于多核ARM CPU以及Intel Golden CPU,类循环神经网络模型长短记忆网络可获得最高62.68倍和3.12倍的性能加速。展开更多
The 252Cf source-driven verification system(SDVS)can recognize the enrichment of fissile material with the enrichment-sensitive autocorrelation functions of a detector signal in252Cf source-driven noise-analysis(SDNA)...The 252Cf source-driven verification system(SDVS)can recognize the enrichment of fissile material with the enrichment-sensitive autocorrelation functions of a detector signal in252Cf source-driven noise-analysis(SDNA)measurements.We propose a parallel and optimized genetic Elman network(POGEN)to identify the enrichment of235U based on the physical properties of the measured autocorrelation functions.Theoretical analysis and experimental results indicate that,for 4 different enrichment fissile materials,due to higher information utilization,more efficient network architecture,and optimized parameters,the POGEN-based algorithm can obtain identification results with higher recognition accuracy,compared to the integrated autocorrelation function(IAF)method.展开更多
The conventional methodology for designing QC-LDPC decoders is applied for fixed configurations used in wireless communication standards, and the supported largest expansion factor Z (the parallelism of the layered de...The conventional methodology for designing QC-LDPC decoders is applied for fixed configurations used in wireless communication standards, and the supported largest expansion factor Z (the parallelism of the layered decoding) is a fixed number. In this paper, we study the circular-shifting network for decoding LDPC codes with arbitrary Z factor, especially for decoding large Z (Z P) codes, where P is the decoder parallelism. By buffering the P-length slices from the memory, and assembling the shifted slices in a fixed routine, the P-parallelism shift network can process Z-parallelism circular-shifting tasks. The implementation results show that the proposed network for arbitrary sized data shifting consumes only one times of additional resource cost compared to the traditional solution for only maximum P sized data shifting, and achieves significant saving on area and routing complexity.展开更多
基金supported by the Sichuan Science and Technology Program (No.2019YJ0356).
文摘Accurate automatic segmentation of gliomas in various sub-regions,including peritumoral edema,necrotic core,and enhancing and non-enhancing tumor core from 3D multimodal MRI images,is challenging because of its highly heterogeneous appearance and shape.Deep convolution neural networks(CNNs)have recently improved glioma segmentation performance.However,extensive down-sampling such as pooling or stridden convolution in CNNs significantly decreases the initial image resolution,resulting in the loss of accurate spatial and object parts information,especially information on the small sub-region tumors,affecting segmentation performance.Hence,this paper proposes a novel multi-level parallel network comprising three different level parallel subnetworks to fully use low-level,mid-level,and high-level information and improve the performance of brain tumor segmentation.We also introduce the Combo loss function to address input class imbalance and false positives and negatives imbalance in deep learning.The proposed method is trained and validated on the BraTS 2020 training and validation dataset.On the validation dataset,ourmethod achieved a mean Dice score of 0.907,0.830,and 0.787 for the whole tumor,tumor core,and enhancing tumor core,respectively.Compared with state-of-the-art methods,the multi-level parallel network has achieved competitive results on the validation dataset.
基金Supported by National High Technology Research and Development Program of China (863 Program) (2007AA04Z239) and National Natural Science Foundation of China (60621001, 60975060)
基金supported in part by the National Natural Science Foundation of China under Grant 61461013in part of the Natural Science Foundation of Guangxi Province under Grant 2018GXNSFAA281179in part of the Dean Project of Guangxi Key Laboratory of Wireless Broadband Communication and Signal Processing under Grant GXKL06160103.
文摘In this study,we developed a system based on deep space–time neural networks for gesture recognition.When users change or the number of gesture categories increases,the accuracy of gesture recognition decreases considerably because most gesture recognition systems cannot accommodate both user differentiation and gesture diversity.To overcome the limitations of existing methods,we designed a onedimensional parallel long short-term memory–fully convolutional network(LSTM–FCN)model to extract gesture features of different dimensions.LSTM can learn complex time dynamic information,whereas FCN can predict gestures efficiently by extracting the deep,abstract features of gestures in the spatial dimension.In the experiment,50 types of gestures of five users were collected and evaluated.The experimental results demonstrate the effectiveness of this system and robustness to various gestures and individual changes.Statistical analysis of the recognition results indicated that an average accuracy of approximately 98.9% was achieved.
文摘A new method solution for the direct displacement of parallel mechanism, wavelet network method, is proposed. Comparing with the classical analytical and numerical methods, this method can be extended to any parallel mechanism with any selected degree of freedom and configuration. A wavelet network suiting to approach multi-input and multi-output system is constructed. The network is optimized by analyzing the sparseness of input data and selecting the fitting wavelets by orthogonalization method according to the output data. Then it is applied to solve the direct displace- ment of a general six-degree-of-freedom parallel mechanism as a numerical example. For comparison purposes, a BP neural network is also used for this problem. Simulation results show that the wavelet network performs better than BP neural network. In addition, the wavelet network learns much faster than BP network.
文摘A parallel neural network-based controller (PNNC) is presented for the motion control of underwater vehicles in this paper. It consists of a real-time part, a self-learning part and a desired-state programmer, and it is different from normal adaptive neural network controller in structure. Owing to the introduction of the self-learning part, on-line learning can be performed without sample data in several sample periods, resulting in high learning speed of the controller and good control performance. The desired-state programmer is utilized to obtain better learning samples of the neural network to keep the stability of the controller. The developed controller is applied to the 4-degree of freedom control of the AUV “IUV- IV” and is successful on the simulation platform. The control performance is also compared with that of neural network controller with different structures such as normal adaptive neural network and different learning methods. Current effects and surge velocity control are also included to demonstrate the controller' s performance. It is shown that the PNNC has a great possibility to solve the problems in the control system design of underwater vehicles.
文摘Training deep neural networks(DNNs)requires a significant amount of time and resources to obtain acceptable results,which severely limits its deployment in resource-limited platforms.This paper proposes DarkFPGA,a novel customizable framework to efficiently accelerate the entire DNN training on a single FPGA platform.First,we explore batch-level parallelism to enable efficient FPGA-based DNN training.Second,we devise a novel hardware architecture optimised by a batch-oriented data pattern and tiling techniques to effectively exploit parallelism.Moreover,an analytical model is developed to determine the optimal design parameters for the DarkFPGA accelerator with respect to a specific network specification and FPGA resource constraints.Our results show that the accelerator is able to perform about 10 times faster than CPU training and about a third of the energy consumption than GPU training using 8-bit integers for training VGG-like networks on the CIFAR dataset for the Maxeler MAX5 platform.
基金supported by the Science and Technology Plan Project of Henan Province (No. 192102310232)。
文摘Accurate estimation of the solubility of a chemical compound is an important issue for many industrial proce sses.To overcome the defects of some thermodynamic models and simple correlations,a parallel neural network(PNN) model was conceived and optimized to predict the solubility of diosgenin in seven n-alkanols(C_(1)-C_(7)).The linear regression analysis of the parity plots indicates that the PNN model can give more accurate descriptions of the solubility of diosgenin than the ordinary neural network(ONN) model.The comparison of the average root mean square deviation(RMSD) shows that the suggested model has a slight advantage over the thermodynamic NRTL model in terms of the calculating precision.Moreover,the PNN model can reflect the effects of the temperature and the chain length of the alcohol solvent on the solution behavior of diosgenin correctly and can estimate its solubility in the n-alkanols with more carbon atoms.
文摘Scalability is an important issue in the design of interconnection networks for massively parallel systems. In this paper a scalable class of interconnection network of Hex-Cell for massively parallel systems is introduced. It is called Multilayer Hex-Cell (MLH). A node addressing scheme and routing algorithm are also presented and discussed. An interesting feature of the proposed MLH is that it maintains a constant network degree regardless of the increase in the network size degree which facilitates modularity in building blocks of scalable systems. The new addressing node scheme makes the proposed routing algorithm simple and efficient in terms of that it needs a minimum number of calculations to reach the destination node. Moreover, the diameter of the proposed MLH is less than Hex-Cell network.
基金This work was supported by the National Natural Science Foundation of China (No. 50375001)
文摘This paper considers adaptive control of parallel manipulators combined with fuzzy-neural network algorithms (FNNA). With this algorithm, the robustness is guaranteed by the adaptive control law and the parametric uncertainties are eliminated. FNNA is used to handle model uncertainties and external disturbances. In the proposed control scheme, we consider modifying the weight of fuzzy rules and present these rules to a MIMO system of parallel manipulators with more than three degrees-of-freedom (DoF). The algorithm has the advantage of not requiring the inverse of the Jacobian matrix especially for the low DoF parallel manipulators. The validity of the control scheme is shown through numerical simulations of a 6-RPS parallel manipulator with three DoF.
文摘To fully utilize the diversity of multi-radio, a new parallel transmission method for wireless mesh network is proposed. Compared with conventional packet transmission which follows “one flow on one radio”, it uses the radio diversity to transmit the packets on different radios simultaneously. Three components are presented to achieve parallel-transmission, which are control module, selection module and schedule module. A localized selecting algorithm selects the right radios based on the quality of wireless links. Two kinds of distributed scheduling algorithms are implemented to transmit packets on the selected radios. Finally, a parallel-adaptive routing metric is presented. Simulation results by NS2 show that this parallel-transmission scheme could enhance the average throughput of network by more than 10%.
基金This work was supported by the National Grand Fundamental Research"973"Programof China (No.2004CB719401)
文摘Objective To reduce the execution time of neural network training. Methods Parallel particle swarm optimization algorithm based on master-slave model is proposed to train radial basis function neural networks, which is implemented on a cluster using MPI libraries for inter-process communication. Results High speed-up factor is achieved and execution time is reduced greatly. On the other hand, the resulting neural network has good classification accuracy not only on training sets but also on test sets. Conclusion Since the fitness evaluation is intensive, parallel particle swarm optimization shows great advantages to speed up neural network training.
文摘模型深度的不断增加和处理序列长度的不一致对循环神经网络在不同处理器上的性能优化提出巨大挑战。针对自主研制的长向量处理器FT-M7032,实现了一个高效的循环神经网络加速引擎。该引擎采用行优先矩阵向量乘算法和数据感知的多核并行方式,提高矩阵向量乘的计算效率;采用两级内核融合优化方法降低临时数据传输的开销;采用手写汇编优化多种算子,进一步挖掘长向量处理器的性能潜力。实验表明,长向量处理器循环神经网络推理引擎可获得较高性能,相较于多核ARM CPU以及Intel Golden CPU,类循环神经网络模型长短记忆网络可获得最高62.68倍和3.12倍的性能加速。
基金Supported by National Natural Science Foundation of China(Nos.61201346,61175005 and 61401049)the Fundamental Research Funds for the Central Universities(No.CDJZR14125501)
文摘The 252Cf source-driven verification system(SDVS)can recognize the enrichment of fissile material with the enrichment-sensitive autocorrelation functions of a detector signal in252Cf source-driven noise-analysis(SDNA)measurements.We propose a parallel and optimized genetic Elman network(POGEN)to identify the enrichment of235U based on the physical properties of the measured autocorrelation functions.Theoretical analysis and experimental results indicate that,for 4 different enrichment fissile materials,due to higher information utilization,more efficient network architecture,and optimized parameters,the POGEN-based algorithm can obtain identification results with higher recognition accuracy,compared to the integrated autocorrelation function(IAF)method.
文摘The conventional methodology for designing QC-LDPC decoders is applied for fixed configurations used in wireless communication standards, and the supported largest expansion factor Z (the parallelism of the layered decoding) is a fixed number. In this paper, we study the circular-shifting network for decoding LDPC codes with arbitrary Z factor, especially for decoding large Z (Z P) codes, where P is the decoder parallelism. By buffering the P-length slices from the memory, and assembling the shifted slices in a fixed routine, the P-parallelism shift network can process Z-parallelism circular-shifting tasks. The implementation results show that the proposed network for arbitrary sized data shifting consumes only one times of additional resource cost compared to the traditional solution for only maximum P sized data shifting, and achieves significant saving on area and routing complexity.