The coupling vibration of a hydraulic pipe system consisting of two pipes is studied.The pipes are installed in parallel and fixed at their ends,and are restrained by clips to one bracket at their middle points.The pi...The coupling vibration of a hydraulic pipe system consisting of two pipes is studied.The pipes are installed in parallel and fixed at their ends,and are restrained by clips to one bracket at their middle points.The pipe subjected to the basement excitation at the left end is named as the active pipe,while the pipe without excitation is called the passive pipe.The clips between the two pipes are the bridge for the vibration energy.The adjacent natural frequencies will enhance the vibration coupling.The governing equation of the coupled system is deduced by the generalized Hamilton principle,and is discretized to the modal space.The modal correction is used during the discretization.The investigation on the natural characters indicates that the adjacent natural frequencies can be adjusted by the stiffness of the two clips and bracket.The harmonic balance method(HBM)is used to study the responses in the adjacent natural frequency region.The results show that the vibration energy transmits from the active pipe to the passive pipe swimmingly via the clips together with a flexible bracket,while the locations of them are not node points.The adjacent natural frequencies may arouse wide resonance curves with two peaks for both pipes.The stiffness of the clip and bracket can release the vibration coupling.It is suggested that the stiffness of the clip on the passive pipe should be weak and the bracket should be strong enough.In this way,the vibration energy is reflected by the almost rigid bracket,and is hard to transfer to the passive pipe via a soft clip.The best choice is to set the clips at the pipe node points.The current work gives some suggestions for weakening the coupled vibration during the dynamic design of a coupled hydraulic pipe system.展开更多
Current applications,consisting of multiple replicas,are packaged into lightweight containers with their execution dependencies.Considering the dominant impact of distribution efficiency of gigantic images on containe...Current applications,consisting of multiple replicas,are packaged into lightweight containers with their execution dependencies.Considering the dominant impact of distribution efficiency of gigantic images on container startup(e.g.,distributed deep learning application),the image“warm-up”technique which prefetches images of these replicas to destination nodes in the cluster is proposed.However,the current image“warm-up”technique solely focuses on identical image distribution,which fails to take effect when distributing different images to destination nodes.To address this problem,this paper proposes Hound,a simple but efficient cluster image distribution system based on Docker.To support diverse image distribution requests of cluster nodes,Hound additionally adopts node-level parallelism(i.e.,downloading images to destination nodes in parallel)to further improve the efficiency of image distribution.The experimental results demonstrate Hound outperforms Docker,kubernetes container runtime interface(CRI-O),and Docker-compose in terms of image distribution performance when cluster nodes request different images.Moreover,the high scalability of Hound is evaluated in the scenario of ten nodes.展开更多
Currently,energy conservation draws wide attention in industrial manufacturing systems.In recent years,many studies have aimed at saving energy consumption in the process of manufacturing and scheduling is regarded as...Currently,energy conservation draws wide attention in industrial manufacturing systems.In recent years,many studies have aimed at saving energy consumption in the process of manufacturing and scheduling is regarded as an effective approach.This paper puts forwards a multi-objective stochastic parallel machine scheduling problem with the consideration of deteriorating and learning effects.In it,the real processing time of jobs is calculated by using their processing speed and normal processing time.To describe this problem in a mathematical way,amultiobjective stochastic programming model aiming at realizing makespan and energy consumption minimization is formulated.Furthermore,we develop a multi-objective multi-verse optimization combined with a stochastic simulation method to deal with it.In this approach,the multi-verse optimization is adopted to find favorable solutions from the huge solution domain,while the stochastic simulation method is employed to assess them.By conducting comparison experiments on test problems,it can be verified that the developed approach has better performance in coping with the considered problem,compared to two classic multi-objective evolutionary algorithms.展开更多
More and more uncertain factors in power systems and more and more complex operation modes of power systems put forward higher requirements for online transient stability assessment methods.The traditional modeldriven...More and more uncertain factors in power systems and more and more complex operation modes of power systems put forward higher requirements for online transient stability assessment methods.The traditional modeldriven methods have clear physical mechanisms and reliable evaluation results but the calculation process is time-consuming,while the data-driven methods have the strong fitting ability and fast calculation speed but the evaluation results lack interpretation.Therefore,it is a future development trend of transient stability assessment methods to combine these two kinds of methods.In this paper,the rate of change of the kinetic energy method is used to calculate the transient stability in the model-driven stage,and the support vector machine and extreme learning machine with different internal principles are respectively used to predict the transient stability in the data-driven stage.In order to quantify the credibility level of the data-driven methods,the credibility index of the output results is proposed.Then the switching function controlling whether the rate of change of the kinetic energy method is activated or not is established based on this index.Thus,a newparallel integratedmodel-driven and datadriven online transient stability assessment method is proposed.The accuracy,efficiency,and adaptability of the proposed method are verified by numerical examples.展开更多
Parallel connection of multiple inverters is an important means to solve the expansion,reserve and protection of distributed power generation,such as photovoltaics.In view of the shortcomings of traditional droop cont...Parallel connection of multiple inverters is an important means to solve the expansion,reserve and protection of distributed power generation,such as photovoltaics.In view of the shortcomings of traditional droop control methods such as weak anti-interference ability,low tracking accuracy of inverter output voltage and serious circulation phenomenon,a finite control set model predictive control(FCS-MPC)strategy of microgrid multiinverter parallel system based on Mixed Logical Dynamical(MLD)modeling is proposed.Firstly,the MLD modeling method is introduced logical variables,combining discrete events and continuous events to form an overall differential equation,which makes the modeling more accurate.Then a predictive controller is designed based on the model,and constraints are added to the objective function,which can not only solve the real-time changes of the control system by online optimization,but also effectively obtain a higher tracking accuracy of the inverter output voltage and lower total harmonic distortion rate(Total Harmonics Distortion,THD);and suppress the circulating current between the inverters,to obtain a good dynamic response.Finally,the simulation is carried out onMATLAB/Simulink to verify the correctness of the model and the rationality of the proposed strategy.This paper aims to provide guidance for the design and optimal control of multi-inverter parallel systems.展开更多
The heterogeneous variational nodal method(HVNM)has emerged as a potential approach for solving high-fidelity neutron transport problems.However,achieving accurate results with HVNM in large-scale problems using high-...The heterogeneous variational nodal method(HVNM)has emerged as a potential approach for solving high-fidelity neutron transport problems.However,achieving accurate results with HVNM in large-scale problems using high-fidelity models has been challenging due to the prohibitive computational costs.This paper presents an efficient parallel algorithm tailored for HVNM based on the Message Passing Interface standard.The algorithm evenly distributes the response matrix sets among processors during the matrix formation process,thus enabling independent construction without communication.Once the formation tasks are completed,a collective operation merges and shares the matrix sets among the processors.For the solution process,the problem domain is decomposed into subdomains assigned to specific processors,and the red-black Gauss-Seidel iteration is employed within each subdomain to solve the response matrix equation.Point-to-point communication is conducted between adjacent subdomains to exchange data along the boundaries.The accuracy and efficiency of the parallel algorithm are verified using the KAIST and JRR-3 test cases.Numerical results obtained with multiple processors agree well with those obtained from Monte Carlo calculations.The parallelization of HVNM results in eigenvalue errors of 31 pcm/-90 pcm and fission rate RMS errors of 1.22%/0.66%,respectively,for the 3D KAIST problem and the 3D JRR-3 problem.In addition,the parallel algorithm significantly reduces computation time,with an efficiency of 68.51% using 36 processors in the KAIST problem and 77.14% using 144 processors in the JRR-3 problem.展开更多
The nonlinear stability of plane parallel shear flows with respect to tilted perturbations is studied by energy methods.Tilted perturbation refers to the fact that perturbations form an angleθ∈(0,π/2)with the direc...The nonlinear stability of plane parallel shear flows with respect to tilted perturbations is studied by energy methods.Tilted perturbation refers to the fact that perturbations form an angleθ∈(0,π/2)with the direction of the basic flows.By defining an energy functional,it is proven that plane parallel shear flows are unconditionally nonlinearly exponentially stable for tilted streamwise perturbation when the Reynolds number is below a certain critical value and the boundary conditions are either rigid or stress-free.In the case of stress-free boundaries,by taking advantage of the poloidal-toroidal decomposition of a solenoidal field to define energy functionals,it can be even shown that plane parallel shear flows are unconditionally nonlinearly exponentially stable for all Reynolds numbers,where the tilted perturbation can be either spanwise or streamwise.展开更多
The Message Passing Interface (MPI) is a widely accepted standard for parallel computing on distributed memorysystems.However, MPI implementations can contain defects that impact the reliability and performance of par...The Message Passing Interface (MPI) is a widely accepted standard for parallel computing on distributed memorysystems.However, MPI implementations can contain defects that impact the reliability and performance of parallelapplications. Detecting and correcting these defects is crucial, yet there is a lack of published models specificallydesigned for correctingMPI defects. To address this, we propose a model for detecting and correcting MPI defects(DC_MPI), which aims to detect and correct defects in various types of MPI communication, including blockingpoint-to-point (BPTP), nonblocking point-to-point (NBPTP), and collective communication (CC). The defectsaddressed by the DC_MPI model include illegal MPI calls, deadlocks (DL), race conditions (RC), and messagemismatches (MM). To assess the effectiveness of the DC_MPI model, we performed experiments on a datasetconsisting of 40 MPI codes. The results indicate that the model achieved a detection rate of 37 out of 40 codes,resulting in an overall detection accuracy of 92.5%. Additionally, the execution duration of the DC_MPI modelranged from 0.81 to 1.36 s. These findings show that the DC_MPI model is useful in detecting and correctingdefects in MPI implementations, thereby enhancing the reliability and performance of parallel applications. TheDC_MPImodel fills an important research gap and provides a valuable tool for improving the quality ofMPI-basedparallel computing systems.展开更多
In this research,we present the pure open multi-processing(OpenMP),pure message passing interface(MPI),and hybrid MPI/OpenMP parallel solvers within the dynamic explicit central difference algorithm for the coining pr...In this research,we present the pure open multi-processing(OpenMP),pure message passing interface(MPI),and hybrid MPI/OpenMP parallel solvers within the dynamic explicit central difference algorithm for the coining process to address the challenge of capturing fine relief features of approximately 50 microns.Achieving such precision demands the utilization of at least 7 million tetrahedron elements,surpassing the capabilities of traditional serial programs previously developed.To mitigate data races when calculating internal forces,intermediate arrays are introduced within the OpenMP directive.This helps ensure proper synchronization and avoid conflicts during parallel execution.Additionally,in the MPI implementation,the coins are partitioned into the desired number of regions.This division allows for efficient distribution of computational tasks across multiple processes.Numerical simulation examples are conducted to compare the three solvers with serial programs,evaluating correctness,acceleration ratio,and parallel efficiency.The results reveal a relative error of approximately 0.3%in forming force among the parallel and serial solvers,while the predicted insufficient material zones align with experimental observations.Additionally,speedup ratio and parallel efficiency are assessed for the coining process simulation.The pureMPI parallel solver achieves a maximum acceleration of 9.5 on a single computer(utilizing 12 cores)and the hybrid solver exhibits a speedup ratio of 136 in a cluster(using 6 compute nodes and 12 cores per compute node),showing the strong scalability of the hybrid MPI/OpenMP programming model.This approach effectively meets the simulation requirements for commemorative coins with intricate relief patterns.展开更多
This paper investigates the effective capacity of a point-to-point ultra-reliable low latency communication(URLLC)transmission over multiple parallel sub-channels at finite blocklength(FBL)with imperfect channel state...This paper investigates the effective capacity of a point-to-point ultra-reliable low latency communication(URLLC)transmission over multiple parallel sub-channels at finite blocklength(FBL)with imperfect channel state information(CSI).Based on reasonable assumptions and approximations,we derive the effective capacity as a function of the pilot length,decoding error probability,transmit power and the sub-channel number.Then we reveal significant impact of the above parameters on the effective capacity.A closed-form lower bound of the effective capacity is derived and an alternating optimization based algorithm is proposed to find the optimal pilot length and decoding error probability.Simulation results validate our theoretical analysis and show that the closedform lower bound is very tight.In addition,through the simulations of the optimized effective capacity,insights for pilot length and decoding error probability optimization are provided to evaluate the optimal parameters in realistic systems.展开更多
Millimeter wave(mmWave)massive multiple-input multiple-output(MIMO)plays an important role in the fifth-generation(5G)mobile communications and beyond wireless communication systems owing to its potential of high capa...Millimeter wave(mmWave)massive multiple-input multiple-output(MIMO)plays an important role in the fifth-generation(5G)mobile communications and beyond wireless communication systems owing to its potential of high capacity.However,channel estimation has become very challenging due to the use of massive MIMO antenna array.Fortunately,the mmWave channel has strong sparsity in the spatial angle domain,and the compressed sensing technology can be used to convert the original channel matrix into the sparse matrix of discrete angle grid.Thus the high-dimensional channel matrix estimation is transformed into a sparse recovery problem with greatly reduced computational complexity.However,the path angle in the actual scene appears randomly and is unlikely to be completely located on the quantization angle grid,thus leading to the problem of power leakage.Moreover,multiple paths with the random distribution of angles will bring about serious interpath interference and further deteriorate the performance of channel estimation.To address these off-grid issues,we propose a parallel interference cancellation assisted multi-grid matching pursuit(PIC-MGMP)algorithm in this paper.The proposed algorithm consists of three stages,including coarse estimation,refined estimation,and inter-path cyclic iterative inter-ference cancellation.More specifically,the angular resolution can be improved by locally refining the grid to reduce power leakage,while the inter-path interference is eliminated by parallel interference cancellation(PIC),and the two together improve the estimation accuracy.Simulation results show that compared with the traditional orthogonal matching pursuit(OMP)algorithm,the normalized mean square error(NMSE)of the proposed algorithm decreases by over 14dB in the case of 2 paths.展开更多
A novel image encryption scheme based on parallel compressive sensing and edge detection embedding technology is proposed to improve visual security. Firstly, the plain image is sparsely represented using the discrete...A novel image encryption scheme based on parallel compressive sensing and edge detection embedding technology is proposed to improve visual security. Firstly, the plain image is sparsely represented using the discrete wavelet transform.Then, the coefficient matrix is scrambled and compressed to obtain a size-reduced image using the Fisher–Yates shuffle and parallel compressive sensing. Subsequently, to increase the security of the proposed algorithm, the compressed image is re-encrypted through permutation and diffusion to obtain a noise-like secret image. Finally, an adaptive embedding method based on edge detection for different carrier images is proposed to generate a visually meaningful cipher image. To improve the plaintext sensitivity of the algorithm, the counter mode is combined with the hash function to generate keys for chaotic systems. Additionally, an effective permutation method is designed to scramble the pixels of the compressed image in the re-encryption stage. The simulation results and analyses demonstrate that the proposed algorithm performs well in terms of visual security and decryption quality.展开更多
This paper aims to solve large-scale and complex isogeometric topology optimization problems that consumesignificant computational resources. A novel isogeometric topology optimization method with a hybrid parallelstr...This paper aims to solve large-scale and complex isogeometric topology optimization problems that consumesignificant computational resources. A novel isogeometric topology optimization method with a hybrid parallelstrategy of CPU/GPU is proposed, while the hybrid parallel strategies for stiffness matrix assembly, equationsolving, sensitivity analysis, and design variable update are discussed in detail. To ensure the high efficiency ofCPU/GPU computing, a workload balancing strategy is presented for optimally distributing the workload betweenCPU and GPU. To illustrate the advantages of the proposedmethod, three benchmark examples are tested to verifythe hybrid parallel strategy in this paper. The results show that the efficiency of the hybrid method is faster thanserial CPU and parallel GPU, while the speedups can be up to two orders of magnitude.展开更多
Accurate diagnosis of apple leaf diseases is crucial for improving the quality of apple production and promoting the development of the apple industry. However, apple leaf diseases do not differ significantly from ima...Accurate diagnosis of apple leaf diseases is crucial for improving the quality of apple production and promoting the development of the apple industry. However, apple leaf diseases do not differ significantly from image texture and structural information. The difficulties in disease feature extraction in complex backgrounds slow the related research progress. To address the problems, this paper proposes an improved multi-scale inverse bottleneck residual network model based on a triplet parallel attention mechanism, which is built upon ResNet-50, while improving and combining the inception module and ResNext inverse bottleneck blocks, to recognize seven types of apple leaf(including six diseases of alternaria leaf spot, brown spot, grey spot, mosaic, rust, scab, and one healthy). First, the 3×3 convolutions in some of the residual modules are replaced by multi-scale residual convolutions, the convolution kernels of different sizes contained in each branch of the multi-scale convolution are applied to extract feature maps of different sizes, and the outputs of these branches are multi-scale fused by summing to enrich the output features of the images. Second, the global layer-wise dynamic coordinated inverse bottleneck structure is used to reduce the network feature loss. The inverse bottleneck structure makes the image information less lossy when transforming from different dimensional feature spaces. The fusion of multi-scale and layer-wise dynamic coordinated inverse bottlenecks makes the model effectively balances computational efficiency and feature representation capability, and more robust with a combination of horizontal and vertical features in the fine identification of apple leaf diseases. Finally, after each improved module, a triplet parallel attention module is integrated with cross-dimensional interactions among channels through rotations and residual transformations, which improves the parallel search efficiency of important features and the recognition rate of the network with relatively small computational costs while the dimensional dependencies are improved. To verify the validity of the model in this paper, we uniformly enhance apple leaf disease images screened from the public data sets of Plant Village, Baidu Flying Paddle, and the Internet. The final processed image count is 14,000. The ablation study, pre-processing comparison, and method comparison are conducted on the processed datasets. The experimental results demonstrate that the proposed method reaches 98.73% accuracy on the adopted datasets, which is 1.82% higher than the classical ResNet-50 model, and 0.29% better than the apple leaf disease datasets before preprocessing. It also achieves competitive results in apple leaf disease identification compared to some state-ofthe-art methods.展开更多
Accurate automatic segmentation of gliomas in various sub-regions,including peritumoral edema,necrotic core,and enhancing and non-enhancing tumor core from 3D multimodal MRI images,is challenging because of its highly...Accurate automatic segmentation of gliomas in various sub-regions,including peritumoral edema,necrotic core,and enhancing and non-enhancing tumor core from 3D multimodal MRI images,is challenging because of its highly heterogeneous appearance and shape.Deep convolution neural networks(CNNs)have recently improved glioma segmentation performance.However,extensive down-sampling such as pooling or stridden convolution in CNNs significantly decreases the initial image resolution,resulting in the loss of accurate spatial and object parts information,especially information on the small sub-region tumors,affecting segmentation performance.Hence,this paper proposes a novel multi-level parallel network comprising three different level parallel subnetworks to fully use low-level,mid-level,and high-level information and improve the performance of brain tumor segmentation.We also introduce the Combo loss function to address input class imbalance and false positives and negatives imbalance in deep learning.The proposed method is trained and validated on the BraTS 2020 training and validation dataset.On the validation dataset,ourmethod achieved a mean Dice score of 0.907,0.830,and 0.787 for the whole tumor,tumor core,and enhancing tumor core,respectively.Compared with state-of-the-art methods,the multi-level parallel network has achieved competitive results on the validation dataset.展开更多
The Extensible Markup Language(XML)files,widely used for storing and exchanging information on the web require efficient parsing mechanisms to improve the performance of the applications.With the existing Document Obj...The Extensible Markup Language(XML)files,widely used for storing and exchanging information on the web require efficient parsing mechanisms to improve the performance of the applications.With the existing Document Object Model(DOM)based parsing,the performance degrades due to sequential processing and large memory requirements,thereby requiring an efficient XML parser to mitigate these issues.In this paper,we propose a Parallel XML Tree Generator(PXTG)algorithm for accelerating the parsing of XML files and a Regression-based XML Parsing Framework(RXPF)that analyzes and predicts performance through profiling,regression,and code generation for efficient parsing.The PXTG algorithm is based on dividing the XML file into n parts and producing n trees in parallel.The profiling phase of the RXPF framework produces a dataset by measuring the performance of various parsing models including StAX,SAX,DOM,JDOM,and PXTG on different cores by using multiple file sizes.The regression phase produces the prediction model,based on which the final code for efficient parsing of XML files is produced through the code generation phase.The RXPF framework has shown a significant improvement in performance varying from 9.54%to 32.34%over other existing models used for parsing XML files.展开更多
The high-resolution DEM-IMB-LBM model can accurately describe pore-scale fluid-solid interactions,but its potential for use in geotechnical engineering analysis has not been fully unleashed due to its prohibitive comp...The high-resolution DEM-IMB-LBM model can accurately describe pore-scale fluid-solid interactions,but its potential for use in geotechnical engineering analysis has not been fully unleashed due to its prohibitive computational costs.To overcome this limitation,a message passing interface(MPI)parallel DEM-IMB-LBM framework is proposed aimed at enhancing computation efficiency.This framework utilises a static domain decomposition scheme,with the entire computation domain being decomposed into multiple subdomains according to predefined processors.A detailed parallel strategy is employed for both contact detection and hydrodynamic force calculation.In particular,a particle ID re-numbering scheme is proposed to handle particle transitions across sub-domain interfaces.Two benchmarks are conducted to validate the accuracy and overall performance of the proposed framework.Subsequently,the framework is applied to simulate scenarios involving multi-particle sedimentation and submarine landslides.The numerical examples effectively demonstrate the robustness and applicability of the MPI parallel DEM-IMB-LBM framework.展开更多
This paper presents partially asynchronous parallel simulation of continuous-system (PAPSoCS) and some approaches to the issues of its implementation on a multicomputer system. To guarantee the simulation results cor...This paper presents partially asynchronous parallel simulation of continuous-system (PAPSoCS) and some approaches to the issues of its implementation on a multicomputer system. To guarantee the simulation results correct and speedup the simulation, the scheme for efficient PAPSoCS is proposed and the virtual topology star is constructed to match the path of message passing for solving algorithm-architecture adequation problem. Under the circumstances that messages frequently passed inter-processor are much shorter, typically within several 4 bytes, asynchronous communication mode is employed to reduce the communication ratio. Experiment results show that asynchronous parallel simulation has much higher efficiency than its synchronous counterpart.展开更多
In this paper, a homogenous parallel simulation system is presented in detail for continuous--system simulation. The system is collstructed by a host computer and I I transputers connected into a topologyof 'Super...In this paper, a homogenous parallel simulation system is presented in detail for continuous--system simulation. The system is collstructed by a host computer and I I transputers connected into a topologyof 'Super--Node' which is very suitable for simulation of stiff systems. An automatic software interface runin the host is developed to partition simulation model, either equations or block diagrams, into several equitable segments and then pack them into parallel simulation program to be executed in the parallel system.This interface frees simulation users from parallel programming to focus on their simulation experiments.展开更多
Multicomputer systems(distributed memory computer systems) are becoming more and more popular and will be wildly used in scientific researches. In this paper, we present a parallel algorithm of Fourier Transform of a ...Multicomputer systems(distributed memory computer systems) are becoming more and more popular and will be wildly used in scientific researches. In this paper, we present a parallel algorithm of Fourier Transform of a vector of complex numbers on multicomputer system and give its computing times and its speedup in parallel environment supported by EXPRESS system on the multicomputer system which consists of four SGI workstations. Our analysis shows that the results is ideal and this scheme is suitable to multicomputer systems.展开更多
基金Project supported by the National Natural Science Foundation of China(No.12002195)the Pujiang Project of Shanghai Science and Technology Commission of China(No.20PJ1404000)。
文摘The coupling vibration of a hydraulic pipe system consisting of two pipes is studied.The pipes are installed in parallel and fixed at their ends,and are restrained by clips to one bracket at their middle points.The pipe subjected to the basement excitation at the left end is named as the active pipe,while the pipe without excitation is called the passive pipe.The clips between the two pipes are the bridge for the vibration energy.The adjacent natural frequencies will enhance the vibration coupling.The governing equation of the coupled system is deduced by the generalized Hamilton principle,and is discretized to the modal space.The modal correction is used during the discretization.The investigation on the natural characters indicates that the adjacent natural frequencies can be adjusted by the stiffness of the two clips and bracket.The harmonic balance method(HBM)is used to study the responses in the adjacent natural frequency region.The results show that the vibration energy transmits from the active pipe to the passive pipe swimmingly via the clips together with a flexible bracket,while the locations of them are not node points.The adjacent natural frequencies may arouse wide resonance curves with two peaks for both pipes.The stiffness of the clip and bracket can release the vibration coupling.It is suggested that the stiffness of the clip on the passive pipe should be weak and the bracket should be strong enough.In this way,the vibration energy is reflected by the almost rigid bracket,and is hard to transfer to the passive pipe via a soft clip.The best choice is to set the clips at the pipe node points.The current work gives some suggestions for weakening the coupled vibration during the dynamic design of a coupled hydraulic pipe system.
基金supported by the National Natural Science Foundation of China(61872423)Industry Prospective Primary Research&Development Plan of Jiangsu Province(BE2017111)+1 种基金the Scientific Research Foundation of the Higher Education Institutions of Jiangsu Province(19KJA180006)the Postgraduate Research&Practice Innovation Program of Jiangsu Province(KYCX20_0764)。
文摘Current applications,consisting of multiple replicas,are packaged into lightweight containers with their execution dependencies.Considering the dominant impact of distribution efficiency of gigantic images on container startup(e.g.,distributed deep learning application),the image“warm-up”technique which prefetches images of these replicas to destination nodes in the cluster is proposed.However,the current image“warm-up”technique solely focuses on identical image distribution,which fails to take effect when distributing different images to destination nodes.To address this problem,this paper proposes Hound,a simple but efficient cluster image distribution system based on Docker.To support diverse image distribution requests of cluster nodes,Hound additionally adopts node-level parallelism(i.e.,downloading images to destination nodes in parallel)to further improve the efficiency of image distribution.The experimental results demonstrate Hound outperforms Docker,kubernetes container runtime interface(CRI-O),and Docker-compose in terms of image distribution performance when cluster nodes request different images.Moreover,the high scalability of Hound is evaluated in the scenario of ten nodes.
文摘Currently,energy conservation draws wide attention in industrial manufacturing systems.In recent years,many studies have aimed at saving energy consumption in the process of manufacturing and scheduling is regarded as an effective approach.This paper puts forwards a multi-objective stochastic parallel machine scheduling problem with the consideration of deteriorating and learning effects.In it,the real processing time of jobs is calculated by using their processing speed and normal processing time.To describe this problem in a mathematical way,amultiobjective stochastic programming model aiming at realizing makespan and energy consumption minimization is formulated.Furthermore,we develop a multi-objective multi-verse optimization combined with a stochastic simulation method to deal with it.In this approach,the multi-verse optimization is adopted to find favorable solutions from the huge solution domain,while the stochastic simulation method is employed to assess them.By conducting comparison experiments on test problems,it can be verified that the developed approach has better performance in coping with the considered problem,compared to two classic multi-objective evolutionary algorithms.
基金funded by the Science and Technology Project of State Grid Shanxi Electric Power Co.,Ltd.(Project No.520530200013).
文摘More and more uncertain factors in power systems and more and more complex operation modes of power systems put forward higher requirements for online transient stability assessment methods.The traditional modeldriven methods have clear physical mechanisms and reliable evaluation results but the calculation process is time-consuming,while the data-driven methods have the strong fitting ability and fast calculation speed but the evaluation results lack interpretation.Therefore,it is a future development trend of transient stability assessment methods to combine these two kinds of methods.In this paper,the rate of change of the kinetic energy method is used to calculate the transient stability in the model-driven stage,and the support vector machine and extreme learning machine with different internal principles are respectively used to predict the transient stability in the data-driven stage.In order to quantify the credibility level of the data-driven methods,the credibility index of the output results is proposed.Then the switching function controlling whether the rate of change of the kinetic energy method is activated or not is established based on this index.Thus,a newparallel integratedmodel-driven and datadriven online transient stability assessment method is proposed.The accuracy,efficiency,and adaptability of the proposed method are verified by numerical examples.
基金supported by the Major Science and Technology Projects of Gansu Province(Grant No.20ZD7GF011)Gansu Province Higher Education Industry Support Plan Project:Research on the Collaborative Operation of Solar Thermal Storage+Wind-Solar Hybrid Power Generation--Based on“Integrated Energy Demonstration of Wind-Solar Energy Storage in Gansu Province”(Project No.2022CYZC-34).
文摘Parallel connection of multiple inverters is an important means to solve the expansion,reserve and protection of distributed power generation,such as photovoltaics.In view of the shortcomings of traditional droop control methods such as weak anti-interference ability,low tracking accuracy of inverter output voltage and serious circulation phenomenon,a finite control set model predictive control(FCS-MPC)strategy of microgrid multiinverter parallel system based on Mixed Logical Dynamical(MLD)modeling is proposed.Firstly,the MLD modeling method is introduced logical variables,combining discrete events and continuous events to form an overall differential equation,which makes the modeling more accurate.Then a predictive controller is designed based on the model,and constraints are added to the objective function,which can not only solve the real-time changes of the control system by online optimization,but also effectively obtain a higher tracking accuracy of the inverter output voltage and lower total harmonic distortion rate(Total Harmonics Distortion,THD);and suppress the circulating current between the inverters,to obtain a good dynamic response.Finally,the simulation is carried out onMATLAB/Simulink to verify the correctness of the model and the rationality of the proposed strategy.This paper aims to provide guidance for the design and optimal control of multi-inverter parallel systems.
基金supported by the National Key Research and Development Program of China(No.2020YFB1901900)the National Natural Science Foundation of China(Nos.U20B2011,12175138)the Shanghai Rising-Star Program。
文摘The heterogeneous variational nodal method(HVNM)has emerged as a potential approach for solving high-fidelity neutron transport problems.However,achieving accurate results with HVNM in large-scale problems using high-fidelity models has been challenging due to the prohibitive computational costs.This paper presents an efficient parallel algorithm tailored for HVNM based on the Message Passing Interface standard.The algorithm evenly distributes the response matrix sets among processors during the matrix formation process,thus enabling independent construction without communication.Once the formation tasks are completed,a collective operation merges and shares the matrix sets among the processors.For the solution process,the problem domain is decomposed into subdomains assigned to specific processors,and the red-black Gauss-Seidel iteration is employed within each subdomain to solve the response matrix equation.Point-to-point communication is conducted between adjacent subdomains to exchange data along the boundaries.The accuracy and efficiency of the parallel algorithm are verified using the KAIST and JRR-3 test cases.Numerical results obtained with multiple processors agree well with those obtained from Monte Carlo calculations.The parallelization of HVNM results in eigenvalue errors of 31 pcm/-90 pcm and fission rate RMS errors of 1.22%/0.66%,respectively,for the 3D KAIST problem and the 3D JRR-3 problem.In addition,the parallel algorithm significantly reduces computation time,with an efficiency of 68.51% using 36 processors in the KAIST problem and 77.14% using 144 processors in the JRR-3 problem.
基金supported by the National Natural Science Foundation of China(21627813)。
文摘The nonlinear stability of plane parallel shear flows with respect to tilted perturbations is studied by energy methods.Tilted perturbation refers to the fact that perturbations form an angleθ∈(0,π/2)with the direction of the basic flows.By defining an energy functional,it is proven that plane parallel shear flows are unconditionally nonlinearly exponentially stable for tilted streamwise perturbation when the Reynolds number is below a certain critical value and the boundary conditions are either rigid or stress-free.In the case of stress-free boundaries,by taking advantage of the poloidal-toroidal decomposition of a solenoidal field to define energy functionals,it can be even shown that plane parallel shear flows are unconditionally nonlinearly exponentially stable for all Reynolds numbers,where the tilted perturbation can be either spanwise or streamwise.
基金the Deanship of Scientific Research at King Abdulaziz University,Jeddah,Saudi Arabia under the Grant No.RG-12-611-43.
文摘The Message Passing Interface (MPI) is a widely accepted standard for parallel computing on distributed memorysystems.However, MPI implementations can contain defects that impact the reliability and performance of parallelapplications. Detecting and correcting these defects is crucial, yet there is a lack of published models specificallydesigned for correctingMPI defects. To address this, we propose a model for detecting and correcting MPI defects(DC_MPI), which aims to detect and correct defects in various types of MPI communication, including blockingpoint-to-point (BPTP), nonblocking point-to-point (NBPTP), and collective communication (CC). The defectsaddressed by the DC_MPI model include illegal MPI calls, deadlocks (DL), race conditions (RC), and messagemismatches (MM). To assess the effectiveness of the DC_MPI model, we performed experiments on a datasetconsisting of 40 MPI codes. The results indicate that the model achieved a detection rate of 37 out of 40 codes,resulting in an overall detection accuracy of 92.5%. Additionally, the execution duration of the DC_MPI modelranged from 0.81 to 1.36 s. These findings show that the DC_MPI model is useful in detecting and correctingdefects in MPI implementations, thereby enhancing the reliability and performance of parallel applications. TheDC_MPImodel fills an important research gap and provides a valuable tool for improving the quality ofMPI-basedparallel computing systems.
基金supported by the fund from ShenyangMint Company Limited(No.20220056)Senior Talent Foundation of Jiangsu University(No.19JDG022)Taizhou City Double Innovation and Entrepreneurship Talent Program(No.Taizhou Human Resources Office[2022]No.22).
文摘In this research,we present the pure open multi-processing(OpenMP),pure message passing interface(MPI),and hybrid MPI/OpenMP parallel solvers within the dynamic explicit central difference algorithm for the coining process to address the challenge of capturing fine relief features of approximately 50 microns.Achieving such precision demands the utilization of at least 7 million tetrahedron elements,surpassing the capabilities of traditional serial programs previously developed.To mitigate data races when calculating internal forces,intermediate arrays are introduced within the OpenMP directive.This helps ensure proper synchronization and avoid conflicts during parallel execution.Additionally,in the MPI implementation,the coins are partitioned into the desired number of regions.This division allows for efficient distribution of computational tasks across multiple processes.Numerical simulation examples are conducted to compare the three solvers with serial programs,evaluating correctness,acceleration ratio,and parallel efficiency.The results reveal a relative error of approximately 0.3%in forming force among the parallel and serial solvers,while the predicted insufficient material zones align with experimental observations.Additionally,speedup ratio and parallel efficiency are assessed for the coining process simulation.The pureMPI parallel solver achieves a maximum acceleration of 9.5 on a single computer(utilizing 12 cores)and the hybrid solver exhibits a speedup ratio of 136 in a cluster(using 6 compute nodes and 12 cores per compute node),showing the strong scalability of the hybrid MPI/OpenMP programming model.This approach effectively meets the simulation requirements for commemorative coins with intricate relief patterns.
基金supported by the National Natural Science Foundation of China under grant 61941106。
文摘This paper investigates the effective capacity of a point-to-point ultra-reliable low latency communication(URLLC)transmission over multiple parallel sub-channels at finite blocklength(FBL)with imperfect channel state information(CSI).Based on reasonable assumptions and approximations,we derive the effective capacity as a function of the pilot length,decoding error probability,transmit power and the sub-channel number.Then we reveal significant impact of the above parameters on the effective capacity.A closed-form lower bound of the effective capacity is derived and an alternating optimization based algorithm is proposed to find the optimal pilot length and decoding error probability.Simulation results validate our theoretical analysis and show that the closedform lower bound is very tight.In addition,through the simulations of the optimized effective capacity,insights for pilot length and decoding error probability optimization are provided to evaluate the optimal parameters in realistic systems.
基金supported in part by the Beijing Natural Science Foundation under Grant No.L202003the National Natural Science Foundation of China under Grant U22B2001 and 62271065the Project of China Railway Corporation under Grant N2022G048.
文摘Millimeter wave(mmWave)massive multiple-input multiple-output(MIMO)plays an important role in the fifth-generation(5G)mobile communications and beyond wireless communication systems owing to its potential of high capacity.However,channel estimation has become very challenging due to the use of massive MIMO antenna array.Fortunately,the mmWave channel has strong sparsity in the spatial angle domain,and the compressed sensing technology can be used to convert the original channel matrix into the sparse matrix of discrete angle grid.Thus the high-dimensional channel matrix estimation is transformed into a sparse recovery problem with greatly reduced computational complexity.However,the path angle in the actual scene appears randomly and is unlikely to be completely located on the quantization angle grid,thus leading to the problem of power leakage.Moreover,multiple paths with the random distribution of angles will bring about serious interpath interference and further deteriorate the performance of channel estimation.To address these off-grid issues,we propose a parallel interference cancellation assisted multi-grid matching pursuit(PIC-MGMP)algorithm in this paper.The proposed algorithm consists of three stages,including coarse estimation,refined estimation,and inter-path cyclic iterative inter-ference cancellation.More specifically,the angular resolution can be improved by locally refining the grid to reduce power leakage,while the inter-path interference is eliminated by parallel interference cancellation(PIC),and the two together improve the estimation accuracy.Simulation results show that compared with the traditional orthogonal matching pursuit(OMP)algorithm,the normalized mean square error(NMSE)of the proposed algorithm decreases by over 14dB in the case of 2 paths.
基金supported by the Key Area R&D Program of Guangdong Province (Grant No.2022B0701180001)the National Natural Science Foundation of China (Grant No.61801127)+1 种基金the Science Technology Planning Project of Guangdong Province,China (Grant Nos.2019B010140002 and 2020B111110002)the Guangdong-Hong Kong-Macao Joint Innovation Field Project (Grant No.2021A0505080006)。
文摘A novel image encryption scheme based on parallel compressive sensing and edge detection embedding technology is proposed to improve visual security. Firstly, the plain image is sparsely represented using the discrete wavelet transform.Then, the coefficient matrix is scrambled and compressed to obtain a size-reduced image using the Fisher–Yates shuffle and parallel compressive sensing. Subsequently, to increase the security of the proposed algorithm, the compressed image is re-encrypted through permutation and diffusion to obtain a noise-like secret image. Finally, an adaptive embedding method based on edge detection for different carrier images is proposed to generate a visually meaningful cipher image. To improve the plaintext sensitivity of the algorithm, the counter mode is combined with the hash function to generate keys for chaotic systems. Additionally, an effective permutation method is designed to scramble the pixels of the compressed image in the re-encryption stage. The simulation results and analyses demonstrate that the proposed algorithm performs well in terms of visual security and decryption quality.
基金the National Key R&D Program of China(2020YFB1708300)the National Natural Science Foundation of China(52005192)the Project of Ministry of Industry and Information Technology(TC210804R-3).
文摘This paper aims to solve large-scale and complex isogeometric topology optimization problems that consumesignificant computational resources. A novel isogeometric topology optimization method with a hybrid parallelstrategy of CPU/GPU is proposed, while the hybrid parallel strategies for stiffness matrix assembly, equationsolving, sensitivity analysis, and design variable update are discussed in detail. To ensure the high efficiency ofCPU/GPU computing, a workload balancing strategy is presented for optimally distributing the workload betweenCPU and GPU. To illustrate the advantages of the proposedmethod, three benchmark examples are tested to verifythe hybrid parallel strategy in this paper. The results show that the efficiency of the hybrid method is faster thanserial CPU and parallel GPU, while the speedups can be up to two orders of magnitude.
基金supported in part by the General Program Hunan Provincial Natural Science Foundation of 2022,China(2022JJ31022)the Undergraduate Education Reform Project of Hunan Province,China(HNJG-20210532)the National Natural Science Foundation of China(62276276)。
文摘Accurate diagnosis of apple leaf diseases is crucial for improving the quality of apple production and promoting the development of the apple industry. However, apple leaf diseases do not differ significantly from image texture and structural information. The difficulties in disease feature extraction in complex backgrounds slow the related research progress. To address the problems, this paper proposes an improved multi-scale inverse bottleneck residual network model based on a triplet parallel attention mechanism, which is built upon ResNet-50, while improving and combining the inception module and ResNext inverse bottleneck blocks, to recognize seven types of apple leaf(including six diseases of alternaria leaf spot, brown spot, grey spot, mosaic, rust, scab, and one healthy). First, the 3×3 convolutions in some of the residual modules are replaced by multi-scale residual convolutions, the convolution kernels of different sizes contained in each branch of the multi-scale convolution are applied to extract feature maps of different sizes, and the outputs of these branches are multi-scale fused by summing to enrich the output features of the images. Second, the global layer-wise dynamic coordinated inverse bottleneck structure is used to reduce the network feature loss. The inverse bottleneck structure makes the image information less lossy when transforming from different dimensional feature spaces. The fusion of multi-scale and layer-wise dynamic coordinated inverse bottlenecks makes the model effectively balances computational efficiency and feature representation capability, and more robust with a combination of horizontal and vertical features in the fine identification of apple leaf diseases. Finally, after each improved module, a triplet parallel attention module is integrated with cross-dimensional interactions among channels through rotations and residual transformations, which improves the parallel search efficiency of important features and the recognition rate of the network with relatively small computational costs while the dimensional dependencies are improved. To verify the validity of the model in this paper, we uniformly enhance apple leaf disease images screened from the public data sets of Plant Village, Baidu Flying Paddle, and the Internet. The final processed image count is 14,000. The ablation study, pre-processing comparison, and method comparison are conducted on the processed datasets. The experimental results demonstrate that the proposed method reaches 98.73% accuracy on the adopted datasets, which is 1.82% higher than the classical ResNet-50 model, and 0.29% better than the apple leaf disease datasets before preprocessing. It also achieves competitive results in apple leaf disease identification compared to some state-ofthe-art methods.
基金supported by the Sichuan Science and Technology Program (No.2019YJ0356).
文摘Accurate automatic segmentation of gliomas in various sub-regions,including peritumoral edema,necrotic core,and enhancing and non-enhancing tumor core from 3D multimodal MRI images,is challenging because of its highly heterogeneous appearance and shape.Deep convolution neural networks(CNNs)have recently improved glioma segmentation performance.However,extensive down-sampling such as pooling or stridden convolution in CNNs significantly decreases the initial image resolution,resulting in the loss of accurate spatial and object parts information,especially information on the small sub-region tumors,affecting segmentation performance.Hence,this paper proposes a novel multi-level parallel network comprising three different level parallel subnetworks to fully use low-level,mid-level,and high-level information and improve the performance of brain tumor segmentation.We also introduce the Combo loss function to address input class imbalance and false positives and negatives imbalance in deep learning.The proposed method is trained and validated on the BraTS 2020 training and validation dataset.On the validation dataset,ourmethod achieved a mean Dice score of 0.907,0.830,and 0.787 for the whole tumor,tumor core,and enhancing tumor core,respectively.Compared with state-of-the-art methods,the multi-level parallel network has achieved competitive results on the validation dataset.
文摘The Extensible Markup Language(XML)files,widely used for storing and exchanging information on the web require efficient parsing mechanisms to improve the performance of the applications.With the existing Document Object Model(DOM)based parsing,the performance degrades due to sequential processing and large memory requirements,thereby requiring an efficient XML parser to mitigate these issues.In this paper,we propose a Parallel XML Tree Generator(PXTG)algorithm for accelerating the parsing of XML files and a Regression-based XML Parsing Framework(RXPF)that analyzes and predicts performance through profiling,regression,and code generation for efficient parsing.The PXTG algorithm is based on dividing the XML file into n parts and producing n trees in parallel.The profiling phase of the RXPF framework produces a dataset by measuring the performance of various parsing models including StAX,SAX,DOM,JDOM,and PXTG on different cores by using multiple file sizes.The regression phase produces the prediction model,based on which the final code for efficient parsing of XML files is produced through the code generation phase.The RXPF framework has shown a significant improvement in performance varying from 9.54%to 32.34%over other existing models used for parsing XML files.
基金financially supported by the National Natural Science Foundation of China(Grant Nos.12072217 and 42077254)the Natural Science Foundation of Hunan Province,China(Grant No.2022JJ30567).
文摘The high-resolution DEM-IMB-LBM model can accurately describe pore-scale fluid-solid interactions,but its potential for use in geotechnical engineering analysis has not been fully unleashed due to its prohibitive computational costs.To overcome this limitation,a message passing interface(MPI)parallel DEM-IMB-LBM framework is proposed aimed at enhancing computation efficiency.This framework utilises a static domain decomposition scheme,with the entire computation domain being decomposed into multiple subdomains according to predefined processors.A detailed parallel strategy is employed for both contact detection and hydrodynamic force calculation.In particular,a particle ID re-numbering scheme is proposed to handle particle transitions across sub-domain interfaces.Two benchmarks are conducted to validate the accuracy and overall performance of the proposed framework.Subsequently,the framework is applied to simulate scenarios involving multi-particle sedimentation and submarine landslides.The numerical examples effectively demonstrate the robustness and applicability of the MPI parallel DEM-IMB-LBM framework.
文摘This paper presents partially asynchronous parallel simulation of continuous-system (PAPSoCS) and some approaches to the issues of its implementation on a multicomputer system. To guarantee the simulation results correct and speedup the simulation, the scheme for efficient PAPSoCS is proposed and the virtual topology star is constructed to match the path of message passing for solving algorithm-architecture adequation problem. Under the circumstances that messages frequently passed inter-processor are much shorter, typically within several 4 bytes, asynchronous communication mode is employed to reduce the communication ratio. Experiment results show that asynchronous parallel simulation has much higher efficiency than its synchronous counterpart.
文摘In this paper, a homogenous parallel simulation system is presented in detail for continuous--system simulation. The system is collstructed by a host computer and I I transputers connected into a topologyof 'Super--Node' which is very suitable for simulation of stiff systems. An automatic software interface runin the host is developed to partition simulation model, either equations or block diagrams, into several equitable segments and then pack them into parallel simulation program to be executed in the parallel system.This interface frees simulation users from parallel programming to focus on their simulation experiments.
文摘Multicomputer systems(distributed memory computer systems) are becoming more and more popular and will be wildly used in scientific researches. In this paper, we present a parallel algorithm of Fourier Transform of a vector of complex numbers on multicomputer system and give its computing times and its speedup in parallel environment supported by EXPRESS system on the multicomputer system which consists of four SGI workstations. Our analysis shows that the results is ideal and this scheme is suitable to multicomputer systems.