Due to the restricted satellite payloads in LEO mega-constellation networks(LMCNs),remote sensing image analysis,online learning and other big data services desirably need onboard distributed processing(OBDP).In exist...Due to the restricted satellite payloads in LEO mega-constellation networks(LMCNs),remote sensing image analysis,online learning and other big data services desirably need onboard distributed processing(OBDP).In existing technologies,the efficiency of big data applications(BDAs)in distributed systems hinges on the stable-state and low-latency links between worker nodes.However,LMCNs with high-dynamic nodes and long-distance links can not provide the above conditions,which makes the performance of OBDP hard to be intuitively measured.To bridge this gap,a multidimensional simulation platform is indispensable that can simulate the network environment of LMCNs and put BDAs in it for performance testing.Using STK's APIs and parallel computing framework,we achieve real-time simulation for thousands of satellite nodes,which are mapped as application nodes through software defined network(SDN)and container technologies.We elaborate the architecture and mechanism of the simulation platform,and take the Starlink and Hadoop as realistic examples for simulations.The results indicate that LMCNs have dynamic end-to-end latency which fluctuates periodically with the constellation movement.Compared to ground data center networks(GDCNs),LMCNs deteriorate the computing and storage job throughput,which can be alleviated by the utilization of erasure codes and data flow scheduling of worker nodes.展开更多
Withthe rapiddevelopment of deep learning,the size of data sets anddeepneuralnetworks(DNNs)models are also booming.As a result,the intolerable long time for models’training or inference with conventional strategies c...Withthe rapiddevelopment of deep learning,the size of data sets anddeepneuralnetworks(DNNs)models are also booming.As a result,the intolerable long time for models’training or inference with conventional strategies can not meet the satisfaction of modern tasks gradually.Moreover,devices stay idle in the scenario of edge computing(EC),which presents a waste of resources since they can share the pressure of the busy devices but they do not.To address the problem,the strategy leveraging distributed processing has been applied to load computation tasks from a single processor to a group of devices,which results in the acceleration of training or inference of DNN models and promotes the high utilization of devices in edge computing.Compared with existing papers,this paper presents an enlightening and novel review of applying distributed processing with data and model parallelism to improve deep learning tasks in edge computing.Considering the practicalities,commonly used lightweight models in a distributed system are introduced as well.As the key technique,the parallel strategy will be described in detail.Then some typical applications of distributed processing will be analyzed.Finally,the challenges of distributed processing with edge computing will be described.展开更多
The distributed hybrid processing optimization problem of non-cooperative targets is an important research direction for future networked air-defense and anti-missile firepower systems. In this paper, the air-defense ...The distributed hybrid processing optimization problem of non-cooperative targets is an important research direction for future networked air-defense and anti-missile firepower systems. In this paper, the air-defense anti-missile targets defense problem is abstracted as a nonconvex constrained combinatorial optimization problem with the optimization objective of maximizing the degree of contribution of the processing scheme to non-cooperative targets, and the constraints mainly consider geographical conditions and anti-missile equipment resources. The grid discretization concept is used to partition the defense area into network nodes, and the overall defense strategy scheme is described as a nonlinear programming problem to solve the minimum defense cost within the maximum defense capability of the defense system network. In the solution of the minimum defense cost problem, the processing scheme, equipment coverage capability, constraints and node cost requirements are characterized, then a nonlinear mathematical model of the non-cooperative target distributed hybrid processing optimization problem is established, and a local optimal solution based on the sequential quadratic programming algorithm is constructed, and the optimal firepower processing scheme is given by using the sequential quadratic programming method containing non-convex quadratic equations and inequality constraints. Finally, the effectiveness of the proposed method is verified by simulation examples.展开更多
The query processing in distributed database management systems(DBMS)faces more challenges,such as more operators,and more factors in cost models and meta-data,than that in a single-node DMBS,in which query optimizati...The query processing in distributed database management systems(DBMS)faces more challenges,such as more operators,and more factors in cost models and meta-data,than that in a single-node DMBS,in which query optimization is already an NP-hard problem.Learned query optimizers(mainly in the single-node DBMS)receive attention due to its capability to capture data distributions and flexible ways to avoid hard-craft rules in refinement and adaptation to new hardware.In this paper,we focus on extensions of learned query optimizers to distributed DBMSs.Specifically,we propose one possible but general architecture of the learned query optimizer in the distributed context and highlight differences from the learned optimizer in the single-node ones.In addition,we discuss the challenges and possible solutions.展开更多
Active distribution network(ADN)is a solution for power system with interconnection of distributed energy resources(DER),which may change the network operation and power flow of traditional power distribution network....Active distribution network(ADN)is a solution for power system with interconnection of distributed energy resources(DER),which may change the network operation and power flow of traditional power distribution network.However,in some circumstances the malfunction of protection and feeder automation in distribution network occurs due to the uncertain bidirectional power flow.Therefore,a novel method of fault location,isolation,and service restoration(FLISR)for ADN based on distributed processing is proposed in this paper.The differential-activated algorithm based on synchronous sampling for feeder fault location and isolation is studied,and a framework of fault restoration is established for ADN.Finally,the effectiveness of the proposed algorithm is verified via computer simulation of a case study for active distributed power system.展开更多
Purpose-Resilient distributed processing technique(RDPT),in which mapper and reducer are simplified with the Spark contexts and support distributed parallel query processing.Design/methodology/approach-The proposed wo...Purpose-Resilient distributed processing technique(RDPT),in which mapper and reducer are simplified with the Spark contexts and support distributed parallel query processing.Design/methodology/approach-The proposed work is implemented with Pig Latin with Spark contexts to develop query processing in a distributed environment.Findings-Query processing in Hadoop influences the distributed processing with the MapReduce model.MapReduce caters to the works on different nodes with the implementation of complex mappers and reducers.Its results are valid for some extent size of the data.Originality/value-Pig supports the required parallel processing framework with the following constructs during the processing of queries:FOREACH;FLATTEN;COGROUP.展开更多
Active learning(AL)trains a high-precision predictor model from small numbers of labeled data by iteratively annotating the most valuable data sample from an unlabeled data pool with a class label throughout the learn...Active learning(AL)trains a high-precision predictor model from small numbers of labeled data by iteratively annotating the most valuable data sample from an unlabeled data pool with a class label throughout the learning process.However,most current AL methods start with the premise that the labels queried at AL rounds must be free of ambiguity,which may be unrealistic in some real-world applications where only a set of candidate labels can be obtained for selected data.Besides,most of the existing AL algorithms only consider the case of centralized processing,which necessitates gathering together all the unlabeled data in one fusion center for selection.Considering that data are collected/stored at different nodes over a network in many real-world scenarios,distributed processing is chosen here.In this paper,the issue of distributed classification of partially labeled(PL)data obtained by a fully decentralized AL method is focused on,and a distributed active partial label learning(dAPLL)algorithm is proposed.Our proposed algorithm is composed of a fully decentralized sample selection strategy and a distributed partial label learning(PLL)algorithm.During the sample selection process,both the uncertainty and representativeness of the data are measured based on the global cluster centers obtained by a distributed clustering method,and the valuable samples are chosen in turn.Meanwhile,using the disambiguation-free strategy,a series of binary classification problems can be constructed,and the corresponding cost-sensitive classifiers can be cooperatively trained in a distributed manner.The experiment results conducted on several datasets demonstrate that the performance of the dAPLL algorithm is comparable to that of the corresponding centralized method and is superior to the existing active PLL(APLL)method in different parameter configurations.Besides,our proposed algorithm outperforms several current PLL methods using the random selection strategy,especially when only small amounts of data are selected to be assigned with the candidate labels.展开更多
Task scheduling plays a key role in effectively managing and allocating computing resources to meet various computing tasks in a cloud computing environment.Short execution time and low load imbalance may be the chall...Task scheduling plays a key role in effectively managing and allocating computing resources to meet various computing tasks in a cloud computing environment.Short execution time and low load imbalance may be the challenges for some algorithms in resource scheduling scenarios.In this work,the Hierarchical Particle Swarm Optimization-Evolutionary Artificial Bee Colony Algorithm(HPSO-EABC)has been proposed,which hybrids our presented Evolutionary Artificial Bee Colony(EABC),and Hierarchical Particle Swarm Optimization(HPSO)algorithm.The HPSO-EABC algorithm incorporates both the advantages of the HPSO and the EABC algorithm.Comprehensive testing including evaluations of algorithm convergence speed,resource execution time,load balancing,and operational costs has been done.The results indicate that the EABC algorithm exhibits greater parallelism compared to the Artificial Bee Colony algorithm.Compared with the Particle Swarm Optimization algorithm,the HPSO algorithmnot only improves the global search capability but also effectively mitigates getting stuck in local optima.As a result,the hybrid HPSO-EABC algorithm demonstrates significant improvements in terms of stability and convergence speed.Moreover,it exhibits enhanced resource scheduling performance in both homogeneous and heterogeneous environments,effectively reducing execution time and cost,which also is verified by the ablation experimental.展开更多
Most distributed stream processing engines(DSPEs)do not support online task management and cannot adapt to time-varying data flows.Recently,some studies have proposed online task deployment algorithms to solve this pr...Most distributed stream processing engines(DSPEs)do not support online task management and cannot adapt to time-varying data flows.Recently,some studies have proposed online task deployment algorithms to solve this problem.However,these approaches do not guarantee the Quality of Service(QoS)when the task deployment changes at runtime,because the task migrations caused by the change of task deployments will impose an exorbitant cost.We study one of the most popular DSPEs,Apache Storm,and find out that when a task needs to be migrated,Storm has to stop the resource(implemented as a process of Worker in Storm)where the task is deployed.This will lead to the stop and restart of all tasks in the resource,resulting in the poor performance of task migrations.Aiming to solve this problem,in this pa-per,we propose N-Storm(Nonstop Storm),which is a task-resource decoupling DSPE.N-Storm allows tasks allocated to resources to be changed at runtime,which is implemented by a thread-level scheme for task migrations.Particularly,we add a local shared key/value store on each node to make resources aware of the changes in the allocation plan.Thus,each resource can manage its tasks at runtime.Based on N-Storm,we further propose Online Task Deployment(OTD).Differ-ing from traditional task deployment algorithms that deploy all tasks at once without considering the cost of task migra-tions caused by a task re-deployment,OTD can gradually adjust the current task deployment to an optimized one based on the communication cost and the runtime states of resources.We demonstrate that OTD can adapt to different kinds of applications including computation-and communication-intensive applications.The experimental results on a real DSPE cluster show that N-Storm can avoid the system stop and save up to 87%of the performance degradation time,compared with Apache Storm and other state-of-the-art approaches.In addition,OTD can increase the average CPU usage by 51%for computation-intensive applications and reduce network communication costs by 88%for communication-intensive ap-plications.展开更多
This paper investigates the estimation problem for a spatially distributed process described by a partial differential equation with missing measurements.The randomly missing measurements are introduced in order to be...This paper investigates the estimation problem for a spatially distributed process described by a partial differential equation with missing measurements.The randomly missing measurements are introduced in order to better reflect the reality in the sensor network.To improve the estimation performance for the spatially distributed process,a network of sensors which are allowed to move within the spatial domain is used.We aim to design an estimator which is used to approximate the distributed process and the mobile trajectories for sensors such that,for all possible missing measurements,the estimation error system is globally asymptotically stable in the mean square sense.By constructing Lyapunov functionals and using inequality analysis,the guidance scheme of every sensor and the convergence of the estimation error system are obtained.Finally,a numerical example is given to verify the effectiveness of the proposed estimator utilizing the proposed guidance scheme for sensors.展开更多
We propose schemes to realize quantum state transfer and prepare quantum entanglement in coupled cavity and cavity-fiber-cavity systems,respectively,by using the dressed state method.We first give the expression of pu...We propose schemes to realize quantum state transfer and prepare quantum entanglement in coupled cavity and cavity-fiber-cavity systems,respectively,by using the dressed state method.We first give the expression of pulses shape by using dressed states and then find a group of Gaussian pulses that are easy to realize in experiment to replace the ideal pulses by curve fitting.We also study the influence of some parameters fluctuation,atomic spontaneous emission,and photon leakage on fidelity.The results show that our schemes have good robustness.Because the atoms are trapped in different cavities,it is easy to perform different operations on different atoms.The proposed schemes have the potential applications in dressed states for distributed quantum information processing tasks.展开更多
Multiple earth observing satellites need to communicate with each other to observe plenty of targets on the Earth together. The factors, such as external interference, result in satellite information interaction delay...Multiple earth observing satellites need to communicate with each other to observe plenty of targets on the Earth together. The factors, such as external interference, result in satellite information interaction delays, which is unable to ensure the integrity and timeliness of the information on decision making for satellites. And the optimization of the planning result is affected. Therefore, the effect of communication delay is considered during the multi-satel ite coordinating process. For this problem, firstly, a distributed cooperative optimization problem for multiple satellites in the delayed communication environment is formulized. Secondly, based on both the analysis of the temporal sequence of tasks in a single satellite and the dynamically decoupled characteristics of the multi-satellite system, the environment information of multi-satellite distributed cooperative optimization is constructed on the basis of the directed acyclic graph(DAG). Then, both a cooperative optimization decision making framework and a model are built according to the decentralized partial observable Markov decision process(DEC-POMDP). After that, a satellite coordinating strategy aimed at different conditions of communication delay is mainly analyzed, and a unified processing strategy on communication delay is designed. An approximate cooperative optimization algorithm based on simulated annealing is proposed. Finally, the effectiveness and robustness of the method presented in this paper are verified via the simulation.展开更多
Mobile agents is becoming a prosperous research field due to recent prevalence of Internet andJava. Although it brings great challenges to traditional distributed processing technologies, its attractive advantages sti...Mobile agents is becoming a prosperous research field due to recent prevalence of Internet andJava. Although it brings great challenges to traditional distributed processing technologies, its attractive advantages stimulate us to further the work on it with great ardor. This paper presents the basic concepts aboutthe Inchoate technique and describes current research status on it. Major obstacles and their possible solutions are discussed.展开更多
Velocity is a key parameter characterizing the movement of saltating particles. High-speed photography is an efficient method to record the velocity. But, manually determining the relevant information from these photo...Velocity is a key parameter characterizing the movement of saltating particles. High-speed photography is an efficient method to record the velocity. But, manually determining the relevant information from these photographs is quite laborious. However, particle tracking velocimetry(PTV) can be used to measure the instantaneous velocity in fluids using tracer particles. The tracer particles have three basic features in fluids: similar movement patterns within a small region, a uniform particle distribution, and high particle density. Unfortunately, the saltation of sand particles in air is a stochastic process, and PTV has not yet been able to accurately determine the velocity field in a cloud of blowing sand. The aim of the present study was to develop an improved PTV technique to measure the downwind(horizontal) and vertical velocities of saltating sand. To demonstrate the feasibility of this new technique, we used it to investigate two-dimensional saltation of particles above a loose sand surface in a wind tunnel. We analyzed the properties of the saltating particles, including the probability distribution of particle velocity, variations in the mean velocity as a function of height, and particle turbulence. By automating much of the analysis, the improved PTV method can satisfy the requirement for a large sample size and can measure the velocity field of blowing sand more accurately than previously-used techniques. The results shed new light on the complicated mechanisms involved in sand saltation.展开更多
This paper investigates the electron-vibrational(e-V)energy exchange in nitrogencontaining plasma,which is very efficient in the case of gas discharge and high speed flow.Based on Harmonic oscillator approximation a...This paper investigates the electron-vibrational(e-V)energy exchange in nitrogencontaining plasma,which is very efficient in the case of gas discharge and high speed flow.Based on Harmonic oscillator approximation and the assumption of the e-V relaxation through a continuous series of Boltzmann distributions over the vibrational states,an analytic approach is derived from the proposed scaling relation of e-V transition rates.A full kinetic model is then investigated by numerically solving the state-to-state master equation for all vibrational levels.The analytical approach leads to a Landau-Teller(LT)-type equation for relaxation of vibrational energy,and predicts the relaxation time on the right order of magnitude.By comparison with the kinetic model,the LT-type equation is valid in typical electron temperatures in gas discharge.However,the analytical approach is not capable of describing the vibrational distribution function during the e-V process in which a full kinetic model is required.展开更多
High quality, concentrated sugar syrup crystal is produced in a critical step in cane sugar production: the clarification process. It is characterized by two variables: the color of the produced sugar and its clarit...High quality, concentrated sugar syrup crystal is produced in a critical step in cane sugar production: the clarification process. It is characterized by two variables: the color of the produced sugar and its clarity degree. We show that the temporal variations of these variables follow power-law distributions and can be well modeled by multiplicative cascade multifractal processes. These interesting properties suggest that the degradation in color and clarity degree has a systemwide cause. In particular, the cascade multifractal model suggests that the degradation in color and clarity degree can be equivalently accounted for by the initial "impurities" in the sugarcane. Hence, more effective cleaning of the sugarcane before the clarification stage may lead to substantial improvement in the effect of clarification.展开更多
In this paper, we consider skyline queries in a mobile and distributed environment, where data objects are distributed in some sites (database servers) which are interconnected through a high-speed wired network, an...In this paper, we consider skyline queries in a mobile and distributed environment, where data objects are distributed in some sites (database servers) which are interconnected through a high-speed wired network, and queries are issued by mobile units (laptop, cell phone, etc.) which access the data objects of database servers by wireless channels. The inherent properties of mobile computing environment such as mobility, limited wireless bandwidth, frequent disconnection, make skyline queries more complicated. We show how to efficiently perform distributed skyline queries in a mobile environment and propose a skyline query processing approach, called efficient distributed skyline based on mobile computing (EDS-MC). In EDS-MC, a distributed skyline query is decomposed into five processing phases and each phase is elaborately designed in order to reduce the network communication, network delay and query response time. We conduct extensive experiments in a simulated mobile database system, and the experimental results demonstrate the superiority of EDS-MC over other skyline query processing techniques on mobile computing.展开更多
This paper considers the distributed estimation of a source parameter using quantized sensor observations in a wireless sensor network with noisy channels. Repetition codes are used to transmit quantization bits of se...This paper considers the distributed estimation of a source parameter using quantized sensor observations in a wireless sensor network with noisy channels. Repetition codes are used to transmit quantization bits of sensor observations and a quasi best linear unbiased estimate is constructed to estimate the source parameter. Simulations show that the estimation scheme achieves a better power and spectral efficiency than the previous scheme.展开更多
In this paper,we present a review of the current literature on distributed(or partially decentralized) control of chemical process networks.In particular,we focus on recent developments in distributed model predictive...In this paper,we present a review of the current literature on distributed(or partially decentralized) control of chemical process networks.In particular,we focus on recent developments in distributed model predictive control,in the context of the specific challenges faced in the control of chemical process networks.The paper is concluded with some open problems and some possible future research directions in the area.展开更多
This paper describes a distributed estimation scheme (DES) for a bandwidth constrained ad hoc sensor network. The DES is universal in the sense that operations on all sensors are identical and independent of noise d...This paper describes a distributed estimation scheme (DES) for a bandwidth constrained ad hoc sensor network. The DES is universal in the sense that operations on all sensors are identical and independent of noise distribution. The scheme requires each sensor to transmit just a 1-bit message per observation. Simulation results show that the scheme achieves much better mean-squares error (MSE) performance than the simplified isotropic universal DES and even outperforms the isotropic universal DES which requires more than twice the bandwidth of this scheme.展开更多
基金supported by National Natural Sciences Foundation of China(No.62271165,62027802,62201307)the Guangdong Basic and Applied Basic Research Foundation(No.2023A1515030297)+2 种基金the Shenzhen Science and Technology Program ZDSYS20210623091808025Stable Support Plan Program GXWD20231129102638002the Major Key Project of PCL(No.PCL2024A01)。
文摘Due to the restricted satellite payloads in LEO mega-constellation networks(LMCNs),remote sensing image analysis,online learning and other big data services desirably need onboard distributed processing(OBDP).In existing technologies,the efficiency of big data applications(BDAs)in distributed systems hinges on the stable-state and low-latency links between worker nodes.However,LMCNs with high-dynamic nodes and long-distance links can not provide the above conditions,which makes the performance of OBDP hard to be intuitively measured.To bridge this gap,a multidimensional simulation platform is indispensable that can simulate the network environment of LMCNs and put BDAs in it for performance testing.Using STK's APIs and parallel computing framework,we achieve real-time simulation for thousands of satellite nodes,which are mapped as application nodes through software defined network(SDN)and container technologies.We elaborate the architecture and mechanism of the simulation platform,and take the Starlink and Hadoop as realistic examples for simulations.The results indicate that LMCNs have dynamic end-to-end latency which fluctuates periodically with the constellation movement.Compared to ground data center networks(GDCNs),LMCNs deteriorate the computing and storage job throughput,which can be alleviated by the utilization of erasure codes and data flow scheduling of worker nodes.
基金supported by the Natural Science Foundation of Jiangsu Province of China under Grant No.BK20211284the Financial and Science Technology Plan Project of Xinjiang Production,Construction Corps under Grant No.2020DB005the National Natural Science Foundation of China under Grant Nos.61872219,62002276 and 62177014。
文摘Withthe rapiddevelopment of deep learning,the size of data sets anddeepneuralnetworks(DNNs)models are also booming.As a result,the intolerable long time for models’training or inference with conventional strategies can not meet the satisfaction of modern tasks gradually.Moreover,devices stay idle in the scenario of edge computing(EC),which presents a waste of resources since they can share the pressure of the busy devices but they do not.To address the problem,the strategy leveraging distributed processing has been applied to load computation tasks from a single processor to a group of devices,which results in the acceleration of training or inference of DNN models and promotes the high utilization of devices in edge computing.Compared with existing papers,this paper presents an enlightening and novel review of applying distributed processing with data and model parallelism to improve deep learning tasks in edge computing.Considering the practicalities,commonly used lightweight models in a distributed system are introduced as well.As the key technique,the parallel strategy will be described in detail.Then some typical applications of distributed processing will be analyzed.Finally,the challenges of distributed processing with edge computing will be described.
基金supported by the National Natural Science Foundation of China (61903025)the Fundamental Research Funds for the Cent ral Universities (FRF-IDRY-20-013)。
文摘The distributed hybrid processing optimization problem of non-cooperative targets is an important research direction for future networked air-defense and anti-missile firepower systems. In this paper, the air-defense anti-missile targets defense problem is abstracted as a nonconvex constrained combinatorial optimization problem with the optimization objective of maximizing the degree of contribution of the processing scheme to non-cooperative targets, and the constraints mainly consider geographical conditions and anti-missile equipment resources. The grid discretization concept is used to partition the defense area into network nodes, and the overall defense strategy scheme is described as a nonlinear programming problem to solve the minimum defense cost within the maximum defense capability of the defense system network. In the solution of the minimum defense cost problem, the processing scheme, equipment coverage capability, constraints and node cost requirements are characterized, then a nonlinear mathematical model of the non-cooperative target distributed hybrid processing optimization problem is established, and a local optimal solution based on the sequential quadratic programming algorithm is constructed, and the optimal firepower processing scheme is given by using the sequential quadratic programming method containing non-convex quadratic equations and inequality constraints. Finally, the effectiveness of the proposed method is verified by simulation examples.
基金partially supported by NSFC under Grant Nos.61832001 and 62272008ZTE Industry-University-Institute Fund Project。
文摘The query processing in distributed database management systems(DBMS)faces more challenges,such as more operators,and more factors in cost models and meta-data,than that in a single-node DMBS,in which query optimization is already an NP-hard problem.Learned query optimizers(mainly in the single-node DBMS)receive attention due to its capability to capture data distributions and flexible ways to avoid hard-craft rules in refinement and adaptation to new hardware.In this paper,we focus on extensions of learned query optimizers to distributed DBMSs.Specifically,we propose one possible but general architecture of the learned query optimizer in the distributed context and highlight differences from the learned optimizer in the single-node ones.In addition,we discuss the challenges and possible solutions.
基金This paper was supported by the National High Technology Research and Development Program of China(863 Program)(No.2014AA051902).
文摘Active distribution network(ADN)is a solution for power system with interconnection of distributed energy resources(DER),which may change the network operation and power flow of traditional power distribution network.However,in some circumstances the malfunction of protection and feeder automation in distribution network occurs due to the uncertain bidirectional power flow.Therefore,a novel method of fault location,isolation,and service restoration(FLISR)for ADN based on distributed processing is proposed in this paper.The differential-activated algorithm based on synchronous sampling for feeder fault location and isolation is studied,and a framework of fault restoration is established for ADN.Finally,the effectiveness of the proposed algorithm is verified via computer simulation of a case study for active distributed power system.
文摘Purpose-Resilient distributed processing technique(RDPT),in which mapper and reducer are simplified with the Spark contexts and support distributed parallel query processing.Design/methodology/approach-The proposed work is implemented with Pig Latin with Spark contexts to develop query processing in a distributed environment.Findings-Query processing in Hadoop influences the distributed processing with the MapReduce model.MapReduce caters to the works on different nodes with the implementation of complex mappers and reducers.Its results are valid for some extent size of the data.Originality/value-Pig supports the required parallel processing framework with the following constructs during the processing of queries:FOREACH;FLATTEN;COGROUP.
基金supported by the National Natural Science Foundation of China(62201398)Natural Science Foundation of Zhejiang Province(LY21F020001),Science and Technology Plan Project of Wenzhou(ZG2020026).
文摘Active learning(AL)trains a high-precision predictor model from small numbers of labeled data by iteratively annotating the most valuable data sample from an unlabeled data pool with a class label throughout the learning process.However,most current AL methods start with the premise that the labels queried at AL rounds must be free of ambiguity,which may be unrealistic in some real-world applications where only a set of candidate labels can be obtained for selected data.Besides,most of the existing AL algorithms only consider the case of centralized processing,which necessitates gathering together all the unlabeled data in one fusion center for selection.Considering that data are collected/stored at different nodes over a network in many real-world scenarios,distributed processing is chosen here.In this paper,the issue of distributed classification of partially labeled(PL)data obtained by a fully decentralized AL method is focused on,and a distributed active partial label learning(dAPLL)algorithm is proposed.Our proposed algorithm is composed of a fully decentralized sample selection strategy and a distributed partial label learning(PLL)algorithm.During the sample selection process,both the uncertainty and representativeness of the data are measured based on the global cluster centers obtained by a distributed clustering method,and the valuable samples are chosen in turn.Meanwhile,using the disambiguation-free strategy,a series of binary classification problems can be constructed,and the corresponding cost-sensitive classifiers can be cooperatively trained in a distributed manner.The experiment results conducted on several datasets demonstrate that the performance of the dAPLL algorithm is comparable to that of the corresponding centralized method and is superior to the existing active PLL(APLL)method in different parameter configurations.Besides,our proposed algorithm outperforms several current PLL methods using the random selection strategy,especially when only small amounts of data are selected to be assigned with the candidate labels.
基金jointly supported by the Jiangsu Postgraduate Research and Practice Innovation Project under Grant KYCX22_1030,SJCX22_0283 and SJCX23_0293the NUPTSF under Grant NY220201.
文摘Task scheduling plays a key role in effectively managing and allocating computing resources to meet various computing tasks in a cloud computing environment.Short execution time and low load imbalance may be the challenges for some algorithms in resource scheduling scenarios.In this work,the Hierarchical Particle Swarm Optimization-Evolutionary Artificial Bee Colony Algorithm(HPSO-EABC)has been proposed,which hybrids our presented Evolutionary Artificial Bee Colony(EABC),and Hierarchical Particle Swarm Optimization(HPSO)algorithm.The HPSO-EABC algorithm incorporates both the advantages of the HPSO and the EABC algorithm.Comprehensive testing including evaluations of algorithm convergence speed,resource execution time,load balancing,and operational costs has been done.The results indicate that the EABC algorithm exhibits greater parallelism compared to the Artificial Bee Colony algorithm.Compared with the Particle Swarm Optimization algorithm,the HPSO algorithmnot only improves the global search capability but also effectively mitigates getting stuck in local optima.As a result,the hybrid HPSO-EABC algorithm demonstrates significant improvements in terms of stability and convergence speed.Moreover,it exhibits enhanced resource scheduling performance in both homogeneous and heterogeneous environments,effectively reducing execution time and cost,which also is verified by the ablation experimental.
基金The work was supported by the National Natural Science Foundation of China under Grant Nos.62072419 and 61672479.
文摘Most distributed stream processing engines(DSPEs)do not support online task management and cannot adapt to time-varying data flows.Recently,some studies have proposed online task deployment algorithms to solve this problem.However,these approaches do not guarantee the Quality of Service(QoS)when the task deployment changes at runtime,because the task migrations caused by the change of task deployments will impose an exorbitant cost.We study one of the most popular DSPEs,Apache Storm,and find out that when a task needs to be migrated,Storm has to stop the resource(implemented as a process of Worker in Storm)where the task is deployed.This will lead to the stop and restart of all tasks in the resource,resulting in the poor performance of task migrations.Aiming to solve this problem,in this pa-per,we propose N-Storm(Nonstop Storm),which is a task-resource decoupling DSPE.N-Storm allows tasks allocated to resources to be changed at runtime,which is implemented by a thread-level scheme for task migrations.Particularly,we add a local shared key/value store on each node to make resources aware of the changes in the allocation plan.Thus,each resource can manage its tasks at runtime.Based on N-Storm,we further propose Online Task Deployment(OTD).Differ-ing from traditional task deployment algorithms that deploy all tasks at once without considering the cost of task migra-tions caused by a task re-deployment,OTD can gradually adjust the current task deployment to an optimized one based on the communication cost and the runtime states of resources.We demonstrate that OTD can adapt to different kinds of applications including computation-and communication-intensive applications.The experimental results on a real DSPE cluster show that N-Storm can avoid the system stop and save up to 87%of the performance degradation time,compared with Apache Storm and other state-of-the-art approaches.In addition,OTD can increase the average CPU usage by 51%for computation-intensive applications and reduce network communication costs by 88%for communication-intensive ap-plications.
基金supported by the National Natural Science Foundation of China(Grant Nos.61174021,61473136,and 61104155)the 111 Project(Grant No.B12018)
文摘This paper investigates the estimation problem for a spatially distributed process described by a partial differential equation with missing measurements.The randomly missing measurements are introduced in order to better reflect the reality in the sensor network.To improve the estimation performance for the spatially distributed process,a network of sensors which are allowed to move within the spatial domain is used.We aim to design an estimator which is used to approximate the distributed process and the mobile trajectories for sensors such that,for all possible missing measurements,the estimation error system is globally asymptotically stable in the mean square sense.By constructing Lyapunov functionals and using inequality analysis,the guidance scheme of every sensor and the convergence of the estimation error system are obtained.Finally,a numerical example is given to verify the effectiveness of the proposed estimator utilizing the proposed guidance scheme for sensors.
基金Project supported by the National Natural Science Foundation of China(Grant No.11804308).
文摘We propose schemes to realize quantum state transfer and prepare quantum entanglement in coupled cavity and cavity-fiber-cavity systems,respectively,by using the dressed state method.We first give the expression of pulses shape by using dressed states and then find a group of Gaussian pulses that are easy to realize in experiment to replace the ideal pulses by curve fitting.We also study the influence of some parameters fluctuation,atomic spontaneous emission,and photon leakage on fidelity.The results show that our schemes have good robustness.Because the atoms are trapped in different cavities,it is easy to perform different operations on different atoms.The proposed schemes have the potential applications in dressed states for distributed quantum information processing tasks.
基金supported by the National Science Foundation for Young Scholars of China(6130123471401175)
文摘Multiple earth observing satellites need to communicate with each other to observe plenty of targets on the Earth together. The factors, such as external interference, result in satellite information interaction delays, which is unable to ensure the integrity and timeliness of the information on decision making for satellites. And the optimization of the planning result is affected. Therefore, the effect of communication delay is considered during the multi-satel ite coordinating process. For this problem, firstly, a distributed cooperative optimization problem for multiple satellites in the delayed communication environment is formulized. Secondly, based on both the analysis of the temporal sequence of tasks in a single satellite and the dynamically decoupled characteristics of the multi-satellite system, the environment information of multi-satellite distributed cooperative optimization is constructed on the basis of the directed acyclic graph(DAG). Then, both a cooperative optimization decision making framework and a model are built according to the decentralized partial observable Markov decision process(DEC-POMDP). After that, a satellite coordinating strategy aimed at different conditions of communication delay is mainly analyzed, and a unified processing strategy on communication delay is designed. An approximate cooperative optimization algorithm based on simulated annealing is proposed. Finally, the effectiveness and robustness of the method presented in this paper are verified via the simulation.
文摘Mobile agents is becoming a prosperous research field due to recent prevalence of Internet andJava. Although it brings great challenges to traditional distributed processing technologies, its attractive advantages stimulate us to further the work on it with great ardor. This paper presents the basic concepts aboutthe Inchoate technique and describes current research status on it. Major obstacles and their possible solutions are discussed.
基金funded by the Young Talent Fund of University Association for Science and Technology in Shaanxi, China (20170303)the National Science Basic Research Plan in Shaanxi Province of China (2017JQ6080)the Talent Development Project of Weinan Normal University, China (16ZRRC02)
文摘Velocity is a key parameter characterizing the movement of saltating particles. High-speed photography is an efficient method to record the velocity. But, manually determining the relevant information from these photographs is quite laborious. However, particle tracking velocimetry(PTV) can be used to measure the instantaneous velocity in fluids using tracer particles. The tracer particles have three basic features in fluids: similar movement patterns within a small region, a uniform particle distribution, and high particle density. Unfortunately, the saltation of sand particles in air is a stochastic process, and PTV has not yet been able to accurately determine the velocity field in a cloud of blowing sand. The aim of the present study was to develop an improved PTV technique to measure the downwind(horizontal) and vertical velocities of saltating sand. To demonstrate the feasibility of this new technique, we used it to investigate two-dimensional saltation of particles above a loose sand surface in a wind tunnel. We analyzed the properties of the saltating particles, including the probability distribution of particle velocity, variations in the mean velocity as a function of height, and particle turbulence. By automating much of the analysis, the improved PTV method can satisfy the requirement for a large sample size and can measure the velocity field of blowing sand more accurately than previously-used techniques. The results shed new light on the complicated mechanisms involved in sand saltation.
基金supported by National Natural Science Foundation of China(No.11505015)the National High-Tech Research and Development Program of China(863 Program)
文摘This paper investigates the electron-vibrational(e-V)energy exchange in nitrogencontaining plasma,which is very efficient in the case of gas discharge and high speed flow.Based on Harmonic oscillator approximation and the assumption of the e-V relaxation through a continuous series of Boltzmann distributions over the vibrational states,an analytic approach is derived from the proposed scaling relation of e-V transition rates.A full kinetic model is then investigated by numerically solving the state-to-state master equation for all vibrational levels.The analytical approach leads to a Landau-Teller(LT)-type equation for relaxation of vibrational energy,and predicts the relaxation time on the right order of magnitude.By comparison with the kinetic model,the LT-type equation is valid in typical electron temperatures in gas discharge.However,the analytical approach is not capable of describing the vibrational distribution function during the e-V process in which a full kinetic model is required.
文摘High quality, concentrated sugar syrup crystal is produced in a critical step in cane sugar production: the clarification process. It is characterized by two variables: the color of the produced sugar and its clarity degree. We show that the temporal variations of these variables follow power-law distributions and can be well modeled by multiplicative cascade multifractal processes. These interesting properties suggest that the degradation in color and clarity degree has a systemwide cause. In particular, the cascade multifractal model suggests that the degradation in color and clarity degree can be equivalently accounted for by the initial "impurities" in the sugarcane. Hence, more effective cleaning of the sugarcane before the clarification stage may lead to substantial improvement in the effect of clarification.
基金supported by the Natural Science Foundation of Tianjin under Grant No. 08JCYBJC12400the Innovative Foundation of Small and Medium Enterprises under Grant No. 08ZXCXGX15000+1 种基金the National High-Technology Research and Development 863 Program of China under Grant No. 2009AA01Z152the National Natural Science Foundation of China under Grant No. 60872064
文摘In this paper, we consider skyline queries in a mobile and distributed environment, where data objects are distributed in some sites (database servers) which are interconnected through a high-speed wired network, and queries are issued by mobile units (laptop, cell phone, etc.) which access the data objects of database servers by wireless channels. The inherent properties of mobile computing environment such as mobility, limited wireless bandwidth, frequent disconnection, make skyline queries more complicated. We show how to efficiently perform distributed skyline queries in a mobile environment and propose a skyline query processing approach, called efficient distributed skyline based on mobile computing (EDS-MC). In EDS-MC, a distributed skyline query is decomposed into five processing phases and each phase is elaborately designed in order to reduce the network communication, network delay and query response time. We conduct extensive experiments in a simulated mobile database system, and the experimental results demonstrate the superiority of EDS-MC over other skyline query processing techniques on mobile computing.
文摘This paper considers the distributed estimation of a source parameter using quantized sensor observations in a wireless sensor network with noisy channels. Repetition codes are used to transmit quantization bits of sensor observations and a quasi best linear unbiased estimate is constructed to estimate the source parameter. Simulations show that the estimation scheme achieves a better power and spectral efficiency than the previous scheme.
基金supported by Australian Research Council(ARC)Discovery Project(No.DP130103330)
文摘In this paper,we present a review of the current literature on distributed(or partially decentralized) control of chemical process networks.In particular,we focus on recent developments in distributed model predictive control,in the context of the specific challenges faced in the control of chemical process networks.The paper is concluded with some open problems and some possible future research directions in the area.
基金Supported by the Basic Research Foundation of Tsinghua National Laboratory for Information Science and Technology (TNList)the Major Program of the National Natural Science Foundation of China (No. 60496311)
文摘This paper describes a distributed estimation scheme (DES) for a bandwidth constrained ad hoc sensor network. The DES is universal in the sense that operations on all sensors are identical and independent of noise distribution. The scheme requires each sensor to transmit just a 1-bit message per observation. Simulation results show that the scheme achieves much better mean-squares error (MSE) performance than the simplified isotropic universal DES and even outperforms the isotropic universal DES which requires more than twice the bandwidth of this scheme.