The Qilian Mountains, a national key ecological function zone in Western China, play a pivotal role in ecosystem services. However, the distribution of its dominant tree species, Picea crassifolia (Qinghai spruce), ha...The Qilian Mountains, a national key ecological function zone in Western China, play a pivotal role in ecosystem services. However, the distribution of its dominant tree species, Picea crassifolia (Qinghai spruce), has decreased dramatically in the past decades due to climate change and human activity, which may have influenced its ecological functions. To restore its ecological functions, reasonable reforestation is the key measure. Many previous efforts have predicted the potential distribution of Picea crassifolia, which provides guidance on regional reforestation policy. However, all of them were performed at low spatial resolution, thus ignoring the natural characteristics of the patchy distribution of Picea crassifolia. Here, we modeled the distribution of Picea crassifolia with species distribution models at high spatial resolutions. For many models, the area under the receiver operating characteristic curve (AUC) is larger than 0.9, suggesting their excellent precision. The AUC of models at 30 m is higher than that of models at 90 m, and the current potential distribution of Picea crassifolia is more closely aligned with its actual distribution at 30 m, demonstrating that finer data resolution improves model performance. Besides, for models at 90 m resolution, annual precipitation (Bio12) played the paramount influence on the distribution of Picea crassifolia, while the aspect became the most important one at 30 m, indicating the crucial role of finer topographic data in modeling species with patchy distribution. The current distribution of Picea crassifolia was concentrated in the northern and central parts of the study area, and this pattern will be maintained under future scenarios, although some habitat loss in the central parts and gain in the eastern regions is expected owing to increasing temperatures and precipitation. Our findings can guide protective and restoration strategies for the Qilian Mountains, which would benefit regional ecological balance.展开更多
To improve data distribution efficiency a load-balancing data distribution LBDD method is proposed in publish/subscribe mode.In the LBDD method subscribers are involved in distribution tasks and data transfers while r...To improve data distribution efficiency a load-balancing data distribution LBDD method is proposed in publish/subscribe mode.In the LBDD method subscribers are involved in distribution tasks and data transfers while receiving data themselves.A dissemination tree is constructed among the subscribers based on MD5 where the publisher acts as the root. The proposed method provides bucket construction target selection and path updates furthermore the property of one-way dissemination is proven.That the average out-going degree of a node is 2 is guaranteed with the proposed LBDD.The experiments on data distribution delay data distribution rate and load distribution are conducted. Experimental results show that the LBDD method aids in shaping the task load between the publisher and subscribers and outperforms the point-to-point approach.展开更多
The existence of three well-defined tongue-shaped zones of swell dominance,termed as 'swell pools',in the Pacific,the Atlantic and the Indian Oceans,was reported by Chen et al.(2002)using satellite data.In thi...The existence of three well-defined tongue-shaped zones of swell dominance,termed as 'swell pools',in the Pacific,the Atlantic and the Indian Oceans,was reported by Chen et al.(2002)using satellite data.In this paper,the ECMWF Re-analyses wind wave data,including wind speed,significant wave height,averaged wave period and direction,are applied to verify the existence of these swell pools.The swell indices calculated from wave height,wave age and correlation coefficient are used to identify swell events.The wave age swell index can be more appropriately related to physical processes compared to the other two swell indices.Based on the ECMWF data the swell pools in the Pacific and the Atlantic Oceans are confirmed,but the expected swell pool in the Indian Ocean is not pronounced.The seasonal variations of global and hemispherical swell indices are investigated,and the argument that swells in the pools seemed to originate mostly from the winter hemisphere is supported by the seasonal variation of the averaged wave direction.The northward bending of the swell pools in the Pacific and the Atlantic Oceans in summer is not revealed by the ECMWF data.The swell pool in the Indian Ocean and the summer northward bending of the swell pools in the Pacific and the Atlan-tic Oceans need to be further verified by other datasets.展开更多
Net Primary Productivity (NPP) is one of the important biophysical variables of vegetation activity, and it plays an important role in studying global carbon cycle, carbon source and sink of ecosystem, and spatial a...Net Primary Productivity (NPP) is one of the important biophysical variables of vegetation activity, and it plays an important role in studying global carbon cycle, carbon source and sink of ecosystem, and spatial and temporal distribution of CO2. Remote sensing can provide broad view quickly, timely and multi-temporally, which makes it an attractive and powerful tool for studying ecosystem primary productivity, at scales ranging from local to global. This paper aims to use Moderate Resolution Imaging Spectroradiometer (MODIS) data to estimate and analyze spatial and temporal distribution of NPP of the northern Hebei Province in 2001 based on Carnegie-Ames-Stanford Approach (CASA) model. The spatial distribution of Absorbed Photosynthetically Active Radiation (APAR) of vegetation and light use efficiency in three geographical subregions, that is, Bashang Plateau Region, Basin Region in the northwestern Hebei Province and Yanshan Mountainous Region in the Northern Hebei Province were analyzed, and total NPP spatial distribution of the study area in 2001 was discussed. Based on 16-day MODIS Fraction of Photosynthetically Active Radiation absorbed by vegetation (FPAR) product, 16-day composite NPP dynamics were calculated using CASA model; the seasonal dynamics of vegetation NPP in three subreglons were also analyzed. Result reveals that the total NPP of the study area in 2001 was 25.1877 × 10^6gC/(m^2.a), and NPP in 2001 ranged from 2 to 608gC/(m^2-a), with an average of 337.516gC/(m^2.a). NPP of the study area in 2001 accumulated mainly from May to September (DOY 129-272), high NIP values appeared from June to August (DOY 177-204), and the maximum NPP appeared from late July to mid-August (DOY 209-224).展开更多
The security of Federated Learning(FL)/Distributed Machine Learning(DML)is gravely threatened by data poisoning attacks,which destroy the usability of the model by contaminating training samples,so such attacks are ca...The security of Federated Learning(FL)/Distributed Machine Learning(DML)is gravely threatened by data poisoning attacks,which destroy the usability of the model by contaminating training samples,so such attacks are called causative availability indiscriminate attacks.Facing the problem that existing data sanitization methods are hard to apply to real-time applications due to their tedious process and heavy computations,we propose a new supervised batch detection method for poison,which can fleetly sanitize the training dataset before the local model training.We design a training dataset generation method that helps to enhance accuracy and uses data complexity features to train a detection model,which will be used in an efficient batch hierarchical detection process.Our model stockpiles knowledge about poison,which can be expanded by retraining to adapt to new attacks.Being neither attack-specific nor scenario-specific,our method is applicable to FL/DML or other online or offline scenarios.展开更多
Due to the restricted satellite payloads in LEO mega-constellation networks(LMCNs),remote sensing image analysis,online learning and other big data services desirably need onboard distributed processing(OBDP).In exist...Due to the restricted satellite payloads in LEO mega-constellation networks(LMCNs),remote sensing image analysis,online learning and other big data services desirably need onboard distributed processing(OBDP).In existing technologies,the efficiency of big data applications(BDAs)in distributed systems hinges on the stable-state and low-latency links between worker nodes.However,LMCNs with high-dynamic nodes and long-distance links can not provide the above conditions,which makes the performance of OBDP hard to be intuitively measured.To bridge this gap,a multidimensional simulation platform is indispensable that can simulate the network environment of LMCNs and put BDAs in it for performance testing.Using STK's APIs and parallel computing framework,we achieve real-time simulation for thousands of satellite nodes,which are mapped as application nodes through software defined network(SDN)and container technologies.We elaborate the architecture and mechanism of the simulation platform,and take the Starlink and Hadoop as realistic examples for simulations.The results indicate that LMCNs have dynamic end-to-end latency which fluctuates periodically with the constellation movement.Compared to ground data center networks(GDCNs),LMCNs deteriorate the computing and storage job throughput,which can be alleviated by the utilization of erasure codes and data flow scheduling of worker nodes.展开更多
Multimodal sentiment analysis utilizes multimodal data such as text,facial expressions and voice to detect people’s attitudes.With the advent of distributed data collection and annotation,we can easily obtain and sha...Multimodal sentiment analysis utilizes multimodal data such as text,facial expressions and voice to detect people’s attitudes.With the advent of distributed data collection and annotation,we can easily obtain and share such multimodal data.However,due to professional discrepancies among annotators and lax quality control,noisy labels might be introduced.Recent research suggests that deep neural networks(DNNs)will overfit noisy labels,leading to the poor performance of the DNNs.To address this challenging problem,we present a Multimodal Robust Meta Learning framework(MRML)for multimodal sentiment analysis to resist noisy labels and correlate distinct modalities simultaneously.Specifically,we propose a two-layer fusion net to deeply fuse different modalities and improve the quality of the multimodal data features for label correction and network training.Besides,a multiple meta-learner(label corrector)strategy is proposed to enhance the label correction approach and prevent models from overfitting to noisy labels.We conducted experiments on three popular multimodal datasets to verify the superiority of ourmethod by comparing it with four baselines.展开更多
A new method of establishing rolling load distribution model was developed by online intelligent information-processing technology for plate rolling. The model combines knowledge model and mathematical model with usin...A new method of establishing rolling load distribution model was developed by online intelligent information-processing technology for plate rolling. The model combines knowledge model and mathematical model with using knowledge discovery in database (KDD) and data mining (DM) as the start. The online maintenance and optimization of the load model are realized. The effectiveness of this new method was testified by offline simulation and online application.展开更多
Traditional distribution network planning relies on the professional knowledge of planners,especially when analyzing the correlations between the problems existing in the network and the crucial influencing factors.Th...Traditional distribution network planning relies on the professional knowledge of planners,especially when analyzing the correlations between the problems existing in the network and the crucial influencing factors.The inherent laws reflected by the historical data of the distribution network are ignored,which affects the objectivity of the planning scheme.In this study,to improve the efficiency and accuracy of distribution network planning,the characteristics of distribution network data were extracted using a data-mining technique,and correlation knowledge of existing problems in the network was obtained.A data-mining model based on correlation rules was established.The inputs of the model were the electrical characteristic indices screened using the gray correlation method.The Apriori algorithm was used to extract correlation knowledge from the operational data of the distribution network and obtain strong correlation rules.Degree of promotion and chi-square tests were used to verify the rationality of the strong correlation rules of the model output.In this study,the correlation relationship between heavy load or overload problems of distribution network feeders in different regions and related characteristic indices was determined,and the confidence of the correlation rules was obtained.These results can provide an effective basis for the formulation of a distribution network planning scheme.展开更多
Considering that the measurement devices of the distribution network are becoming more and more abundant, on the basis of the traditional Supervisory Control And Data Acquisition (SCADA) measurement system, Phasor mea...Considering that the measurement devices of the distribution network are becoming more and more abundant, on the basis of the traditional Supervisory Control And Data Acquisition (SCADA) measurement system, Phasor measurement unit (PMU) devices are also gradually applied to the distribution network. So when estimating the state of the distribution network, the above two devices need to be used. However, because the data of different measurement systems are different, it is necessary to balance this difference so that the data of different systems can be compatible to achieve the purpose of effective utilization of the estimated power distribution state. To this end, this paper starts with three aspects of data accuracy of the two measurement systems, data time section and data refresh frequency to eliminate the differences between system data, and then considers the actual situation of the three-phase asymmetry of the distribution network. The three-phase state estimation equations are constructed by the branch current method, and finally the state estimation results are solved by the weighted least square method.展开更多
Type-I censoring mechanism arises when the number of units experiencing the event is random but the total duration of the study is fixed. There are a number of mathematical approaches developed to handle this type of ...Type-I censoring mechanism arises when the number of units experiencing the event is random but the total duration of the study is fixed. There are a number of mathematical approaches developed to handle this type of data. The purpose of the research was to estimate the three parameters of the Frechet distribution via the frequentist Maximum Likelihood and the Bayesian Estimators. In this paper, the maximum likelihood method (MLE) is not available of the three parameters in the closed forms;therefore, it was solved by the numerical methods. Similarly, the Bayesian estimators are implemented using Jeffreys and gamma priors with two loss functions, which are: squared error loss function and Linear Exponential Loss Function (LINEX). The parameters of the Frechet distribution via Bayesian cannot be obtained analytically and therefore Markov Chain Monte Carlo is used, where the full conditional distribution for the three parameters is obtained via Metropolis-Hastings algorithm. Comparisons of the estimators are obtained using Mean Square Errors (MSE) to determine the best estimator of the three parameters of the Frechet distribution. The results show that the Bayesian estimation under Linear Exponential Loss Function based on Type-I censored data is a better estimator for all the parameter estimates when the value of the loss parameter is positive.展开更多
In this review, we highlight some recent methodological and theoretical develop- ments in estimation and testing of large panel data models with cross-sectional dependence. The paper begins with a discussion of issues...In this review, we highlight some recent methodological and theoretical develop- ments in estimation and testing of large panel data models with cross-sectional dependence. The paper begins with a discussion of issues of cross-sectional dependence, and introduces the concepts of weak and strong cross-sectional dependence. Then, the main attention is primarily paid to spatial and factor approaches for modeling cross-sectional dependence for both linear and nonlinear (nonparametric and semiparametric) panel data models. Finally, we conclude with some speculations on future research directions.展开更多
Recent studies have pointed out the potential of the odd Fréchet family(or class)of continuous distributions in fitting data of all kinds.In this article,we propose an extension of this family through the so-cal...Recent studies have pointed out the potential of the odd Fréchet family(or class)of continuous distributions in fitting data of all kinds.In this article,we propose an extension of this family through the so-called“Topp-Leone strategy”,aiming to improve its overall flexibility by adding a shape parameter.The main objective is to offer original distributions with modifiable properties,from which adaptive and pliant statistical models can be derived.For the new family,these aspects are illustrated by the means of comprehensive mathematical and numerical results.In particular,we emphasize a special distribution with three parameters based on the exponential distribution.The related model is shown to be skillful to the fitting of various lifetime data,more or less heterogeneous.Among all the possible applications,we consider two data sets of current interest,linked to the COVID-19 pandemic.They concern daily cases confirmed and recovered in Pakistan from March 24 to April 28,2020.As a result of our analyzes,the proposed model has the best fitting results in comparison to serious challengers,including the former odd Fréchet model.展开更多
We analyze co-seismic displacement field of the 26 December 2004, giant Sumatra–Andaman earthquake derived from Global Position System observations,geological vertical measurement of coral head, and pivot line observ...We analyze co-seismic displacement field of the 26 December 2004, giant Sumatra–Andaman earthquake derived from Global Position System observations,geological vertical measurement of coral head, and pivot line observed through remote sensing. Using the co-seismic displacement field and AK135 spherical layered Earth model, we invert co-seismic slip distribution along the seismic fault. We also search the best fault geometry model to fit the observed data. Assuming that the dip angle linearly increases in downward direction, the postfit residual variation of the inversed geometry model with dip angles linearly changing along fault strike are plotted. The geometry model with local minimum misfits is the one with dip angle linearly increasing along strike from 4.3oin top southernmost patch to 4.5oin top northernmost path and dip angle linearly increased. By using the fault shape and geodetic co-seismic data, we estimate the slip distribution on the curved fault. Our result shows that the earthquake ruptured *200-km width down to a depth of about 60 km.0.5–12.5 m of thrust slip is resolved with the largest slip centered around the central section of the rupture zone78N–108N in latitude. The estimated seismic moment is8.2 9 1022 N m, which is larger than estimation from the centroid moment magnitude(4.0 9 1022 N m), and smaller than estimation from normal-mode oscillation data modeling(1.0 9 1023 N m).展开更多
In several fields like financial dealing,industry,business,medicine,et cetera,Big Data(BD)has been utilized extensively,which is nothing but a collection of a huge amount of data.However,it is highly complicated alon...In several fields like financial dealing,industry,business,medicine,et cetera,Big Data(BD)has been utilized extensively,which is nothing but a collection of a huge amount of data.However,it is highly complicated along with time-consuming to process a massive amount of data.Thus,to design the Distribution Preserving Framework for BD,a novel methodology has been proposed utilizing Manhattan Distance(MD)-centered Partition Around Medoid(MD–PAM)along with Conjugate Gradient Artificial Neural Network(CG-ANN),which undergoes various steps to reduce the complications of BD.Firstly,the data are processed in the pre-processing phase by mitigating the data repetition utilizing the map-reduce function;subsequently,the missing data are handled by substituting or by ignoring the missed values.After that,the data are transmuted into a normalized form.Next,to enhance the classification performance,the data’s dimensionalities are minimized by employing Gaussian Kernel(GK)-Fisher Discriminant Analysis(GK-FDA).Afterwards,the processed data is submitted to the partitioning phase after transmuting it into a structured format.In the partition phase,by utilizing the MD-PAM,the data are partitioned along with grouped into a cluster.Lastly,by employing CG-ANN,the data are classified in the classification phase so that the needed data can be effortlessly retrieved by the user.To analogize the outcomes of the CG-ANN with the prevailing methodologies,the NSL-KDD openly accessible datasets are utilized.The experiential outcomes displayed that an efficient result along with a reduced computation cost was shown by the proposed CG-ANN.The proposed work outperforms well in terms of accuracy,sensitivity and specificity than the existing systems.展开更多
The fitting of lifetime distribution in real-life data has been studied in various fields of research. With the theory of evolution still applicable, more complex data from real-world scenarios will continue to emerge...The fitting of lifetime distribution in real-life data has been studied in various fields of research. With the theory of evolution still applicable, more complex data from real-world scenarios will continue to emerge. Despite this, many researchers have made commendable efforts to develop new lifetime distributions that can fit this complex data. In this paper, we utilized the KM-transformation technique to increase the flexibility of the power Lindley distribution, resulting in the Kavya-Manoharan Power Lindley (KMPL) distribution. We study the mathematical treatments of the KMPL distribution in detail and adapt the widely used method of maximum likelihood to estimate the unknown parameters of the KMPL distribution. We carry out a Monte Carlo simulation study to investigate the performance of the Maximum Likelihood Estimates (MLEs) of the parameters of the KMPL distribution. To demonstrate the effectiveness of the KMPL distribution for data fitting, we use a real dataset comprising the waiting time of 100 bank customers. We compare the KMPL distribution with other models that are extensions of the power Lindley distribution. Based on some statistical model selection criteria, the summary results of the analysis were in favor of the KMPL distribution. We further investigate the density fit and probability-probability (p-p) plots to validate the superiority of the KMPL distribution over the competing distributions for fitting the waiting time dataset.展开更多
For a transaction processing system to operate effectively and efficiently in cloud environments, it is important to distribute huge amount of data while guaranteeing the ACID (atomic, consistent, isolated, and dura...For a transaction processing system to operate effectively and efficiently in cloud environments, it is important to distribute huge amount of data while guaranteeing the ACID (atomic, consistent, isolated, and durable) properties. Moreover, database partition and migration tools can help transplanting conventional relational database systems to the cloud environment rather than rebuilding a new system. This paper proposes a database distribution management (DBDM) system, which partitions or replicates the data according to the transaction behaviors of the application system. The principle strategy of DBDM is to keep together the data used in a single transaction, and thus, avoiding massive transmission of records in join operations. The proposed system has been implemented successfully. The preliminary experiments show that the DBDM performs the database partition and migration effectively. Also, the DBDM system is modularly designed to adapt to different database management system (DBMS) or different partition algorithms.展开更多
Consider the bivariate exponential distribution due to Marshall and Olkin[2], whose survival function is F(x, g) = exp[-λ1x-λ2y-λ12 max(x, y)] (x 0,y 0)with unknown Parameters λ1 > 0, λ2 > 0 and λ12 0.Base...Consider the bivariate exponential distribution due to Marshall and Olkin[2], whose survival function is F(x, g) = exp[-λ1x-λ2y-λ12 max(x, y)] (x 0,y 0)with unknown Parameters λ1 > 0, λ2 > 0 and λ12 0.Based on grouped data, a newestimator for λ1, λ2 and λ12 is derived and its asymptotic properties are discussed.Besides, some test procedures of equal marginals and independence are given. Asimulation result is given, too.展开更多
Integrating marketing and distribution businesses is crucial for improving the coordination of equipment and the efficient management of multi-energy systems.New energy sources are continuously being connected to dist...Integrating marketing and distribution businesses is crucial for improving the coordination of equipment and the efficient management of multi-energy systems.New energy sources are continuously being connected to distribution grids;this,however,increases the complexity of the information structure of marketing and distribution businesses.The existing unified data model and the coordinated application of marketing and distribution suffer from various drawbacks.As a solution,this paper presents a data model of"one graph of marketing and distribution"and a framework for graph computing,by analyzing the current trends of business and data in the marketing and distribution fields and using graph data theory.Specifically,this work aims to determine the correlation between distribution transformers and marketing users,which is crucial for elucidating the connection between marketing and distribution.In this manner,a novel identification algorithm is proposed based on the collected data for marketing and distribution.Lastly,a forecasting application is developed based on the proposed algorithm to realize the coordinated prediction and consumption of distributed photovoltaic power generation and distribution loads.Furthermore,an operation and maintenance(O&M)knowledge graph reasoning application is developed to improve the intelligent O&M ability of marketing and distribution equipment.展开更多
The main objective of this paper is to discuss a general family of distributions generated from the symmetrical arcsine distribution.The considered family includes various asymmetrical and symmetrical probability dist...The main objective of this paper is to discuss a general family of distributions generated from the symmetrical arcsine distribution.The considered family includes various asymmetrical and symmetrical probability distributions as special cases.A particular case of a symmetrical probability distribution from this family is the Arcsine–Gaussian distribution.Key statistical properties of this distribution including quantile,mean residual life,order statistics and moments are derived.The Arcsine–Gaussian parameters are estimated using two classical estimation methods called moments and maximum likelihood methods.A simulation study which provides asymptotic distribution of all considered point estimators,90%and 95%asymptotic confidence intervals are performed to examine the estimation efficiency of the considered methods numerically.The simulation results show that both biases and variances of the estimators tend to zero as the sample size increases,i.e.,the estimators are asymptotically consistent.Also,when the sample size increases the coverage probabilities of the confidence intervals increase to the nominal levels,while the corresponding length decrease and approach zero.Two real data sets from the medicine filed are used to illustrate the flexibility of the Arcsine–Gaussian distribution as compared with the normal,logistic,and Cauchy models.The proposed distribution is very versatile to fit real applications and can be used as a good alternative to the traditional gaussian distribution.展开更多
基金supported by the National Natural Science Foundation of China(No.42071057).
文摘The Qilian Mountains, a national key ecological function zone in Western China, play a pivotal role in ecosystem services. However, the distribution of its dominant tree species, Picea crassifolia (Qinghai spruce), has decreased dramatically in the past decades due to climate change and human activity, which may have influenced its ecological functions. To restore its ecological functions, reasonable reforestation is the key measure. Many previous efforts have predicted the potential distribution of Picea crassifolia, which provides guidance on regional reforestation policy. However, all of them were performed at low spatial resolution, thus ignoring the natural characteristics of the patchy distribution of Picea crassifolia. Here, we modeled the distribution of Picea crassifolia with species distribution models at high spatial resolutions. For many models, the area under the receiver operating characteristic curve (AUC) is larger than 0.9, suggesting their excellent precision. The AUC of models at 30 m is higher than that of models at 90 m, and the current potential distribution of Picea crassifolia is more closely aligned with its actual distribution at 30 m, demonstrating that finer data resolution improves model performance. Besides, for models at 90 m resolution, annual precipitation (Bio12) played the paramount influence on the distribution of Picea crassifolia, while the aspect became the most important one at 30 m, indicating the crucial role of finer topographic data in modeling species with patchy distribution. The current distribution of Picea crassifolia was concentrated in the northern and central parts of the study area, and this pattern will be maintained under future scenarios, although some habitat loss in the central parts and gain in the eastern regions is expected owing to increasing temperatures and precipitation. Our findings can guide protective and restoration strategies for the Qilian Mountains, which would benefit regional ecological balance.
基金The National Key Basic Research Program of China(973 Program)
文摘To improve data distribution efficiency a load-balancing data distribution LBDD method is proposed in publish/subscribe mode.In the LBDD method subscribers are involved in distribution tasks and data transfers while receiving data themselves.A dissemination tree is constructed among the subscribers based on MD5 where the publisher acts as the root. The proposed method provides bucket construction target selection and path updates furthermore the property of one-way dissemination is proven.That the average out-going degree of a node is 2 is guaranteed with the proposed LBDD.The experiments on data distribution delay data distribution rate and load distribution are conducted. Experimental results show that the LBDD method aids in shaping the task load between the publisher and subscribers and outperforms the point-to-point approach.
基金the National Natural Science Foundation of China (Nos. 40830959 and 40921004)the Ministry of Science and Technology of China (No. 2011BAC03B01)
文摘The existence of three well-defined tongue-shaped zones of swell dominance,termed as 'swell pools',in the Pacific,the Atlantic and the Indian Oceans,was reported by Chen et al.(2002)using satellite data.In this paper,the ECMWF Re-analyses wind wave data,including wind speed,significant wave height,averaged wave period and direction,are applied to verify the existence of these swell pools.The swell indices calculated from wave height,wave age and correlation coefficient are used to identify swell events.The wave age swell index can be more appropriately related to physical processes compared to the other two swell indices.Based on the ECMWF data the swell pools in the Pacific and the Atlantic Oceans are confirmed,but the expected swell pool in the Indian Ocean is not pronounced.The seasonal variations of global and hemispherical swell indices are investigated,and the argument that swells in the pools seemed to originate mostly from the winter hemisphere is supported by the seasonal variation of the averaged wave direction.The northward bending of the swell pools in the Pacific and the Atlantic Oceans in summer is not revealed by the ECMWF data.The swell pool in the Indian Ocean and the summer northward bending of the swell pools in the Pacific and the Atlan-tic Oceans need to be further verified by other datasets.
基金Under the auspices of the National Natural Science Foundation of China (No. 40571117), the Knowledge Innovation Program of Chinese Academy of Sciences (No. KZCX3-SW-338), Research foundation of the State Key Laboratory of Remote Sensing Science, Institute of Remote Sensing Applications, Chinese Academy of Sciences (KQ060006)
文摘Net Primary Productivity (NPP) is one of the important biophysical variables of vegetation activity, and it plays an important role in studying global carbon cycle, carbon source and sink of ecosystem, and spatial and temporal distribution of CO2. Remote sensing can provide broad view quickly, timely and multi-temporally, which makes it an attractive and powerful tool for studying ecosystem primary productivity, at scales ranging from local to global. This paper aims to use Moderate Resolution Imaging Spectroradiometer (MODIS) data to estimate and analyze spatial and temporal distribution of NPP of the northern Hebei Province in 2001 based on Carnegie-Ames-Stanford Approach (CASA) model. The spatial distribution of Absorbed Photosynthetically Active Radiation (APAR) of vegetation and light use efficiency in three geographical subregions, that is, Bashang Plateau Region, Basin Region in the northwestern Hebei Province and Yanshan Mountainous Region in the Northern Hebei Province were analyzed, and total NPP spatial distribution of the study area in 2001 was discussed. Based on 16-day MODIS Fraction of Photosynthetically Active Radiation absorbed by vegetation (FPAR) product, 16-day composite NPP dynamics were calculated using CASA model; the seasonal dynamics of vegetation NPP in three subreglons were also analyzed. Result reveals that the total NPP of the study area in 2001 was 25.1877 × 10^6gC/(m^2.a), and NPP in 2001 ranged from 2 to 608gC/(m^2-a), with an average of 337.516gC/(m^2.a). NPP of the study area in 2001 accumulated mainly from May to September (DOY 129-272), high NIP values appeared from June to August (DOY 177-204), and the maximum NPP appeared from late July to mid-August (DOY 209-224).
基金supported in part by the“Pioneer”and“Leading Goose”R&D Program of Zhejiang(Grant No.2022C03174)the National Natural Science Foundation of China(No.92067103)+4 种基金the Key Research and Development Program of Shaanxi,China(No.2021ZDLGY06-02)the Natural Science Foundation of Shaanxi Province(No.2019ZDLGY12-02)the Shaanxi Innovation Team Project(No.2018TD-007)the Xi'an Science and technology Innovation Plan(No.201809168CX9JC10)the Fundamental Research Funds for the Central Universities(No.YJS2212)and National 111 Program of China B16037.
文摘The security of Federated Learning(FL)/Distributed Machine Learning(DML)is gravely threatened by data poisoning attacks,which destroy the usability of the model by contaminating training samples,so such attacks are called causative availability indiscriminate attacks.Facing the problem that existing data sanitization methods are hard to apply to real-time applications due to their tedious process and heavy computations,we propose a new supervised batch detection method for poison,which can fleetly sanitize the training dataset before the local model training.We design a training dataset generation method that helps to enhance accuracy and uses data complexity features to train a detection model,which will be used in an efficient batch hierarchical detection process.Our model stockpiles knowledge about poison,which can be expanded by retraining to adapt to new attacks.Being neither attack-specific nor scenario-specific,our method is applicable to FL/DML or other online or offline scenarios.
基金supported by National Natural Sciences Foundation of China(No.62271165,62027802,62201307)the Guangdong Basic and Applied Basic Research Foundation(No.2023A1515030297)+2 种基金the Shenzhen Science and Technology Program ZDSYS20210623091808025Stable Support Plan Program GXWD20231129102638002the Major Key Project of PCL(No.PCL2024A01)。
文摘Due to the restricted satellite payloads in LEO mega-constellation networks(LMCNs),remote sensing image analysis,online learning and other big data services desirably need onboard distributed processing(OBDP).In existing technologies,the efficiency of big data applications(BDAs)in distributed systems hinges on the stable-state and low-latency links between worker nodes.However,LMCNs with high-dynamic nodes and long-distance links can not provide the above conditions,which makes the performance of OBDP hard to be intuitively measured.To bridge this gap,a multidimensional simulation platform is indispensable that can simulate the network environment of LMCNs and put BDAs in it for performance testing.Using STK's APIs and parallel computing framework,we achieve real-time simulation for thousands of satellite nodes,which are mapped as application nodes through software defined network(SDN)and container technologies.We elaborate the architecture and mechanism of the simulation platform,and take the Starlink and Hadoop as realistic examples for simulations.The results indicate that LMCNs have dynamic end-to-end latency which fluctuates periodically with the constellation movement.Compared to ground data center networks(GDCNs),LMCNs deteriorate the computing and storage job throughput,which can be alleviated by the utilization of erasure codes and data flow scheduling of worker nodes.
基金supported by STI 2030-Major Projects 2021ZD0200400National Natural Science Foundation of China(62276233 and 62072405)Key Research Project of Zhejiang Province(2023C01048).
文摘Multimodal sentiment analysis utilizes multimodal data such as text,facial expressions and voice to detect people’s attitudes.With the advent of distributed data collection and annotation,we can easily obtain and share such multimodal data.However,due to professional discrepancies among annotators and lax quality control,noisy labels might be introduced.Recent research suggests that deep neural networks(DNNs)will overfit noisy labels,leading to the poor performance of the DNNs.To address this challenging problem,we present a Multimodal Robust Meta Learning framework(MRML)for multimodal sentiment analysis to resist noisy labels and correlate distinct modalities simultaneously.Specifically,we propose a two-layer fusion net to deeply fuse different modalities and improve the quality of the multimodal data features for label correction and network training.Besides,a multiple meta-learner(label corrector)strategy is proposed to enhance the label correction approach and prevent models from overfitting to noisy labels.We conducted experiments on three popular multimodal datasets to verify the superiority of ourmethod by comparing it with four baselines.
文摘A new method of establishing rolling load distribution model was developed by online intelligent information-processing technology for plate rolling. The model combines knowledge model and mathematical model with using knowledge discovery in database (KDD) and data mining (DM) as the start. The online maintenance and optimization of the load model are realized. The effectiveness of this new method was testified by offline simulation and online application.
基金supported by the Science and Technology Project of China Southern Power Grid(GZHKJXM20210043-080041KK52210002).
文摘Traditional distribution network planning relies on the professional knowledge of planners,especially when analyzing the correlations between the problems existing in the network and the crucial influencing factors.The inherent laws reflected by the historical data of the distribution network are ignored,which affects the objectivity of the planning scheme.In this study,to improve the efficiency and accuracy of distribution network planning,the characteristics of distribution network data were extracted using a data-mining technique,and correlation knowledge of existing problems in the network was obtained.A data-mining model based on correlation rules was established.The inputs of the model were the electrical characteristic indices screened using the gray correlation method.The Apriori algorithm was used to extract correlation knowledge from the operational data of the distribution network and obtain strong correlation rules.Degree of promotion and chi-square tests were used to verify the rationality of the strong correlation rules of the model output.In this study,the correlation relationship between heavy load or overload problems of distribution network feeders in different regions and related characteristic indices was determined,and the confidence of the correlation rules was obtained.These results can provide an effective basis for the formulation of a distribution network planning scheme.
文摘Considering that the measurement devices of the distribution network are becoming more and more abundant, on the basis of the traditional Supervisory Control And Data Acquisition (SCADA) measurement system, Phasor measurement unit (PMU) devices are also gradually applied to the distribution network. So when estimating the state of the distribution network, the above two devices need to be used. However, because the data of different measurement systems are different, it is necessary to balance this difference so that the data of different systems can be compatible to achieve the purpose of effective utilization of the estimated power distribution state. To this end, this paper starts with three aspects of data accuracy of the two measurement systems, data time section and data refresh frequency to eliminate the differences between system data, and then considers the actual situation of the three-phase asymmetry of the distribution network. The three-phase state estimation equations are constructed by the branch current method, and finally the state estimation results are solved by the weighted least square method.
文摘Type-I censoring mechanism arises when the number of units experiencing the event is random but the total duration of the study is fixed. There are a number of mathematical approaches developed to handle this type of data. The purpose of the research was to estimate the three parameters of the Frechet distribution via the frequentist Maximum Likelihood and the Bayesian Estimators. In this paper, the maximum likelihood method (MLE) is not available of the three parameters in the closed forms;therefore, it was solved by the numerical methods. Similarly, the Bayesian estimators are implemented using Jeffreys and gamma priors with two loss functions, which are: squared error loss function and Linear Exponential Loss Function (LINEX). The parameters of the Frechet distribution via Bayesian cannot be obtained analytically and therefore Markov Chain Monte Carlo is used, where the full conditional distribution for the three parameters is obtained via Metropolis-Hastings algorithm. Comparisons of the estimators are obtained using Mean Square Errors (MSE) to determine the best estimator of the three parameters of the Frechet distribution. The results show that the Bayesian estimation under Linear Exponential Loss Function based on Type-I censored data is a better estimator for all the parameter estimates when the value of the loss parameter is positive.
基金Supported by the National Natural Science Foundation of China(71131008(Key Project)and 71271179)
文摘In this review, we highlight some recent methodological and theoretical develop- ments in estimation and testing of large panel data models with cross-sectional dependence. The paper begins with a discussion of issues of cross-sectional dependence, and introduces the concepts of weak and strong cross-sectional dependence. Then, the main attention is primarily paid to spatial and factor approaches for modeling cross-sectional dependence for both linear and nonlinear (nonparametric and semiparametric) panel data models. Finally, we conclude with some speculations on future research directions.
基金This work was funded by the Deanship of Scientific Research(DSR),King AbdulAziz University,Jeddah,under grant No.(G:550-247-1441).
文摘Recent studies have pointed out the potential of the odd Fréchet family(or class)of continuous distributions in fitting data of all kinds.In this article,we propose an extension of this family through the so-called“Topp-Leone strategy”,aiming to improve its overall flexibility by adding a shape parameter.The main objective is to offer original distributions with modifiable properties,from which adaptive and pliant statistical models can be derived.For the new family,these aspects are illustrated by the means of comprehensive mathematical and numerical results.In particular,we emphasize a special distribution with three parameters based on the exponential distribution.The related model is shown to be skillful to the fitting of various lifetime data,more or less heterogeneous.Among all the possible applications,we consider two data sets of current interest,linked to the COVID-19 pandemic.They concern daily cases confirmed and recovered in Pakistan from March 24 to April 28,2020.As a result of our analyzes,the proposed model has the best fitting results in comparison to serious challengers,including the former odd Fréchet model.
基金supported by the Special Fund of Fundamental Scientific Research Business Expense for Higher School of Central Government(Projects for creation teams ZY20110101)NSFC 41090294talent selection and training plan project of Hebei university
文摘We analyze co-seismic displacement field of the 26 December 2004, giant Sumatra–Andaman earthquake derived from Global Position System observations,geological vertical measurement of coral head, and pivot line observed through remote sensing. Using the co-seismic displacement field and AK135 spherical layered Earth model, we invert co-seismic slip distribution along the seismic fault. We also search the best fault geometry model to fit the observed data. Assuming that the dip angle linearly increases in downward direction, the postfit residual variation of the inversed geometry model with dip angles linearly changing along fault strike are plotted. The geometry model with local minimum misfits is the one with dip angle linearly increasing along strike from 4.3oin top southernmost patch to 4.5oin top northernmost path and dip angle linearly increased. By using the fault shape and geodetic co-seismic data, we estimate the slip distribution on the curved fault. Our result shows that the earthquake ruptured *200-km width down to a depth of about 60 km.0.5–12.5 m of thrust slip is resolved with the largest slip centered around the central section of the rupture zone78N–108N in latitude. The estimated seismic moment is8.2 9 1022 N m, which is larger than estimation from the centroid moment magnitude(4.0 9 1022 N m), and smaller than estimation from normal-mode oscillation data modeling(1.0 9 1023 N m).
文摘In several fields like financial dealing,industry,business,medicine,et cetera,Big Data(BD)has been utilized extensively,which is nothing but a collection of a huge amount of data.However,it is highly complicated along with time-consuming to process a massive amount of data.Thus,to design the Distribution Preserving Framework for BD,a novel methodology has been proposed utilizing Manhattan Distance(MD)-centered Partition Around Medoid(MD–PAM)along with Conjugate Gradient Artificial Neural Network(CG-ANN),which undergoes various steps to reduce the complications of BD.Firstly,the data are processed in the pre-processing phase by mitigating the data repetition utilizing the map-reduce function;subsequently,the missing data are handled by substituting or by ignoring the missed values.After that,the data are transmuted into a normalized form.Next,to enhance the classification performance,the data’s dimensionalities are minimized by employing Gaussian Kernel(GK)-Fisher Discriminant Analysis(GK-FDA).Afterwards,the processed data is submitted to the partitioning phase after transmuting it into a structured format.In the partition phase,by utilizing the MD-PAM,the data are partitioned along with grouped into a cluster.Lastly,by employing CG-ANN,the data are classified in the classification phase so that the needed data can be effortlessly retrieved by the user.To analogize the outcomes of the CG-ANN with the prevailing methodologies,the NSL-KDD openly accessible datasets are utilized.The experiential outcomes displayed that an efficient result along with a reduced computation cost was shown by the proposed CG-ANN.The proposed work outperforms well in terms of accuracy,sensitivity and specificity than the existing systems.
文摘The fitting of lifetime distribution in real-life data has been studied in various fields of research. With the theory of evolution still applicable, more complex data from real-world scenarios will continue to emerge. Despite this, many researchers have made commendable efforts to develop new lifetime distributions that can fit this complex data. In this paper, we utilized the KM-transformation technique to increase the flexibility of the power Lindley distribution, resulting in the Kavya-Manoharan Power Lindley (KMPL) distribution. We study the mathematical treatments of the KMPL distribution in detail and adapt the widely used method of maximum likelihood to estimate the unknown parameters of the KMPL distribution. We carry out a Monte Carlo simulation study to investigate the performance of the Maximum Likelihood Estimates (MLEs) of the parameters of the KMPL distribution. To demonstrate the effectiveness of the KMPL distribution for data fitting, we use a real dataset comprising the waiting time of 100 bank customers. We compare the KMPL distribution with other models that are extensions of the power Lindley distribution. Based on some statistical model selection criteria, the summary results of the analysis were in favor of the KMPL distribution. We further investigate the density fit and probability-probability (p-p) plots to validate the superiority of the KMPL distribution over the competing distributions for fitting the waiting time dataset.
基金supported by the Taiwan Ministry of Economic Affairs and Institute for Information Industry under the project titled "Fundamental Industrial Technology Development Program (1/4)"
文摘For a transaction processing system to operate effectively and efficiently in cloud environments, it is important to distribute huge amount of data while guaranteeing the ACID (atomic, consistent, isolated, and durable) properties. Moreover, database partition and migration tools can help transplanting conventional relational database systems to the cloud environment rather than rebuilding a new system. This paper proposes a database distribution management (DBDM) system, which partitions or replicates the data according to the transaction behaviors of the application system. The principle strategy of DBDM is to keep together the data used in a single transaction, and thus, avoiding massive transmission of records in join operations. The proposed system has been implemented successfully. The preliminary experiments show that the DBDM performs the database partition and migration effectively. Also, the DBDM system is modularly designed to adapt to different database management system (DBMS) or different partition algorithms.
文摘Consider the bivariate exponential distribution due to Marshall and Olkin[2], whose survival function is F(x, g) = exp[-λ1x-λ2y-λ12 max(x, y)] (x 0,y 0)with unknown Parameters λ1 > 0, λ2 > 0 and λ12 0.Based on grouped data, a newestimator for λ1, λ2 and λ12 is derived and its asymptotic properties are discussed.Besides, some test procedures of equal marginals and independence are given. Asimulation result is given, too.
基金This work was supported by the National Key R&D Program of China(2020YFB0905900).
文摘Integrating marketing and distribution businesses is crucial for improving the coordination of equipment and the efficient management of multi-energy systems.New energy sources are continuously being connected to distribution grids;this,however,increases the complexity of the information structure of marketing and distribution businesses.The existing unified data model and the coordinated application of marketing and distribution suffer from various drawbacks.As a solution,this paper presents a data model of"one graph of marketing and distribution"and a framework for graph computing,by analyzing the current trends of business and data in the marketing and distribution fields and using graph data theory.Specifically,this work aims to determine the correlation between distribution transformers and marketing users,which is crucial for elucidating the connection between marketing and distribution.In this manner,a novel identification algorithm is proposed based on the collected data for marketing and distribution.Lastly,a forecasting application is developed based on the proposed algorithm to realize the coordinated prediction and consumption of distributed photovoltaic power generation and distribution loads.Furthermore,an operation and maintenance(O&M)knowledge graph reasoning application is developed to improve the intelligent O&M ability of marketing and distribution equipment.
文摘The main objective of this paper is to discuss a general family of distributions generated from the symmetrical arcsine distribution.The considered family includes various asymmetrical and symmetrical probability distributions as special cases.A particular case of a symmetrical probability distribution from this family is the Arcsine–Gaussian distribution.Key statistical properties of this distribution including quantile,mean residual life,order statistics and moments are derived.The Arcsine–Gaussian parameters are estimated using two classical estimation methods called moments and maximum likelihood methods.A simulation study which provides asymptotic distribution of all considered point estimators,90%and 95%asymptotic confidence intervals are performed to examine the estimation efficiency of the considered methods numerically.The simulation results show that both biases and variances of the estimators tend to zero as the sample size increases,i.e.,the estimators are asymptotically consistent.Also,when the sample size increases the coverage probabilities of the confidence intervals increase to the nominal levels,while the corresponding length decrease and approach zero.Two real data sets from the medicine filed are used to illustrate the flexibility of the Arcsine–Gaussian distribution as compared with the normal,logistic,and Cauchy models.The proposed distribution is very versatile to fit real applications and can be used as a good alternative to the traditional gaussian distribution.