In this paper,by combining sampling methods for food statistics with years of sample sampling experience,various sampling points and corresponding sampling methods are summarized.It hopes to discover food safety risks...In this paper,by combining sampling methods for food statistics with years of sample sampling experience,various sampling points and corresponding sampling methods are summarized.It hopes to discover food safety risks and improve the level of food safety.展开更多
In this study, for accuracy and cost an optimal inventory method was examined and introduced to obtain information about Zagros forests, Iran. For this purpose,three distance sampling methods(compound, order distance ...In this study, for accuracy and cost an optimal inventory method was examined and introduced to obtain information about Zagros forests, Iran. For this purpose,three distance sampling methods(compound, order distance and random-pairs) in 5 inventory networks(100 m × 100 m, 100 m × 150 m, 100 m × 200 m,150 m × 150 m, 200 m × 200 m) were implemented in GIS environment, and the related statistical analyses were carried out. Average tree density and canopy cover in hectare with 100% inventory were compared to each other.All the studied methods were implemented in 30 inventory points, and the implementation time of each was recorded.According to the results, the best inventory methods for estimating density and canopy cover were compound150 m × 150 m and 100 m × 100 m methods, respectively. The minimum amount of product inventory time per second(T), and(E%)2 square percent of inventory error of sampling for the compound 150 m × 150 m method regarding density in hectare was 691.8, and for the compound 100 m × 100 m method regarding canopy of 12,089 ha. It can be concluded that compound method is the best for estimating density and canopy features of the forests area.展开更多
The accuracy of spatial interpolation of precipitation data is determined by the actual spatial variability of the precipitation, the interpolation method, and the distribution of observatories whose selections are pa...The accuracy of spatial interpolation of precipitation data is determined by the actual spatial variability of the precipitation, the interpolation method, and the distribution of observatories whose selections are particularly important. In this paper, three spatial sampling programs, including spatial random sampling, spatial stratified sampling, and spatial sandwich sampling, are used to analyze the data from meteorological stations of northwestern China. We compared the accuracy of ordinary Kriging interpolation methods on the basis of the sampling results. The error values of the regional annual pre-cipitation interpolation based on spatial sandwich sampling, including ME (0.1513), RMSE (95.91), ASE (101.84), MSE (?0.0036), and RMSSE (1.0397), were optimal under the premise of abundant prior knowledge. The result of spatial stratified sampling was poor, and spatial random sampling was even worse. Spatial sandwich sampling was the best sampling method, which minimized the error of regional precipitation estimation. It had a higher degree of accuracy compared with the other two methods and a wider scope of application.展开更多
Graph convolutional networks(GCNs)have received significant attention from various research fields due to the excellent performance in learning graph representations.Although GCN performs well compared with other meth...Graph convolutional networks(GCNs)have received significant attention from various research fields due to the excellent performance in learning graph representations.Although GCN performs well compared with other methods,it still faces challenges.Training a GCN model for large-scale graphs in a conventional way requires high computation and storage costs.Therefore,motivated by an urgent need in terms of efficiency and scalability in training GCN,sampling methods have been proposed and achieved a significant effect.In this paper,we categorize sampling methods based on the sampling mechanisms and provide a comprehensive survey of sampling methods for efficient training of GCN.To highlight the characteristics and differences of sampling methods,we present a detailed comparison within each category and further give an overall comparative analysis for the sampling methods in all categories.Finally,we discuss some challenges and future research directions of the sampling methods.展开更多
Uniform linear array(ULA)radars are widely used in the collision-avoidance radar systems of small unmanned aerial vehicles(UAVs).In practice,a ULA's multi-target direction of arrival(DOA)estimation performance suf...Uniform linear array(ULA)radars are widely used in the collision-avoidance radar systems of small unmanned aerial vehicles(UAVs).In practice,a ULA's multi-target direction of arrival(DOA)estimation performance suffers from significant performance degradation owing to the limited number of physical elements.To improve the underdetermined DOA estimation performance of a ULA radar mounted on a small UAV platform,we propose a nonuniform linear motion sampling underdetermined DOA estimation method.Using the motion of the UAV platform,the echo signal is sampled at different positions.Then,according to the concept of difference co-array,a virtual ULA with multiple array elements and a large aperture is synthesized to increase the degrees of freedom(DOFs).Through position analysis of the original and motion arrays,we propose a nonuniform linear motion sampling method based on ULA for determining the optimal DOFs.Under the condition of no increase in the aperture of the physical array,the proposed method obtains a high DOF with fewer sampling runs and greatly improves the underdetermined DOA estimation performance of ULA.The results of numerical simulations conducted herein verify the superior performance of the proposed method.展开更多
The laboratories in the bauxite processing industry are always under a heavy workload of sample collection, analysis, and compilation of the results. After size reduction from grinding mills, the samples of bauxite ar...The laboratories in the bauxite processing industry are always under a heavy workload of sample collection, analysis, and compilation of the results. After size reduction from grinding mills, the samples of bauxite are collected after intervals of 3 to 4 hours. Large bauxite processing industries producing 1 million tons of pure aluminium can have three grinding mills. Thus, the total number of samples to be tested in one day reaches a figure of 18 to 24. The sample of bauxite ore coming from the grinding mill is tested for its particle size and composition. For testing the composition, the bauxite ore sample is first prepared by fusing it with X-ray flux. Then the sample is sent for X-ray fluorescence analysis. Afterwards, the crucibles are washed in ultrasonic baths to be used for the next testing. The whole procedure takes about 2 - 3 hours. With a large number of samples reaching the laboratory, the chances of error in composition analysis increase. In this study, we have used a composite sampling methodology to reduce the number of samples reaching the laboratory without compromising their validity. The results of the average composition of fifteen samples were measured against composite samples. The mean of difference was calculated. The standard deviation and paired t-test values were evaluated against predetermined critical values obtained using a two-tailed test. It was found from the results that paired test-t values were much lower than the critical values thus validating the composition attained through composite sampling. The composite sampling approach not only reduced the number of samples but also the chemicals used in the laboratory. The objective of improved analytical protocol to reduce the number of samples reaching the laboratory was successfully achieved without compromising the quality of analytical results.展开更多
We consider the interior inverse scattering problem for recovering the shape of a penetrable partially coated cavity with external obstacles from the knowledge of measured scattered waves due to point sources.In the f...We consider the interior inverse scattering problem for recovering the shape of a penetrable partially coated cavity with external obstacles from the knowledge of measured scattered waves due to point sources.In the first part,we obtain the well-posedness of the direct scattering problem by the variational method.In the second part,we establish the mathematical basis of the linear sampling method to recover both the shape of the cavity,and the shape of the external obstacle,however the exterior transmission eigenvalue problem also plays a key role in the discussion of this paper.展开更多
Accelerating materials discovery crucially relies on strategies that efficiently sample the search space to label a pool of unlabeled data.This is important if the available labeled data sets are relatively small comp...Accelerating materials discovery crucially relies on strategies that efficiently sample the search space to label a pool of unlabeled data.This is important if the available labeled data sets are relatively small compared to the unlabeled data pool.Active learning with efficient sampling methods provides the means to guide the decision making to minimize the number of experiments or iterations required to find targeted properties.We review here different sampling strategies and show how they are utilized within an active learning loop in materials science.展开更多
Land use and cover change(LUCC)is the most direct manifestation of the interaction between anthropological activities and the natural environment on Earth's surface,with significant impacts on the environment and ...Land use and cover change(LUCC)is the most direct manifestation of the interaction between anthropological activities and the natural environment on Earth's surface,with significant impacts on the environment and social economy.Rapid economic development and climate change have resulted in significant changes in land use and cover.The Shiyang River Basin,located in the eastern part of the Hexi Corridor in China,has undergone significant climate change and LUCC over the past few decades.In this study,we used the random forest classification to obtain the land use and cover datasets of the Shiyang River Basin in 1991,1995,2000,2005,2010,2015,and 2020 based on Landsat images.We validated the land use and cover data in 2015 from the random forest classification results(this study),the high-resolution dataset of annual global land cover from 2000 to 2015(AGLC-2000-2015),the global 30 m land cover classification with a fine classification system(GLC_FCS30),and the first Landsat-derived annual China Land Cover Dataset(CLCD)against ground-truth classification results to evaluate the accuracy of the classification results in this study.Furthermore,we explored and compared the spatiotemporal patterns of LUCC in the upper,middle,and lower reaches of the Shiyang River Basin over the past 30 years,and employed the random forest importance ranking method to analyze the influencing factors of LUCC based on natural(evapotranspiration,precipitation,temperature,and surface soil moisture)and anthropogenic(nighttime light,gross domestic product(GDP),and population)factors.The results indicated that the random forest classification results for land use and cover in the Shiyang River Basin in 2015 outperformed the AGLC-2000-2015,GLC_FCS30,and CLCD datasets in both overall and partial validations.Moreover,the classification results in this study exhibited a high level of agreement with the ground truth features.From 1991 to 2020,the area of bare land exhibited a decreasing trend,with changes primarily occurring in the middle and lower reaches of the basin.The area of grassland initially decreased and then increased,with changes occurring mainly in the upper and middle reaches of the basin.In contrast,the area of cropland initially increased and then decreased,with changes occurring in the middle and lower reaches.The LUCC was influenced by both natural and anthropogenic factors.Climatic factors and population contributed significantly to LUCC,and the importance values of evapotranspiration,precipitation,temperature,and population were 22.12%,32.41%,21.89%,and 19.65%,respectively.Moreover,policy interventions also played an important role.Land use and cover in the Shiyang River Basin exhibited fluctuating changes over the past 30 years,with the ecological environment improving in the last 10 years.This suggests that governance efforts in the study area have had some effects,and the government can continue to move in this direction in the future.The findings can provide crucial insights for related research and regional sustainable development in the Shiyang River Basin and other similar arid and semi-arid areas.展开更多
Based on the observation of importance sampling and second order information about the failure surface of a structure, an importance sampling region is defined in V-space which is obtained by rotating a U-space at the...Based on the observation of importance sampling and second order information about the failure surface of a structure, an importance sampling region is defined in V-space which is obtained by rotating a U-space at the point of maximum likelihood. The sampling region is a hyper-ellipsoid that consists of the sampling ellipse on each plane of main curvature in V-space. Thus, the sampling probability density function can be constructed by the sampling region center and ellipsoid axes. Several examples have shown the efficiency and generality of this method.展开更多
In this paper, we consider the inverse scattering by chiral obstacle in electromagnetic fields, and prove that the linear sampling method is also effective to determine the support of a chiral obstacle from the noisy ...In this paper, we consider the inverse scattering by chiral obstacle in electromagnetic fields, and prove that the linear sampling method is also effective to determine the support of a chiral obstacle from the noisy far field data.展开更多
The quantitative characterization of the full-field stress and displacement is significant for analyzing the failure and instability of engineering materials.Various optical measurement techniques such as photoelastic...The quantitative characterization of the full-field stress and displacement is significant for analyzing the failure and instability of engineering materials.Various optical measurement techniques such as photoelasticity,moiréand digital image correlation methods have been developed to achieve this goal.However,these methods are difficult to incorporate to determine the stress and displacement fields simultaneously because the tested models must contain particles and grating for displacement measurement;however,these elements will disturb the light passing through the tested models using photoelasticity.In this study,by combining photoelasticity and the sampling moirémethod,we developed a method to determine the stress and displacement fields simultaneously in a three-dimensional(3D)-printed photoelastic model with orthogonal grating.Then,the full-field stress was determined by analyzing 10 photoelastic patterns,and the displacement fields were calculated using the sampling moirémethod.The results indicate that the developed method can simultaneously determine the stress and displacement fields.展开更多
Based on the topological analysis of three-phase matrix AC to AC conversion circuit, an AC to AC nine-switch matrix isequivalent to rectification part and conversion part. The Matrix converter can be viewed as AC-DC-A...Based on the topological analysis of three-phase matrix AC to AC conversion circuit, an AC to AC nine-switch matrix isequivalent to rectification part and conversion part. The Matrix converter can be viewed as AC-DC-AC converter, the asymmetricregular sampling method SPWM(Sine Pulse Width Modulation) is studied and applied in the three-phase matrix AC to AC converter,Based on Matlab/simulink the simulation of the matrix converter with such strategy is carried out. Inductive load simulation is carriedout on the matrix converter prototype. The simulation results verify the workability of the asymmetric regular sampling method SPWMstrategy for matrix converter.展开更多
The spacecraequipment layout optimization design(SELOD)problems with complicated performance con-straints and diversity are studied in this paper.The previous literature uses the gradient-based algorithm to obtain op...The spacecraequipment layout optimization design(SELOD)problems with complicated performance con-straints and diversity are studied in this paper.The previous literature uses the gradient-based algorithm to obtain optimized non-overlap layout schemes from randomly initialized cases eectively.However,these local optimal solutions are too dicult to jump out of their current relative geometry relationships,signicantly limiting their further improvement in performance indicators.Therefore,considering the geometric diversity of layout schemes is put forward to alleviate this limitation.First,similarity measures,including modied cosine similarity and gaussian kernel function similarity,are introduced into the layout optimization process.Then the optimization produces a set of feasible layout candidates with the most remarkable dierence in geometric distribution and the most representative schemes are sampled.Finally,these feasible geometric solutions are used as initial solutions to optimize the physical performance indicators of the spacecra,and diversied layout schemes of spacecraequipment are generated for the engineering practice.The validity and eectiveness of the proposed methodology are demonstrated by two SELOD applications.展开更多
The quantitative determination of heavy metals in aquatic products is of great importance for food security issues.Laser-induced breakdown spectroscopy(LIBS)has been used in a variety of foodstuff analysis,but is stil...The quantitative determination of heavy metals in aquatic products is of great importance for food security issues.Laser-induced breakdown spectroscopy(LIBS)has been used in a variety of foodstuff analysis,but is still limited by its low sensitivity when targeting trace heavy metals.In this work,we compare three sample enrichment methods,namely drying,carbonization,and ashing,for increasing detection sensitivity by LIBS analysis for Pb and Cr in oyster samples.The results demonstrate that carbonization can remove a significant amount of the contributions of organic elements C,H,N and O;meanwhile,the signals of the metallic elements such as Cu,Pb,Sr,Ca,Cr and Mg are enhanced by3–6 times after carbonization,and further enhanced by 5–9 times after ashing.Such enhancement is not only due to the more concentrated metallic elements in the sample compared to the dried ones,but also the unifying of the matter in carbonized and ashed samples from which higher plasma temperature and electron density are observed.This condition favors the detection of trace elements.According to the calibration curves with univariate and multivariate analysis,the ashing method is considered to be the best choice.The limits of detection of the ashing method are 0.52 mg kg-1 for Pb and0.08 mg kg-1 for Cr,which can detect the presence of heavy metals in the oysters exceeding the maximum limits of Pb and Cr required by the Chinese national standard.This method provides a promising application for the heavy metal contamination monitoring in the aquatic product industry.展开更多
Motif-based graph local clustering(MGLC)algorithms are gen-erally designed with the two-phase framework,which gets the motif weight for each edge beforehand and then conducts the local clustering algorithm on the weig...Motif-based graph local clustering(MGLC)algorithms are gen-erally designed with the two-phase framework,which gets the motif weight for each edge beforehand and then conducts the local clustering algorithm on the weighted graph to output the result.Despite correctness,this frame-work brings limitations on both practical and theoretical aspects and is less applicable in real interactive situations.This research develops a purely local and index-adaptive method,Index-adaptive Triangle-based Graph Local Clustering(TGLC+),to solve the MGLC problem w.r.t.triangle.TGLC+combines the approximated Monte-Carlo method Triangle-based Random Walk(TRW)and deterministic Brute-Force method Triangle-based Forward Push(TFP)adaptively to estimate the Personalized PageRank(PPR)vector without calculating the exact triangle-weighted transition probability and then outputs the clustering result by conducting the standard sweep procedure.This paper presents the efficiency of TGLC+through theoretical analysis and demonstrates its effectiveness through extensive experiments.To our knowl-edge,TGLC+is the first to solve the MGLC problem without computing the motif weight beforehand,thus achieving better efficiency with comparable effectiveness.TGLC+is suitable for large-scale and interactive graph analysis tasks,including visualization,system optimization,and decision-making.展开更多
A new method has recently developed,the solution sampling method(SMM),to quantify the binary rubber blends of NR/BR by pyrolysis/gas chromatography/mass spectrometry(PY/GC/MS).The rubbers were swelled in the cyclohexa...A new method has recently developed,the solution sampling method(SMM),to quantify the binary rubber blends of NR/BR by pyrolysis/gas chromatography/mass spectrometry(PY/GC/MS).The rubbers were swelled in the cyclohexane,using the hydrogen peroxide solution to destroy the cross-linked structure of the rubbers and then the rubbers can be sloved in the cyclohexane.The rubber solution was applied to the spiral section of the injector.The solvent was evaporated depositing the rubber on the spiral section,and the rubber was prolyzed at 550℃.Then calculate the weight percentage by the characteristic peak area ratio.One test rubber,NR/BR where the weight percentage of NR is 50% was calculated.The error was 5.64%.展开更多
Purpose: This paper aims to improve the classification performance when the data is imbalanced by applying different sampling techniques available in Machine Learning.Design/methodology/approach: The medical appointme...Purpose: This paper aims to improve the classification performance when the data is imbalanced by applying different sampling techniques available in Machine Learning.Design/methodology/approach: The medical appointment no-show dataset is imbalanced, and when classification algorithms are applied directly to the dataset, it is biased towards the majority class, ignoring the minority class. To avoid this issue, multiple sampling techniques such as Random Over Sampling(ROS), Random Under Sampling(RUS), Synthetic Minority Oversampling TEchnique(SMOTE), ADAptive SYNthetic Sampling(ADASYN), Edited Nearest Neighbor(ENN), and Condensed Nearest Neighbor(CNN) are applied in order to make the dataset balanced. The performance is assessed by the Decision Tree classifier with the listed sampling techniques and the best performance is identified.Findings: This study focuses on the comparison of the performance metrics of various sampling methods widely used. It is revealed that, compared to other techniques, the Recall is high when ENN is applied CNN and ADASYN have performed equally well on the Imbalanced data.Research limitations: The testing was carried out with limited dataset and needs to be tested with a larger dataset.Practical implications: This framework will be useful whenever the data is imbalanced in real world scenarios, which ultimately improves the performance.Originality/value: This paper uses the rebalancing framework on medical appointment no-show dataset to predict the no-shows and removes the bias towards minority class.展开更多
文摘In this paper,by combining sampling methods for food statistics with years of sample sampling experience,various sampling points and corresponding sampling methods are summarized.It hopes to discover food safety risks and improve the level of food safety.
文摘In this study, for accuracy and cost an optimal inventory method was examined and introduced to obtain information about Zagros forests, Iran. For this purpose,three distance sampling methods(compound, order distance and random-pairs) in 5 inventory networks(100 m × 100 m, 100 m × 150 m, 100 m × 200 m,150 m × 150 m, 200 m × 200 m) were implemented in GIS environment, and the related statistical analyses were carried out. Average tree density and canopy cover in hectare with 100% inventory were compared to each other.All the studied methods were implemented in 30 inventory points, and the implementation time of each was recorded.According to the results, the best inventory methods for estimating density and canopy cover were compound150 m × 150 m and 100 m × 100 m methods, respectively. The minimum amount of product inventory time per second(T), and(E%)2 square percent of inventory error of sampling for the compound 150 m × 150 m method regarding density in hectare was 691.8, and for the compound 100 m × 100 m method regarding canopy of 12,089 ha. It can be concluded that compound method is the best for estimating density and canopy features of the forests area.
基金conducted within the National Major Scientific Research Project (No. 2013CBA01806)the National Natural Science Foundation of China (No. 41271085)the National Scientific and Technological Support Project (No. 2013BAB05B03)
文摘The accuracy of spatial interpolation of precipitation data is determined by the actual spatial variability of the precipitation, the interpolation method, and the distribution of observatories whose selections are particularly important. In this paper, three spatial sampling programs, including spatial random sampling, spatial stratified sampling, and spatial sandwich sampling, are used to analyze the data from meteorological stations of northwestern China. We compared the accuracy of ordinary Kriging interpolation methods on the basis of the sampling results. The error values of the regional annual pre-cipitation interpolation based on spatial sandwich sampling, including ME (0.1513), RMSE (95.91), ASE (101.84), MSE (?0.0036), and RMSSE (1.0397), were optimal under the premise of abundant prior knowledge. The result of spatial stratified sampling was poor, and spatial random sampling was even worse. Spatial sandwich sampling was the best sampling method, which minimized the error of regional precipitation estimation. It had a higher degree of accuracy compared with the other two methods and a wider scope of application.
基金supported by the National Natural Science Foundation of China(61732018,61872335,61802367,61876215)the Strategic Priority Research Program of Chinese Academy of Sciences(XDC05000000)+1 种基金Beijing Academy of Artificial Intelligence(BAAI),the Open Project Program of the State Key Laboratory of Mathematical Engineering and Advanced Computing(2019A07)the Open Project of Zhejiang Laboratory,and a grant from the Institute for Guo Qiang,Tsinghua University.Recommended by Associate Editor Long Chen.
文摘Graph convolutional networks(GCNs)have received significant attention from various research fields due to the excellent performance in learning graph representations.Although GCN performs well compared with other methods,it still faces challenges.Training a GCN model for large-scale graphs in a conventional way requires high computation and storage costs.Therefore,motivated by an urgent need in terms of efficiency and scalability in training GCN,sampling methods have been proposed and achieved a significant effect.In this paper,we categorize sampling methods based on the sampling mechanisms and provide a comprehensive survey of sampling methods for efficient training of GCN.To highlight the characteristics and differences of sampling methods,we present a detailed comparison within each category and further give an overall comparative analysis for the sampling methods in all categories.Finally,we discuss some challenges and future research directions of the sampling methods.
基金National Natural Science Foundation of China(61973037)National 173 Program Project(2019-JCJQ-ZD-324)。
文摘Uniform linear array(ULA)radars are widely used in the collision-avoidance radar systems of small unmanned aerial vehicles(UAVs).In practice,a ULA's multi-target direction of arrival(DOA)estimation performance suffers from significant performance degradation owing to the limited number of physical elements.To improve the underdetermined DOA estimation performance of a ULA radar mounted on a small UAV platform,we propose a nonuniform linear motion sampling underdetermined DOA estimation method.Using the motion of the UAV platform,the echo signal is sampled at different positions.Then,according to the concept of difference co-array,a virtual ULA with multiple array elements and a large aperture is synthesized to increase the degrees of freedom(DOFs).Through position analysis of the original and motion arrays,we propose a nonuniform linear motion sampling method based on ULA for determining the optimal DOFs.Under the condition of no increase in the aperture of the physical array,the proposed method obtains a high DOF with fewer sampling runs and greatly improves the underdetermined DOA estimation performance of ULA.The results of numerical simulations conducted herein verify the superior performance of the proposed method.
文摘The laboratories in the bauxite processing industry are always under a heavy workload of sample collection, analysis, and compilation of the results. After size reduction from grinding mills, the samples of bauxite are collected after intervals of 3 to 4 hours. Large bauxite processing industries producing 1 million tons of pure aluminium can have three grinding mills. Thus, the total number of samples to be tested in one day reaches a figure of 18 to 24. The sample of bauxite ore coming from the grinding mill is tested for its particle size and composition. For testing the composition, the bauxite ore sample is first prepared by fusing it with X-ray flux. Then the sample is sent for X-ray fluorescence analysis. Afterwards, the crucibles are washed in ultrasonic baths to be used for the next testing. The whole procedure takes about 2 - 3 hours. With a large number of samples reaching the laboratory, the chances of error in composition analysis increase. In this study, we have used a composite sampling methodology to reduce the number of samples reaching the laboratory without compromising their validity. The results of the average composition of fifteen samples were measured against composite samples. The mean of difference was calculated. The standard deviation and paired t-test values were evaluated against predetermined critical values obtained using a two-tailed test. It was found from the results that paired test-t values were much lower than the critical values thus validating the composition attained through composite sampling. The composite sampling approach not only reduced the number of samples but also the chemicals used in the laboratory. The objective of improved analytical protocol to reduce the number of samples reaching the laboratory was successfully achieved without compromising the quality of analytical results.
基金supported by the Natural Science Foundation of Xinjiang Uygur Autonomous Region of China(2019D01A05)supported by the NSFC(11571132)。
文摘We consider the interior inverse scattering problem for recovering the shape of a penetrable partially coated cavity with external obstacles from the knowledge of measured scattered waves due to point sources.In the first part,we obtain the well-posedness of the direct scattering problem by the variational method.In the second part,we establish the mathematical basis of the linear sampling method to recover both the shape of the cavity,and the shape of the external obstacle,however the exterior transmission eigenvalue problem also plays a key role in the discussion of this paper.
基金the National Key Research and Development Program of China(Grant No.2017YFB0702401)the National Natural Science Foundation of China(Grant Nos.51571156,51671157,51621063,and 51931004).
文摘Accelerating materials discovery crucially relies on strategies that efficiently sample the search space to label a pool of unlabeled data.This is important if the available labeled data sets are relatively small compared to the unlabeled data pool.Active learning with efficient sampling methods provides the means to guide the decision making to minimize the number of experiments or iterations required to find targeted properties.We review here different sampling strategies and show how they are utilized within an active learning loop in materials science.
基金supported by the Central Government to Guide Local Technological Development(23ZYQH0298)the Science and Technology Project of Gansu Province(20JR10RA656,22JR5RA416)the Science and Technology Project of Wuwei City(WW2202YFS006).
文摘Land use and cover change(LUCC)is the most direct manifestation of the interaction between anthropological activities and the natural environment on Earth's surface,with significant impacts on the environment and social economy.Rapid economic development and climate change have resulted in significant changes in land use and cover.The Shiyang River Basin,located in the eastern part of the Hexi Corridor in China,has undergone significant climate change and LUCC over the past few decades.In this study,we used the random forest classification to obtain the land use and cover datasets of the Shiyang River Basin in 1991,1995,2000,2005,2010,2015,and 2020 based on Landsat images.We validated the land use and cover data in 2015 from the random forest classification results(this study),the high-resolution dataset of annual global land cover from 2000 to 2015(AGLC-2000-2015),the global 30 m land cover classification with a fine classification system(GLC_FCS30),and the first Landsat-derived annual China Land Cover Dataset(CLCD)against ground-truth classification results to evaluate the accuracy of the classification results in this study.Furthermore,we explored and compared the spatiotemporal patterns of LUCC in the upper,middle,and lower reaches of the Shiyang River Basin over the past 30 years,and employed the random forest importance ranking method to analyze the influencing factors of LUCC based on natural(evapotranspiration,precipitation,temperature,and surface soil moisture)and anthropogenic(nighttime light,gross domestic product(GDP),and population)factors.The results indicated that the random forest classification results for land use and cover in the Shiyang River Basin in 2015 outperformed the AGLC-2000-2015,GLC_FCS30,and CLCD datasets in both overall and partial validations.Moreover,the classification results in this study exhibited a high level of agreement with the ground truth features.From 1991 to 2020,the area of bare land exhibited a decreasing trend,with changes primarily occurring in the middle and lower reaches of the basin.The area of grassland initially decreased and then increased,with changes occurring mainly in the upper and middle reaches of the basin.In contrast,the area of cropland initially increased and then decreased,with changes occurring in the middle and lower reaches.The LUCC was influenced by both natural and anthropogenic factors.Climatic factors and population contributed significantly to LUCC,and the importance values of evapotranspiration,precipitation,temperature,and population were 22.12%,32.41%,21.89%,and 19.65%,respectively.Moreover,policy interventions also played an important role.Land use and cover in the Shiyang River Basin exhibited fluctuating changes over the past 30 years,with the ecological environment improving in the last 10 years.This suggests that governance efforts in the study area have had some effects,and the government can continue to move in this direction in the future.The findings can provide crucial insights for related research and regional sustainable development in the Shiyang River Basin and other similar arid and semi-arid areas.
文摘Based on the observation of importance sampling and second order information about the failure surface of a structure, an importance sampling region is defined in V-space which is obtained by rotating a U-space at the point of maximum likelihood. The sampling region is a hyper-ellipsoid that consists of the sampling ellipse on each plane of main curvature in V-space. Thus, the sampling probability density function can be constructed by the sampling region center and ellipsoid axes. Several examples have shown the efficiency and generality of this method.
文摘In this paper, we consider the inverse scattering by chiral obstacle in electromagnetic fields, and prove that the linear sampling method is also effective to determine the support of a chiral obstacle from the noisy far field data.
基金financial support from the National Natural Science Foundation of China(Nos.52004137,52121003,51727807,12032013 and 11972209)Fundamental Research Funds for the Central Universities(No.2022XJAQ01)。
文摘The quantitative characterization of the full-field stress and displacement is significant for analyzing the failure and instability of engineering materials.Various optical measurement techniques such as photoelasticity,moiréand digital image correlation methods have been developed to achieve this goal.However,these methods are difficult to incorporate to determine the stress and displacement fields simultaneously because the tested models must contain particles and grating for displacement measurement;however,these elements will disturb the light passing through the tested models using photoelasticity.In this study,by combining photoelasticity and the sampling moirémethod,we developed a method to determine the stress and displacement fields simultaneously in a three-dimensional(3D)-printed photoelastic model with orthogonal grating.Then,the full-field stress was determined by analyzing 10 photoelastic patterns,and the displacement fields were calculated using the sampling moirémethod.The results indicate that the developed method can simultaneously determine the stress and displacement fields.
文摘Based on the topological analysis of three-phase matrix AC to AC conversion circuit, an AC to AC nine-switch matrix isequivalent to rectification part and conversion part. The Matrix converter can be viewed as AC-DC-AC converter, the asymmetricregular sampling method SPWM(Sine Pulse Width Modulation) is studied and applied in the three-phase matrix AC to AC converter,Based on Matlab/simulink the simulation of the matrix converter with such strategy is carried out. Inductive load simulation is carriedout on the matrix converter prototype. The simulation results verify the workability of the asymmetric regular sampling method SPWMstrategy for matrix converter.
基金supported by Aerospace Frontier Inspiration Project (Grant No.KY0505072113) from College of Aerospace Science and Engineering,NUDT,which are gratefully acknowledged by the authors.
文摘The spacecraequipment layout optimization design(SELOD)problems with complicated performance con-straints and diversity are studied in this paper.The previous literature uses the gradient-based algorithm to obtain optimized non-overlap layout schemes from randomly initialized cases eectively.However,these local optimal solutions are too dicult to jump out of their current relative geometry relationships,signicantly limiting their further improvement in performance indicators.Therefore,considering the geometric diversity of layout schemes is put forward to alleviate this limitation.First,similarity measures,including modied cosine similarity and gaussian kernel function similarity,are introduced into the layout optimization process.Then the optimization produces a set of feasible layout candidates with the most remarkable dierence in geometric distribution and the most representative schemes are sampled.Finally,these feasible geometric solutions are used as initial solutions to optimize the physical performance indicators of the spacecra,and diversied layout schemes of spacecraequipment are generated for the engineering practice.The validity and eectiveness of the proposed methodology are demonstrated by two SELOD applications.
基金supported by the National Key Research and Development Program of China(No.2019YFD0901701)National Natural Science Foundation of China(Nos.12174359and 61975190)Provincial Key Research and Development Program of Shandong,China(No.2019GHZ010)。
文摘The quantitative determination of heavy metals in aquatic products is of great importance for food security issues.Laser-induced breakdown spectroscopy(LIBS)has been used in a variety of foodstuff analysis,but is still limited by its low sensitivity when targeting trace heavy metals.In this work,we compare three sample enrichment methods,namely drying,carbonization,and ashing,for increasing detection sensitivity by LIBS analysis for Pb and Cr in oyster samples.The results demonstrate that carbonization can remove a significant amount of the contributions of organic elements C,H,N and O;meanwhile,the signals of the metallic elements such as Cu,Pb,Sr,Ca,Cr and Mg are enhanced by3–6 times after carbonization,and further enhanced by 5–9 times after ashing.Such enhancement is not only due to the more concentrated metallic elements in the sample compared to the dried ones,but also the unifying of the matter in carbonized and ashed samples from which higher plasma temperature and electron density are observed.This condition favors the detection of trace elements.According to the calibration curves with univariate and multivariate analysis,the ashing method is considered to be the best choice.The limits of detection of the ashing method are 0.52 mg kg-1 for Pb and0.08 mg kg-1 for Cr,which can detect the presence of heavy metals in the oysters exceeding the maximum limits of Pb and Cr required by the Chinese national standard.This method provides a promising application for the heavy metal contamination monitoring in the aquatic product industry.
基金supported by the Fundamental Research Funds for the Central Universities(No.2020JS005).
文摘Motif-based graph local clustering(MGLC)algorithms are gen-erally designed with the two-phase framework,which gets the motif weight for each edge beforehand and then conducts the local clustering algorithm on the weighted graph to output the result.Despite correctness,this frame-work brings limitations on both practical and theoretical aspects and is less applicable in real interactive situations.This research develops a purely local and index-adaptive method,Index-adaptive Triangle-based Graph Local Clustering(TGLC+),to solve the MGLC problem w.r.t.triangle.TGLC+combines the approximated Monte-Carlo method Triangle-based Random Walk(TRW)and deterministic Brute-Force method Triangle-based Forward Push(TFP)adaptively to estimate the Personalized PageRank(PPR)vector without calculating the exact triangle-weighted transition probability and then outputs the clustering result by conducting the standard sweep procedure.This paper presents the efficiency of TGLC+through theoretical analysis and demonstrates its effectiveness through extensive experiments.To our knowl-edge,TGLC+is the first to solve the MGLC problem without computing the motif weight beforehand,thus achieving better efficiency with comparable effectiveness.TGLC+is suitable for large-scale and interactive graph analysis tasks,including visualization,system optimization,and decision-making.
文摘A new method has recently developed,the solution sampling method(SMM),to quantify the binary rubber blends of NR/BR by pyrolysis/gas chromatography/mass spectrometry(PY/GC/MS).The rubbers were swelled in the cyclohexane,using the hydrogen peroxide solution to destroy the cross-linked structure of the rubbers and then the rubbers can be sloved in the cyclohexane.The rubber solution was applied to the spiral section of the injector.The solvent was evaporated depositing the rubber on the spiral section,and the rubber was prolyzed at 550℃.Then calculate the weight percentage by the characteristic peak area ratio.One test rubber,NR/BR where the weight percentage of NR is 50% was calculated.The error was 5.64%.
文摘Purpose: This paper aims to improve the classification performance when the data is imbalanced by applying different sampling techniques available in Machine Learning.Design/methodology/approach: The medical appointment no-show dataset is imbalanced, and when classification algorithms are applied directly to the dataset, it is biased towards the majority class, ignoring the minority class. To avoid this issue, multiple sampling techniques such as Random Over Sampling(ROS), Random Under Sampling(RUS), Synthetic Minority Oversampling TEchnique(SMOTE), ADAptive SYNthetic Sampling(ADASYN), Edited Nearest Neighbor(ENN), and Condensed Nearest Neighbor(CNN) are applied in order to make the dataset balanced. The performance is assessed by the Decision Tree classifier with the listed sampling techniques and the best performance is identified.Findings: This study focuses on the comparison of the performance metrics of various sampling methods widely used. It is revealed that, compared to other techniques, the Recall is high when ENN is applied CNN and ADASYN have performed equally well on the Imbalanced data.Research limitations: The testing was carried out with limited dataset and needs to be tested with a larger dataset.Practical implications: This framework will be useful whenever the data is imbalanced in real world scenarios, which ultimately improves the performance.Originality/value: This paper uses the rebalancing framework on medical appointment no-show dataset to predict the no-shows and removes the bias towards minority class.