Tarq geochemical 1:100,000 Sheet is located in Isfahan province which is investigated by Iran’s Geological and Explorations Organization using stream sediment analyzes. This area has stratigraphy of Precambrian to Qu...Tarq geochemical 1:100,000 Sheet is located in Isfahan province which is investigated by Iran’s Geological and Explorations Organization using stream sediment analyzes. This area has stratigraphy of Precambrian to Quaternary rocks and is located in the Central Iran zone. According to the presence of signs of gold mineralization in this area, it is necessary to identify important mineral areas in this area. Therefore, finding information is necessary about the relationship and monitoring the elements of gold, arsenic, and antimony relative to each other in this area to determine the extent of geochemical halos and to estimate the grade. Therefore, a well-known and useful K-means method is used for monitoring the elements in the present study, this is a clustering method based on minimizing the total Euclidean distances of each sample from the center of the classes which are assigned to them. In this research, the clustering quality function and the utility rate of the sample have been used in the desired cluster (S(i)) to determine the optimum number of clusters. Finally, with regard to the cluster centers and the results, the equations were used to predict the amount of the gold element based on four parameters of arsenic and antimony grade, length and width of sampling points.展开更多
The classification of the Northeast China Cold Vortex(NCCV)activity paths is an important way to analyze its characteristics in detail.Based on the daily precipitation data of the northeastern China(NEC)region,and the...The classification of the Northeast China Cold Vortex(NCCV)activity paths is an important way to analyze its characteristics in detail.Based on the daily precipitation data of the northeastern China(NEC)region,and the atmospheric circulation field and temperature field data of ERA-Interim for every six hours,the NCCV processes during the early summer(June)seasons from 1979 to 2018 were objectively identified.Then,the NCCV processes were classified using a machine learning method(k-means)according to the characteristic parameters of the activity path information.The rationality of the classification results was verified from two aspects,as follows:(1)the atmospheric circulation configuration of the NCCV on various paths;and(2)its influences on the climate conditions in the NEC.The obtained results showed that the activity paths of the NCCV could be divided into four types according to such characteristics as the generation origin,movement direction,and movement velocity of the NCCV.These included the generation-eastward movement type in the east of the Mongolia Plateau(eastward movement type or type A);generation-southeast longdistance movement type in the upstream of the Lena River(southeast long-distance movement type or type B);generationeastward less-movement type near Lake Baikal(eastward less-movement type or type C);and the generation-southward less-movement type in eastern Siberia(southward less-movement type or type D).There were obvious differences observed in the atmospheric circulation configuration and the climate impact of the NCCV on the four above-mentioned types of paths,which indicated that the classification results were reasonable.展开更多
Grade estimation is an important phase of mining projects, and one that is considered a challenge due in part to the structural complexities in mineral ore deposits.To overcome this challenge, various techniques have ...Grade estimation is an important phase of mining projects, and one that is considered a challenge due in part to the structural complexities in mineral ore deposits.To overcome this challenge, various techniques have been used in the past. This paper introduces an approach for estimating Au ore grades within a mining deposit using k-means and principal component analysis(PCA). The Khooni district was selected as the case study. This region is interesting geologically, in part because it is considered an important gold source. The study area is situated approximately 60km northeast of the Anarak city and 270km from Esfahan. Through PCA, we sought to understand the relationship between the elements of gold,arsenic, and antimony. Then, by clustering, the behavior of these elements was investigated. One of the most famous and efficient clustering methods is k-means, based on minimizing the total Euclidean distance from each class center. Using the combined results and characteristics of the cluster centers, the gold grade was determined with a correlation coefficient of 91%. An estimation equation for gold grade was derived based on four parameters: arsenic and antimony content, and length and width of the sampling points. The results demonstrate that this approach is faster and more accurate than existing methodologies for ore grade estimation.展开更多
In Zhu,Wang and Gao(SIAM J.Sci.Comput.,43(2021),pp.A3009–A3031),we proposed a new framework of troubled-cell indicator(TCI)using K-means clustering and the numerical results demonstrate that it can detect the trouble...In Zhu,Wang and Gao(SIAM J.Sci.Comput.,43(2021),pp.A3009–A3031),we proposed a new framework of troubled-cell indicator(TCI)using K-means clustering and the numerical results demonstrate that it can detect the troubled cells accurately using the KXRCF indication variable.The main advantage of this TCI framework is its great potential of extensibility.In this follow-up work,we introduce three more indication variables,i.e.,the TVB,Fu-Shu and cell-boundary jump indication variables,and show their good performance by numerical tests to demonstrate that the TCI framework offers great flexibility in the choice of indication variables.We also compare the three indication variables with the KXRCF one,and the numerical results favor the KXRCF and the cell-boundary jump indication variables.展开更多
In this paper, we propose a Fast Iteration Method for solving mixture regression problem, which can be treated as a model-based clustering. Compared to the EM algorithm, the proposed method is faster, more flexible an...In this paper, we propose a Fast Iteration Method for solving mixture regression problem, which can be treated as a model-based clustering. Compared to the EM algorithm, the proposed method is faster, more flexible and can solve mixture regression problem with different error distributions (i.e. Laplace and t distribution). Extensive numeric experiments show that our proposed method has better performance on randomly simulations and real data.展开更多
Internet services and web-based applications play pivotal roles in various sensitive domains, encompassing e-commerce, e-learning, e-healthcare, and e-payment. However, safeguarding these services poses a significant ...Internet services and web-based applications play pivotal roles in various sensitive domains, encompassing e-commerce, e-learning, e-healthcare, and e-payment. However, safeguarding these services poses a significant challenge, as the need for robust security measures becomes increasingly imperative. This paper presented an innovative method based on differential analyses to detect abrupt changes in network traffic characteristics. The core concept revolves around identifying abrupt alterations in certain characteristics such as input/output volume, the number of TCP connections, or DNS queries—within the analyzed traffic. Initially, the traffic is segmented into distinct sequences of slices, followed by quantifying specific characteristics for each slice. Subsequently, the distance between successive values of these measured characteristics is computed and clustered to detect sudden changes. To accomplish its objectives, the approach combined several techniques, including propositional logic, distance metrics (e.g., Kullback-Leibler Divergence), and clustering algorithms (e.g., K-means). When applied to two distinct datasets, the proposed approach demonstrates exceptional performance, achieving detection rates of up to 100%.展开更多
Clustering approaches are one of the probabilistic load flow(PLF)methods in distribution networks that can be used to obtain output random variables,with much less computation burden and time than the Monte Carlo simu...Clustering approaches are one of the probabilistic load flow(PLF)methods in distribution networks that can be used to obtain output random variables,with much less computation burden and time than the Monte Carlo simulation(MCS)method.However,a challenge of the clustering methods is that the statistical characteristics of the output random variables are obtained with low accuracy.This paper presents a hybrid approach based on clustering and Point estimate methods.In the proposed approach,first,the sample points are clustered based on the𝑙-means method and the optimal agent of each cluster is determined.Then,for each member of the population of agents,the deterministic load flow calculations are performed,and the output variables are calculated.Afterward,a Point estimate-based PLF is performed and the mean and the standard deviation of the output variables are obtained.Finally,the statistical data of each output random variable are modified using the Point estimate method.The use of the proposed method makes it possible to obtain the statistical properties of output random variables such as mean,standard deviation and probabilistic functions,with high accuracy and without significantly increasing the burden of calculations.In order to confirm the consistency and efficiency of the proposed method,the 10-,33-,69-,85-,and 118-bus standard distribution networks have been simulated using coding in Python®programming language.In simulation studies,the results of the proposed method have been compared with the results obtained from the clustering method as well as the MCS method,as a criterion.展开更多
As the largest manufacturing country,China is striving to improve the development quality of its power industry with the goal of Carbon Peaking and Carbon Neutrality,in order to sustain its high-quality economic growt...As the largest manufacturing country,China is striving to improve the development quality of its power industry with the goal of Carbon Peaking and Carbon Neutrality,in order to sustain its high-quality economic growth.In this regard,it is of importance to reveal both the regional development level of China’s power sector and its characteristics in terms of inspiring the next improvement direction.Motived by this purpose,this paper constructs an evaluation indicator system from three dimensions at the province level based on the connotation of high-quality development of the power industry(HDPI).Next,it calculates the HDPI indexes of 30 provinces and explore their development trend and spatial pattern.The results indicate that the total comprehensive performance of all regions was improved in general in the recent decade,but the spatial distribution characteristics of clean,low-carbon,safe and efficient are different.In the aspects of improvement space in future,not only do actively ameliorate the related management regimes or technical fields so as to improve the corresponding indicators’value,but also passively rely on the macro-development such as China’s urbanization level improvement,technological level improvement,and industrial structure upgrading as usual.展开更多
提出了一种基于DI-FCM(double indices fuzzy C-means)算法框架的无监督距离学习算法——基于混合距离学习的双指数模糊C均值算法HDDI-FCM(double indices fuzzy C-m eans with hybrid distance).数据集未知距离度量被表示为若干已有距...提出了一种基于DI-FCM(double indices fuzzy C-means)算法框架的无监督距离学习算法——基于混合距离学习的双指数模糊C均值算法HDDI-FCM(double indices fuzzy C-m eans with hybrid distance).数据集未知距离度量被表示为若干已有距离的线性组合,然后执行HDDI-FCM,在对数据集进行有效聚类的同时进行距离学习.为了保证迭代算法收敛,引入了Steffensen迭代法来改进计算簇中心点的迭代公式.讨论了算法中参数的选择.基于UCI(University of California,Irvine)数据集的实验结果表明该算法是有效的.展开更多
文摘Tarq geochemical 1:100,000 Sheet is located in Isfahan province which is investigated by Iran’s Geological and Explorations Organization using stream sediment analyzes. This area has stratigraphy of Precambrian to Quaternary rocks and is located in the Central Iran zone. According to the presence of signs of gold mineralization in this area, it is necessary to identify important mineral areas in this area. Therefore, finding information is necessary about the relationship and monitoring the elements of gold, arsenic, and antimony relative to each other in this area to determine the extent of geochemical halos and to estimate the grade. Therefore, a well-known and useful K-means method is used for monitoring the elements in the present study, this is a clustering method based on minimizing the total Euclidean distances of each sample from the center of the classes which are assigned to them. In this research, the clustering quality function and the utility rate of the sample have been used in the desired cluster (S(i)) to determine the optimum number of clusters. Finally, with regard to the cluster centers and the results, the equations were used to predict the amount of the gold element based on four parameters of arsenic and antimony grade, length and width of sampling points.
基金This research was jointly supported by the National Natural Science Foundation of China(Grant No.42005037)the Liaoning Provincial Natural Science Foundation Project(PhD Start-up Research Fund 2019-BS-214),the Special Scientific Research Project for the Forecaster(Grant No.CMAYBY2018-018)+2 种基金a Key Technical Project of Liaoning Meteorological Bureau(Grant No.LNGJ201903)the National Key Research and Development Project(Grant No.2018YFC1505601)the Open Foundation Project of the Institute of Atmospheric Environment,China Meteorological Administration(Grant Nos.2020SYIAE08 and 2020SYIAEZD5).
文摘The classification of the Northeast China Cold Vortex(NCCV)activity paths is an important way to analyze its characteristics in detail.Based on the daily precipitation data of the northeastern China(NEC)region,and the atmospheric circulation field and temperature field data of ERA-Interim for every six hours,the NCCV processes during the early summer(June)seasons from 1979 to 2018 were objectively identified.Then,the NCCV processes were classified using a machine learning method(k-means)according to the characteristic parameters of the activity path information.The rationality of the classification results was verified from two aspects,as follows:(1)the atmospheric circulation configuration of the NCCV on various paths;and(2)its influences on the climate conditions in the NEC.The obtained results showed that the activity paths of the NCCV could be divided into four types according to such characteristics as the generation origin,movement direction,and movement velocity of the NCCV.These included the generation-eastward movement type in the east of the Mongolia Plateau(eastward movement type or type A);generation-southeast longdistance movement type in the upstream of the Lena River(southeast long-distance movement type or type B);generationeastward less-movement type near Lake Baikal(eastward less-movement type or type C);and the generation-southward less-movement type in eastern Siberia(southward less-movement type or type D).There were obvious differences observed in the atmospheric circulation configuration and the climate impact of the NCCV on the four above-mentioned types of paths,which indicated that the classification results were reasonable.
文摘Grade estimation is an important phase of mining projects, and one that is considered a challenge due in part to the structural complexities in mineral ore deposits.To overcome this challenge, various techniques have been used in the past. This paper introduces an approach for estimating Au ore grades within a mining deposit using k-means and principal component analysis(PCA). The Khooni district was selected as the case study. This region is interesting geologically, in part because it is considered an important gold source. The study area is situated approximately 60km northeast of the Anarak city and 270km from Esfahan. Through PCA, we sought to understand the relationship between the elements of gold,arsenic, and antimony. Then, by clustering, the behavior of these elements was investigated. One of the most famous and efficient clustering methods is k-means, based on minimizing the total Euclidean distance from each class center. Using the combined results and characteristics of the cluster centers, the gold grade was determined with a correlation coefficient of 91%. An estimation equation for gold grade was derived based on four parameters: arsenic and antimony content, and length and width of the sampling points. The results demonstrate that this approach is faster and more accurate than existing methodologies for ore grade estimation.
基金We thank the anonymous reviewers and the editor for their valuable comments and suggestions.The research of Z.Gao is partially supported by the National Key R&D Program of China(No.2021YFF0704002)The four authors,Z.Wang,Z.Gao,H.Wang and H.Zhu,want to acknowledge the funding support by NSFC grant No.11871443+3 种基金The research of Z.Wang and H.Zhu is also partially sponsored by NUPTSF(Grant No.NY220040)Natural Science Foundation of Jiangsu Province of China(No.BK20191375)Postgraduate Research&Practice Innovation Program of Jiangsu Province under Grant No.KYCX200787The research of Q.Zhang is partially supported by NSFC grant No.12071214.
文摘In Zhu,Wang and Gao(SIAM J.Sci.Comput.,43(2021),pp.A3009–A3031),we proposed a new framework of troubled-cell indicator(TCI)using K-means clustering and the numerical results demonstrate that it can detect the troubled cells accurately using the KXRCF indication variable.The main advantage of this TCI framework is its great potential of extensibility.In this follow-up work,we introduce three more indication variables,i.e.,the TVB,Fu-Shu and cell-boundary jump indication variables,and show their good performance by numerical tests to demonstrate that the TCI framework offers great flexibility in the choice of indication variables.We also compare the three indication variables with the KXRCF one,and the numerical results favor the KXRCF and the cell-boundary jump indication variables.
文摘In this paper, we propose a Fast Iteration Method for solving mixture regression problem, which can be treated as a model-based clustering. Compared to the EM algorithm, the proposed method is faster, more flexible and can solve mixture regression problem with different error distributions (i.e. Laplace and t distribution). Extensive numeric experiments show that our proposed method has better performance on randomly simulations and real data.
文摘Internet services and web-based applications play pivotal roles in various sensitive domains, encompassing e-commerce, e-learning, e-healthcare, and e-payment. However, safeguarding these services poses a significant challenge, as the need for robust security measures becomes increasingly imperative. This paper presented an innovative method based on differential analyses to detect abrupt changes in network traffic characteristics. The core concept revolves around identifying abrupt alterations in certain characteristics such as input/output volume, the number of TCP connections, or DNS queries—within the analyzed traffic. Initially, the traffic is segmented into distinct sequences of slices, followed by quantifying specific characteristics for each slice. Subsequently, the distance between successive values of these measured characteristics is computed and clustered to detect sudden changes. To accomplish its objectives, the approach combined several techniques, including propositional logic, distance metrics (e.g., Kullback-Leibler Divergence), and clustering algorithms (e.g., K-means). When applied to two distinct datasets, the proposed approach demonstrates exceptional performance, achieving detection rates of up to 100%.
文摘Clustering approaches are one of the probabilistic load flow(PLF)methods in distribution networks that can be used to obtain output random variables,with much less computation burden and time than the Monte Carlo simulation(MCS)method.However,a challenge of the clustering methods is that the statistical characteristics of the output random variables are obtained with low accuracy.This paper presents a hybrid approach based on clustering and Point estimate methods.In the proposed approach,first,the sample points are clustered based on the𝑙-means method and the optimal agent of each cluster is determined.Then,for each member of the population of agents,the deterministic load flow calculations are performed,and the output variables are calculated.Afterward,a Point estimate-based PLF is performed and the mean and the standard deviation of the output variables are obtained.Finally,the statistical data of each output random variable are modified using the Point estimate method.The use of the proposed method makes it possible to obtain the statistical properties of output random variables such as mean,standard deviation and probabilistic functions,with high accuracy and without significantly increasing the burden of calculations.In order to confirm the consistency and efficiency of the proposed method,the 10-,33-,69-,85-,and 118-bus standard distribution networks have been simulated using coding in Python®programming language.In simulation studies,the results of the proposed method have been compared with the results obtained from the clustering method as well as the MCS method,as a criterion.
基金This work was supported by National Natural Science Foundation of China[Grant number.71673034]Postdoctoral Research Foundation of China[Grant number.2021M692654]+1 种基金Natural Science Basic Research Program of Shaanxi Province[Grant number.2020JQ282]Social Science Foundation of Shaanxi Province[Grant number.2020R042].
文摘As the largest manufacturing country,China is striving to improve the development quality of its power industry with the goal of Carbon Peaking and Carbon Neutrality,in order to sustain its high-quality economic growth.In this regard,it is of importance to reveal both the regional development level of China’s power sector and its characteristics in terms of inspiring the next improvement direction.Motived by this purpose,this paper constructs an evaluation indicator system from three dimensions at the province level based on the connotation of high-quality development of the power industry(HDPI).Next,it calculates the HDPI indexes of 30 provinces and explore their development trend and spatial pattern.The results indicate that the total comprehensive performance of all regions was improved in general in the recent decade,but the spatial distribution characteristics of clean,low-carbon,safe and efficient are different.In the aspects of improvement space in future,not only do actively ameliorate the related management regimes or technical fields so as to improve the corresponding indicators’value,but also passively rely on the macro-development such as China’s urbanization level improvement,technological level improvement,and industrial structure upgrading as usual.
文摘提出了一种基于DI-FCM(double indices fuzzy C-means)算法框架的无监督距离学习算法——基于混合距离学习的双指数模糊C均值算法HDDI-FCM(double indices fuzzy C-m eans with hybrid distance).数据集未知距离度量被表示为若干已有距离的线性组合,然后执行HDDI-FCM,在对数据集进行有效聚类的同时进行距离学习.为了保证迭代算法收敛,引入了Steffensen迭代法来改进计算簇中心点的迭代公式.讨论了算法中参数的选择.基于UCI(University of California,Irvine)数据集的实验结果表明该算法是有效的.