Traditional clustering algorithms often struggle to produce satisfactory results when dealing with datasets withuneven density. Additionally, they incur substantial computational costs when applied to high-dimensional...Traditional clustering algorithms often struggle to produce satisfactory results when dealing with datasets withuneven density. Additionally, they incur substantial computational costs when applied to high-dimensional datadue to calculating similarity matrices. To alleviate these issues, we employ the KD-Tree to partition the dataset andcompute the K-nearest neighbors (KNN) density for each point, thereby avoiding the computation of similaritymatrices. Moreover, we apply the rules of voting elections, treating each data point as a voter and casting a votefor the point with the highest density among its KNN. By utilizing the vote counts of each point, we develop thestrategy for classifying noise points and potential cluster centers, allowing the algorithm to identify clusters withuneven density and complex shapes. Additionally, we define the concept of “adhesive points” between two clustersto merge adjacent clusters that have similar densities. This process helps us identify the optimal number of clustersautomatically. Experimental results indicate that our algorithm not only improves the efficiency of clustering butalso increases its accuracy.展开更多
By selecting a typical peak-cluster depression area of karst region in Southwest China, we evaluated the effect of land use types and topographic factors on soil nutrients. Grid and line sampling methods were used to ...By selecting a typical peak-cluster depression area of karst region in Southwest China, we evaluated the effect of land use types and topographic factors on soil nutrients. Grid and line sampling methods were used to sample soil in depression and slope lands respectively, and classical statistical tools were applied to analyze the spatial variability character of soil organic carbon (SOC), total nitrogen (TN), total phosphorus (TP), total potassium (TK), available nitrogen (AN), available phosphorus (AP), available potassium (AK), pH, and C/N. It was found that land use type was the dominant factor that effected the spatial heterogeneity of SOC, TN, TP, TK, AN, and AP. The content of SOC, TN, and AN decreased with the increase of land use intensity. Due to high fertilizer input, TP and AP in tillage fields were higher than those in the other land use types. TK had no obvious change trend among various land use types. Topographic factors had a significant effect on SOC, TN, TP, AN, AP, AK, and pH. Habitat factor was the dominant factor that effected AK. Altitude factor was the dominant factor for pH. However, all of these factors had no significant effect on C/N. Tillage practice had important effect on soil nutrients loss and soil degradation in the fragile karst ecosystem, and the input of organic manure should be increased in this region.展开更多
Through utilizing water flow monitoring, rock scratching, soil wood piles and radionuclide ^(137)Cs tracing in the Longhe karst ecological experimental site(hereinafter referred to"Longhe site"), Pingguo Cou...Through utilizing water flow monitoring, rock scratching, soil wood piles and radionuclide ^(137)Cs tracing in the Longhe karst ecological experimental site(hereinafter referred to"Longhe site"), Pingguo County, Guangxi Province, the features and values of soil erosion and soil leakage in different geomorphologic locations and land uses in the karst peak-cluster depressions are showed clearly. There are four kinds of geomorphologic locations in the karst peak-cluster depression, namely peaks, strip, slopes and depression. The soil leakage modulus in the peaks and strips respectively occupy 92.43% and 96.24% of the total mean soil erosion modulus at experimental sites. On the slope, soil leakage accounted for about 75%. At the bottom of depression, surface water was the main factor of soil erosion, and at last most soil leaked into underground rivers from sinkholes. The total soil erosion modulus and the contribution rates of relative surface soil erosion in regard of peaks, slopes and depressions gradually increased. There are also five major types of land use in the karst peak-cluster depressions, farmland, Kudingcha tea plantations, young Lignum Sappan fields, shrub-grassland and pastures. The soil erosion modulus of slope farmland has the highest value with an increasing trend year by year. But soil erosion modulus of other four land use types decreased by year, which shows the "grain for green" will result in better soil protection. By handling with rocky desertification and ecological rehabilitation in Longhe site, the mean soil erosion modulus of the karst peak-cluster depression has decreased about 80% from 2003 to 2015.展开更多
The key challenge of the extended target probability hypothesis density (ET-PHD) filter is to reduce the computational complexity by using a subset to approximate the full set of partitions. In this paper, the influen...The key challenge of the extended target probability hypothesis density (ET-PHD) filter is to reduce the computational complexity by using a subset to approximate the full set of partitions. In this paper, the influence for the tracking results of different partitions is analyzed, and the form of the most informative partition is obtained. Then, a fast density peak-based clustering (FDPC) partitioning algorithm is applied to the measurement set partitioning. Since only one partition of the measurement set is used, the ET-PHD filter based on FDPC partitioning has lower computational complexity than the other ET-PHD filters. As FDPC partitioning is able to remove the spatially close clutter-generated measurements, the ET-PHD filter based on FDPC partitioning has good tracking performance in the scenario with more clutter-generated measurements. The simulation results show that the proposed algorithm can get the most informative partition and obviously reduce computational burden without losing tracking performance. As the number of clutter-generated measurements increased, the ET-PHD filter based on FDPC partitioning has better tracking performance than other ET-PHD filters. The FDPC algorithm will play an important role in the engineering realization of the multiple extended target tracking filter.展开更多
密度峰值聚类(clustering by fast search and find of density peaks,DPC)算法是一种基于密度的聚类算法,它可以发现任意形状和维度的类簇,是具有里程碑意义的聚类算法。然而,DPC算法的样本局部密度定义不适用于同时发现数据集的稠密...密度峰值聚类(clustering by fast search and find of density peaks,DPC)算法是一种基于密度的聚类算法,它可以发现任意形状和维度的类簇,是具有里程碑意义的聚类算法。然而,DPC算法的样本局部密度定义不适用于同时发现数据集的稠密簇和稀疏簇;此外,DPC算法的一步分配策略使得一旦有一个样本分配错误,将导致更多样本的错误分配,产生“多米诺骨牌效应”。针对这些问题,提出一种新的样本局部密度定义,采用局部标准差指数定义样本局部密度,克服DPC的密度定义缺陷;采用两步分配策略代替DPC的一步分配策略,克服DPC的“多米诺骨牌效应”,得到ESDTS-DPC算法。与DPC及其改进算法KNN-DPC、FKNN-DPC、DPC-CE和经典密度聚类算法DBSCAN的实验比较显示,提出的ESDTS-DPC算法具有更好的聚类准确性。展开更多
In industrial production and engineering operations,the health state of complex systems is critical,and predicting it can ensure normal operation.Complex systems have many monitoring indicators,complex coupling struct...In industrial production and engineering operations,the health state of complex systems is critical,and predicting it can ensure normal operation.Complex systems have many monitoring indicators,complex coupling structures,non-linear and time-varying characteristics,so it is a challenge to establish a reliable prediction model.The belief rule base(BRB)can fuse observed data and expert knowledge to establish a nonlinear relationship between input and output and has well modeling capabilities.Since each indicator of the complex system can reflect the health state to some extent,the BRB is built based on the causal relationship between system indicators and the health state to achieve the prediction.A health state prediction model based on BRB and long short term memory for complex systems is proposed in this paper.Firstly,the LSTMis introduced to predict the trend of the indicators in the system.Secondly,the Density Peak Clustering(DPC)algorithmis used todetermine referential values of indicators for BRB,which effectively offset the lack of expert knowledge.Then,the predicted values and expert knowledge are fused to construct BRB to predict the health state of the systems by inference.Finally,the effectiveness of the model is verified by a case study of a certain vehicle hydraulic pump.展开更多
Although data-independent acquisition (DIA) shows powerful potential in achieving comprehensive peptide information acquisition, the difficulty in determining the precursor m/z and distinguishing fragment ions has pos...Although data-independent acquisition (DIA) shows powerful potential in achieving comprehensive peptide information acquisition, the difficulty in determining the precursor m/z and distinguishing fragment ions has posed challenges in DIA data analysis. To address this challenge, a common approach is to recover the correspondence between precursor ions and fragment ions, followed by peptide identification using traditional data-dependent acquisition (DDA) database searching. In this study, we propose a cosine similarity-based deconvolution method that rapidly establishes the correspondence between chromatographic profiles of precursor ions and fragment ions through matrix calculations. Experimental results demonstrate that our method, referred to as CosDIA, yields a peptide identification count close to that of DIA-umpire. However, compared to DIA-umpire, we can establish the correspondence between original MS/MS spectra and pseudo-MS/MS spectra. Furthermore, compared to the CorrDIA method, our approach achieves higher efficiency in terms of time, reducing the time cost of the analysis process. These results highlight the potential advantages of the CosDIA method in DIA data analysis, providing a powerful tool and method for large-scale proteomics research.展开更多
Tri-training利用无标签数据进行分类可有效提高分类器的泛化能力,但其易将无标签数据误标,从而形成训练噪声。提出一种基于密度峰值聚类的Tri-training(Tri-training with density peaks clustering,DPC-TT)算法。密度峰值聚类通过类...Tri-training利用无标签数据进行分类可有效提高分类器的泛化能力,但其易将无标签数据误标,从而形成训练噪声。提出一种基于密度峰值聚类的Tri-training(Tri-training with density peaks clustering,DPC-TT)算法。密度峰值聚类通过类簇中心和局部密度可选出数据空间结构表现较好的样本。DPC-TT算法采用密度峰值聚类算法获取训练数据的类簇中心和样本的局部密度,对类簇中心的截断距离范围内的样本认定为空间结构表现较好,标记为核心数据,使用核心数据更新分类器,可降低迭代过程中的训练噪声,进而提高分类器的性能。实验结果表明:相比于标准Tritraining算法及其改进算法,DPC-TT算法具有更好的分类性能。展开更多
基金National Natural Science Foundation of China Nos.61962054 and 62372353.
文摘Traditional clustering algorithms often struggle to produce satisfactory results when dealing with datasets withuneven density. Additionally, they incur substantial computational costs when applied to high-dimensional datadue to calculating similarity matrices. To alleviate these issues, we employ the KD-Tree to partition the dataset andcompute the K-nearest neighbors (KNN) density for each point, thereby avoiding the computation of similaritymatrices. Moreover, we apply the rules of voting elections, treating each data point as a voter and casting a votefor the point with the highest density among its KNN. By utilizing the vote counts of each point, we develop thestrategy for classifying noise points and potential cluster centers, allowing the algorithm to identify clusters withuneven density and complex shapes. Additionally, we define the concept of “adhesive points” between two clustersto merge adjacent clusters that have similar densities. This process helps us identify the optimal number of clustersautomatically. Experimental results indicate that our algorithm not only improves the efficiency of clustering butalso increases its accuracy.
文摘By selecting a typical peak-cluster depression area of karst region in Southwest China, we evaluated the effect of land use types and topographic factors on soil nutrients. Grid and line sampling methods were used to sample soil in depression and slope lands respectively, and classical statistical tools were applied to analyze the spatial variability character of soil organic carbon (SOC), total nitrogen (TN), total phosphorus (TP), total potassium (TK), available nitrogen (AN), available phosphorus (AP), available potassium (AK), pH, and C/N. It was found that land use type was the dominant factor that effected the spatial heterogeneity of SOC, TN, TP, TK, AN, and AP. The content of SOC, TN, and AN decreased with the increase of land use intensity. Due to high fertilizer input, TP and AP in tillage fields were higher than those in the other land use types. TK had no obvious change trend among various land use types. Topographic factors had a significant effect on SOC, TN, TP, AN, AP, AK, and pH. Habitat factor was the dominant factor that effected AK. Altitude factor was the dominant factor for pH. However, all of these factors had no significant effect on C/N. Tillage practice had important effect on soil nutrients loss and soil degradation in the fragile karst ecosystem, and the input of organic manure should be increased in this region.
基金the financial support by IGCP 661, CAGS Research Fund (Grant No. YYWF201725)the National Natural Science Foundation of China (Grant No.41571203)+1 种基金the Ministry of Science and Technology of China (Grant No. 2016YFC0502403-2) the Bureau of Science and Technology of Guangxi (Grant No. 2014GXNS FAA118280)
文摘Through utilizing water flow monitoring, rock scratching, soil wood piles and radionuclide ^(137)Cs tracing in the Longhe karst ecological experimental site(hereinafter referred to"Longhe site"), Pingguo County, Guangxi Province, the features and values of soil erosion and soil leakage in different geomorphologic locations and land uses in the karst peak-cluster depressions are showed clearly. There are four kinds of geomorphologic locations in the karst peak-cluster depression, namely peaks, strip, slopes and depression. The soil leakage modulus in the peaks and strips respectively occupy 92.43% and 96.24% of the total mean soil erosion modulus at experimental sites. On the slope, soil leakage accounted for about 75%. At the bottom of depression, surface water was the main factor of soil erosion, and at last most soil leaked into underground rivers from sinkholes. The total soil erosion modulus and the contribution rates of relative surface soil erosion in regard of peaks, slopes and depressions gradually increased. There are also five major types of land use in the karst peak-cluster depressions, farmland, Kudingcha tea plantations, young Lignum Sappan fields, shrub-grassland and pastures. The soil erosion modulus of slope farmland has the highest value with an increasing trend year by year. But soil erosion modulus of other four land use types decreased by year, which shows the "grain for green" will result in better soil protection. By handling with rocky desertification and ecological rehabilitation in Longhe site, the mean soil erosion modulus of the karst peak-cluster depression has decreased about 80% from 2003 to 2015.
基金supported by the National Natural Science Foundation of China(61401475)
文摘The key challenge of the extended target probability hypothesis density (ET-PHD) filter is to reduce the computational complexity by using a subset to approximate the full set of partitions. In this paper, the influence for the tracking results of different partitions is analyzed, and the form of the most informative partition is obtained. Then, a fast density peak-based clustering (FDPC) partitioning algorithm is applied to the measurement set partitioning. Since only one partition of the measurement set is used, the ET-PHD filter based on FDPC partitioning has lower computational complexity than the other ET-PHD filters. As FDPC partitioning is able to remove the spatially close clutter-generated measurements, the ET-PHD filter based on FDPC partitioning has good tracking performance in the scenario with more clutter-generated measurements. The simulation results show that the proposed algorithm can get the most informative partition and obviously reduce computational burden without losing tracking performance. As the number of clutter-generated measurements increased, the ET-PHD filter based on FDPC partitioning has better tracking performance than other ET-PHD filters. The FDPC algorithm will play an important role in the engineering realization of the multiple extended target tracking filter.
文摘密度峰值聚类(clustering by fast search and find of density peaks,DPC)算法是一种基于密度的聚类算法,它可以发现任意形状和维度的类簇,是具有里程碑意义的聚类算法。然而,DPC算法的样本局部密度定义不适用于同时发现数据集的稠密簇和稀疏簇;此外,DPC算法的一步分配策略使得一旦有一个样本分配错误,将导致更多样本的错误分配,产生“多米诺骨牌效应”。针对这些问题,提出一种新的样本局部密度定义,采用局部标准差指数定义样本局部密度,克服DPC的密度定义缺陷;采用两步分配策略代替DPC的一步分配策略,克服DPC的“多米诺骨牌效应”,得到ESDTS-DPC算法。与DPC及其改进算法KNN-DPC、FKNN-DPC、DPC-CE和经典密度聚类算法DBSCAN的实验比较显示,提出的ESDTS-DPC算法具有更好的聚类准确性。
基金supported by the Natural Science Foundation of China underGrant 61833016 and 61873293the Shaanxi OutstandingYouth Science Foundation underGrant 2020JC-34the Shaanxi Science and Technology Innovation Team under Grant 2022TD-24.
文摘In industrial production and engineering operations,the health state of complex systems is critical,and predicting it can ensure normal operation.Complex systems have many monitoring indicators,complex coupling structures,non-linear and time-varying characteristics,so it is a challenge to establish a reliable prediction model.The belief rule base(BRB)can fuse observed data and expert knowledge to establish a nonlinear relationship between input and output and has well modeling capabilities.Since each indicator of the complex system can reflect the health state to some extent,the BRB is built based on the causal relationship between system indicators and the health state to achieve the prediction.A health state prediction model based on BRB and long short term memory for complex systems is proposed in this paper.Firstly,the LSTMis introduced to predict the trend of the indicators in the system.Secondly,the Density Peak Clustering(DPC)algorithmis used todetermine referential values of indicators for BRB,which effectively offset the lack of expert knowledge.Then,the predicted values and expert knowledge are fused to construct BRB to predict the health state of the systems by inference.Finally,the effectiveness of the model is verified by a case study of a certain vehicle hydraulic pump.
文摘Although data-independent acquisition (DIA) shows powerful potential in achieving comprehensive peptide information acquisition, the difficulty in determining the precursor m/z and distinguishing fragment ions has posed challenges in DIA data analysis. To address this challenge, a common approach is to recover the correspondence between precursor ions and fragment ions, followed by peptide identification using traditional data-dependent acquisition (DDA) database searching. In this study, we propose a cosine similarity-based deconvolution method that rapidly establishes the correspondence between chromatographic profiles of precursor ions and fragment ions through matrix calculations. Experimental results demonstrate that our method, referred to as CosDIA, yields a peptide identification count close to that of DIA-umpire. However, compared to DIA-umpire, we can establish the correspondence between original MS/MS spectra and pseudo-MS/MS spectra. Furthermore, compared to the CorrDIA method, our approach achieves higher efficiency in terms of time, reducing the time cost of the analysis process. These results highlight the potential advantages of the CosDIA method in DIA data analysis, providing a powerful tool and method for large-scale proteomics research.
文摘Tri-training利用无标签数据进行分类可有效提高分类器的泛化能力,但其易将无标签数据误标,从而形成训练噪声。提出一种基于密度峰值聚类的Tri-training(Tri-training with density peaks clustering,DPC-TT)算法。密度峰值聚类通过类簇中心和局部密度可选出数据空间结构表现较好的样本。DPC-TT算法采用密度峰值聚类算法获取训练数据的类簇中心和样本的局部密度,对类簇中心的截断距离范围内的样本认定为空间结构表现较好,标记为核心数据,使用核心数据更新分类器,可降低迭代过程中的训练噪声,进而提高分类器的性能。实验结果表明:相比于标准Tritraining算法及其改进算法,DPC-TT算法具有更好的分类性能。