To investigate the judging problem of optimal dividing matrix among several fuzzy dividing matrices in fuzzy dividing space, correspondingly, which is determined by the various choices of cluster samples in the totali...To investigate the judging problem of optimal dividing matrix among several fuzzy dividing matrices in fuzzy dividing space, correspondingly, which is determined by the various choices of cluster samples in the totality sample space, two algorithms are proposed on the basis of the data analysis method in rough sets theory: information system discrete algorithm (algorithm 1) and samples representatives judging algorithm (algorithm 2). On the principle of the farthest distance, algorithm 1 transforms continuous data into discrete form which could be transacted by rough sets theory. Taking the approximate precision as a criterion, algorithm 2 chooses the sample space with a good representative. Hence, the clustering sample set in inducing and computing optimal dividing matrix can be achieved. Several theorems are proposed to provide strict theoretic foundations for the execution of the algorithm model. An applied example based on the new algorithm model is given, whose result verifies the feasibility of this new algorithm model.展开更多
How to find these communities is an important research work. Recently, community discovery are mainly categorized to HITS algorithm, bipartite cores algorithm and maximum flow/minimum cut framework. In this paper, we ...How to find these communities is an important research work. Recently, community discovery are mainly categorized to HITS algorithm, bipartite cores algorithm and maximum flow/minimum cut framework. In this paper, we proposed a new method to extract communities. The MCL algorithm, which is short for the Markov Cluster Algorithm, a fast and scalable unsupervised cluster algorithm is used to extract communities. By putting mirror deleting procedure behind graph clustering, we decrease comparing cost considerably. After MCL and mirror deletion, we use community member select algorithm to produce the sets of community candidates. The experiment and results show the new method works effectively and properly.展开更多
目的建立超高效液相色谱法(ultra performance liquid chromatography,UPLC)快速测定蜂胶提取物中的14种化学成分,结合多元统计分析方法对不同厂家的蜂胶提取物质量进行综合评价。方法收集来自不同厂家的17批蜂胶提取物样品,采用UPLC采...目的建立超高效液相色谱法(ultra performance liquid chromatography,UPLC)快速测定蜂胶提取物中的14种化学成分,结合多元统计分析方法对不同厂家的蜂胶提取物质量进行综合评价。方法收集来自不同厂家的17批蜂胶提取物样品,采用UPLC采集色谱图,甲醇-0.2%磷酸水溶液为流动相,梯度洗脱,同时测定咖啡酸、p-香豆酸、阿魏酸、异阿魏酸、3,4-二甲氧基肉桂酸、咖啡酸苯乙酯、阿替匹林C、槲皮素、山奈素、芹菜素、异鼠李素、乔松素、白杨素、高良姜素的含量,运用统计学软件进行主成分分析(principal component analysis,PCA)、聚类分析(clustering analysis,CA)、偏最小二乘-判别分析(partial least squares-discriminant analysis,PLS-DA),筛选分析质量差异标志物。通过熵权法计算各指标权重,将结果应用于优劣解距离法(technique for order preference by similarity to ideal solution,TOPSIS)和秩和比法(rank sum ratio,RSR)构建综合评价模型,评价不同批次的蜂胶提取物质量优劣。结果14个指标成分在各自的浓度范围内线性关系良好(r≥0.9992),平均加样回收率是96.37%~102.21%,相对标准偏差小于2%。化学计量学结果表明17批样品聚为4类,同一个厂家的样品聚为一类,不同厂家的样品存在明显差异,3,4-二甲氧基肉桂酸、异阿魏酸、槲皮素、高良姜素、阿替匹林C、咖啡酸苯乙酯可能是影响厂家质量差异的潜在标志物。通过熵权-TOPSIS、熵权-RSR以及两者相结合的方式构建的综合质量评价模型,对不同批次蜂胶提取物的质量优劣排序结果较为一致。结论基于UPLC的多指标测定方法准确便捷,结合PCA、CA、PLS-DA和TOPSIS-RSR建立的评价模式能够有效分析不同厂家的差异性,为蜂胶提取物的整体质量评价提供参考。展开更多
In order to investigate the effect of influent condition heterogeneity on diversity of the bacterial community,the degree of microbial resolution and effluent quality,biological treatment of micro-polluted source wate...In order to investigate the effect of influent condition heterogeneity on diversity of the bacterial community,the degree of microbial resolution and effluent quality,biological treatment of micro-polluted source water is proposed. Scanning Electron Microscopy( SEM) analysis reflects that influent conditions change the morphologies of biofilm. Denaturing Gradient Gel Electrophoresis( DGGE) analysis shows differences of H values are due to succession of functional bacterial communities. Microbial resolution values and species identifications reveal organic carbon is the main cause of community differentiation and bacterial migration.展开更多
流形数据由一些弧线状或环状的类簇组成,其特点是同一类簇的样本间距离差距较大。密度峰值聚类算法不能有效识别流形类簇的类簇中心且分配剩余样本时易引发样本的连续误分配问题。为此,本文提出面向流形数据的共享近邻密度峰值聚类(dens...流形数据由一些弧线状或环状的类簇组成,其特点是同一类簇的样本间距离差距较大。密度峰值聚类算法不能有效识别流形类簇的类簇中心且分配剩余样本时易引发样本的连续误分配问题。为此,本文提出面向流形数据的共享近邻密度峰值聚类(density peaks clustering based on shared nearest neighbor for manifold datasets,DPC-SNN)算法。提出了一种基于共享近邻的样本相似度定义方式,使得同一流形类簇样本间的相似度尽可能高;基于上述相似度定义局部密度,不忽略距类簇中心较远样本的密度贡献,能更好地区分出流形类簇的类簇中心与其他样本;根据样本的相似度分配剩余样本,避免了样本的连续误分配。DPC-SNN算法与DPC、FKNNDPC、FNDPC、DPCSA及IDPC-FA算法的对比实验结果表明,DPC-SNN算法能够有效发现流形数据的类簇中心并准确完成聚类,对真实以及人脸数据集也有不错的聚类效果。展开更多
为探究城市信号交叉口影响人车冲突严重程度的关键因素,提升交叉口安全管理水平,本文选取典型的城市道路信号交叉口,采用无人机航拍获取交通流视频,基于人工观测和Tracker软件解析处理得到冲突点信息参数与位置分布特征。为量化冲突程度...为探究城市信号交叉口影响人车冲突严重程度的关键因素,提升交叉口安全管理水平,本文选取典型的城市道路信号交叉口,采用无人机航拍获取交通流视频,基于人工观测和Tracker软件解析处理得到冲突点信息参数与位置分布特征。为量化冲突程度,采用后侵入时间、冲突区域车速、潜在碰撞距离作为人车冲突严重程度评价指标,利用K-means聚类算法将过街冲突按严重程度迭代分类,确定人、车、路三方面下的21个解释变量。通过Pearson相关性分析筛选,建立多元有序Logistic模型,并通过ROC(Receiver Operating Characteristic)曲线验证得到模型对冲突严重级别的估计分类概率结果AUC(Area Under Curve)为0.971。结果表明:行人与冲突点的距离(0.364)、车辆在冲突点前的趋向(停车让行为-4.22,减速让行为-0.937)、行人是否闯红灯行为(0.818)、机动车道数量(0.29)、行人等待红灯时间长短(0.012)、行人年龄段(-0.869)、行人着装颜色(0.673)是影响人车冲突严重程度的显著因素。本文研究结果能够为行人过街安全的交通策略制定提供一定参考价值。展开更多
文摘To investigate the judging problem of optimal dividing matrix among several fuzzy dividing matrices in fuzzy dividing space, correspondingly, which is determined by the various choices of cluster samples in the totality sample space, two algorithms are proposed on the basis of the data analysis method in rough sets theory: information system discrete algorithm (algorithm 1) and samples representatives judging algorithm (algorithm 2). On the principle of the farthest distance, algorithm 1 transforms continuous data into discrete form which could be transacted by rough sets theory. Taking the approximate precision as a criterion, algorithm 2 chooses the sample space with a good representative. Hence, the clustering sample set in inducing and computing optimal dividing matrix can be achieved. Several theorems are proposed to provide strict theoretic foundations for the execution of the algorithm model. An applied example based on the new algorithm model is given, whose result verifies the feasibility of this new algorithm model.
基金Supported bythe 211 Project of Ministry of Educa-tion of China
文摘How to find these communities is an important research work. Recently, community discovery are mainly categorized to HITS algorithm, bipartite cores algorithm and maximum flow/minimum cut framework. In this paper, we proposed a new method to extract communities. The MCL algorithm, which is short for the Markov Cluster Algorithm, a fast and scalable unsupervised cluster algorithm is used to extract communities. By putting mirror deleting procedure behind graph clustering, we decrease comparing cost considerably. After MCL and mirror deletion, we use community member select algorithm to produce the sets of community candidates. The experiment and results show the new method works effectively and properly.
文摘目的建立超高效液相色谱法(ultra performance liquid chromatography,UPLC)快速测定蜂胶提取物中的14种化学成分,结合多元统计分析方法对不同厂家的蜂胶提取物质量进行综合评价。方法收集来自不同厂家的17批蜂胶提取物样品,采用UPLC采集色谱图,甲醇-0.2%磷酸水溶液为流动相,梯度洗脱,同时测定咖啡酸、p-香豆酸、阿魏酸、异阿魏酸、3,4-二甲氧基肉桂酸、咖啡酸苯乙酯、阿替匹林C、槲皮素、山奈素、芹菜素、异鼠李素、乔松素、白杨素、高良姜素的含量,运用统计学软件进行主成分分析(principal component analysis,PCA)、聚类分析(clustering analysis,CA)、偏最小二乘-判别分析(partial least squares-discriminant analysis,PLS-DA),筛选分析质量差异标志物。通过熵权法计算各指标权重,将结果应用于优劣解距离法(technique for order preference by similarity to ideal solution,TOPSIS)和秩和比法(rank sum ratio,RSR)构建综合评价模型,评价不同批次的蜂胶提取物质量优劣。结果14个指标成分在各自的浓度范围内线性关系良好(r≥0.9992),平均加样回收率是96.37%~102.21%,相对标准偏差小于2%。化学计量学结果表明17批样品聚为4类,同一个厂家的样品聚为一类,不同厂家的样品存在明显差异,3,4-二甲氧基肉桂酸、异阿魏酸、槲皮素、高良姜素、阿替匹林C、咖啡酸苯乙酯可能是影响厂家质量差异的潜在标志物。通过熵权-TOPSIS、熵权-RSR以及两者相结合的方式构建的综合质量评价模型,对不同批次蜂胶提取物的质量优劣排序结果较为一致。结论基于UPLC的多指标测定方法准确便捷,结合PCA、CA、PLS-DA和TOPSIS-RSR建立的评价模式能够有效分析不同厂家的差异性,为蜂胶提取物的整体质量评价提供参考。
基金Sponsored by Major Science and Technology Program for Water Pollution Control and Treatment(Grant No.2012ZX07408001)State Key Laboratory of Urban Water Resource and Environment in China,Fundamental Research Funds for the Central Universities,China(Grant No.5710006113,HIT.BRETIII.201417)Postdoctoral Science Foundation of China(Grant No.2014T70324,LBH-Z12090)
文摘In order to investigate the effect of influent condition heterogeneity on diversity of the bacterial community,the degree of microbial resolution and effluent quality,biological treatment of micro-polluted source water is proposed. Scanning Electron Microscopy( SEM) analysis reflects that influent conditions change the morphologies of biofilm. Denaturing Gradient Gel Electrophoresis( DGGE) analysis shows differences of H values are due to succession of functional bacterial communities. Microbial resolution values and species identifications reveal organic carbon is the main cause of community differentiation and bacterial migration.
文摘流形数据由一些弧线状或环状的类簇组成,其特点是同一类簇的样本间距离差距较大。密度峰值聚类算法不能有效识别流形类簇的类簇中心且分配剩余样本时易引发样本的连续误分配问题。为此,本文提出面向流形数据的共享近邻密度峰值聚类(density peaks clustering based on shared nearest neighbor for manifold datasets,DPC-SNN)算法。提出了一种基于共享近邻的样本相似度定义方式,使得同一流形类簇样本间的相似度尽可能高;基于上述相似度定义局部密度,不忽略距类簇中心较远样本的密度贡献,能更好地区分出流形类簇的类簇中心与其他样本;根据样本的相似度分配剩余样本,避免了样本的连续误分配。DPC-SNN算法与DPC、FKNNDPC、FNDPC、DPCSA及IDPC-FA算法的对比实验结果表明,DPC-SNN算法能够有效发现流形数据的类簇中心并准确完成聚类,对真实以及人脸数据集也有不错的聚类效果。
文摘为探究城市信号交叉口影响人车冲突严重程度的关键因素,提升交叉口安全管理水平,本文选取典型的城市道路信号交叉口,采用无人机航拍获取交通流视频,基于人工观测和Tracker软件解析处理得到冲突点信息参数与位置分布特征。为量化冲突程度,采用后侵入时间、冲突区域车速、潜在碰撞距离作为人车冲突严重程度评价指标,利用K-means聚类算法将过街冲突按严重程度迭代分类,确定人、车、路三方面下的21个解释变量。通过Pearson相关性分析筛选,建立多元有序Logistic模型,并通过ROC(Receiver Operating Characteristic)曲线验证得到模型对冲突严重级别的估计分类概率结果AUC(Area Under Curve)为0.971。结果表明:行人与冲突点的距离(0.364)、车辆在冲突点前的趋向(停车让行为-4.22,减速让行为-0.937)、行人是否闯红灯行为(0.818)、机动车道数量(0.29)、行人等待红灯时间长短(0.012)、行人年龄段(-0.869)、行人着装颜色(0.673)是影响人车冲突严重程度的显著因素。本文研究结果能够为行人过街安全的交通策略制定提供一定参考价值。