高光谱图像(Hyperspectral Images, HSI)可提供几十到数百个连续的光谱波段,但这些波段导致数据处理的复杂性增加,并且相邻波段的冗余度较大。为了解决这些问题,提出了一种潜在特征融合和最优聚类的高光谱图像降维方法(Latent Features ...高光谱图像(Hyperspectral Images, HSI)可提供几十到数百个连续的光谱波段,但这些波段导致数据处理的复杂性增加,并且相邻波段的冗余度较大。为了解决这些问题,提出了一种潜在特征融合和最优聚类的高光谱图像降维方法(Latent Features Fusion and Optimal Clustering Framework, LFFOCF)。该方法使用超像素分割将HSI分割为多个区域,以便充分保留HSI的空间信息。通过构造相应的拉普拉斯矩阵获取先验信息,生成一组低维潜在特征,进一步增强不同波段之间的可分性;通过融合区域感知的潜在特征,获得HSI的共享潜在特征表示,以有效捕获HSI的频带冗余;通过最优聚类框架搜索HSI中的最优聚类结构,在一种排序策略的基础上获得最优聚类结果,生成相关性较低且具有更多鉴别信息的波段子集。该方法充分利用了光谱和空间特性,在两个公共数据集上的大量实验表明,与Optimal Neighborhood Reconstruction(ONR)、Optimal Clustering Framework(OCF)和Region-aware Latent Features Fusion based Clustering(RLFFC)方法相比,所提出的方法在OA、MA和Kappa系数3个指标上都优于其他算法。展开更多
无监督特征选择是机器学习和数据挖掘中的一种重要的降维技术。然而当前的无监督特征选择方法侧重于从数据的邻接矩阵中学习数据的流形结构,忽视非邻接数据对之间的关联。其次这些方法都假设数据实例具有独立同一性,但现实中的数据样本...无监督特征选择是机器学习和数据挖掘中的一种重要的降维技术。然而当前的无监督特征选择方法侧重于从数据的邻接矩阵中学习数据的流形结构,忽视非邻接数据对之间的关联。其次这些方法都假设数据实例具有独立同一性,但现实中的数据样本其来源是不同的,这样的假设就不成立。此外,在原始数据空间中特征重要性的衡量会受到数据和特征中的噪声影响。基于以上问题,本文提出了潜在多步马尔可夫概率的鲁棒无监督特征选择方法(unsupervised feature selection via multi-step Markov probability and latent representation,MMLRL),其思想是通过最大多步马尔可夫转移概率学习数据流形结构,然后通过对称非负矩阵分解模型学习数据的潜在表示,最后在数据的潜在表示空间中选择特征。同时在6个不同类型的数据集上验证了所提出算法的有效性。展开更多
A general prediction model for seven heavy metals was established using the heavy metal contents of 207soil samples measured by a portable X-ray fluorescence spectrometer(XRF)and six environmental factors as model cor...A general prediction model for seven heavy metals was established using the heavy metal contents of 207soil samples measured by a portable X-ray fluorescence spectrometer(XRF)and six environmental factors as model correction coefficients.The eXtreme Gradient Boosting(XGBoost)model was used to fit the relationship between the content of heavy metals and environment characteristics to evaluate the soil ecological risk of the smelting site.The results demonstrated that the generalized prediction model developed for Pb,Cd,and As was highly accurate with fitted coefficients(R~2)values of 0.911,0.950,and 0.835,respectively.Topsoil presented the highest ecological risk,and there existed high potential ecological risk at some positions with different depths due to high mobility of Cd.Generally,the application of machine learning significantly increased the accuracy of pXRF measurements,and identified key environmental factors.The adapted potential ecological risk assessment emphasized the need to focus on Pb,Cd,and As in future site remediation efforts.展开更多
Using CALL authoring software "Language Coach" as a field example, this case study looks into criteria of software evaluation. Theoretical and empirical analyses reveal that Chapelle's principles for evaluating CAL...Using CALL authoring software "Language Coach" as a field example, this case study looks into criteria of software evaluation. Theoretical and empirical analyses reveal that Chapelle's principles for evaluating CALL software, which focus on "Language Learning Potential," not only broaden ways for the innovation of CALL software, but also shed lights on CALL pedagogy.展开更多
We recently reported several driver genes of biliary tract carcinoma(BTC) that are known to play important roles in oncogenesis and disease progression. Although the need for developing novel therapeutic strategies is...We recently reported several driver genes of biliary tract carcinoma(BTC) that are known to play important roles in oncogenesis and disease progression. Although the need for developing novel therapeutic strategies is increasing, there are very few BTC cell lines and xenograft models currently available for conducting preclinical studies. Using a total of 88 surgical BTC specimens and 536 immunodeficient mice, 28 xenograft models and 13 new BTC cell lines, including subtypes, were established. Some of our cell lines were found to be resistant to gemcitabine, which is currently the first choice of treatment, thereby allowing highly practical preclinical studies to be conducted. Using the aforementioned cell lines and xenograft models and a clinical pathological database of patients undergoing BTC resection, we can establish a preclinical study system and appropriate parameters for drug efficacy studies to explore new biomarkers for practical applications in the future studies.展开更多
文摘高光谱图像(Hyperspectral Images, HSI)可提供几十到数百个连续的光谱波段,但这些波段导致数据处理的复杂性增加,并且相邻波段的冗余度较大。为了解决这些问题,提出了一种潜在特征融合和最优聚类的高光谱图像降维方法(Latent Features Fusion and Optimal Clustering Framework, LFFOCF)。该方法使用超像素分割将HSI分割为多个区域,以便充分保留HSI的空间信息。通过构造相应的拉普拉斯矩阵获取先验信息,生成一组低维潜在特征,进一步增强不同波段之间的可分性;通过融合区域感知的潜在特征,获得HSI的共享潜在特征表示,以有效捕获HSI的频带冗余;通过最优聚类框架搜索HSI中的最优聚类结构,在一种排序策略的基础上获得最优聚类结果,生成相关性较低且具有更多鉴别信息的波段子集。该方法充分利用了光谱和空间特性,在两个公共数据集上的大量实验表明,与Optimal Neighborhood Reconstruction(ONR)、Optimal Clustering Framework(OCF)和Region-aware Latent Features Fusion based Clustering(RLFFC)方法相比,所提出的方法在OA、MA和Kappa系数3个指标上都优于其他算法。
文摘无监督特征选择是机器学习和数据挖掘中的一种重要的降维技术。然而当前的无监督特征选择方法侧重于从数据的邻接矩阵中学习数据的流形结构,忽视非邻接数据对之间的关联。其次这些方法都假设数据实例具有独立同一性,但现实中的数据样本其来源是不同的,这样的假设就不成立。此外,在原始数据空间中特征重要性的衡量会受到数据和特征中的噪声影响。基于以上问题,本文提出了潜在多步马尔可夫概率的鲁棒无监督特征选择方法(unsupervised feature selection via multi-step Markov probability and latent representation,MMLRL),其思想是通过最大多步马尔可夫转移概率学习数据流形结构,然后通过对称非负矩阵分解模型学习数据的潜在表示,最后在数据的潜在表示空间中选择特征。同时在6个不同类型的数据集上验证了所提出算法的有效性。
基金financially supported from the National Key Research and Development Program of China(No.2019YFC1803601)the Fundamental Research Funds for the Central Universities of Central South University,China(No.2023ZZTS0801)+1 种基金the Postgraduate Innovative Project of Central South University,China(No.2023XQLH068)the Postgraduate Scientific Research Innovation Project of Hunan Province,China(No.QL20230054)。
文摘A general prediction model for seven heavy metals was established using the heavy metal contents of 207soil samples measured by a portable X-ray fluorescence spectrometer(XRF)and six environmental factors as model correction coefficients.The eXtreme Gradient Boosting(XGBoost)model was used to fit the relationship between the content of heavy metals and environment characteristics to evaluate the soil ecological risk of the smelting site.The results demonstrated that the generalized prediction model developed for Pb,Cd,and As was highly accurate with fitted coefficients(R~2)values of 0.911,0.950,and 0.835,respectively.Topsoil presented the highest ecological risk,and there existed high potential ecological risk at some positions with different depths due to high mobility of Cd.Generally,the application of machine learning significantly increased the accuracy of pXRF measurements,and identified key environmental factors.The adapted potential ecological risk assessment emphasized the need to focus on Pb,Cd,and As in future site remediation efforts.
文摘Using CALL authoring software "Language Coach" as a field example, this case study looks into criteria of software evaluation. Theoretical and empirical analyses reveal that Chapelle's principles for evaluating CALL software, which focus on "Language Learning Potential," not only broaden ways for the innovation of CALL software, but also shed lights on CALL pedagogy.
文摘We recently reported several driver genes of biliary tract carcinoma(BTC) that are known to play important roles in oncogenesis and disease progression. Although the need for developing novel therapeutic strategies is increasing, there are very few BTC cell lines and xenograft models currently available for conducting preclinical studies. Using a total of 88 surgical BTC specimens and 536 immunodeficient mice, 28 xenograft models and 13 new BTC cell lines, including subtypes, were established. Some of our cell lines were found to be resistant to gemcitabine, which is currently the first choice of treatment, thereby allowing highly practical preclinical studies to be conducted. Using the aforementioned cell lines and xenograft models and a clinical pathological database of patients undergoing BTC resection, we can establish a preclinical study system and appropriate parameters for drug efficacy studies to explore new biomarkers for practical applications in the future studies.