随着信息时代的来临,互联网产生的大规模高维数据呈现几何级数增长,对其进行谱聚类在计算时间和内存使用上都存在瓶颈问题,尤其是求Laplacian矩阵特征向量分解。鉴于Hadoop MapReduce并行编程模型对密集型数据处理的优势,基于t最近邻稀...随着信息时代的来临,互联网产生的大规模高维数据呈现几何级数增长,对其进行谱聚类在计算时间和内存使用上都存在瓶颈问题,尤其是求Laplacian矩阵特征向量分解。鉴于Hadoop MapReduce并行编程模型对密集型数据处理的优势,基于t最近邻稀疏化近似相似Laplacian矩阵,设计Hadoop MapReduce并行近似谱聚类算法,以期解决上述瓶颈问题。实验使用UCI Bag of Words数据集验证所设计算法的正确性和有效性,结果显示该并行设计在谱聚类质量和性能方面达到了一定的预期效果。展开更多
The Circle algorithm was proposed for large datasets.The idea of the algorithm is to find a set of vertices that are close to each other and far from other vertices.This algorithm makes use of the connection between c...The Circle algorithm was proposed for large datasets.The idea of the algorithm is to find a set of vertices that are close to each other and far from other vertices.This algorithm makes use of the connection between clustering aggregation and the problem of correlation clustering.The best deterministic approximation algorithm was provided for the variation of the correlation of clustering problem,and showed how sampling can be used to scale the algorithms for large datasets.An extensive empirical evaluation was given for the usefulness of the problem and the solutions.The results show that this method achieves more than 50% reduction in the running time without sacrificing the quality of the clustering.展开更多
Photoinduced carrier dynamic processes are without doubt the main driving force responsible for the efficient performance of semiconductor nanomaterials in applications like photoconversion and photonics.Nevertheless,...Photoinduced carrier dynamic processes are without doubt the main driving force responsible for the efficient performance of semiconductor nanomaterials in applications like photoconversion and photonics.Nevertheless,establishing theoretical insights into these processes is computationally challenging owing to the multiple factors involved in the processes,namely reaction rate,material surface area,material composition etc.Modelling of photoinduced carrier dynamic processes can be performed via nonadiabatic molecular dynamics(NA-MD)methods,which are methods specifically designed to solve the time-dependent Schrodinger equation with the inclusion of nonadiabatic couplings.Among NA-MD methods,surface hopping methods have been proven to be a mighty tool to mimic the competitive nonadiabatic processes in semiconductor nanomaterials,a worth noticing feature is its exceptional balance between accuracy and computational cost.Consequently,surface hopping is the method of choice for modelling ultrafast dynamics and more complex phenomena like charge separation in Janus transition metal dichalcogenides-based van der Waals heterojunction materials.Covering latest stateof-the-art numerical simulations along with experimental results in the field,this review aims to provide a basic understanding of the tight relation between semiconductor nanomaterials and the proper simulation of their properties via surface hopping methods.Special stress is put on emerging state-ot-the-art techniques.By highlighting the challenge imposed by new materials,we depict emerging creative approaches,including high-level electronic structure methods and NA-MD methods to model nonadiabatic systems with high complexity.展开更多
文摘随着信息时代的来临,互联网产生的大规模高维数据呈现几何级数增长,对其进行谱聚类在计算时间和内存使用上都存在瓶颈问题,尤其是求Laplacian矩阵特征向量分解。鉴于Hadoop MapReduce并行编程模型对密集型数据处理的优势,基于t最近邻稀疏化近似相似Laplacian矩阵,设计Hadoop MapReduce并行近似谱聚类算法,以期解决上述瓶颈问题。实验使用UCI Bag of Words数据集验证所设计算法的正确性和有效性,结果显示该并行设计在谱聚类质量和性能方面达到了一定的预期效果。
基金Projects(60873265,60903222) supported by the National Natural Science Foundation of China Project(IRT0661) supported by the Program for Changjiang Scholars and Innovative Research Team in University of China
文摘The Circle algorithm was proposed for large datasets.The idea of the algorithm is to find a set of vertices that are close to each other and far from other vertices.This algorithm makes use of the connection between clustering aggregation and the problem of correlation clustering.The best deterministic approximation algorithm was provided for the variation of the correlation of clustering problem,and showed how sampling can be used to scale the algorithms for large datasets.An extensive empirical evaluation was given for the usefulness of the problem and the solutions.The results show that this method achieves more than 50% reduction in the running time without sacrificing the quality of the clustering.
基金supported by the National Natural Science Foundation of China(No.22073045)the Fundamental Research Funds for the Central Universities。
文摘Photoinduced carrier dynamic processes are without doubt the main driving force responsible for the efficient performance of semiconductor nanomaterials in applications like photoconversion and photonics.Nevertheless,establishing theoretical insights into these processes is computationally challenging owing to the multiple factors involved in the processes,namely reaction rate,material surface area,material composition etc.Modelling of photoinduced carrier dynamic processes can be performed via nonadiabatic molecular dynamics(NA-MD)methods,which are methods specifically designed to solve the time-dependent Schrodinger equation with the inclusion of nonadiabatic couplings.Among NA-MD methods,surface hopping methods have been proven to be a mighty tool to mimic the competitive nonadiabatic processes in semiconductor nanomaterials,a worth noticing feature is its exceptional balance between accuracy and computational cost.Consequently,surface hopping is the method of choice for modelling ultrafast dynamics and more complex phenomena like charge separation in Janus transition metal dichalcogenides-based van der Waals heterojunction materials.Covering latest stateof-the-art numerical simulations along with experimental results in the field,this review aims to provide a basic understanding of the tight relation between semiconductor nanomaterials and the proper simulation of their properties via surface hopping methods.Special stress is put on emerging state-ot-the-art techniques.By highlighting the challenge imposed by new materials,we depict emerging creative approaches,including high-level electronic structure methods and NA-MD methods to model nonadiabatic systems with high complexity.