摘要
采用一种针对有机化合物提出的类语言分子描述符对哈佛清洁能源项目数据库(CEPDB)中29000个有机太阳能电池供体分子进行描述,分子将基于最近邻子图理论被分解成片段(词),并利用广度优先搜索算法将片段排列成一定的序列(句子),在每个片段的信息被嵌入一个数值向量后,每个分子可表示为一个信息矩阵。在此基础上,通过一个深层神经网络提取嵌入信息,并与对应材料的光电转换效率(PCE)关联,获得了决定系数(R2)为0.97、均方误差(MSE)为0.16的预测结果。与现有方法的比较表明该方法在精度上具有竞争力。在建模过程中引入注意力机制,识别出了几个对PCE值具有决定性意义的分子片段,可为有机光伏材料的逆向设计提供指导信息。
A language-like descriptor for organic compounds was used to describe 29000 organic solar cell donor molecules collected from the Harvard Clean Energy Project Database(CEPDB).Inspired by the similarity between organic chemistry and natural language,these molecules were decomposed into fragments(words)based on the nearest neighbor subgraph theory,and these fragments were arranged into a certain sequence(sentences)by the breadth first search algorithm.After the information of each fragment was embedded into a numerical vector,each molecule can be represented by an information matrix.This matrix is a descriptor called g-FSI,which can reflect the composition and structure information of molecules.The descriptor was then parsed by a deep neural network to extract the embedded information and correlate to the corresponding PCE.The prediction model has obtained the prediction result in which the determination coefficient(R2)is 0.97 and the mean square error(MSE)is 0.16.Compared with the existing research,this model is competitive in accuracy of prediction.The attention mechanism is introduced in the modeling process,and several molecular fragments that are decisive for the PCE value are identified,which can provide guidance information for the reverse design of organic photovoltaic materials.
作者
于程远
吴金奎
周利
吉旭
戴一阳
党亚固
YU Chengyuan;WU Jinkui;ZHOU Li;JI Xu;DAI Yiyang;DANG Yagu(School of Chemical Engineering,Sichuan University,Chengdu 610065,Sichuan,China)
出处
《化工学报》
EI
CAS
CSCD
北大核心
2021年第3期1487-1495,共9页
CIESC Journal
基金
中央高校基本科研业务费专项资金(YJ201838)
国家自然科学基金项目(21776183,21706220)。
关键词
有机化合物
太阳能
类语言描述符
深度学习
预测
光电转换效率
organic compounds
solar energy
language-like descriptor
deep learning
prediction
power conversion efficiency