基于抑郁症相关医学文献信息挖掘潜在抗抑郁药并揭示其抗抑郁相关机制和通路,为抑郁症药物的研发提供方向。通过SemRep提取与抑郁症相关的语义三元组,限定语义关系和语义类型确定潜在药物。从PubChem、GeneCards等数据库获取潜在药物和...基于抑郁症相关医学文献信息挖掘潜在抗抑郁药并揭示其抗抑郁相关机制和通路,为抑郁症药物的研发提供方向。通过SemRep提取与抑郁症相关的语义三元组,限定语义关系和语义类型确定潜在药物。从PubChem、GeneCards等数据库获取潜在药物和抑郁症的靶点并取二者的交集后,构建交集靶点的蛋白相互作用(protein-protein interaction,PPI)网络。通过Cytoscape分析PPI网络,确定核心靶点。通过R软件对核心靶点进行GO(gene ontology)和KEGG(kyoto encyclopedia of genes and genomes)分析。最后,通过AutodockTool软件对核心靶点与潜在药物进行分子对接分析。结果表明Hydrocortisone、Benzodiazepine、Curcumin、Metformin、Nicotine、Risperidone等6种药物为潜在抗抑郁药,这些药物通过炎症、神经递质的调节等生物过程以及MAPK(mitogen activated protein kinase)、TNF(tumor necrosis factor)等信号通路发挥抗抑郁作用,并且与抑郁症核心靶点间结合性能良好。此外,Benzodiazepine和Nicotine在临床实践中存在成瘾和滥用的风险,在抑郁症治疗中的作用可能有限。可见基于语义三元组和网络药理学发现抑郁症药物新知识,可以节约时间和经济成本,也能为临床药物使用提供新的方向。展开更多
基于分布式语义学理论的词向量蕴含了丰富的语义信息,一定程度上标志着自然语言处理和计算语言学领域进入了大模型发展时代。由于词向量的可计算属性,逐渐发展出了多种基于词向量的语义计算任务,语义关系辨析便是语义计算任务当中重要...基于分布式语义学理论的词向量蕴含了丰富的语义信息,一定程度上标志着自然语言处理和计算语言学领域进入了大模型发展时代。由于词向量的可计算属性,逐渐发展出了多种基于词向量的语义计算任务,语义关系辨析便是语义计算任务当中重要的一项。本研究基于fastText中文词向量和腾讯中文词向量的方法计算出表征语义关联强度的余弦相似度值,并得出以下结论:fastText中文词向量和腾讯中文词向量在辨别近义关系、反义关系、上下义关系、部分–整体关系这4种语义关系的任务上表现存在一定差异;通过比较Spearman相关系数,fastText中文词向量在实验数据上表现出其习得了更强的语义相似度特征,腾讯中文词向量则体现出其学习到了更强的语义相关度特征;在反义词辨析任务上,fastText中文词向量和腾讯中文词向量都在高度规约化的反义词对上计算出很高的余弦相似度值。The word embeddings, based on the distributed semantics theory, which contains rich linguistic information, have contributed a lot to the development of large language model (LLM) in the fields of natural language processing and computational linguistics. Due to the computable properties of word embeddings, various semantic computing tasks based on them have gradually emerged, among which semantic relation discrimination is an important task in semantic computation. In our study, we adopt two word-embedding methods, the fastText Chinese word embeddings and the Tencent Chinese word embeddings, to calculate Chinese semantic relations, where the cosine similarity is used to represent the semantic association strength between words. The following are our findings in this study: First, the fastText Chinese embeddings and the Tencent Chinese embeddings show some differences in the task of distinguishing the four types of semantic relation in Chinese, namely, synonymy, antonymy, hyponymy and meronymy;Second, by comparing the Spearman correlation coefficient, the fastText embeddings have acquired more knowledge of semantic similarity between words, while the Tencent Chinese word embeddings have acquired more knowledge of semantic relatedness between words;Third, both the fastText Chinese embeddings and the Tencent Chinese word embeddings give higher values of cosine similarity to highly conventionalized antonyms.展开更多
文摘基于抑郁症相关医学文献信息挖掘潜在抗抑郁药并揭示其抗抑郁相关机制和通路,为抑郁症药物的研发提供方向。通过SemRep提取与抑郁症相关的语义三元组,限定语义关系和语义类型确定潜在药物。从PubChem、GeneCards等数据库获取潜在药物和抑郁症的靶点并取二者的交集后,构建交集靶点的蛋白相互作用(protein-protein interaction,PPI)网络。通过Cytoscape分析PPI网络,确定核心靶点。通过R软件对核心靶点进行GO(gene ontology)和KEGG(kyoto encyclopedia of genes and genomes)分析。最后,通过AutodockTool软件对核心靶点与潜在药物进行分子对接分析。结果表明Hydrocortisone、Benzodiazepine、Curcumin、Metformin、Nicotine、Risperidone等6种药物为潜在抗抑郁药,这些药物通过炎症、神经递质的调节等生物过程以及MAPK(mitogen activated protein kinase)、TNF(tumor necrosis factor)等信号通路发挥抗抑郁作用,并且与抑郁症核心靶点间结合性能良好。此外,Benzodiazepine和Nicotine在临床实践中存在成瘾和滥用的风险,在抑郁症治疗中的作用可能有限。可见基于语义三元组和网络药理学发现抑郁症药物新知识,可以节约时间和经济成本,也能为临床药物使用提供新的方向。
文摘基于分布式语义学理论的词向量蕴含了丰富的语义信息,一定程度上标志着自然语言处理和计算语言学领域进入了大模型发展时代。由于词向量的可计算属性,逐渐发展出了多种基于词向量的语义计算任务,语义关系辨析便是语义计算任务当中重要的一项。本研究基于fastText中文词向量和腾讯中文词向量的方法计算出表征语义关联强度的余弦相似度值,并得出以下结论:fastText中文词向量和腾讯中文词向量在辨别近义关系、反义关系、上下义关系、部分–整体关系这4种语义关系的任务上表现存在一定差异;通过比较Spearman相关系数,fastText中文词向量在实验数据上表现出其习得了更强的语义相似度特征,腾讯中文词向量则体现出其学习到了更强的语义相关度特征;在反义词辨析任务上,fastText中文词向量和腾讯中文词向量都在高度规约化的反义词对上计算出很高的余弦相似度值。The word embeddings, based on the distributed semantics theory, which contains rich linguistic information, have contributed a lot to the development of large language model (LLM) in the fields of natural language processing and computational linguistics. Due to the computable properties of word embeddings, various semantic computing tasks based on them have gradually emerged, among which semantic relation discrimination is an important task in semantic computation. In our study, we adopt two word-embedding methods, the fastText Chinese word embeddings and the Tencent Chinese word embeddings, to calculate Chinese semantic relations, where the cosine similarity is used to represent the semantic association strength between words. The following are our findings in this study: First, the fastText Chinese embeddings and the Tencent Chinese embeddings show some differences in the task of distinguishing the four types of semantic relation in Chinese, namely, synonymy, antonymy, hyponymy and meronymy;Second, by comparing the Spearman correlation coefficient, the fastText embeddings have acquired more knowledge of semantic similarity between words, while the Tencent Chinese word embeddings have acquired more knowledge of semantic relatedness between words;Third, both the fastText Chinese embeddings and the Tencent Chinese word embeddings give higher values of cosine similarity to highly conventionalized antonyms.