现有大多数跨域推荐(cross-domain recommendation,CDR)方法只是简单利用评分数据,对评论信息的挖掘不足。评论信息中往往包含用户的多个观点,如何充分利用评论信息中的细粒度观点挖掘其潜在价值以更好地解决跨域推荐冷启动和数据稀疏问...现有大多数跨域推荐(cross-domain recommendation,CDR)方法只是简单利用评分数据,对评论信息的挖掘不足。评论信息中往往包含用户的多个观点,如何充分利用评论信息中的细粒度观点挖掘其潜在价值以更好地解决跨域推荐冷启动和数据稀疏问题,成为当下跨域推荐的研究重点与难点。因此,设计了一种基于评论细粒度观点的跨域推荐模型(cross-domain recommendation model based on fine-grained opinion from review,FGOR-CDRM)。该模型主要由评论细粒度观点提取、辅助评论增强、跨域相关性学习三个模块组成。将文本卷积神经网络(text convolutional neural network,TextCNN)与门控机制结合,通过设置两个全局细粒度观点矩阵指导查询,有效提取评论信息的细粒度观点;在文本卷积之上增加一层卷积,利用相似非重叠用户的评论构建辅助文档,在增加训练数据多样性的同时有效缓解了数据稀疏;学习跨域细粒度观点之间的相关性,利用静态细粒度观点构建相关矩阵并进行语义匹配,实现目标域冷启动用户对项目的评分预测。在Amazon三个不同数据集(Book,Movies and TV,CDs and Vinyl)构成的三个领域对下进行实验,实验结果表明,FGOR-CDRM模型在三数据对下的表现均优于其他基准模型,以“电影-图书”数据对为例,FGOR-CDRM模型的(mean absolute error,MAE)比基线模型中ANR模型提高6.09%,比CDLFM模型提高3.58%。展开更多
Domain-domain interactions are important clues to inferring protein-protein interactions. Although about 8 000 domain-domain interactions are discovered so far,they are just the tip of the iceberg. Because domains are...Domain-domain interactions are important clues to inferring protein-protein interactions. Although about 8 000 domain-domain interactions are discovered so far,they are just the tip of the iceberg. Because domains are conservative and commonplace in proteins,domain-domain interactions are discovered based on pairs of domains which significantly co-exist in proteins. Meanwhile,it is realized that:( 1) domain-domain interactions may exist within the same proteins or across different proteins;( 2) only the domain-domain interactions across different proteins can mediate interactions between proteins;( 3) domains have biases to interact with other domains. And then,a novel method is put forward to construct protein-protein interaction network by using domain-domain interactions. The method is validated by experiments and compared with the state- of-art methods in the field. The experimental results suggest that the method is reasonable and effectiveness on constructing Protein-protein interactions network.展开更多
随着网络与信息技术的快速发展,导致网络上产生了大量的电子文本,而文本间的相似度计算是文本处理的一种重要手段。对于大规模的文本集,通常采用向量空间模型(vector space model,VSM)进行文本表示,但是该方法面临着文本向量维度较高及...随着网络与信息技术的快速发展,导致网络上产生了大量的电子文本,而文本间的相似度计算是文本处理的一种重要手段。对于大规模的文本集,通常采用向量空间模型(vector space model,VSM)进行文本表示,但是该方法面临着文本向量维度较高及文本语义相似度难以度量的问题。提出一种改进的文本相似度计算方法,从大量的特征空间中选择出具有代表性的元数据特征向量元素,以降低向量空间的维度;构建领域概念树并设计基于领域概念树的文本相似度算法,对领域概念中广泛存在的同义词进行处理,以提高文本之间语义相似度度量的性能。实验结果表明:通过降维和概念相似度计算可提高文本相似度计算的性能。展开更多
文摘现有大多数跨域推荐(cross-domain recommendation,CDR)方法只是简单利用评分数据,对评论信息的挖掘不足。评论信息中往往包含用户的多个观点,如何充分利用评论信息中的细粒度观点挖掘其潜在价值以更好地解决跨域推荐冷启动和数据稀疏问题,成为当下跨域推荐的研究重点与难点。因此,设计了一种基于评论细粒度观点的跨域推荐模型(cross-domain recommendation model based on fine-grained opinion from review,FGOR-CDRM)。该模型主要由评论细粒度观点提取、辅助评论增强、跨域相关性学习三个模块组成。将文本卷积神经网络(text convolutional neural network,TextCNN)与门控机制结合,通过设置两个全局细粒度观点矩阵指导查询,有效提取评论信息的细粒度观点;在文本卷积之上增加一层卷积,利用相似非重叠用户的评论构建辅助文档,在增加训练数据多样性的同时有效缓解了数据稀疏;学习跨域细粒度观点之间的相关性,利用静态细粒度观点构建相关矩阵并进行语义匹配,实现目标域冷启动用户对项目的评分预测。在Amazon三个不同数据集(Book,Movies and TV,CDs and Vinyl)构成的三个领域对下进行实验,实验结果表明,FGOR-CDRM模型在三数据对下的表现均优于其他基准模型,以“电影-图书”数据对为例,FGOR-CDRM模型的(mean absolute error,MAE)比基线模型中ANR模型提高6.09%,比CDLFM模型提高3.58%。
基金Sponsored by the National Natural Science Foundation of China(Grant No.61271346,61571163,61532014,91335112 and 61402132)the Fundamental Research Funds for the Central Universities(Grant No.DB13AB02)
文摘Domain-domain interactions are important clues to inferring protein-protein interactions. Although about 8 000 domain-domain interactions are discovered so far,they are just the tip of the iceberg. Because domains are conservative and commonplace in proteins,domain-domain interactions are discovered based on pairs of domains which significantly co-exist in proteins. Meanwhile,it is realized that:( 1) domain-domain interactions may exist within the same proteins or across different proteins;( 2) only the domain-domain interactions across different proteins can mediate interactions between proteins;( 3) domains have biases to interact with other domains. And then,a novel method is put forward to construct protein-protein interaction network by using domain-domain interactions. The method is validated by experiments and compared with the state- of-art methods in the field. The experimental results suggest that the method is reasonable and effectiveness on constructing Protein-protein interactions network.
文摘随着网络与信息技术的快速发展,导致网络上产生了大量的电子文本,而文本间的相似度计算是文本处理的一种重要手段。对于大规模的文本集,通常采用向量空间模型(vector space model,VSM)进行文本表示,但是该方法面临着文本向量维度较高及文本语义相似度难以度量的问题。提出一种改进的文本相似度计算方法,从大量的特征空间中选择出具有代表性的元数据特征向量元素,以降低向量空间的维度;构建领域概念树并设计基于领域概念树的文本相似度算法,对领域概念中广泛存在的同义词进行处理,以提高文本之间语义相似度度量的性能。实验结果表明:通过降维和概念相似度计算可提高文本相似度计算的性能。