摘要
基因在表示上或在表示下面仔细与人的疾病被联系,它贡献 phenotypic 变化和差异。到我们的最好的知识,没有可得到的单个开的特定的资源提供在基因在表示上或在表示下面和各种各样的疾病之间的协会信息。在这研究,我们基于我们的建议文本采矿管道和几个开的 curated 数据库介绍了一个全面联系疾病的过去表示、下面表示的基因数据库(OUGene ) 。它包含在 7,238 过去表示或下面表示的基因和 1,480 疾病之间的总数 41,269 唯一的协会,它被 81,974 个证据句子从 56,442 篇文章支持。OUGene 是全面的并且盖住很重要的治疗学的区域。同时,一个新得分系统被设计对手副牧师数据基于 benchmarking 评价协会。OUGene 提供一个 easy-of-use 网接口让研究人员分析这些数据并且设想联系网络,它能在系统水平把卓见给在过去表示、下面表示的基因和疾病之间的复杂关系。它在 www.csbio.sjtu.edu.cn/bioinf/OUGene/ 是可得到的。
Gene over-expression or under-expression is closely associated with human diseases, which contributes to phenotypic variations and diversity. To our best knowledge, there is no single open specific resource available to provide the association information between gene over- or under-expression and various diseases. In this study, we presented a comprehensive disease-associated over- and under-expressed gene database (OUGene) based on our proposed text mining pipeline and several open curated databases. It contains total 41,269 unique associa- tions between 7,238 over- or under-expressed genes and 1,480 diseases, which are supported by 81,974 evidence sentences from 56,442 articles. The OUGene is compre- hensive and covers most important therapeutic areas. Meanwhile a new scoring system is designed to rank the associations based on benchmarking against hand-curated data. OUGene provides an easy-of-use web interface for researchers to analyze these data and visualize the associ- ated networks, which can give insights to the complex relationships between over- and under-expressed genes and diseases at a system level. It is available at www.csbio.sjtu. edu.cn/bioinf/OUGene/.