摘要
在应用基于转录组特征构建的支持向量机、贝叶斯分类器等传统分类器对组织样本进行分类时,要求对基因表达谱进行样本间的数据标准化处理,以去除实验批次效应带来的影响,因此限制了这些分类器在个体化水平上的应用。本文旨在构建鉴别肺癌组织与非癌(肺炎与肺正常)组织的个体化分类器。文中采用来自多组独立数据的197例肺癌与189例肺非癌组织样本作为训练集,筛选得到了3对基因作为特征,应用多数投票规则区分肺癌组织与肺非癌组织的平均准确率达到95.34%。然后,本文采用来自多组独立数据的251例肺癌组织与141例肺非癌组织样本的非标化数据进行独立验证,其平均准确率达到96.78%。因此,本文提出的该分类器可对由不同实验室检测的样本进行个体化判断提供一种新的思路,具有较强的临床实用性。
Traditional classifiers, such as support vector machine and Bayesian classifier, require data normalization for removing experimental batch effects, which limit their applications at the individual level. In this paper, we aim to build a classifier to distinguish lung cancer and non-cancer lung tissues (pneumonia and normal lung tissues). We identified gene pairs as signatures to build a classifier based on the within-sample relative expression orderings ofgene pairs in a particular type of tissues (cancer or non-cancer). Using multiple independent datasets as the training data, including a total of 197 lung cancer cases and 189 non-cancer cases, we identified three gene pairs. Classifying a sample by the majority voting rule, the average accuracy reached 95.34% in the training data. Using multiple independent validation datasets, including a total of 251 lung cancer samples and 141 non-cancer samples without data normalization, the average accuracy was as high as 96.78%. The rank-based signature is robust against experimental batch effects and can be used to diagnose lung cancer using samples measured by different laboratories at the individual level.
作者
陈燕花
郑宝童
林云轻
朱慧敏
郑智军
关庆洲
郭政
严海丹
CHEN Yanhua ZHENG Baotong LIN Yunqing ZHU Huimin ZHENG Zhijun GUAN Qingzhou GUO Zheng YAN Haidan(Key Laboratory of Ministry of Education for Gastrointestinal Cancer, Department of Bioinformatics, Fujian Medical University, Fuzhou 350108, P.R.China)
出处
《生物医学工程学杂志》
EI
CAS
CSCD
北大核心
2017年第1期129-133,共5页
Journal of Biomedical Engineering
基金
国家自然科学基金(81572935
81372213
81501215
81501829)
大学生创新创业训练计划(201510392017)
福建医科大学苗圃科研基金(2015MP005)
关键词
标志
分类器
肺癌
数据标准化
批次效应
signature
classifier
lung cancer
data normalization
batch effect