摘要
In cancer genomes, there are frequent copy number aberration (CNA) events, some of which are believed to be tumori-genic. While copy numbers can be detected by a number of technologies, e.g., SNP arrays, their relations with gene expressions are not well clarified. Here, we describe an approach to visualize the global relations between copy numbers and gene expressions using expression microarrays. We mapped the gene expression signals detected by microar-ray probesets onto a reference human genome, the RefSeq, based on their annotated physical positions, resulting in a landscape that we called expressogram. To study the expressograms under various conditions and their relations with cytogenetic events, such as CNAs, we obtained three classes of array samples, namely samples of a cancer (e.g., liver cancer), normal samples in the same tissue, and normal samples of other tissues. We developed a Bayesian based algorithm to estimate a background signal from the latter two sources for the cancer samples. By subtracting the estimated background from the raw signals of the cancer samples, and subjecting the differences to a kernel-based smoothing scheme, we produced an expressogram that shows strong consistency with the copy numbers. This indicates that copy numbers are on average positively correlated with and have strong impacts on gene expressions. To further explore the applicability of these findings, we submit the expressograms to the significant CNA detection algorithm GISTIC. The results strongly indicate that expressogram can also be used to infer copy number events and significant regions of CNA affected dysregulation.
In cancer genomes, there are frequent copy number aberration (CNA) events, some of which are believed to be tumori-genic. While copy numbers can be detected by a number of technologies, e.g., SNP arrays, their relations with gene expressions are not well clarified. Here, we describe an approach to visualize the global relations between copy numbers and gene expressions using expression microarrays. We mapped the gene expression signals detected by microar-ray probesets onto a reference human genome, the RefSeq, based on their annotated physical positions, resulting in a landscape that we called expressogram. To study the expressograms under various conditions and their relations with cytogenetic events, such as CNAs, we obtained three classes of array samples, namely samples of a cancer (e.g., liver cancer), normal samples in the same tissue, and normal samples of other tissues. We developed a Bayesian based algorithm to estimate a background signal from the latter two sources for the cancer samples. By subtracting the estimated background from the raw signals of the cancer samples, and subjecting the differences to a kernel-based smoothing scheme, we produced an expressogram that shows strong consistency with the copy numbers. This indicates that copy numbers are on average positively correlated with and have strong impacts on gene expressions. To further explore the applicability of these findings, we submit the expressograms to the significant CNA detection algorithm GISTIC. The results strongly indicate that expressogram can also be used to infer copy number events and significant regions of CNA affected dysregulation.