We propose a new method for tumor classification from gene expression data, which mainly contains three steps. Firstly, the original DNA microarray gene expression data are modeled by independent component analysis (...We propose a new method for tumor classification from gene expression data, which mainly contains three steps. Firstly, the original DNA microarray gene expression data are modeled by independent component analysis (ICA). Secondly, the most discriminant eigenassays extracted by ICA are selected by the sequential floating forward selection technique. Finally, support vector machine is used to classify the modeling data. To show the validity of the proposed method, we applied it to classify three DNA microarray datasets involving various human normal and tumor tissue samples. The experimental results show that the method is efficient and feasible.展开更多
With the ever-growing data and the need for developing powerful machine learning models,data owners increasingly depend on various untrusted platforms(e.g.,public clouds,edges,and machine learning service providers)fo...With the ever-growing data and the need for developing powerful machine learning models,data owners increasingly depend on various untrusted platforms(e.g.,public clouds,edges,and machine learning service providers)for scalable processing or collaborative learning.Thus,sensitive data and models are in danger of unauthorized access,misuse,and privacy compromises.A relatively new body of research confidentially trains machine learning models on protected data to address these concerns.In this survey,we summarize notable studies in this emerging area of research.With a unified framework,we highlight the critical challenges and innovations in outsourcing machine learning confidentially.We focus on the cryptographic approaches for confidential machine learning(CML),primarily on model training,while also covering other directions such as perturbation-based approaches and CML in the hardware-assisted computing environment.The discussion will take a holistic way to consider a rich context of the related threat models,security assumptions,design principles,and associated trade-offs amongst data utility,cost,and confidentiality.展开更多
基金the National Natural Sci-ence Foundation of China (No. 30700161)the Na-tional High-Tech Research and Development Program(863 Program) of China (No. 2007AA01Z167 and2006AA02Z309)+1 种基金China Postdoctoral Science Foun-dation (No. 20070410223)Doctor Scientific Re-search Startup Foundation of Qufu Normal University(No. Bsqd2007036).
文摘We propose a new method for tumor classification from gene expression data, which mainly contains three steps. Firstly, the original DNA microarray gene expression data are modeled by independent component analysis (ICA). Secondly, the most discriminant eigenassays extracted by ICA are selected by the sequential floating forward selection technique. Finally, support vector machine is used to classify the modeling data. To show the validity of the proposed method, we applied it to classify three DNA microarray datasets involving various human normal and tumor tissue samples. The experimental results show that the method is efficient and feasible.
基金the National Science Foundation under grant no.1245847the National Institute of Health under grant no.1R43AI136357-01A1.
文摘With the ever-growing data and the need for developing powerful machine learning models,data owners increasingly depend on various untrusted platforms(e.g.,public clouds,edges,and machine learning service providers)for scalable processing or collaborative learning.Thus,sensitive data and models are in danger of unauthorized access,misuse,and privacy compromises.A relatively new body of research confidentially trains machine learning models on protected data to address these concerns.In this survey,we summarize notable studies in this emerging area of research.With a unified framework,we highlight the critical challenges and innovations in outsourcing machine learning confidentially.We focus on the cryptographic approaches for confidential machine learning(CML),primarily on model training,while also covering other directions such as perturbation-based approaches and CML in the hardware-assisted computing environment.The discussion will take a holistic way to consider a rich context of the related threat models,security assumptions,design principles,and associated trade-offs amongst data utility,cost,and confidentiality.