Predicting the response variables of the target dataset is one of the main problems in machine learning. Predictive models are desired to perform satisfactorily in a broad range of target domains. However, that may no...Predicting the response variables of the target dataset is one of the main problems in machine learning. Predictive models are desired to perform satisfactorily in a broad range of target domains. However, that may not be plausible if there is a mismatch between the source and target domain distributions. The goal of domain adaptation algorithms is to solve this issue and deploy a model across different target domains. We propose a method based on kernel distribution embedding and Hilbert-Schmidt independence criterion (HSIC) to address this problem. The proposed method embeds both source and target data into a new feature space with two properties: 1) the distributions of the source and the target datasets are as close as possible in the new feature space, and 2) the important structural information of the data is preserved. The embedded data can be in lower dimensional space while preserving the aforementioned properties and therefore the method can be considered as a dimensionality reduction method as well. Our proposed method has a closed-form solution and the experimental results show that it works well in practice.展开更多
文摘Predicting the response variables of the target dataset is one of the main problems in machine learning. Predictive models are desired to perform satisfactorily in a broad range of target domains. However, that may not be plausible if there is a mismatch between the source and target domain distributions. The goal of domain adaptation algorithms is to solve this issue and deploy a model across different target domains. We propose a method based on kernel distribution embedding and Hilbert-Schmidt independence criterion (HSIC) to address this problem. The proposed method embeds both source and target data into a new feature space with two properties: 1) the distributions of the source and the target datasets are as close as possible in the new feature space, and 2) the important structural information of the data is preserved. The embedded data can be in lower dimensional space while preserving the aforementioned properties and therefore the method can be considered as a dimensionality reduction method as well. Our proposed method has a closed-form solution and the experimental results show that it works well in practice.