Knowledge distillation,as a pivotal technique in the field of model compression,has been widely applied across various domains.However,the problem of student model performance being limited due to inherent biases in t...Knowledge distillation,as a pivotal technique in the field of model compression,has been widely applied across various domains.However,the problem of student model performance being limited due to inherent biases in the teacher model during the distillation process still persists.To address the inherent biases in knowledge distillation,we propose a de-biased knowledge distillation framework tailored for binary classification tasks.For the pre-trained teacher model,biases in the soft labels are mitigated through knowledge infusion and label de-biasing techniques.Based on this,a de-biased distillation loss is introduced,allowing the de-biased labels to replace the soft labels as the fitting target for the student model.This approach enables the student model to learn from the corrected model information,achieving high-performance deployment on lightweight student models.Experiments conducted on multiple real-world datasets demonstrate that deep learning models compressed under the de-biased knowledge distillation framework significantly outperform traditional response-based and feature-based knowledge distillation models across various evaluation metrics,highlighting the effectiveness and superiority of the de-biased knowledge distillation framework in model compression.展开更多
The era of big data brings opportunities and challenges to developing new statistical methods and models to evaluate social programs or economic policies or interventions. This paper provides a comprehensive review on...The era of big data brings opportunities and challenges to developing new statistical methods and models to evaluate social programs or economic policies or interventions. This paper provides a comprehensive review on some recent advances in statistical methodologies and models to evaluate programs with high-dimensional data. In particular, four kinds of methods for making valid statistical inferences for treatment effects in high dimensions are addressed. The first one is the so-called doubly robust type estimation, which models the outcome regression and propensity score functions simultaneously. The second one is the covariate balance method to construct the treatment effect estimators. The third one is the sufficient dimension reduction approach for causal inferences. The last one is the machine learning procedure directly or indirectly to make statistical inferences to treatment effect. In such a way, some of these methods and models are closely related to the de-biased Lasso type methods for the regression model with high dimensions in the statistical literature. Finally, some future research topics are also discussed.展开更多
基金supported by the National Natural Science Foundation of China under Grant No.62172056Young Elite Scientists Sponsorship Program by CAST under Grant No.2022QNRC001.
文摘Knowledge distillation,as a pivotal technique in the field of model compression,has been widely applied across various domains.However,the problem of student model performance being limited due to inherent biases in the teacher model during the distillation process still persists.To address the inherent biases in knowledge distillation,we propose a de-biased knowledge distillation framework tailored for binary classification tasks.For the pre-trained teacher model,biases in the soft labels are mitigated through knowledge infusion and label de-biasing techniques.Based on this,a de-biased distillation loss is introduced,allowing the de-biased labels to replace the soft labels as the fitting target for the student model.This approach enables the student model to learn from the corrected model information,achieving high-performance deployment on lightweight student models.Experiments conducted on multiple real-world datasets demonstrate that deep learning models compressed under the de-biased knowledge distillation framework significantly outperform traditional response-based and feature-based knowledge distillation models across various evaluation metrics,highlighting the effectiveness and superiority of the de-biased knowledge distillation framework in model compression.
基金Supported by the National Natural Science Foundation of China(71631004, 72033008)National Science Foundation for Distinguished Young Scholars(71625001)Science Foundation of Ministry of Education of China(19YJA910003)。
文摘The era of big data brings opportunities and challenges to developing new statistical methods and models to evaluate social programs or economic policies or interventions. This paper provides a comprehensive review on some recent advances in statistical methodologies and models to evaluate programs with high-dimensional data. In particular, four kinds of methods for making valid statistical inferences for treatment effects in high dimensions are addressed. The first one is the so-called doubly robust type estimation, which models the outcome regression and propensity score functions simultaneously. The second one is the covariate balance method to construct the treatment effect estimators. The third one is the sufficient dimension reduction approach for causal inferences. The last one is the machine learning procedure directly or indirectly to make statistical inferences to treatment effect. In such a way, some of these methods and models are closely related to the de-biased Lasso type methods for the regression model with high dimensions in the statistical literature. Finally, some future research topics are also discussed.