Sepsis treatment is a highly challenging effort to reduce mortality in hospital intensive care units since the treatment response may vary for each patient.Tailored treatment recommendations are desired to assist doct...Sepsis treatment is a highly challenging effort to reduce mortality in hospital intensive care units since the treatment response may vary for each patient.Tailored treatment recommendations are desired to assist doctors in making decisions efficiently and accurately.In this work,we apply a self-supervised method based on reinforcement learning(RL)for treatment recommendation on individuals.An uncertainty evaluation method is proposed to separate patient samples into two domains according to their responses to treatments and the state value of the chosen policy.Examples of two domains are then reconstructed with an auxiliary transfer learning task.A distillation method of privilege learning is tied to a variational auto-encoder framework for the transfer learning task between the low-and high-quality domains.Combined with the self-supervised way for better state and action representations,we propose a deep RL method called high-risk uncertainty(HRU)control to provide flexibility on the trade-off between the effectiveness and accuracy of ambiguous samples and to reduce the expected mortality.Experiments on the large-scale publicly available real-world dataset MIMIC-Ⅲdemonstrate that our model reduces the estimated mortality rate by up to 2.3%in total,and that the estimated mortality rate in the majority of cases is reduced to 9.5%.展开更多
基金the National Natural Science Foundation of China(No.61702186)。
文摘Sepsis treatment is a highly challenging effort to reduce mortality in hospital intensive care units since the treatment response may vary for each patient.Tailored treatment recommendations are desired to assist doctors in making decisions efficiently and accurately.In this work,we apply a self-supervised method based on reinforcement learning(RL)for treatment recommendation on individuals.An uncertainty evaluation method is proposed to separate patient samples into two domains according to their responses to treatments and the state value of the chosen policy.Examples of two domains are then reconstructed with an auxiliary transfer learning task.A distillation method of privilege learning is tied to a variational auto-encoder framework for the transfer learning task between the low-and high-quality domains.Combined with the self-supervised way for better state and action representations,we propose a deep RL method called high-risk uncertainty(HRU)control to provide flexibility on the trade-off between the effectiveness and accuracy of ambiguous samples and to reduce the expected mortality.Experiments on the large-scale publicly available real-world dataset MIMIC-Ⅲdemonstrate that our model reduces the estimated mortality rate by up to 2.3%in total,and that the estimated mortality rate in the majority of cases is reduced to 9.5%.