目的基于传染病动力学SEAIQR(susceptible-exposed-asymptomatic-infected-quarantined-removed)模型和Dropout-LSTM(Dropout long short term memory network)模型预测西安市新型冠状病毒肺炎(COVID-19)疫情的发展趋势,为评估“动态清...目的基于传染病动力学SEAIQR(susceptible-exposed-asymptomatic-infected-quarantined-removed)模型和Dropout-LSTM(Dropout long short term memory network)模型预测西安市新型冠状病毒肺炎(COVID-19)疫情的发展趋势,为评估“动态清零”策略防控效果提供科学依据。方法考虑到西安市本轮疫情存在大量的无症状感染者、依时变化的参数以及采取的管控举措等特点,构建具有阶段性防控措施的时变SEAIQR模型。考虑到COVID-19疫情数据的时序性特征及它们之间的非线性关系,构建深度学习Dropout-LSTM模型。选用2021年12月9日-2022年1月31日西安市新增确诊病例数据进行拟合,用2022年2月1日-2022年2月7日数据评估预测效果,计算有效再生数(R_(t))并评价不同参数对疫情发展的影响。结果SEAIQR模型预测的新增确诊病例拐点预计在2021年12月26日出现,约为176例,疫情将于2022年1月24日实现“动态清零”,模型R^(2)=0.849。Dropout-LSTM模型能够体现数据的时序性与非线性特征,预测出的新增确诊病例数与实际情况高度吻合,R^(2)=0.937。Dropout-LSTM模型的MAE和RMSE均较SEAIQR模型低,说明预测结果更为理想。疫情暴发初期,R 0为5.63,自实施全面管控后,R_(t)呈逐渐下降趋势,直到2021年12月27日降至1.0以下。随着有效接触率不断缩小、管控措施的提早实施及免疫阈值的提高,新增确诊病例在到达拐点时的人数将会持续降低。结论建立的Dropout-LSTM模型实现了较准确的疫情预测,可为COVID-19疫情“动态清零”防控决策提供借鉴。展开更多
In this paper, we introduce the survival modelling methodology in order to identify some factors which may be influencing the university dropout. By using the data base provided by the Fundación Universidad Aut...In this paper, we introduce the survival modelling methodology in order to identify some factors which may be influencing the university dropout. By using the data base provided by the Fundación Universidad Autónoma de Colombia and the semi parametric proportional hazard Cox model, we have been able to identify these risk factors.展开更多
Since a complete DNA chain contains a large data (usually billions of nucleotides), it’s challenging to figure out the function of each sequence segment. Several powerful predictive models for the function of DNA seq...Since a complete DNA chain contains a large data (usually billions of nucleotides), it’s challenging to figure out the function of each sequence segment. Several powerful predictive models for the function of DNA sequence, including, CNN (convolutional neural network), RNN (recurrent neural network), and LSTM [1] (long short-term memory) have been proposed. However, all of them have some flaws. For example, the RNN can hardly have long-term memory. Here, we build on one of these models, DanQ, which uses CNN and LSTM together. We extend DanQ by developing an improved DanQ model and applying it to predict the function of DNA sequence more efficiently. In the most primitive DanQ model, the regulatory grammar is learned by the regulatory motifs captured by the convolution layer and the long-term dependencies between the motifs captured by the recurrent layer, so as to increase the prediction accuracy. Through the testing of some models, DanQ has greatly improved in some indicators. For the regulatory markers, DanQ achieves improvements above 50% of the area under the curve, via the measurement of the precision-recall curve.展开更多
针对MOOC中学生行为数据的长短期混合特性,为解决辍学预测中的动态类别不平衡问题,提出一种基于深度学习的辍学预测策略。首先建立以天为时间步长、周为学习周期的新型学生行为时间序列,以捕捉每一时间步长下时间序列数据的短期依赖关...针对MOOC中学生行为数据的长短期混合特性,为解决辍学预测中的动态类别不平衡问题,提出一种基于深度学习的辍学预测策略。首先建立以天为时间步长、周为学习周期的新型学生行为时间序列,以捕捉每一时间步长下时间序列数据的短期依赖关系和相邻学习周期之间的长期模式和趋势。然后结合辍学定义的两种不同表达揭示MOOC辍学预测的动态类别不平衡现象。接着引入基于代价敏感的长短期时间序列深度学习模型,以实现对高辍学风险学生的精准预测。最后在KDD Cup 2015数据集上的实验证明,所提策略能够有效帮助MOOC课程教师和教学管理者追踪课程学生在不同时间步长的学习状态,从而动态监控不同学习阶段的辍学行为。展开更多
文摘目的基于传染病动力学SEAIQR(susceptible-exposed-asymptomatic-infected-quarantined-removed)模型和Dropout-LSTM(Dropout long short term memory network)模型预测西安市新型冠状病毒肺炎(COVID-19)疫情的发展趋势,为评估“动态清零”策略防控效果提供科学依据。方法考虑到西安市本轮疫情存在大量的无症状感染者、依时变化的参数以及采取的管控举措等特点,构建具有阶段性防控措施的时变SEAIQR模型。考虑到COVID-19疫情数据的时序性特征及它们之间的非线性关系,构建深度学习Dropout-LSTM模型。选用2021年12月9日-2022年1月31日西安市新增确诊病例数据进行拟合,用2022年2月1日-2022年2月7日数据评估预测效果,计算有效再生数(R_(t))并评价不同参数对疫情发展的影响。结果SEAIQR模型预测的新增确诊病例拐点预计在2021年12月26日出现,约为176例,疫情将于2022年1月24日实现“动态清零”,模型R^(2)=0.849。Dropout-LSTM模型能够体现数据的时序性与非线性特征,预测出的新增确诊病例数与实际情况高度吻合,R^(2)=0.937。Dropout-LSTM模型的MAE和RMSE均较SEAIQR模型低,说明预测结果更为理想。疫情暴发初期,R 0为5.63,自实施全面管控后,R_(t)呈逐渐下降趋势,直到2021年12月27日降至1.0以下。随着有效接触率不断缩小、管控措施的提早实施及免疫阈值的提高,新增确诊病例在到达拐点时的人数将会持续降低。结论建立的Dropout-LSTM模型实现了较准确的疫情预测,可为COVID-19疫情“动态清零”防控决策提供借鉴。
文摘In this paper, we introduce the survival modelling methodology in order to identify some factors which may be influencing the university dropout. By using the data base provided by the Fundación Universidad Autónoma de Colombia and the semi parametric proportional hazard Cox model, we have been able to identify these risk factors.
文摘Since a complete DNA chain contains a large data (usually billions of nucleotides), it’s challenging to figure out the function of each sequence segment. Several powerful predictive models for the function of DNA sequence, including, CNN (convolutional neural network), RNN (recurrent neural network), and LSTM [1] (long short-term memory) have been proposed. However, all of them have some flaws. For example, the RNN can hardly have long-term memory. Here, we build on one of these models, DanQ, which uses CNN and LSTM together. We extend DanQ by developing an improved DanQ model and applying it to predict the function of DNA sequence more efficiently. In the most primitive DanQ model, the regulatory grammar is learned by the regulatory motifs captured by the convolution layer and the long-term dependencies between the motifs captured by the recurrent layer, so as to increase the prediction accuracy. Through the testing of some models, DanQ has greatly improved in some indicators. For the regulatory markers, DanQ achieves improvements above 50% of the area under the curve, via the measurement of the precision-recall curve.
文摘针对MOOC中学生行为数据的长短期混合特性,为解决辍学预测中的动态类别不平衡问题,提出一种基于深度学习的辍学预测策略。首先建立以天为时间步长、周为学习周期的新型学生行为时间序列,以捕捉每一时间步长下时间序列数据的短期依赖关系和相邻学习周期之间的长期模式和趋势。然后结合辍学定义的两种不同表达揭示MOOC辍学预测的动态类别不平衡现象。接着引入基于代价敏感的长短期时间序列深度学习模型,以实现对高辍学风险学生的精准预测。最后在KDD Cup 2015数据集上的实验证明,所提策略能够有效帮助MOOC课程教师和教学管理者追踪课程学生在不同时间步长的学习状态,从而动态监控不同学习阶段的辍学行为。