针对航天嵌入式软件(aerospace embedded software,AES)时序需求复杂带来的时序需求定义不准确问题,提出一种基于MARTE(modeling and analysis of real-time and embedded systems)模型的数据流时序(data flow timing based on MARTE,DF...针对航天嵌入式软件(aerospace embedded software,AES)时序需求复杂带来的时序需求定义不准确问题,提出一种基于MARTE(modeling and analysis of real-time and embedded systems)模型的数据流时序(data flow timing based on MARTE,DFT-MARTE)模型,设计基于该模型的处理点缓存计算算法、时序偏离概率检测算法和时序序列分析算法。处理点缓存计算算法动态更新缓存空间,使后续时序检测正常执行;时序偏离概率检测算法利用多线程并发模拟时序特性,检测需求中时序偏离问题;时序序列分析算法是基于梯度下降算法,拟合时序序列,指导用户优化需求。该模型相比传统数据流模型更适用航天嵌入式软件,利于后续开发和维护,具有极高的应用价值。展开更多
基于CRISP-DM(cross-industry standard process for data mining)模型设计与实现了一个时序预测Web服务,对网站资源的下载需求量进行预测。重点阐述了CRISP-DM模型应用于时序预测任务时的设计思想和实现的关键技术。测试结果表明,该时...基于CRISP-DM(cross-industry standard process for data mining)模型设计与实现了一个时序预测Web服务,对网站资源的下载需求量进行预测。重点阐述了CRISP-DM模型应用于时序预测任务时的设计思想和实现的关键技术。测试结果表明,该时序预测Web服务具有较高的预测准确率,部署快速,使用方便,对解决同类问题具有一定的示范和参考价值。展开更多
The characters of marine data, such as multi-source, polymorphism, diversity and large amount, determine their differences from other data. How to store and manage marine data rationally and effectively to provide pow...The characters of marine data, such as multi-source, polymorphism, diversity and large amount, determine their differences from other data. How to store and manage marine data rationally and effectively to provide powerful data support for marine management information system and "Digital Ocean" prototype system construction is an urgent problem to solve. Different types of system planning data, such as marine resource, marine environment, marine econotny and marine management, and establishing marine data architecture frame with uniform standard are to realize the effective management of all level marine data, such as national marine data, the provincial (municipal) marine data, and meet the need of fundamental information-platform construction.展开更多
Many high quality studies have emerged from public databases,such as Surveillance,Epidemiology,and End Results(SEER),National Health and Nutrition Examination Survey(NHANES),The Cancer Genome Atlas(TCGA),and Medical I...Many high quality studies have emerged from public databases,such as Surveillance,Epidemiology,and End Results(SEER),National Health and Nutrition Examination Survey(NHANES),The Cancer Genome Atlas(TCGA),and Medical Information Mart for Intensive Care(MIMIC);however,these data are often characterized by a high degree of dimensional heterogeneity,timeliness,scarcity,irregularity,and other characteristics,resulting in the value of these data not being fully utilized.Data-mining technology has been a frontier field in medical research,as it demonstrates excellent performance in evaluating patient risks and assisting clinical decision-making in building disease-prediction models.Therefore,data mining has unique advantages in clinical big-data research,especially in large-scale medical public databases.This article introduced the main medical public database and described the steps,tasks,and models of data mining in simple language.Additionally,we described data-mining methods along with their practical applications.The goal of this work was to aid clinical researchers in gaining a clear and intuitive understanding of the application of data-mining technology on clinical big-data in order to promote the production of research results that are beneficial to doctors and patients.展开更多
Real-time perception of rock mass information is of great importance to efficient tunneling and hazard prevention in tunnel boring machines(TBMs).In this study,a TBM-rock mutual feedback perception method based on dat...Real-time perception of rock mass information is of great importance to efficient tunneling and hazard prevention in tunnel boring machines(TBMs).In this study,a TBM-rock mutual feedback perception method based on data mining(DM) is proposed,which takes 10 tunneling parameters related to surrounding rock conditions as input features.For implementation,first,the database of TBM tunneling parameters was established,in which 10,807 tunneling cycles from the Songhua River water conveyance tunnel were accommodated.Then,the spectral clustering(SC) algorithm based on graph theory was introduced to cluster the TBM tunneling data.According to the clustering results and rock mass boreability index,the rock mass conditions were classified into four classes,and the reasonable distribution intervals of the main tunneling parameters corresponding to each class were presented.Meanwhile,based on the deep neural network(DNN),the real-time prediction model regarding different rock conditions was established.Finally,the rationality and adaptability of the proposed method were validated via analyzing the tunneling specific energy,feature importance,and training dataset size.The proposed TBM-rock mutual feedback perception method enables the automatic identification of rock mass conditions and the dynamic adjustment of tunneling parameters during TBM driving.Furthermore,in terms of the prediction performance,the method can predict the rock mass conditions ahead of the tunnel face in real time more accurately than the traditional machine learning prediction methods.展开更多
文摘针对航天嵌入式软件(aerospace embedded software,AES)时序需求复杂带来的时序需求定义不准确问题,提出一种基于MARTE(modeling and analysis of real-time and embedded systems)模型的数据流时序(data flow timing based on MARTE,DFT-MARTE)模型,设计基于该模型的处理点缓存计算算法、时序偏离概率检测算法和时序序列分析算法。处理点缓存计算算法动态更新缓存空间,使后续时序检测正常执行;时序偏离概率检测算法利用多线程并发模拟时序特性,检测需求中时序偏离问题;时序序列分析算法是基于梯度下降算法,拟合时序序列,指导用户优化需求。该模型相比传统数据流模型更适用航天嵌入式软件,利于后续开发和维护,具有极高的应用价值。
文摘基于CRISP-DM(cross-industry standard process for data mining)模型设计与实现了一个时序预测Web服务,对网站资源的下载需求量进行预测。重点阐述了CRISP-DM模型应用于时序预测任务时的设计思想和实现的关键技术。测试结果表明,该时序预测Web服务具有较高的预测准确率,部署快速,使用方便,对解决同类问题具有一定的示范和参考价值。
文摘The characters of marine data, such as multi-source, polymorphism, diversity and large amount, determine their differences from other data. How to store and manage marine data rationally and effectively to provide powerful data support for marine management information system and "Digital Ocean" prototype system construction is an urgent problem to solve. Different types of system planning data, such as marine resource, marine environment, marine econotny and marine management, and establishing marine data architecture frame with uniform standard are to realize the effective management of all level marine data, such as national marine data, the provincial (municipal) marine data, and meet the need of fundamental information-platform construction.
基金the National Social Science Foundation of China(No.16BGL183).
文摘Many high quality studies have emerged from public databases,such as Surveillance,Epidemiology,and End Results(SEER),National Health and Nutrition Examination Survey(NHANES),The Cancer Genome Atlas(TCGA),and Medical Information Mart for Intensive Care(MIMIC);however,these data are often characterized by a high degree of dimensional heterogeneity,timeliness,scarcity,irregularity,and other characteristics,resulting in the value of these data not being fully utilized.Data-mining technology has been a frontier field in medical research,as it demonstrates excellent performance in evaluating patient risks and assisting clinical decision-making in building disease-prediction models.Therefore,data mining has unique advantages in clinical big-data research,especially in large-scale medical public databases.This article introduced the main medical public database and described the steps,tasks,and models of data mining in simple language.Additionally,we described data-mining methods along with their practical applications.The goal of this work was to aid clinical researchers in gaining a clear and intuitive understanding of the application of data-mining technology on clinical big-data in order to promote the production of research results that are beneficial to doctors and patients.
基金supported by the National Natural Science Foundation of China(Grant Nos.41772309 and 51908431)the Outstanding Youth Foundation of Hubei Province,China(Grant No.2019CFA074)。
文摘Real-time perception of rock mass information is of great importance to efficient tunneling and hazard prevention in tunnel boring machines(TBMs).In this study,a TBM-rock mutual feedback perception method based on data mining(DM) is proposed,which takes 10 tunneling parameters related to surrounding rock conditions as input features.For implementation,first,the database of TBM tunneling parameters was established,in which 10,807 tunneling cycles from the Songhua River water conveyance tunnel were accommodated.Then,the spectral clustering(SC) algorithm based on graph theory was introduced to cluster the TBM tunneling data.According to the clustering results and rock mass boreability index,the rock mass conditions were classified into four classes,and the reasonable distribution intervals of the main tunneling parameters corresponding to each class were presented.Meanwhile,based on the deep neural network(DNN),the real-time prediction model regarding different rock conditions was established.Finally,the rationality and adaptability of the proposed method were validated via analyzing the tunneling specific energy,feature importance,and training dataset size.The proposed TBM-rock mutual feedback perception method enables the automatic identification of rock mass conditions and the dynamic adjustment of tunneling parameters during TBM driving.Furthermore,in terms of the prediction performance,the method can predict the rock mass conditions ahead of the tunnel face in real time more accurately than the traditional machine learning prediction methods.
文摘数据挖掘语言标准化的研究是开发新一代数据挖掘系统的关键。DMX(Data Mining Extensions,数据挖掘扩展)是OLE DBFor DM规范支持的数据挖掘查询语言,支持数据挖掘系统直接对关系数据库进行挖掘,是数据挖掘原语标准化发展中的一个突破。该文介绍了OLE DB For DM规范下数据挖掘的主要步骤,给出了Microsoft SQL Server Analysis Services中基于DMX的实现方法。