The Moon-based Ultraviolet Telescope (MUVT) is one of the payloads on the Chang'e-3 (CE-3) lunar lander. Because of the advantages of having no at- mospheric disturbances and the slow rotation of the Moon, we can...The Moon-based Ultraviolet Telescope (MUVT) is one of the payloads on the Chang'e-3 (CE-3) lunar lander. Because of the advantages of having no at- mospheric disturbances and the slow rotation of the Moon, we can make long-term continuous observations of a series of important celestial objects in the near ultra- violet band (245-340 nm), and perform a sky survey of selected areas, which can- not be completed on Earth. We can find characteristic changes in celestial brightness with time by analyzing image data from the MUVT, and deduce the radiation mech- anism and physical properties of these celestial objects after comparing with a phys- ical model. In order to explain the scientific purposes of MUVT, this article analyzes the preprocessing of MUVT image data and makes a preliminary evaluation of data quality. The results demonstrate that the methods used for data collection and prepro- cessing are effective, and the Level 2A and 2B image data satisfy the requirements of follow-up scientific researches.展开更多
Quantum Machine Learning(QML)techniques have been recently attracting massive interest.However reported applications usually employ synthetic or well-known datasets.One of these techniques based on using a hybrid appr...Quantum Machine Learning(QML)techniques have been recently attracting massive interest.However reported applications usually employ synthetic or well-known datasets.One of these techniques based on using a hybrid approach combining quantum and classic devices is the Variational Quantum Classifier(VQC),which development seems promising.Albeit being largely studied,VQC implementations for“real-world”datasets are still challenging on Noisy Intermediate Scale Quantum devices(NISQ).In this paper we propose a preprocessing pipeline based on Stokes parameters for data mapping.This pipeline enhances the prediction rates when applying VQC techniques,improving the feasibility of solving classification problems using NISQ devices.By including feature selection techniques and geometrical transformations,enhanced quantum state preparation is achieved.Also,a representation based on the Stokes parameters in the PoincaréSphere is possible for visualizing the data.Our results show that by using the proposed techniques we improve the classification score for the incidence of acute comorbid diseases in Type 2 Diabetes Mellitus patients.We used the implemented version of VQC available on IBM’s framework Qiskit,and obtained with two and three qubits an accuracy of 70%and 72%respectively.展开更多
Many classifiers and methods are proposed to deal with letter recognition problem. Among them, clustering is a widely used method. But only one time for clustering is not adequately. Here, we adopt data preprocessing ...Many classifiers and methods are proposed to deal with letter recognition problem. Among them, clustering is a widely used method. But only one time for clustering is not adequately. Here, we adopt data preprocessing and a re kernel clustering method to tackle the letter recognition problem. In order to validate effectiveness and efficiency of proposed method, we introduce re kernel clustering into Kernel Nearest Neighbor classification(KNN), Radial Basis Function Neural Network(RBFNN), and Support Vector Machine(SVM). Furthermore, we compare the difference between re kernel clustering and one time kernel clustering which is denoted as kernel clustering for short. Experimental results validate that re kernel clustering forms fewer and more feasible kernels and attain higher classification accuracy.展开更多
Due to the frequent changes of wind speed and wind direction,the accuracy of wind turbine(WT)power prediction using traditional data preprocessing method is low.This paper proposes a data preprocessing method which co...Due to the frequent changes of wind speed and wind direction,the accuracy of wind turbine(WT)power prediction using traditional data preprocessing method is low.This paper proposes a data preprocessing method which combines POT with DBSCAN(POT-DBSCAN)to improve the prediction efficiency of wind power prediction model.Firstly,according to the data of WT in the normal operation condition,the power prediction model ofWT is established based on the Particle Swarm Optimization(PSO)Arithmetic which is combined with the BP Neural Network(PSO-BP).Secondly,the wind-power data obtained from the supervisory control and data acquisition(SCADA)system is preprocessed by the POT-DBSCAN method.Then,the power prediction of the preprocessed data is carried out by PSO-BP model.Finally,the necessity of preprocessing is verified by the indexes.This case analysis shows that the prediction result of POT-DBSCAN preprocessing is better than that of the Quartile method.Therefore,the accuracy of data and prediction model can be improved by using this method.展开更多
In this study, we propose a data preprocessing algorithm called D-IMPACT inspired by the IMPACT clustering algorithm. D-IMPACT iteratively moves data points based on attraction and density to detect and remove noise a...In this study, we propose a data preprocessing algorithm called D-IMPACT inspired by the IMPACT clustering algorithm. D-IMPACT iteratively moves data points based on attraction and density to detect and remove noise and outliers, and separate clusters. Our experimental results on two-dimensional datasets and practical datasets show that this algorithm can produce new datasets such that the performance of the clustering algorithm is improved.展开更多
Liquid chromatography–mass spectrometry(LC–MS)has enabled the detection of thousands of metabolite features from a single biological sample that produces large and complex datasets.One of the key issues in LC–MS-ba...Liquid chromatography–mass spectrometry(LC–MS)has enabled the detection of thousands of metabolite features from a single biological sample that produces large and complex datasets.One of the key issues in LC–MS-based metabolomics is comprehensive and accurate analysis of enormous amount of data.Many free data preprocessing tools,such as XCMS,MZmine,MAVEN,and MetaboAnalyst,as well as commercial software,have been developed to facilitate data processing.However,researchers are challenged by the inevitable and unconquerable yields of numerous false-positive peaks,and human errors while manually removing such false peaks.Even with continuous improvements of data processing tools,there can still be many mistakes generated during data preprocessing.In addition,many data preprocessing software exist,and every tool has its own advantages and disadvantages.Thereby,a researcher needs to judge what kind of software or tools to choose that most suit their vendor proprietary formats and goal of downstream analysis.Here,we provided a brief introduction of the general steps of raw MS data processing,and properties of automated data processing tools.Then,characteristics of mainly free data preprocessing software were summarized for researchers’consideration in conducting metabolomics study.展开更多
The tendency toward achieving more sustainable and green buildings turned several passive buildings into more dynamic ones.Mosques are the type of buildings that have a unique energy usage pattern.Nevertheless,these t...The tendency toward achieving more sustainable and green buildings turned several passive buildings into more dynamic ones.Mosques are the type of buildings that have a unique energy usage pattern.Nevertheless,these types of buildings have minimal consideration in the ongoing energy efficiency applications.This is due to the unpredictability in the electrical consumption of the mosques affecting the stability of the distribution networks.Therefore,this study addresses this issue by developing a framework for a short-term electricity load forecast for a mosque load located in Riyadh,Saudi Arabia.In this study,and by harvesting the load consumption of the mosque and meteorological datasets,the performance of four forecasting algorithms is investigated,namely Artificial Neural Network and Support Vector Regression(SVR)based on three kernel functions:Radial Basis(RB),Polynomial,and Linear.In addition,this research work examines the impact of 13 different combinations of input attributes since selecting the optimal features has a major influence on yielding precise forecasting outcomes.For the mosque load,the(SVR-RB)with eleven features appeared to be the best forecasting model with the lowest forecasting errors metrics giving RMSE,nRMSE,MAE,and nMAE values of 4.207 kW,2.522%,2.938 kW,and 1.761%,respectively.展开更多
Cancer is one of the most dangerous diseaseswith highmortality.One of the principal treatments is radiotherapy by using radiation beams to destroy cancer cells and this workflow requires a lot of experience and skill ...Cancer is one of the most dangerous diseaseswith highmortality.One of the principal treatments is radiotherapy by using radiation beams to destroy cancer cells and this workflow requires a lot of experience and skill from doctors and technicians.In our study,we focused on the 3D dose prediction problem in radiotherapy by applying the deeplearning approach to computed tomography(CT)images of cancer patients.Medical image data has more complex characteristics than normal image data,and this research aims to explore the effectiveness of data preprocessing and augmentation in the context of the 3D dose prediction problem.We proposed four strategies to clarify our hypothesis in different aspects of applying data preprocessing and augmentation.In strategies,we trained our custom convolutional neural network model which has a structure inspired by the U-net,and residual blocks were also applied to the architecture.The output of the network is added with a rectified linear unit(Re-Lu)function for each pixel to ensure there are no negative values,which are absurd with radiation doses.Our experiments were conducted on the dataset of the Open Knowledge-Based Planning Challenge which was collected from head and neck cancer patients treatedwith radiation therapy.The results of four strategies showthat our hypothesis is rational by evaluating metrics in terms of the Dose-score and the Dose-volume histogram score(DVH-score).In the best training cases,the Dose-score is 3.08 and the DVH-score is 1.78.In addition,we also conducted a comparison with the results of another study in the same context of using the loss function.展开更多
准确预测滚动轴承剩余使用寿命(Remaining Useful Life,RUL)对维护建筑机械设备稳定运行、保障生产安全具有重要的现实需求和应用价值。为提升滚动轴承RUL预测准确率,提出一种基于归一化最小均方(Normalized Least Mean Square,NLMS)自...准确预测滚动轴承剩余使用寿命(Remaining Useful Life,RUL)对维护建筑机械设备稳定运行、保障生产安全具有重要的现实需求和应用价值。为提升滚动轴承RUL预测准确率,提出一种基于归一化最小均方(Normalized Least Mean Square,NLMS)自适应滤波器和Autoformer长序列预测模型的滚动轴承RUL预测新方法。使用NLMS自适应滤波器对滚动轴承原始振动信号进行降噪,从降噪振动信号中分段提取初始时域特征,采用Spearman相关系数进行特征筛选,经归一化后形成多维特征集;利用Autoformer模型中序列分解模块与自相关机制建立多维特征集与滚动轴承RUL之间的分段非线性映射,实现滚动轴承RUL预测;在PHM 2012数据集与XJTU-SY数据集上进行对比实验,结果表明该方法与已有方法相比可取得最低预测误差,均方根误差(Root Mean Squared Error,RMSE)与平均绝对误差(Mean Absolute Error,MAE)分别提升24.4%与47.2%,证明了该方法在滚动轴承RUL预测的有效性。展开更多
文摘The Moon-based Ultraviolet Telescope (MUVT) is one of the payloads on the Chang'e-3 (CE-3) lunar lander. Because of the advantages of having no at- mospheric disturbances and the slow rotation of the Moon, we can make long-term continuous observations of a series of important celestial objects in the near ultra- violet band (245-340 nm), and perform a sky survey of selected areas, which can- not be completed on Earth. We can find characteristic changes in celestial brightness with time by analyzing image data from the MUVT, and deduce the radiation mech- anism and physical properties of these celestial objects after comparing with a phys- ical model. In order to explain the scientific purposes of MUVT, this article analyzes the preprocessing of MUVT image data and makes a preliminary evaluation of data quality. The results demonstrate that the methods used for data collection and prepro- cessing are effective, and the Level 2A and 2B image data satisfy the requirements of follow-up scientific researches.
基金funded by eVIDA Research group IT-905-16 from Basque Government.
文摘Quantum Machine Learning(QML)techniques have been recently attracting massive interest.However reported applications usually employ synthetic or well-known datasets.One of these techniques based on using a hybrid approach combining quantum and classic devices is the Variational Quantum Classifier(VQC),which development seems promising.Albeit being largely studied,VQC implementations for“real-world”datasets are still challenging on Noisy Intermediate Scale Quantum devices(NISQ).In this paper we propose a preprocessing pipeline based on Stokes parameters for data mapping.This pipeline enhances the prediction rates when applying VQC techniques,improving the feasibility of solving classification problems using NISQ devices.By including feature selection techniques and geometrical transformations,enhanced quantum state preparation is achieved.Also,a representation based on the Stokes parameters in the PoincaréSphere is possible for visualizing the data.Our results show that by using the proposed techniques we improve the classification score for the incidence of acute comorbid diseases in Type 2 Diabetes Mellitus patients.We used the implemented version of VQC available on IBM’s framework Qiskit,and obtained with two and three qubits an accuracy of 70%and 72%respectively.
基金Supported by the National Science Foundation(No.IIS-9988642)the Multidisciplinary Research Program
文摘Many classifiers and methods are proposed to deal with letter recognition problem. Among them, clustering is a widely used method. But only one time for clustering is not adequately. Here, we adopt data preprocessing and a re kernel clustering method to tackle the letter recognition problem. In order to validate effectiveness and efficiency of proposed method, we introduce re kernel clustering into Kernel Nearest Neighbor classification(KNN), Radial Basis Function Neural Network(RBFNN), and Support Vector Machine(SVM). Furthermore, we compare the difference between re kernel clustering and one time kernel clustering which is denoted as kernel clustering for short. Experimental results validate that re kernel clustering forms fewer and more feasible kernels and attain higher classification accuracy.
基金National Natural Science Foundation of China(Nos.51875199 and 51905165)Hunan Natural Science Fund Project(2019JJ50186)the Ke7y Research and Development Program of Hunan Province(No.2018GK2073).
文摘Due to the frequent changes of wind speed and wind direction,the accuracy of wind turbine(WT)power prediction using traditional data preprocessing method is low.This paper proposes a data preprocessing method which combines POT with DBSCAN(POT-DBSCAN)to improve the prediction efficiency of wind power prediction model.Firstly,according to the data of WT in the normal operation condition,the power prediction model ofWT is established based on the Particle Swarm Optimization(PSO)Arithmetic which is combined with the BP Neural Network(PSO-BP).Secondly,the wind-power data obtained from the supervisory control and data acquisition(SCADA)system is preprocessed by the POT-DBSCAN method.Then,the power prediction of the preprocessed data is carried out by PSO-BP model.Finally,the necessity of preprocessing is verified by the indexes.This case analysis shows that the prediction result of POT-DBSCAN preprocessing is better than that of the Quartile method.Therefore,the accuracy of data and prediction model can be improved by using this method.
文摘In this study, we propose a data preprocessing algorithm called D-IMPACT inspired by the IMPACT clustering algorithm. D-IMPACT iteratively moves data points based on attraction and density to detect and remove noise and outliers, and separate clusters. Our experimental results on two-dimensional datasets and practical datasets show that this algorithm can produce new datasets such that the performance of the clustering algorithm is improved.
基金National Natural Science Foundation of China(31371515,31671226)。
文摘Liquid chromatography–mass spectrometry(LC–MS)has enabled the detection of thousands of metabolite features from a single biological sample that produces large and complex datasets.One of the key issues in LC–MS-based metabolomics is comprehensive and accurate analysis of enormous amount of data.Many free data preprocessing tools,such as XCMS,MZmine,MAVEN,and MetaboAnalyst,as well as commercial software,have been developed to facilitate data processing.However,researchers are challenged by the inevitable and unconquerable yields of numerous false-positive peaks,and human errors while manually removing such false peaks.Even with continuous improvements of data processing tools,there can still be many mistakes generated during data preprocessing.In addition,many data preprocessing software exist,and every tool has its own advantages and disadvantages.Thereby,a researcher needs to judge what kind of software or tools to choose that most suit their vendor proprietary formats and goal of downstream analysis.Here,we provided a brief introduction of the general steps of raw MS data processing,and properties of automated data processing tools.Then,characteristics of mainly free data preprocessing software were summarized for researchers’consideration in conducting metabolomics study.
基金The author extends his appreciation to the Deputyship for Research&Innovation,Ministry of Education and Qassim University,Saudi Arabia for funding this research work through the Project Number(QU-IF-4-3-3-30013).
文摘The tendency toward achieving more sustainable and green buildings turned several passive buildings into more dynamic ones.Mosques are the type of buildings that have a unique energy usage pattern.Nevertheless,these types of buildings have minimal consideration in the ongoing energy efficiency applications.This is due to the unpredictability in the electrical consumption of the mosques affecting the stability of the distribution networks.Therefore,this study addresses this issue by developing a framework for a short-term electricity load forecast for a mosque load located in Riyadh,Saudi Arabia.In this study,and by harvesting the load consumption of the mosque and meteorological datasets,the performance of four forecasting algorithms is investigated,namely Artificial Neural Network and Support Vector Regression(SVR)based on three kernel functions:Radial Basis(RB),Polynomial,and Linear.In addition,this research work examines the impact of 13 different combinations of input attributes since selecting the optimal features has a major influence on yielding precise forecasting outcomes.For the mosque load,the(SVR-RB)with eleven features appeared to be the best forecasting model with the lowest forecasting errors metrics giving RMSE,nRMSE,MAE,and nMAE values of 4.207 kW,2.522%,2.938 kW,and 1.761%,respectively.
基金sponsored by the Institute of Information Technology(Vietnam Academy of Science and Technology)with Project Code“CS24.01”.
文摘Cancer is one of the most dangerous diseaseswith highmortality.One of the principal treatments is radiotherapy by using radiation beams to destroy cancer cells and this workflow requires a lot of experience and skill from doctors and technicians.In our study,we focused on the 3D dose prediction problem in radiotherapy by applying the deeplearning approach to computed tomography(CT)images of cancer patients.Medical image data has more complex characteristics than normal image data,and this research aims to explore the effectiveness of data preprocessing and augmentation in the context of the 3D dose prediction problem.We proposed four strategies to clarify our hypothesis in different aspects of applying data preprocessing and augmentation.In strategies,we trained our custom convolutional neural network model which has a structure inspired by the U-net,and residual blocks were also applied to the architecture.The output of the network is added with a rectified linear unit(Re-Lu)function for each pixel to ensure there are no negative values,which are absurd with radiation doses.Our experiments were conducted on the dataset of the Open Knowledge-Based Planning Challenge which was collected from head and neck cancer patients treatedwith radiation therapy.The results of four strategies showthat our hypothesis is rational by evaluating metrics in terms of the Dose-score and the Dose-volume histogram score(DVH-score).In the best training cases,the Dose-score is 3.08 and the DVH-score is 1.78.In addition,we also conducted a comparison with the results of another study in the same context of using the loss function.
文摘准确预测滚动轴承剩余使用寿命(Remaining Useful Life,RUL)对维护建筑机械设备稳定运行、保障生产安全具有重要的现实需求和应用价值。为提升滚动轴承RUL预测准确率,提出一种基于归一化最小均方(Normalized Least Mean Square,NLMS)自适应滤波器和Autoformer长序列预测模型的滚动轴承RUL预测新方法。使用NLMS自适应滤波器对滚动轴承原始振动信号进行降噪,从降噪振动信号中分段提取初始时域特征,采用Spearman相关系数进行特征筛选,经归一化后形成多维特征集;利用Autoformer模型中序列分解模块与自相关机制建立多维特征集与滚动轴承RUL之间的分段非线性映射,实现滚动轴承RUL预测;在PHM 2012数据集与XJTU-SY数据集上进行对比实验,结果表明该方法与已有方法相比可取得最低预测误差,均方根误差(Root Mean Squared Error,RMSE)与平均绝对误差(Mean Absolute Error,MAE)分别提升24.4%与47.2%,证明了该方法在滚动轴承RUL预测的有效性。