期刊文献+
共找到1,322篇文章
< 1 2 67 >
每页显示 20 50 100
Hybrid 1DCNN-Attention with Enhanced Data Preprocessing for Loan Approval Prediction
1
作者 Yaru Liu Huifang Feng 《Journal of Computer and Communications》 2024年第8期224-241,共18页
In order to reduce the risk of non-performing loans, losses, and improve the loan approval efficiency, it is necessary to establish an intelligent loan risk and approval prediction system. A hybrid deep learning model... In order to reduce the risk of non-performing loans, losses, and improve the loan approval efficiency, it is necessary to establish an intelligent loan risk and approval prediction system. A hybrid deep learning model with 1DCNN-attention network and the enhanced preprocessing techniques is proposed for loan approval prediction. Our proposed model consists of the enhanced data preprocessing and stacking of multiple hybrid modules. Initially, the enhanced data preprocessing techniques using a combination of methods such as standardization, SMOTE oversampling, feature construction, recursive feature elimination (RFE), information value (IV) and principal component analysis (PCA), which not only eliminates the effects of data jitter and non-equilibrium, but also removes redundant features while improving the representation of features. Subsequently, a hybrid module that combines a 1DCNN with an attention mechanism is proposed to extract local and global spatio-temporal features. Finally, the comprehensive experiments conducted validate that the proposed model surpasses state-of-the-art baseline models across various performance metrics, including accuracy, precision, recall, F1 score, and AUC. Our proposed model helps to automate the loan approval process and provides scientific guidance to financial institutions for loan risk control. 展开更多
关键词 Loan Approval Prediction Deep Learning One-Dimensional Convolutional Neural Network Attention Mechanism data preprocessing
下载PDF
Data preprocessing and preliminary results of the Moon-based Ultraviolet Telescope on the CE-3 lander 被引量:4
2
作者 Wei-Bin Wen Fang Wang +8 位作者 Chun-Lai Li Jing Wang Li Cao Jian-Jun Liu Xu Tan Yuan Xiao Qiang Fu Yan Su Wei Zuo 《Research in Astronomy and Astrophysics》 SCIE CAS CSCD 2014年第12期1674-1681,共8页
The Moon-based Ultraviolet Telescope (MUVT) is one of the payloads on the Chang'e-3 (CE-3) lunar lander. Because of the advantages of having no at- mospheric disturbances and the slow rotation of the Moon, we can... The Moon-based Ultraviolet Telescope (MUVT) is one of the payloads on the Chang'e-3 (CE-3) lunar lander. Because of the advantages of having no at- mospheric disturbances and the slow rotation of the Moon, we can make long-term continuous observations of a series of important celestial objects in the near ultra- violet band (245-340 nm), and perform a sky survey of selected areas, which can- not be completed on Earth. We can find characteristic changes in celestial brightness with time by analyzing image data from the MUVT, and deduce the radiation mech- anism and physical properties of these celestial objects after comparing with a phys- ical model. In order to explain the scientific purposes of MUVT, this article analyzes the preprocessing of MUVT image data and makes a preliminary evaluation of data quality. The results demonstrate that the methods used for data collection and prepro- cessing are effective, and the Level 2A and 2B image data satisfy the requirements of follow-up scientific researches. 展开更多
关键词 Chang'e-3 mission -- the Moon-based Ultraviolet Telescope -- data preprocessing -- near ultraviolet band
下载PDF
Diabetes Type 2: Poincaré Data Preprocessing for Quantum Machine Learning 被引量:1
3
作者 Daniel Sierra-Sosa Juan D.Arcila-Moreno +1 位作者 Begonya Garcia-Zapirain Adel Elmaghraby 《Computers, Materials & Continua》 SCIE EI 2021年第5期1849-1861,共13页
Quantum Machine Learning(QML)techniques have been recently attracting massive interest.However reported applications usually employ synthetic or well-known datasets.One of these techniques based on using a hybrid appr... Quantum Machine Learning(QML)techniques have been recently attracting massive interest.However reported applications usually employ synthetic or well-known datasets.One of these techniques based on using a hybrid approach combining quantum and classic devices is the Variational Quantum Classifier(VQC),which development seems promising.Albeit being largely studied,VQC implementations for“real-world”datasets are still challenging on Noisy Intermediate Scale Quantum devices(NISQ).In this paper we propose a preprocessing pipeline based on Stokes parameters for data mapping.This pipeline enhances the prediction rates when applying VQC techniques,improving the feasibility of solving classification problems using NISQ devices.By including feature selection techniques and geometrical transformations,enhanced quantum state preparation is achieved.Also,a representation based on the Stokes parameters in the PoincaréSphere is possible for visualizing the data.Our results show that by using the proposed techniques we improve the classification score for the incidence of acute comorbid diseases in Type 2 Diabetes Mellitus patients.We used the implemented version of VQC available on IBM’s framework Qiskit,and obtained with two and three qubits an accuracy of 70%and 72%respectively. 展开更多
关键词 Quantum machine learning data preprocessing stokes parameters Poincarésphere
下载PDF
DATA PREPROCESSING AND RE KERNEL CLUSTERING FOR LETTER
4
作者 Zhu Changming Gao Daqi 《Journal of Electronics(China)》 2014年第6期552-564,共13页
Many classifiers and methods are proposed to deal with letter recognition problem. Among them, clustering is a widely used method. But only one time for clustering is not adequately. Here, we adopt data preprocessing ... Many classifiers and methods are proposed to deal with letter recognition problem. Among them, clustering is a widely used method. But only one time for clustering is not adequately. Here, we adopt data preprocessing and a re kernel clustering method to tackle the letter recognition problem. In order to validate effectiveness and efficiency of proposed method, we introduce re kernel clustering into Kernel Nearest Neighbor classification(KNN), Radial Basis Function Neural Network(RBFNN), and Support Vector Machine(SVM). Furthermore, we compare the difference between re kernel clustering and one time kernel clustering which is denoted as kernel clustering for short. Experimental results validate that re kernel clustering forms fewer and more feasible kernels and attain higher classification accuracy. 展开更多
关键词 data preprocessing Kernel clustering Kernel Nearest Neighbor(KNN) Re kernel clustering
下载PDF
Power Data Preprocessing Method of Mountain Wind Farm Based on POT-DBSCAN
5
作者 Anfeng Zhu Zhao Xiao Qiancheng Zhao 《Energy Engineering》 EI 2021年第3期549-563,共15页
Due to the frequent changes of wind speed and wind direction,the accuracy of wind turbine(WT)power prediction using traditional data preprocessing method is low.This paper proposes a data preprocessing method which co... Due to the frequent changes of wind speed and wind direction,the accuracy of wind turbine(WT)power prediction using traditional data preprocessing method is low.This paper proposes a data preprocessing method which combines POT with DBSCAN(POT-DBSCAN)to improve the prediction efficiency of wind power prediction model.Firstly,according to the data of WT in the normal operation condition,the power prediction model ofWT is established based on the Particle Swarm Optimization(PSO)Arithmetic which is combined with the BP Neural Network(PSO-BP).Secondly,the wind-power data obtained from the supervisory control and data acquisition(SCADA)system is preprocessed by the POT-DBSCAN method.Then,the power prediction of the preprocessed data is carried out by PSO-BP model.Finally,the necessity of preprocessing is verified by the indexes.This case analysis shows that the prediction result of POT-DBSCAN preprocessing is better than that of the Quartile method.Therefore,the accuracy of data and prediction model can be improved by using this method. 展开更多
关键词 Wind turbine SCADA data data preprocessing method power prediction
下载PDF
D-IMPACT: A Data Preprocessing Algorithm to Improve the Performance of Clustering
6
作者 Vu Anh Tran Osamu Hirose +8 位作者 Thammakorn Saethang Lan Anh T. Nguyen Xuan Tho Dang Tu Kien T. Le Duc Luu Ngo Gavrilov Sergey Mamoru Kubo Yoichi Yamada Kenji Satou 《Journal of Software Engineering and Applications》 2014年第8期639-654,共16页
In this study, we propose a data preprocessing algorithm called D-IMPACT inspired by the IMPACT clustering algorithm. D-IMPACT iteratively moves data points based on attraction and density to detect and remove noise a... In this study, we propose a data preprocessing algorithm called D-IMPACT inspired by the IMPACT clustering algorithm. D-IMPACT iteratively moves data points based on attraction and density to detect and remove noise and outliers, and separate clusters. Our experimental results on two-dimensional datasets and practical datasets show that this algorithm can produce new datasets such that the performance of the clustering algorithm is improved. 展开更多
关键词 ATTRACTION CLUSTERING data preprocessING DENSITY SHRINKING
下载PDF
Untargeted LC–MS Data Preprocessing in Metabolomics
7
作者 He Tian Bowen Li Guanghou Shui 《Journal of Analysis and Testing》 EI 2017年第3期187-192,共6页
Liquid chromatography–mass spectrometry(LC–MS)has enabled the detection of thousands of metabolite features from a single biological sample that produces large and complex datasets.One of the key issues in LC–MS-ba... Liquid chromatography–mass spectrometry(LC–MS)has enabled the detection of thousands of metabolite features from a single biological sample that produces large and complex datasets.One of the key issues in LC–MS-based metabolomics is comprehensive and accurate analysis of enormous amount of data.Many free data preprocessing tools,such as XCMS,MZmine,MAVEN,and MetaboAnalyst,as well as commercial software,have been developed to facilitate data processing.However,researchers are challenged by the inevitable and unconquerable yields of numerous false-positive peaks,and human errors while manually removing such false peaks.Even with continuous improvements of data processing tools,there can still be many mistakes generated during data preprocessing.In addition,many data preprocessing software exist,and every tool has its own advantages and disadvantages.Thereby,a researcher needs to judge what kind of software or tools to choose that most suit their vendor proprietary formats and goal of downstream analysis.Here,we provided a brief introduction of the general steps of raw MS data processing,and properties of automated data processing tools.Then,characteristics of mainly free data preprocessing software were summarized for researchers’consideration in conducting metabolomics study. 展开更多
关键词 Metabolomics data preprocessing LC-MS Free software/tools
原文传递
Short-Term Mosques Load Forecast Using Machine Learning and Meteorological Data
8
作者 Musaed Alrashidi 《Computer Systems Science & Engineering》 SCIE EI 2023年第7期371-387,共17页
The tendency toward achieving more sustainable and green buildings turned several passive buildings into more dynamic ones.Mosques are the type of buildings that have a unique energy usage pattern.Nevertheless,these t... The tendency toward achieving more sustainable and green buildings turned several passive buildings into more dynamic ones.Mosques are the type of buildings that have a unique energy usage pattern.Nevertheless,these types of buildings have minimal consideration in the ongoing energy efficiency applications.This is due to the unpredictability in the electrical consumption of the mosques affecting the stability of the distribution networks.Therefore,this study addresses this issue by developing a framework for a short-term electricity load forecast for a mosque load located in Riyadh,Saudi Arabia.In this study,and by harvesting the load consumption of the mosque and meteorological datasets,the performance of four forecasting algorithms is investigated,namely Artificial Neural Network and Support Vector Regression(SVR)based on three kernel functions:Radial Basis(RB),Polynomial,and Linear.In addition,this research work examines the impact of 13 different combinations of input attributes since selecting the optimal features has a major influence on yielding precise forecasting outcomes.For the mosque load,the(SVR-RB)with eleven features appeared to be the best forecasting model with the lowest forecasting errors metrics giving RMSE,nRMSE,MAE,and nMAE values of 4.207 kW,2.522%,2.938 kW,and 1.761%,respectively. 展开更多
关键词 Big data harvesting mosque load forecast data preprocessing machine learning optimal features selection
下载PDF
基于CWT-RES34的风电机组叶片裂纹状态评估
9
作者 李练兵 肖亚泽 +3 位作者 张萍 张国峰 吴伟强 陈程 《噪声与振动控制》 CSCD 北大核心 2024年第2期143-148,293,共7页
为有效进行风电机组叶片运行时的裂纹状态评估,提出一种基于连续小波变换(Continue Wavelet Transform,CWT)和残差神经网络(Residual Networks,ResNet)结合的叶片裂纹状态评估方法。首先对叶片加速度振动信号做CWT后生成二维彩色时频图... 为有效进行风电机组叶片运行时的裂纹状态评估,提出一种基于连续小波变换(Continue Wavelet Transform,CWT)和残差神经网络(Residual Networks,ResNet)结合的叶片裂纹状态评估方法。首先对叶片加速度振动信号做CWT后生成二维彩色时频图像,然后将图像分别作为训练集和测试集,使用34层ResNet进行训练和诊断,最后选取天津某风电场提供的1.5 MW风力发电机作为研究对象,根据其样本数据将叶片故障程度按照裂纹长度和宽度分为健康、轻微、中等、严重、危险5种状态,评估平均准确率高达98.23%,方法的有效性和可行性得到验证。 展开更多
关键词 故障诊断 风电机组 状态评估 小波变换 残差神经网络 数据预处理
下载PDF
基于GA-BP神经网络的大型客机气流角估计方法
10
作者 张伟 张喆 +1 位作者 龚孝懿 王昕楠 《计算机仿真》 2024年第1期53-57,102,共6页
为了解决硬件冗余难以克服的气流角传感器共因故障问题,进一步提高飞机气流角信号的可靠性,研究了基于GABP神经网络的气流角估计方法。通过BP神经网络融合姿态角、加速度、风速等参数来实现不依赖气流角传感器的气流角估计;引入遗传算... 为了解决硬件冗余难以克服的气流角传感器共因故障问题,进一步提高飞机气流角信号的可靠性,研究了基于GABP神经网络的气流角估计方法。通过BP神经网络融合姿态角、加速度、风速等参数来实现不依赖气流角传感器的气流角估计;引入遗传算法对神经网络权值和阈值进行全局优化,提高估计精度;对某大型客机的试飞数据预处理后用于模型的训练和测试。仿真结果表明,训练完成的GA-BP神经网络模型对气流角的估计值贴近实际值,稳定性和精度明显高于BP神经网络。上述方法给飞机增加一个余度的气流角信号,可用于传感器故障时为飞机提供可靠的气流角信号。 展开更多
关键词 气流角估计 神经网络 遗传算法 试飞数据预处理 大型客机
下载PDF
引入神经网络极限学习机的关键数据查询模型
11
作者 张勇飞 陈艳君 赵世忠 《计算机仿真》 2024年第3期519-523,共5页
网络空间数据的结构具有较高相似性,海量数据的不断增量更新,导致关键数据查询结果存在冗余和偏离问题。因此提出基于神经网络极限学习机的关键数据查询方法。建模描述关键数据查询问题。基于此引入神经网络极限学习机,建立关键数据查... 网络空间数据的结构具有较高相似性,海量数据的不断增量更新,导致关键数据查询结果存在冗余和偏离问题。因此提出基于神经网络极限学习机的关键数据查询方法。建模描述关键数据查询问题。基于此引入神经网络极限学习机,建立关键数据查询模型。预处理数据库中无用数据和重复数据做,通过输出权值范数的最小二乘解,避免算法陷入局部最优。结合输出矩阵,训练查询模型,输出结果结果即为关键数据查询结果。为证明上述方法的性能优势,设计对比实验,结果表明提出的方法应用于关键数据查询的均方根误差不超过1.2,平均绝对百分比误差最高为4.1%,关系数F可达0.6,网络节点的使用率低于20%。以上实验数据验证了上述方法数据查询精度较高,可应用性更强。 展开更多
关键词 神经网络极限学习机 关键数据 输出权值 最小二乘解 数据预处理
下载PDF
基于人工神经网络的透射水体亚硝酸盐含量模拟估计
12
作者 王彩玲 张国浩 闫晶晶 《中国无机分析化学》 CAS 北大核心 2024年第7期857-865,共9页
亚硝酸盐是水体的重要测试指标,对水体质量的评估有着重要意义。采用透射高光谱结合人工神经网络(ANN)建立水体亚硝酸盐含量估算模型。首先采用试剂配制10种浓度的亚硝酸氮标准溶液(0.02、0.04、0.06、0.08、0.10、0.12、0.14、0.16、0... 亚硝酸盐是水体的重要测试指标,对水体质量的评估有着重要意义。采用透射高光谱结合人工神经网络(ANN)建立水体亚硝酸盐含量估算模型。首先采用试剂配制10种浓度的亚硝酸氮标准溶液(0.02、0.04、0.06、0.08、0.10、0.12、0.14、0.16、0.18和0.20 mg/L),并使用OCEAN-HDX-XR微型光纤光谱仪扫描10次各浓度亚硝酸盐溶液在181.1~1023.1 nm的透射光谱,取平均值作为各浓度亚硝酸盐溶液原始透射光谱,分别使用最大最小均一化(MMN)、标准正态变化(SNV)、多元散射校正(MSC)、以及二阶差分(SOD)四种光谱预处理方法,并结合ANN方法建立水体亚硝酸盐含量估算模型,通过比较模型的精度来选择最优的模型进行水体亚硝酸盐含量的估计。结果显示,基于二阶差分预处理下的BP-ANN神经网络预测模型中的均方根误差RMSE为0.032367,平均绝对误差MAE为0.016895,决定系数R^(2)为0.987403,与二次有理高斯过程回归(QR-GPR)和二次支持向量机(Q-SVM)预测模型相比,该模型的拟合效果更好,精确度更高。提出了反向传播人工神经网络(BP-ANN)高光谱水质亚硝酸盐参数的反演方法,为水质亚硝酸盐参数动态检测提供了新方法。 展开更多
关键词 高光谱 人工神经网络 亚硝酸盐 数据预处理 估算模型
下载PDF
进化计算在大规模高维特征选择中的应用综述
13
作者 叶志伟 王巧 +3 位作者 周雯 王明威 蔡婷 何其祎 《北方工业大学学报》 2024年第2期8-19,共12页
随着大数据时代的到来,数据的规模和特征维度呈现爆炸式增长,这给数据处理带来了前所未有的挑战。特征选择作为数据预处理的关键环节,在处理大规模高维数据时显得尤为重要。而进化计算方法因其出色的全局搜索能力和高效的优化性能,越来... 随着大数据时代的到来,数据的规模和特征维度呈现爆炸式增长,这给数据处理带来了前所未有的挑战。特征选择作为数据预处理的关键环节,在处理大规模高维数据时显得尤为重要。而进化计算方法因其出色的全局搜索能力和高效的优化性能,越来越多的研究者开始对其进行研究,其在大规模高维特征选择中得到了广泛的应用。本文首先介绍了大规模高维数据处理的重要性;然后简单介绍了部分经典和较新的进化计算方法,并详细介绍了其在大规模高维特征选择中的应用情况;最后对目前进化计算在大规模高维特征选择中存在的问题进行总结,并展望了其未来的发展方向。 展开更多
关键词 特征选择 进化计算 全局搜索 数据预处理 机器学习
下载PDF
Predicting 3D Radiotherapy Dose-Volume Based on Deep Learning
14
作者 Do Nang Toan Lam Thanh Hien +2 位作者 Ha Manh Toan Nguyen Trong Vinh Pham Trung Hieu 《Intelligent Automation & Soft Computing》 2024年第2期319-335,共17页
Cancer is one of the most dangerous diseaseswith highmortality.One of the principal treatments is radiotherapy by using radiation beams to destroy cancer cells and this workflow requires a lot of experience and skill ... Cancer is one of the most dangerous diseaseswith highmortality.One of the principal treatments is radiotherapy by using radiation beams to destroy cancer cells and this workflow requires a lot of experience and skill from doctors and technicians.In our study,we focused on the 3D dose prediction problem in radiotherapy by applying the deeplearning approach to computed tomography(CT)images of cancer patients.Medical image data has more complex characteristics than normal image data,and this research aims to explore the effectiveness of data preprocessing and augmentation in the context of the 3D dose prediction problem.We proposed four strategies to clarify our hypothesis in different aspects of applying data preprocessing and augmentation.In strategies,we trained our custom convolutional neural network model which has a structure inspired by the U-net,and residual blocks were also applied to the architecture.The output of the network is added with a rectified linear unit(Re-Lu)function for each pixel to ensure there are no negative values,which are absurd with radiation doses.Our experiments were conducted on the dataset of the Open Knowledge-Based Planning Challenge which was collected from head and neck cancer patients treatedwith radiation therapy.The results of four strategies showthat our hypothesis is rational by evaluating metrics in terms of the Dose-score and the Dose-volume histogram score(DVH-score).In the best training cases,the Dose-score is 3.08 and the DVH-score is 1.78.In addition,we also conducted a comparison with the results of another study in the same context of using the loss function. 展开更多
关键词 CT image 3D dose prediction data preprocessing augmentation
下载PDF
基于机器学习的不平衡数据下个人信用评分预测模型研究
15
作者 费振华 《长江信息通信》 2024年第4期112-114,共3页
文章介绍了个人信用评分的基本概念,以及不平衡数据及其处理方法和机器学习算法在信用评分中的应用。然后,通过数据预处理,包括数据来源与特性、数据清洗与整理、数据不平衡分析、数据增强方法和效果评估,为后续模型构建提供基础。最后... 文章介绍了个人信用评分的基本概念,以及不平衡数据及其处理方法和机器学习算法在信用评分中的应用。然后,通过数据预处理,包括数据来源与特性、数据清洗与整理、数据不平衡分析、数据增强方法和效果评估,为后续模型构建提供基础。最后,使用实际数据集进行模型训练和测试,并评估模型的性能。实验结果表明,基于机器学习的不平衡数据下个人信用评分预测模型能够有效地预测个人信用风险,对于金融机构的风险管理和信贷决策具有重要意义。 展开更多
关键词 个人信用评分 不平衡数据 机器学习 数据预处理 模型研究
下载PDF
基于LBSN数据聚类分析的城市POI感知方法
16
作者 杨桂松 郭东升 +1 位作者 何杏宇 卢海军 《智能计算机与应用》 2024年第7期43-49,共7页
城市POI的分布情况客观反映了一个城市各行各业的发展情况,传统获取POI的测绘手段成本高、更新周期长、时效性差,而基于位置的社交网络(Location-Based Social Network,LBSN)平台的发展为实现城市POI的感知提供了一种新思路。本文提出... 城市POI的分布情况客观反映了一个城市各行各业的发展情况,传统获取POI的测绘手段成本高、更新周期长、时效性差,而基于位置的社交网络(Location-Based Social Network,LBSN)平台的发展为实现城市POI的感知提供了一种新思路。本文提出一种基于LBSN数据聚类分析的城市POI感知方法,首先,对LBSN数据进行预处理,包括清洗重复数据、删除无效数据、数据预分类等,以提高数据的有效性;其次,提出一种改进的DBSCAN算法,对处理后的数据进行聚类分析,从而得到准确度较高的城市各类POI分布情况。实验结果表明,与传统的DBSCAN算法以及K-means算法相比,本文提出的算法有更好的聚类效果,且在聚类指标上有更大的CH指数值和更小的DBI指数值。 展开更多
关键词 城市POI感知 基于位置的社交网络 数据预处理 改进的DBSCAN算法
下载PDF
重力卫星加速度计数据预处理研究
17
作者 潘宗鹏 肖云 +1 位作者 刘晓刚 明锋 《地球物理学报》 SCIE EI CAS CSCD 北大核心 2024年第10期3697-3706,共10页
静电悬浮加速度计主要用于测量重力卫星质心处受到的非保守力,是重力卫星的核心载荷,卫星平台和环境中的各类干扰均会对加速度计测量产生影响.本文重点研究了GRACE Follow-On卫星加速度计ACC1A原始数据的预处理方法,给出了加速度计数据... 静电悬浮加速度计主要用于测量重力卫星质心处受到的非保守力,是重力卫星的核心载荷,卫星平台和环境中的各类干扰均会对加速度计测量产生影响.本文重点研究了GRACE Follow-On卫星加速度计ACC1A原始数据的预处理方法,给出了加速度计数据预处理详细流程,重点分析了卫星平台和环境中各类干扰引起加速度计数据异常的量级和处理方法(包括姿控推力器推力偏差、磁力矩器干扰、温控开关、阻尼震荡等引起的异常影响).基于加速度计ACC1A原始数据生成ACT1A产品和ACT1B产品,并与JPL发布相应产品进行比较.结果表明,ACT1A产品的三轴线加速度残差中误差优于10^(-17)m·s^(-2),残差幅度谱密度优于10^(-18)m·s^(-2)/(Hz)^(1/2),ACT1B产品的三轴线加速度残差中误差优于10^(-11)m·s^(-2),残差幅度谱密度优于10^(-12)m·s^(-2)/(Hz)^(1/2),均低于加速度计测量精度10^(-10)m·s^(-2),生成的产品可以用于后续重力场反演. 展开更多
关键词 重力卫星 GRACE Follow-On 加速度计 数据预处理
下载PDF
基于高光谱成像的烤烟着生部位识别
18
作者 梅吉帆 郭文孟 +8 位作者 李智慧 薛宇毅 杨忠泮 李嘉康 苏子淇 张雷 堵劲松 徐大勇 李辉 《中国烟草学报》 CAS CSCD 北大核心 2024年第3期51-60,共10页
【目的】采用高光谱成像技术结合机器学习方法,建立烤烟着生部位(上部、中部、下部)的识别模型。【方法】首先,通过分析烟叶在水、氮敏感波段下的强度分布特征,采用了一种结合OTSU和Sauvola图像分割算法的双阈值感兴趣区(ROI)选取方法,... 【目的】采用高光谱成像技术结合机器学习方法,建立烤烟着生部位(上部、中部、下部)的识别模型。【方法】首先,通过分析烟叶在水、氮敏感波段下的强度分布特征,采用了一种结合OTSU和Sauvola图像分割算法的双阈值感兴趣区(ROI)选取方法,然后对比分析不同预处理方法对数据建模的影响规律,采用支持向量机(SVM)、极限梯度提升(XGBoost)算法进行判别模型的建立,通过参数寻优进行模型的优化。使用遗传算法(GA)和遗传算法结合连续投影算法(GA-SPA)进行特征波长的选择,建立简化模型。【结果】(1)建立的双阈值感兴趣区选取方法能准确高效地实现烤烟叶片正常叶面区域的选取(2)不同数据预处理方法对识别模型影响较为显著,基于一阶导和萨维莱茨-戈莱平滑(1Der+SG)预处理光谱数据,结合GA选取的特征波长建立的XGBoost着生部位识别模型具有最佳的分类效能,其准确率高达97.78%。【结论】研究建立的基于高光谱成像技术结合机器学习方法的部位模型可满足烤烟着生部位的高效准确识别。 展开更多
关键词 高光谱成像技术 着生部位 数据预处理 机器学习 双阈值分割 定性判别
下载PDF
基于BP神经网络的FAST馈源舱融合测量预测研究
19
作者 卢朝茂 李明辉 +4 位作者 宋本宁 彭帅 冯禹 于东俊 骆亚波 《天文学进展》 CSCD 北大核心 2024年第3期519-528,共10页
500 m口径球面射电望远镜(Five-hundred-meter Aperture Spherical Radio Telescope,FAST)的跟踪观测需要馈源的空间运动配合,馈源舱主要用于实现馈源的精调定位,因此馈源舱位置的高精度测量对FAST望远镜的高效运行意义重大。但当全站... 500 m口径球面射电望远镜(Five-hundred-meter Aperture Spherical Radio Telescope,FAST)的跟踪观测需要馈源的空间运动配合,馈源舱主要用于实现馈源的精调定位,因此馈源舱位置的高精度测量对FAST望远镜的高效运行意义重大。但当全站仪设备失效时,无法对采用Kalman算法的GPS/IMU融合测量结果进行修正,导致馈源舱测量精度下降。为了解决这个问题,设计了基于BP(back propagation)神经网络的预测模型,包括数据预处理、模型设计和模型训练验证。模型训练数据为FAST真实测量数据,数据量为40 GB左右。为了验证模型的泛化能力,选取三种运动轨迹数据对模型预测精度进行测试,结果显示,三种运动轨迹下精度都满足15 mm要求。 展开更多
关键词 FAST 馈源舱融合测量预测 数据预处理 BP神经网络 时间序列
下载PDF
医院麻醉信息系统数据科研化预处理方法探索
20
作者 向茹梅 魏星 +6 位作者 戴维 张丽君 徐玮 田杰 张宏伟 孙佳昕 石丘玲 《中国医院统计》 2024年第3期219-229,共11页
目的准确、规范的数据是得出可靠研究结果的基础。本文以肺部手术为例,分析麻醉信息系统的数据特征,并进行清洗、转换、集成和归约等预处理,构建可用于科研分析的数据集。方法收集四川省某肿瘤医院2021年4月至2022年11月行肺部手术患者... 目的准确、规范的数据是得出可靠研究结果的基础。本文以肺部手术为例,分析麻醉信息系统的数据特征,并进行清洗、转换、集成和归约等预处理,构建可用于科研分析的数据集。方法收集四川省某肿瘤医院2021年4月至2022年11月行肺部手术患者麻醉信息系统的相关数据。分析源数据特征,并基于Python和SAS软件提出数据预处理流程和宏代码。通过Python的SPLIT语句,SAS宏和函数将文本数据转换为易于数据挖掘的数值数据;通过数据清洗和维归约,填补缺失值、纠正异常和不一致的数据,去除冗余数据;通过NOUNIQUEKEY、SQL和LAG语句实现数据集成,扩大数据体量。结果从麻醉信息系统和医院信息系统中导出2个Excel表,共计1835条麻醉记录和46612条医嘱记录。源数据分析发现麻醉信息系统存在医疗术语不规范、语义表达多样性、同一药物多种量纲、部分药物带有后缀“备用”的特点。基于上述数据特点和半结构化的数据结构,编译了3个宏(macro),清洗核查全部药物名称、规范化医疗术语以及统一量纲,最终提取麻醉前、术中和镇痛泵的药物各12、24、12种;完成缺失数据的二次补充,平滑噪声和清理不一致数据;剔除了48条(2.62%)非肺手术的麻醉记录,去除与挖掘任务无关的10个字段;经过数据集成,1748(97.82%)例麻醉数据与医嘱数据相匹配。通过上述数据预处理流程,最终结构化的数据集中共有1748例患者,99个变量。结论通过对源数据的分析,制定特异的麻醉数据预处理流程,进而得到了规范、准确的麻醉用药数据。为其他机构麻醉信息的数据科研化提供了方法学参考,同时为需要利用高质量麻醉用药数据的研究提供了可靠的数据基础。 展开更多
关键词 麻醉信息系统 预处理 数据清洗 数据结构化 SAS软件
下载PDF
上一页 1 2 67 下一页 到第
使用帮助 返回顶部