期刊文献+
共找到237篇文章
< 1 2 12 >
每页显示 20 50 100
Random Forests Algorithm Based Duplicate Detection in On-Site Programming Big Data Environment 被引量:1
1
作者 Qianqian Li Meng Li +1 位作者 Lei Guo Zhen Zhang 《Journal of Information Hiding and Privacy Protection》 2020年第4期199-205,共7页
On-site programming big data refers to the massive data generated in the process of software development with the characteristics of real-time,complexity and high-difficulty for processing.Therefore,data cleaning is e... On-site programming big data refers to the massive data generated in the process of software development with the characteristics of real-time,complexity and high-difficulty for processing.Therefore,data cleaning is essential for on-site programming big data.Duplicate data detection is an important step in data cleaning,which can save storage resources and enhance data consistency.Due to the insufficiency in traditional Sorted Neighborhood Method(SNM)and the difficulty of high-dimensional data detection,an optimized algorithm based on random forests with the dynamic and adaptive window size is proposed.The efficiency of the algorithm can be elevated by improving the method of the key-selection,reducing dimension of data set and using an adaptive variable size sliding window.Experimental results show that the improved SNM algorithm exhibits better performance and achieve higher accuracy. 展开更多
关键词 On-site programming big data duplicate record detection random forests adaptive sliding window
下载PDF
A Missing Power Data Filling Method Based on Improved Random Forest Algorithm 被引量:3
2
作者 Wei Deng Yixiu Guo +3 位作者 Jie Liu Yong Li Dingguo Liu Liang Zhu 《Chinese Journal of Electrical Engineering》 CSCD 2019年第4期33-39,共7页
Missing data filling is a key step in power big data preprocessing,which helps to improve the quality and the utilization of electric power data.Due to the limitations of the traditional methods of filling missing dat... Missing data filling is a key step in power big data preprocessing,which helps to improve the quality and the utilization of electric power data.Due to the limitations of the traditional methods of filling missing data,an improved random forest filling algorithm is proposed.As a result of the horizontal and vertical directions of the electric power data are based on the characteristics of time series.Therefore,the method of improved random forest filling missing data combines the methods of linear interpolation,matrix combination and matrix transposition to solve the problem of filling large amount of electric power missing data.The filling results show that the improved random forest filling algorithm is applicable to filling electric power data in various missing forms.What’s more,the accuracy of the filling results is high and the stability of the model is strong,which is beneficial in improving the quality of electric power data. 展开更多
关键词 big data cleaning missing data filling data preprocessing random forest data quality
原文传递
Random Forest Based Very Fast Decision Tree Algorithm for Data Stream
3
作者 DONG Zhenjiang LUO Shengmei +2 位作者 WEN Tao ZHANG Fayang LI Lingjuan 《ZTE Communications》 2017年第B12期52-57,共6页
The Very Fast Decision Tree(VFDT)algorithm is a classification algorithm for data streams.When processing large amounts of data,VFDT requires less time than traditional decision tree algorithms.However,when training s... The Very Fast Decision Tree(VFDT)algorithm is a classification algorithm for data streams.When processing large amounts of data,VFDT requires less time than traditional decision tree algorithms.However,when training samples become fewer,the label values of VFDT leaf nodes will have more errors,and the classification ability of single VFDT decision tree is limited.The Random Forest algorithm is a combinational classifier with high prediction accuracy and noise-tol-erant ability.It is constituted by multiple decision trees and can make up for the shortage of single decision tree.In this paper,in order to improve the classification accuracy on data streams,the Random Forest algorithm is integrated into the process of tree building of the VFDT algorithm,and a new Random Forest Based Very Fast Decision Tree algorithm named RFVFDT is designed.The RFVFDT algorithm adopts the decision tree building criterion of a Random Forest classifier,and improves Random Forest algorithm with sliding window to meet the unboundedness of data streams and avoid process delay and data loss.Experimental results of the classification of KDD CUP data sets show that the classification accuracy of RFVFDT algorithm is higher than that of VFDT.The less the samples are,the more obvious the advantage is.RFVFDT is fast when running in the multithread mode. 展开更多
关键词 data STREAM data classification random forest algorithm VFDT algorithm
下载PDF
Using machine learning algorithms to estimate stand volume growth of Larix and Quercus forests based on national-scale Forest Inventory data in China
4
作者 Huiling Tian Jianhua Zhu +8 位作者 Xiao He Xinyun Chen Zunji Jian Chenyu Li Qiangxin Ou Qi Li Guosheng Huang Changfu Liu Wenfa Xiao 《Forest Ecosystems》 SCIE CSCD 2022年第3期396-406,共11页
Estimating the volume growth of forest ecosystems accurately is important for understanding carbon sequestration and achieving carbon neutrality goals.However,the key environmental factors affecting volume growth diff... Estimating the volume growth of forest ecosystems accurately is important for understanding carbon sequestration and achieving carbon neutrality goals.However,the key environmental factors affecting volume growth differ across various scales and plant functional types.This study was,therefore,conducted to estimate the volume growth of Larix and Quercus forests based on national-scale forestry inventory data in China and its influencing factors using random forest algorithms.The results showed that the model performances of volume growth in natural forests(R^(2)=0.65 for Larix and 0.66 for Quercus,respectively)were better than those in planted forests(R^(2)=0.44 for Larix and 0.40 for Quercus,respectively).In both natural and planted forests,the stand age showed a strong relative importance for volume growth(8.6%–66.2%),while the edaphic and climatic variables had a limited relative importance(<6.0%).The relationship between stand age and volume growth was unimodal in natural forests and linear increase in planted Quercus forests.And the specific locations(i.e.,altitude and aspect)of sampling plots exhibited high relative importance for volume growth in planted forests(4.1%–18.2%).Altitude positively affected volume growth in planted Larix forests but controlled volume growth negatively in planted Quercus forests.Similarly,the effects of other environmental factors on volume growth also differed in both stand origins(planted versus natural)and plant functional types(Larix versus Quercus).These results highlighted that the stand age was the most important predictor for volume growth and there were diverse effects of environmental factors on volume growth among stand origins and plant functional types.Our findings will provide a good framework for site-specific recommendations regarding the management practices necessary to maintain the volume growth in China's forest ecosystems. 展开更多
关键词 Stand volume growth Stand origin Plant functional type National forest inventory data random forest algorithms
下载PDF
Improved Random Forest Algorithm Based on Adaptive Step Size Artificial Bee Colony Optimization
5
作者 Jiuyuan Huo Xuan Qin +2 位作者 Hamzah Murad Mohammed Al-Neshmi Lin Mu Tao Ju 《国际计算机前沿大会会议论文集》 2020年第2期216-233,共18页
The traditional random forest algorithm works along with unbalanced data,cannot achieve satisfactory prediction results for minority class,and suffers from the parameter selection dilemma.In view of this problem,this ... The traditional random forest algorithm works along with unbalanced data,cannot achieve satisfactory prediction results for minority class,and suffers from the parameter selection dilemma.In view of this problem,this paper proposes an unbalanced accuracy weighted random forest algorithm(UAW_RF)based on the adaptive step size artificial bee colony optimization.It combines the ideas of decision tree optimization,sampling selection,and weighted voting to improve the ability of stochastic forest algorithm when dealing with biased data classification.The adaptive step size and the optimal solution were introduced to improve the position updating formula of the artificial bee colony algorithm,and then the parameter combination of the random forest algorithm was iteratively optimized with the advantages of the algorithm.Experimental results show satisfactory accuracies and prove that the method can effectively improve the classification accuracy of the random forest algorithm. 展开更多
关键词 random forest algorithm Artificial bee colony algorithm Unbalanced data Classification problem
原文传递
Resource prediction and assessment based on 3D/4D big data modeling and deep integration in key ore districts of North China 被引量:2
6
作者 Gongwen WANG Zhiqiang ZHANG +6 位作者 Ruixi LI Junjian LI Deming SHA Qingdong ZENG Zhenshan PANG Dapeng LI Leilei HUANG 《Science China Earth Sciences》 SCIE EI CSCD 2021年第9期1590-1606,共17页
The North China district has been subjected to significant research with regard to the ore-forming dynamics,processes,and quantitative forecasting of gold deposits;it accounts for the highest number of gold reserves a... The North China district has been subjected to significant research with regard to the ore-forming dynamics,processes,and quantitative forecasting of gold deposits;it accounts for the highest number of gold reserves and annual products in China.Based on the top-level design of geoscience theory and the method adopted by the National Key R&D Project(deep process and metallogenic mechanism of North China Craton(NCC)metallogenic system),this paper systematically collects and constructs the geoscience data(district,camp,and deposit scales)in four key gold districts of North China(Jiaojia-Sanshandao,Southern Zhaoping,Wulong,and Qingchengzi).The settings associated with the geological dynamics of gold deposits were quantitatively and synthetically analyzed,namely:NCC destruction,metallogenic events,genetic models,and exploration models.Three-dimensional(3D)and four-dimensional(4D)geological modeling was performed using the big data on the districts,while the district-scale 3D exploration criteria were integrated to construct a quantitative exploration model.Among them,FLAC3D modelling and the Geo Cube software(version 3.0)were used to implement the numerical simulation of the 3D geological models and the constraints of the fluid saturation parameters of the Jiaojia fault to reconstruct the 4D fault structure models of the Jiaojia fault(with a depth of 5000 m).Using Geo Cube3.0,multiple integration modules(general weights of evidence(Wof E),Boost Wof E,Fuzzy Wof E,Logistic Regression,Information Entropy,and Random Forest)and exploration criteria were integrated,while the C-V fractal classification of A,B and C targets in four districts was carried out.The research results are summarized in the following four areas:(1)Four gold districts in the study area have more than three targets(the depth is 3000 m),and the class A,B and C targets exhibit a good spatial correlation with gold bodies that are controlled by mining engineering at depths greater than 1000 m.(2)The Boost Wof E method was used to identify the target optimization in 3D spaces(at depths of 3000–5000 m)of the Jiaojia-Sanshandao,Southern Zhaoping,and Wulong districts.(3)The general Wof E method is based on the Bayesian theory in 3D space and provides robust integration and target optimization that are suitable for the Jiaojia-Sanshandao and Southern Zhaoping districts in the Jiaodong area;it can also be applied to the Wulong district in the Liaodong area using a quantitative genetic model and an exploration model.Random forest is a multi-objective integration and target optimization method for 3D spaces,and it is suitable for the complex exploration model in the Qingchengzi district of the Liaodong area.The genetic model and exploration criteria associated with the exploration model of the Qingchengzi district were constrained by the common characteristics of the gold fault structure,magmatic rock emplacement in North China,and the strata fold and interlayer detachment structure.(4)Based on the gold reserves and the 3D block unit model of the Sanshandao gold deposit in the Jiaojia-Sanshandao district,the gold contents of the 3D block units in class A and B targets of the ore concentration were estimated to be 65.5%and 25.1%,respectively.The total Au resources of the optimized targets below a depth of 3000 m were 3908 t(including 1700 t reserves),and the total Au resources of the targets at depths from 3000 to 5000 m were 936 t.The study shows that the deep gold deposits in the four gold districts of North China exhibit a strong"transport-deposition"spatial correlation with potential targets.These"transport-deposition"spatial models represent the tectonic-magmatic-hydrothermal activities of the metallogenic system associated with the NCC destruction events and indicate the Au enrichment zones. 展开更多
关键词 Geoscience big data 3D/4D modeling Weights of evidence random forest Target optimization and resources assessment Gold district in North China
原文传递
基于Isolation Forest和Random Forest相结合的智能电网时间序列数据异常检测算法 被引量:8
7
作者 杨永娇 肖建毅 +1 位作者 赵创业 周开东 《计算机与现代化》 2020年第3期99-102,126,共5页
智能电网的信息系统是保障电力行业正常运行的基础,而智能电网中各种时间序列数据的分析结果是衡量信息系统稳定运行的重要依据。传统的时间序列数据异常检测算法很难同时兼顾准确性和实时性。本文引入基于Isolation Forest和Random For... 智能电网的信息系统是保障电力行业正常运行的基础,而智能电网中各种时间序列数据的分析结果是衡量信息系统稳定运行的重要依据。传统的时间序列数据异常检测算法很难同时兼顾准确性和实时性。本文引入基于Isolation Forest和Random Forest相结合的智能电网时间序列数据异常检测算法,结合无监督学习算法和有监督学习算法的优点,实现机器自动标注和自动学习阈值,人工标注少量特征值,从一定程度上提高了时间序列数据异常检查准确性和实时性,可以满足智能电网时间序列数据异常检测需求,从而达到提升智能电网信息安全的目的。 展开更多
关键词 Isolation forest算法 random forest算法 异常检测算法 时间序列数据 智能电网
下载PDF
Classifying forest inventory data into species-based forest community types at broad extents: exploring tradeoffs among supervised and unsupervised approaches
8
作者 jennifer k.costanza don faber-langendoen +1 位作者 john w.coulston david n.wear 《Forest Ecosystems》 SCIE CSCD 2018年第1期91-107,共17页
Background: Knowledge of the different kinds of tree communities that currently exist can provide a baseline for assessing the ecological attributes of forests and monitoring future changes. Forest inventory data can... Background: Knowledge of the different kinds of tree communities that currently exist can provide a baseline for assessing the ecological attributes of forests and monitoring future changes. Forest inventory data can facilitate the development of this baseline knowledge across broad extents, but they first must be classified into forest community types. Here, we compared three alternative classifications across the United States using data from over 117,000 U.S. Department of Agriculture Forest Service Forest Inventory and Analysis (FIA) plots. Methods: Each plot had three forest community type labels: (1) "FIA" types were assigned by the FIA program using a supervised method; (2) "USNVC" types were assigned via a key based on the U.S. National Vegetation Classification; (3) "empirical" types resulted from unsupervised clustering of tree species information. We assessed the degree to which analog classes occurred among classifications, compared indicator species values, and used random forest models to determine how well the classifications could be predicted using environmental variables. Results: The classifications generated groups of classes that had broadly similar distributions, but often there was no one-to-one analog across the classifications. The Iongleaf pine forest community type stood out as the exception: it was the only class with strong analogs across all classifications. Analogs were most lacking for forest community types with species that occurred across a range of geographic and environmental conditions, such as Ioblolly pine types, indicator species metrics were generally high for the USNVC, suggesting that LJSNVC classes are floristically well-defined. The empirical classification was best predicted by environmental variables. The most important predictors differed slightly but were broadly similar across all classifications, and included slope, amount of forest in the surrounding landscape, average minimum temperature, and other climate variables. Conclusions: The classifications have similarities and differences that reflect their differing approaches and Dbjectives. They are most consistent for forest community types that occur in a relatively narrow range of Invironmental conditions, and differ most for types with wide-ranging tree species. Environmental variables at variety of scales were important for predicting all classifications, though strongest for the empirical and FIA, guggesting that each is useful for studying how forest communities respond to of multi-scale environmental processes, including global change drivers. 展开更多
关键词 big data Correspondence analysis Dominant species forest communities Global change Hierarchical classification Indicator species random forests Species assemblages
下载PDF
A Hadoop Performance Prediction Model Based on Random Forest
9
作者 Zhendong Bei Zhibin Yu +4 位作者 Huiling Zhang Chengzhong Xu Shenzhong Feng Zhenjiang Dong Hengsheng Zhang 《ZTE Communications》 2013年第2期38-44,共7页
MapReduce is a programming model for processing large data sets, and Hadoop is the most popular open-source implementation of MapReduce. To achieve high performance, up to 190 Hadoop configuration parameters must be m... MapReduce is a programming model for processing large data sets, and Hadoop is the most popular open-source implementation of MapReduce. To achieve high performance, up to 190 Hadoop configuration parameters must be manually tunned. This is not only time-consuming but also error-pron. In this paper, we propose a new performance model based on random forest, a recently devel- oped machine-learning algorithm. The model, called RFMS, is used to predict the performance of a Hadoop system according to the system' s configuration parameters. RFMS is created from 2000 distinct fine-grained performance observations with different Hadoop configurations. We test RFMS against the measured performance of representative workloads from the Hadoop Micro-benchmark suite. The results show that the prediction accuracy of RFMS achieves 95% on average and up to 99%. This new, highly accurate prediction model can be used to automatically optimize the performance of Hadoop systems. 展开更多
关键词 big data cloud computing MAPREDUCE HADOOP random forest micro-benchmark
下载PDF
基于改进随机森林的大坝监测数据质量评价算法
10
作者 潘宇 李登华 丁勇 《人民长江》 北大核心 2024年第2期231-237,共7页
针对大坝安全监测数据质量评价效率低下、智慧化不足等难题,为了满足大坝高频率自动化采集的实时数据质量评价需要,从准确性、完整性、时效性和连续性4个方面出发提出了6项评价因子及由相关评价规范构成的安全监测历史数据质量评价标准... 针对大坝安全监测数据质量评价效率低下、智慧化不足等难题,为了满足大坝高频率自动化采集的实时数据质量评价需要,从准确性、完整性、时效性和连续性4个方面出发提出了6项评价因子及由相关评价规范构成的安全监测历史数据质量评价标准,通过基于AUC值改进的随机森林算法建立了大坝安全监测历史数据质量评价算法,并将该算法应用于新疆柳树沟面板堆石坝多年安全监测历史数据评价。结果表明:通过AUC值改进的随机森林算法优于原始算法,在特征属性数量取3时效果最好,测试集的泛化误差最小仅为0.0195,平均准确率稳定在96.97%附近,10折交叉验证平均准确率达到97.77%,证明了该算法的可行性。 展开更多
关键词 大坝安全监测 数据质量评价 随机森林算法 评价因子
下载PDF
基于随机森林算法的电力工程数据预测分析建模与仿真
11
作者 周云浩 杨宝杰 +2 位作者 刘丹 李海峰 杨鹏飞 《电子设计工程》 2024年第4期103-106,111,共5页
针对电力工程数据量大、种类较多且传统分析模型处理效果不佳等问题,文中构建了一种基于随机森林算法的电力工程数据预测分析模型。该模型通过采集层获取各种工程数据,并在数据分析层运用经灰狼优化算法改进的随机森林算法对各种数据进... 针对电力工程数据量大、种类较多且传统分析模型处理效果不佳等问题,文中构建了一种基于随机森林算法的电力工程数据预测分析模型。该模型通过采集层获取各种工程数据,并在数据分析层运用经灰狼优化算法改进的随机森林算法对各种数据进行深度挖掘及学习,以获得电力工程数据的预测结果,从而满足应用层的业务需求。基于Matlab仿真平台进行数值实验论证的结果表明,所提模型的平均绝对百分比误差与均方根误差分别为4.15%、34.19万元,且均优于其他对比模型。 展开更多
关键词 数据预测分析 随机森林算法 灰狼优化算法 平均绝对百分比误差 均方根误差
下载PDF
基于改进模糊聚类算法的大数据随机挖掘仿真
12
作者 李萍 刘金金 《计算机仿真》 2024年第2期496-499,521,共5页
大数据挖掘是从大量有噪声的、随机模糊的大数据中提取有价值信息的过程,由于海量大数据具有多维性、稀疏性以及动态性等特点,准确获取其分布特征的难度较大,随机挖掘难以直接实现。为此提出基于改进模糊聚类算法的大数据随机挖掘方法... 大数据挖掘是从大量有噪声的、随机模糊的大数据中提取有价值信息的过程,由于海量大数据具有多维性、稀疏性以及动态性等特点,准确获取其分布特征的难度较大,随机挖掘难以直接实现。为此提出基于改进模糊聚类算法的大数据随机挖掘方法。利用建立的语义概念树模型获取大数据的特征分布关系,并根据模糊语义分析法得出大数据的语义相似性、关联性条件,提取大数据特征。优先确定最佳聚类数,采用改进模糊聚类算法对其聚类,实现基于改进模糊算法的大数据随机挖掘。实验结果表明,上述方法的大数据模糊聚类效果较好,随机挖掘准确率可达到95%以上,实验所得结果验证了上述方法较强的应用有效性。 展开更多
关键词 改进模糊聚类算法 大数据随机挖掘 语义概念树 特征提取 特征聚类
下载PDF
基于Sentinel-1/2数据特征优选的冬小麦种植区识别方法研究
13
作者 解毅 王佳楠 刘钰 《农业机械学报》 EI CAS CSCD 北大核心 2024年第2期231-241,共11页
为了提高冬小麦种植区识别精度,本文基于谷歌地球引擎(Google Earth Engine,GEE)平台和随机森林算法,对比雷达和光学遥感数据对冬小麦提取效果的差异,并对多类特征变量进行重要性分析,研究特征优选对冬小麦识别精度的影响。选取2019年3... 为了提高冬小麦种植区识别精度,本文基于谷歌地球引擎(Google Earth Engine,GEE)平台和随机森林算法,对比雷达和光学遥感数据对冬小麦提取效果的差异,并对多类特征变量进行重要性分析,研究特征优选对冬小麦识别精度的影响。选取2019年3—5月冬小麦关键生育期的Sentinel-1和Sentinel-2影像为数据源,构建Sentinel-1的极化特征和纹理特征以及Sentinel-2的光谱特征、植被指数特征、植被指数变化率特征共5类特征变量;设置不同数据源和不同特征组合的冬小麦种植区提取方案;对方案中特征变量进行优选,得出最优特征组合,利用最优特征组合对河南省驻马店市冬小麦种植区进行提取。结果表明,无论是否进行特征优选,基于多源遥感数据的冬小麦识别精度均优于仅采用光学或雷达数据的精度;经过特征优选后,各方案的分类精度均有不同程度的提升,说明多源数据特征变量组合和特征优选均能够提高分类精度。不同月份和类型的特征变量对分类精度的贡献率不同,贡献率由大到小为4月、3月和5月;贡献率由大到小的特征类型为极化特征、植被指数变化率特征、植被指数特征、光谱特征和纹理特征。基于多源数据特征优选提取的2019年驻马店冬小麦空间分布最优,总体精度为95.60%,Kappa系数为0.93,冬小麦提取面积与统计年鉴数据相比,相对误差为2.23%。本文可为基于多源光学和雷达遥感影像进行农作物种植区提取的研究提供理论参考。 展开更多
关键词 冬小麦 种植区识别 特征优选 哨兵数据 GEE 随机森林算法
下载PDF
云边协同背景下基于融合RF算法的电网数据资产综合处理技术
14
作者 陈浩敏 梁锦照 +1 位作者 马赟 李晋伟 《沈阳工业大学学报》 CAS 北大核心 2024年第1期54-59,共6页
针对现有大多数方法难以充分挖掘出电网数据潜在价值的问题,提出了一种云边协同背景下基于随机森林算法结合BP神经网络的电网数据资产综合处理技术。该技术在靠近电网数据源一侧部署边缘计算节点,以构建云边协同环境下的电网数字化资产... 针对现有大多数方法难以充分挖掘出电网数据潜在价值的问题,提出了一种云边协同背景下基于随机森林算法结合BP神经网络的电网数据资产综合处理技术。该技术在靠近电网数据源一侧部署边缘计算节点,以构建云边协同环境下的电网数字化资产管理系统。利用随机森林算法设计分类器完成电网数据类型的划分,并将各类型数据输入至BP神经网络中进行学习,通过不断地迭代优化输出相应的综合处理结果。基于Python平台进行的实验分析结果表明,所提技术的分类准确率均超过了90%,能够有效提升电网数据资产的处理效率。 展开更多
关键词 云边协同 随机森林算法 BP神经网络 电网数据资产 电网数字化 分类器 数据处理 负荷预测
下载PDF
基于改进GRU与MVC设计模式的数据智能分析算法
15
作者 牛洁 《电子设计工程》 2024年第10期25-29,共5页
针对传统财务异常数据检测方法效率较低、准确度差且坏账率高的问题,文中基于改进的人工智能算法提出了一种异常数据检测方法。由于高维异常数据难以分析,先用孤立森林算法将其剔除,再将处理后的数据经过双向GRU算法的训练,挖掘出数据... 针对传统财务异常数据检测方法效率较低、准确度差且坏账率高的问题,文中基于改进的人工智能算法提出了一种异常数据检测方法。由于高维异常数据难以分析,先用孤立森林算法将其剔除,再将处理后的数据经过双向GRU算法的训练,挖掘出数据的时序性特征。对于训练后数据分类准确度较低的问题,通过注意力机制对数据特征权重进行排序,从而得到最终的分类结果。基于MVC设计了软件架构进行实验测试,该算法的训练总时长明显低于对比算法,RMSE及MAPE指标相较Bi-LSTM算法低0.2%和0.15%,且准确率、召回率与F1值在对比算法中也为最优。 展开更多
关键词 异常数据分析 孤立森林算法 双向GRU 注意力机制 MVC设计 大数据
下载PDF
基于随机森林算法的僵尸企业大数据识别方法研究
16
作者 贺元启 江乾坤 《浙江水利水电学院学报》 2024年第1期79-85,共7页
僵尸企业严重浪费社会资源,应及时进行处置。但目前对僵尸企业的识别标准不够明晰是阻碍我国僵尸企业处置工作的重要因素。为解决这一问题,通过随机森林算法的大数据识别方法,发现净利润及其变动、纳税总额及其变动、最低利息保障倍数... 僵尸企业严重浪费社会资源,应及时进行处置。但目前对僵尸企业的识别标准不够明晰是阻碍我国僵尸企业处置工作的重要因素。为解决这一问题,通过随机森林算法的大数据识别方法,发现净利润及其变动、纳税总额及其变动、最低利息保障倍数、政府补贴依赖程度等指标能更好地起到预警僵尸企业的作用。因此,随机森林算法等大数据识别方法为我国僵尸企业预警提供了新路径,有利于及时处置僵尸企业。 展开更多
关键词 僵尸企业 大数据识别方法 银行信贷 政府补贴 随机森林算法
下载PDF
炼油装置计划优化预测模型的创建
17
作者 陈铮 《石油化工技术与经济》 CAS 2024年第2期27-29,共3页
随着炼化企业在生产计划的准确性、调度排产精确度方面的要求逐年提高,对于智能化排产软件的需求越来越高。与传统构建装置反应机理模型的方式不同,文章提供了一种基于随机森林算法创建炼油装置预测模型的方法,并在实际运用中取得较好... 随着炼化企业在生产计划的准确性、调度排产精确度方面的要求逐年提高,对于智能化排产软件的需求越来越高。与传统构建装置反应机理模型的方式不同,文章提供了一种基于随机森林算法创建炼油装置预测模型的方法,并在实际运用中取得较好效果。 展开更多
关键词 计划优化 大数据 机器学习 随机森林
下载PDF
基于大数据分析技术的企业管理成本智能化预测技术研究
18
作者 董文杰 《吉林农业科技学院学报》 2024年第2期57-61,共5页
市场经济环境下企业管理成本的智能化预测为企业快速、高效决策提供了数据支撑,有助于企业的健康发展。针对企业管理成本影响因素多、传统预测技术智能化水平低的问题,采用烟花算法对随机森林算法的参数进行优化,同时借助Informatica软... 市场经济环境下企业管理成本的智能化预测为企业快速、高效决策提供了数据支撑,有助于企业的健康发展。针对企业管理成本影响因素多、传统预测技术智能化水平低的问题,采用烟花算法对随机森林算法的参数进行优化,同时借助Informatica软件对影响企业管理成本的各种因素进行管理、集成,依次作为预测模型的输入。将构建的企业管理成本智能化系统应用于零售业企业和制造业企业中,分析了导致企业管理成本较高的原因,为企业发展决策的制定提供了一定的参考。 展开更多
关键词 大数据技术 随机森林 烟花算法 管理成本智能化预测
下载PDF
基于随机森林算法的工业大数据故障分析
19
作者 张艳敏 董坤行 《无线互联科技》 2024年第6期20-22,共3页
随着信息技术的发展,工业互联网技术已经被应用到工业大数据生产的各个环节,基于大数据技术的数据采集、数据存储、数据处理、数据分析和数据可视化等模块的技术应用也越来越走向成熟和高端。但是数据异常在生产过程中带来的风险始终是... 随着信息技术的发展,工业互联网技术已经被应用到工业大数据生产的各个环节,基于大数据技术的数据采集、数据存储、数据处理、数据分析和数据可视化等模块的技术应用也越来越走向成熟和高端。但是数据异常在生产过程中带来的风险始终是企业不可忽视的问题。文章对工业大数据的实时数据进行特征提取、数据处理,采用随机森林算法对工业大数据进行训练、构建模型,将实时数据输入模型中,动态更新参数以提高模型的分类精度,输出分类结果,最终在工业生产过程中对工业大数据进行故障预警并进行故障分析。 展开更多
关键词 工业大数据 随机森林 故障预警
下载PDF
基于机器学习的滤棒圆周模型
20
作者 陈慧媛 李敏娴 +2 位作者 张羽茜 华鸣宇 李依璇 《价值工程》 2024年第18期92-94,共3页
卷烟圆周是滤棒的重要物理指标之一。它能直接影响卷烟的消费者体验。为了更好地确保生产滤棒圆周的一致性,本文利用工业大数据,考虑到变量之间可能存在的非线性关系,选择了随机森林的机器学习方法对影响滤棒圆周因素进行了研究。结果表... 卷烟圆周是滤棒的重要物理指标之一。它能直接影响卷烟的消费者体验。为了更好地确保生产滤棒圆周的一致性,本文利用工业大数据,考虑到变量之间可能存在的非线性关系,选择了随机森林的机器学习方法对影响滤棒圆周因素进行了研究。结果表明,经过变量选择后,模型的精准度由77.92%提高到了84.4%,滤棒圆周主要受冷却棒位置、吸阻高低、吸阻稳定性、丝束填充量、开松辊速比影响。 展开更多
关键词 滤棒圆周 工业大数据 变量选择 机器学习 随机森林
下载PDF
上一页 1 2 12 下一页 到第
使用帮助 返回顶部