期刊文献+
共找到493篇文章
< 1 2 25 >
每页显示 20 50 100
Optimal Estimation of High-Dimensional Covariance Matrices with Missing and Noisy Data
1
作者 Meiyin Wang Wanzhou Ye 《Advances in Pure Mathematics》 2024年第4期214-227,共14页
The estimation of covariance matrices is very important in many fields, such as statistics. In real applications, data are frequently influenced by high dimensions and noise. However, most relevant studies are based o... The estimation of covariance matrices is very important in many fields, such as statistics. In real applications, data are frequently influenced by high dimensions and noise. However, most relevant studies are based on complete data. This paper studies the optimal estimation of high-dimensional covariance matrices based on missing and noisy sample under the norm. First, the model with sub-Gaussian additive noise is presented. The generalized sample covariance is then modified to define a hard thresholding estimator , and the minimax upper bound is derived. After that, the minimax lower bound is derived, and it is concluded that the estimator presented in this article is rate-optimal. Finally, numerical simulation analysis is performed. The result shows that for missing samples with sub-Gaussian noise, if the true covariance matrix is sparse, the hard thresholding estimator outperforms the traditional estimate method. 展开更多
关键词 high-dimensional Covariance Matrix Missing data Sub-Gaussian Noise Optimal Estimation
下载PDF
Observation points classifier ensemble for high-dimensional imbalanced classification 被引量:1
2
作者 Yulin He Xu Li +3 位作者 Philippe Fournier‐Viger Joshua Zhexue Huang Mianjie Li Salman Salloum 《CAAI Transactions on Intelligence Technology》 SCIE EI 2023年第2期500-517,共18页
In this paper,an Observation Points Classifier Ensemble(OPCE)algorithm is proposed to deal with High-Dimensional Imbalanced Classification(HDIC)problems based on data processed using the Multi-Dimensional Scaling(MDS)... In this paper,an Observation Points Classifier Ensemble(OPCE)algorithm is proposed to deal with High-Dimensional Imbalanced Classification(HDIC)problems based on data processed using the Multi-Dimensional Scaling(MDS)feature extraction technique.First,dimensionality of the original imbalanced data is reduced using MDS so that distances between any two different samples are preserved as well as possible.Second,a novel OPCE algorithm is applied to classify imbalanced samples by placing optimised observation points in a low-dimensional data space.Third,optimization of the observation point mappings is carried out to obtain a reliable assessment of the unknown samples.Exhaustive experiments have been conducted to evaluate the feasibility,rationality,and effectiveness of the proposed OPCE algorithm using seven benchmark HDIC data sets.Experimental results show that(1)the OPCE algorithm can be trained faster on low-dimensional imbalanced data than on high-dimensional data;(2)the OPCE algorithm can correctly identify samples as the number of optimised observation points is increased;and(3)statistical analysis reveals that OPCE yields better HDIC performances on the selected data sets in comparison with eight other HDIC algorithms.This demonstrates that OPCE is a viable algorithm to deal with HDIC problems. 展开更多
关键词 classifier ensemble feature transformation high-dimensional data classification imbalanced learning observation point mechanism
下载PDF
A Length-Adaptive Non-Dominated Sorting Genetic Algorithm for Bi-Objective High-Dimensional Feature Selection
3
作者 Yanlu Gong Junhai Zhou +2 位作者 Quanwang Wu MengChu Zhou Junhao Wen 《IEEE/CAA Journal of Automatica Sinica》 SCIE EI CSCD 2023年第9期1834-1844,共11页
As a crucial data preprocessing method in data mining,feature selection(FS)can be regarded as a bi-objective optimization problem that aims to maximize classification accuracy and minimize the number of selected featu... As a crucial data preprocessing method in data mining,feature selection(FS)can be regarded as a bi-objective optimization problem that aims to maximize classification accuracy and minimize the number of selected features.Evolutionary computing(EC)is promising for FS owing to its powerful search capability.However,in traditional EC-based methods,feature subsets are represented via a length-fixed individual encoding.It is ineffective for high-dimensional data,because it results in a huge search space and prohibitive training time.This work proposes a length-adaptive non-dominated sorting genetic algorithm(LA-NSGA)with a length-variable individual encoding and a length-adaptive evolution mechanism for bi-objective highdimensional FS.In LA-NSGA,an initialization method based on correlation and redundancy is devised to initialize individuals of diverse lengths,and a Pareto dominance-based length change operator is introduced to guide individuals to explore in promising search space adaptively.Moreover,a dominance-based local search method is employed for further improvement.The experimental results based on 12 high-dimensional gene datasets show that the Pareto front of feature subsets produced by LA-NSGA is superior to those of existing algorithms. 展开更多
关键词 Bi-objective optimization feature selection(FS) genetic algorithm high-dimensional data length-adaptive
下载PDF
Similarity measurement method of high-dimensional data based on normalized net lattice subspace 被引量:4
4
作者 李文法 Wang Gongming +1 位作者 Li Ke Huang Su 《High Technology Letters》 EI CAS 2017年第2期179-184,共6页
The performance of conventional similarity measurement methods is affected seriously by the curse of dimensionality of high-dimensional data.The reason is that data difference between sparse and noisy dimensionalities... The performance of conventional similarity measurement methods is affected seriously by the curse of dimensionality of high-dimensional data.The reason is that data difference between sparse and noisy dimensionalities occupies a large proportion of the similarity,leading to the dissimilarities between any results.A similarity measurement method of high-dimensional data based on normalized net lattice subspace is proposed.The data range of each dimension is divided into several intervals,and the components in different dimensions are mapped onto the corresponding interval.Only the component in the same or adjacent interval is used to calculate the similarity.To validate this method,three data types are used,and seven common similarity measurement methods are compared.The experimental result indicates that the relative difference of the method is increasing with the dimensionality and is approximately two or three orders of magnitude higher than the conventional method.In addition,the similarity range of this method in different dimensions is [0,1],which is fit for similarity analysis after dimensionality reduction. 展开更多
关键词 high-dimensional data the curse of dimensionality SIMILARITY NORMALIZATION SUBSPACE NPsim
下载PDF
Energy-balanced clustering protocol for data gathering in wireless sensor networks with unbalanced traffic load 被引量:1
5
作者 奎晓燕 王建新 张士庚 《Journal of Central South University》 SCIE EI CAS 2012年第11期3180-3187,共8页
Energy-efficient data gathering in multi-hop wireless sensor networks was studied,considering that different node produces different amounts of data in realistic environments.A novel dominating set based clustering pr... Energy-efficient data gathering in multi-hop wireless sensor networks was studied,considering that different node produces different amounts of data in realistic environments.A novel dominating set based clustering protocol (DSCP) was proposed to solve the data gathering problem in this scenario.In DSCP,a node evaluates the potential lifetime of the network (from its local point of view) assuming that it acts as the cluster head,and claims to be a tentative cluster head if it maximizes the potential lifetime.When evaluating the potential lifetime of the network,a node considers not only its remaining energy,but also other factors including its traffic load,the number of its neighbors,and the traffic loads of its neighbors.A tentative cluster head becomes a final cluster head with a probability inversely proportional to the number of tentative cluster heads that cover its neighbors.The protocol can terminate in O(n/lg n) steps,and its total message complexity is O(n2/lg n).Simulation results show that DSCP can effectively prolong the lifetime of the network in multi-hop networks with unbalanced traffic load.Compared with EECT,the network lifetime is prolonged by 56.6% in average. 展开更多
关键词 无线传感器网络 负载不平衡 数据收集 数据流量 协议 能量均衡 延长使用寿命 群集
下载PDF
Optimized Modeling Method for Unbalanced Data in High-Level Visual Semantic Concept Classification
6
作者 谭励 曹元大 +1 位作者 杨明华 贺巧艳 《Journal of Beijing Institute of Technology》 EI CAS 2009年第2期186-191,共6页
To solve the unbalanced data problems of learning models for semantic concepts, an optimized modeling method based on the posterior probability support vector machine (PPSVM) is presented. A neighborbased posterior ... To solve the unbalanced data problems of learning models for semantic concepts, an optimized modeling method based on the posterior probability support vector machine (PPSVM) is presented. A neighborbased posterior probability estimator for visual concepts is provided. The proposed method has been applied in a high-level visual semantic concept classification system and the experiment results show that it results in enhanced performance over the baseline SVM models, as well as in improved robustness with respect to high-level visual semantic concept classification. 展开更多
关键词 visual concept modeling posterior probability support vector machine unbalanced data
下载PDF
Dimensionality Reduction of High-Dimensional Highly Correlated Multivariate Grapevine Dataset
7
作者 Uday Kant Jha Peter Bajorski +3 位作者 Ernest Fokoue Justine Vanden Heuvel Jan van Aardt Grant Anderson 《Open Journal of Statistics》 2017年第4期702-717,共16页
Viticulturists traditionally have a keen interest in studying the relationship between the biochemistry of grapevines’ leaves/petioles and their associated spectral reflectance in order to understand the fruit ripeni... Viticulturists traditionally have a keen interest in studying the relationship between the biochemistry of grapevines’ leaves/petioles and their associated spectral reflectance in order to understand the fruit ripening rate, water status, nutrient levels, and disease risk. In this paper, we implement imaging spectroscopy (hyperspectral) reflectance data, for the reflective 330 - 2510 nm wavelength region (986 total spectral bands), to assess vineyard nutrient status;this constitutes a high dimensional dataset with a covariance matrix that is ill-conditioned. The identification of the variables (wavelength bands) that contribute useful information for nutrient assessment and prediction, plays a pivotal role in multivariate statistical modeling. In recent years, researchers have successfully developed many continuous, nearly unbiased, sparse and accurate variable selection methods to overcome this problem. This paper compares four regularized and one functional regression methods: Elastic Net, Multi-Step Adaptive Elastic Net, Minimax Concave Penalty, iterative Sure Independence Screening, and Functional Data Analysis for wavelength variable selection. Thereafter, the predictive performance of these regularized sparse models is enhanced using the stepwise regression. This comparative study of regression methods using a high-dimensional and highly correlated grapevine hyperspectral dataset revealed that the performance of Elastic Net for variable selection yields the best predictive ability. 展开更多
关键词 high-dimensional data MULTI-STEP Adaptive Elastic Net MINIMAX CONCAVE Penalty Sure Independence Screening Functional data Analysis
下载PDF
Adaptive Optimization Swarm Algorithm Ensemble Model Applied to the Classification of Unbalanced Data
8
作者 Qingqing He Chao Qin 《Intelligent Information Management》 2021年第5期251-267,共17页
In order to solve the problem that, the <span style="white-space:normal;">hyper-parameters</span> of the existing random forest-based classification prediction model depend on empirical settings,... In order to solve the problem that, the <span style="white-space:normal;">hyper-parameters</span> of the existing random forest-based classification prediction model depend on empirical settings, which leads to unsatisfactory model performance. We propose a based on adaptive particle swarm optimization algorithm random forest model to optimize data classification and an adaptive particle swarm algorithm for optimizing hyper-parameters in the random forest to ensure that the model can better predict unbalanced data. Aiming at the premature convergence problem in the particle swarm optimization algorithm, the population is adaptively divided according to the fitness of the population, and an adaptive update strategy is introduced to enhance the ability of particles to jump out of the local optimum. The main steps of the model are as follows: Normalize the data set, initialize the model on the training set, and then use the particle swarm optimization algorithm to optimize the modeling process to establish a classification model. Experimental results show that our proposed algorithm is better than traditional algorithms, especially in terms of F1-Measure and ACC evaluation standards. The results of the six-keel imbalanced data set demonstrate the advantages of our proposed algorithm. 展开更多
关键词 Random Forest APSO unbalanced data Parameter Optimization
下载PDF
Making Short-term High-dimensional Data Predictable
9
作者 CHEN Luonan 《Bulletin of the Chinese Academy of Sciences》 2018年第4期243-244,共2页
Making accurate forecast or prediction is a challenging task in the big data era, in particular for those datasets involving high-dimensional variables but short-term time series points,which are generally available f... Making accurate forecast or prediction is a challenging task in the big data era, in particular for those datasets involving high-dimensional variables but short-term time series points,which are generally available from real-world systems.To address this issue, Prof. 展开更多
关键词 RDE MAKING SHORT-TERM high-dimensional data Predictable
下载PDF
Switched-mode AGC circuits with internally created reset module for burst-mode unbalanced data optical receiver
10
作者 Wang Rong Wang Zhigong +2 位作者 Wang Weibai Xu Jian Guan Zhiqiang 《High Technology Letters》 EI CAS 2011年第3期317-324,共8页
关键词 AGC电路 复位电路 不平衡数据 开关模式 突发模式 模块 光接收机 自动增益控制
下载PDF
Data-driven Surrogate-assisted Method for High-dimensional Multi-area Combined Economic/Emission Dispatch
11
作者 Chenhao Lin Huijun Liang +2 位作者 Aokang Pang Jianwei Zhong Yongchao Yang 《Journal of Modern Power Systems and Clean Energy》 SCIE EI CSCD 2024年第1期52-64,共13页
Multi-area combined economic/emission dispatch(MACEED)problems are generally studied using analytical functions.However,as the scale of power systems increases,ex isting solutions become time-consuming and may not mee... Multi-area combined economic/emission dispatch(MACEED)problems are generally studied using analytical functions.However,as the scale of power systems increases,ex isting solutions become time-consuming and may not meet oper ational constraints.To overcome excessive computational ex pense in high-dimensional MACEED problems,a novel data-driven surrogate-assisted method is proposed.First,a cosine-similarity-based deep belief network combined with a back-propagation(DBN+BP)neural network is utilized to replace cost and emission functions.Second,transfer learning is applied with a pretraining and fine-tuning method to improve DBN+BP regression surrogate models,thus realizing fast con struction of surrogate models between different regional power systems.Third,a multi-objective antlion optimizer with a novel general single-dimension retention bi-objective optimization poli cy is proposed to execute MACEED optimization to obtain scheduling decisions.The proposed method not only ensures the convergence,uniformity,and extensibility of the Pareto front,but also greatly reduces the computational time.Finally,a 4-ar ea 40-unit test system with different constraints is employed to demonstrate the effectiveness of the proposed method. 展开更多
关键词 Multi-area combined economic/emission dispatch high-dimensional power system deep belief network data driven transfer learning
原文传递
Randomized Latent Factor Model for High-dimensional and Sparse Matrices from Industrial Applications 被引量:13
12
作者 Mingsheng Shang Xin Luo +3 位作者 Zhigang Liu Jia Chen Ye Yuan MengChu Zhou 《IEEE/CAA Journal of Automatica Sinica》 EI CSCD 2019年第1期131-141,共11页
Latent factor(LF) models are highly effective in extracting useful knowledge from High-Dimensional and Sparse(HiDS) matrices which are commonly seen in various industrial applications. An LF model usually adopts itera... Latent factor(LF) models are highly effective in extracting useful knowledge from High-Dimensional and Sparse(HiDS) matrices which are commonly seen in various industrial applications. An LF model usually adopts iterative optimizers,which may consume many iterations to achieve a local optima,resulting in considerable time cost. Hence, determining how to accelerate the training process for LF models has become a significant issue. To address this, this work proposes a randomized latent factor(RLF) model. It incorporates the principle of randomized learning techniques from neural networks into the LF analysis of HiDS matrices, thereby greatly alleviating computational burden. It also extends a standard learning process for randomized neural networks in context of LF analysis to make the resulting model represent an HiDS matrix correctly.Experimental results on three HiDS matrices from industrial applications demonstrate that compared with state-of-the-art LF models, RLF is able to achieve significantly higher computational efficiency and comparable prediction accuracy for missing data.I provides an important alternative approach to LF analysis of HiDS matrices, which is especially desired for industrial applications demanding highly efficient models. 展开更多
关键词 Big data high-dimensional and sparse matrix latent factor analysis latent factor model randomized learning
下载PDF
CSFW-SC: Cuckoo Search Fuzzy-Weighting Algorithm for Subspace Clustering Applying to High-Dimensional Clustering 被引量:1
13
作者 WANG Jindong HE Jiajing +1 位作者 ZHANG Hengwei YU Zhiyong 《China Communications》 SCIE CSCD 2015年第S2期55-63,共9页
Aimed at the issue that traditional clustering methods are not appropriate to high-dimensional data, a cuckoo search fuzzy-weighting algorithm for subspace clustering is presented on the basis of the exited soft subsp... Aimed at the issue that traditional clustering methods are not appropriate to high-dimensional data, a cuckoo search fuzzy-weighting algorithm for subspace clustering is presented on the basis of the exited soft subspace clustering algorithm. In the proposed algorithm, a novel objective function is firstly designed by considering the fuzzy weighting within-cluster compactness and the between-cluster separation, and loosening the constraints of dimension weight matrix. Then gradual membership and improved Cuckoo search, a global search strategy, are introduced to optimize the objective function and search subspace clusters, giving novel learning rules for clustering. At last, the performance of the proposed algorithm on the clustering analysis of various low and high dimensional datasets is experimentally compared with that of several competitive subspace clustering algorithms. Experimental studies demonstrate that the proposed algorithm can obtain better performance than most of the existing soft subspace clustering algorithms. 展开更多
关键词 high-dimensional data CLUSTERING soft SUBSPACE CUCKOO SEARCH FUZZY CLUSTERING
下载PDF
Analysis of Variance in an Unbalanced Two-Way Mixed Effect Interactive Model 被引量:1
14
作者 F. C. Eze E. U. Nwankwo 《Open Journal of Statistics》 2016年第2期310-319,共10页
The expected mean squares for unbalanced mixed effect interactive model were derived using Brute Force Method. From the expected mean squares, there are no obvious denominators for testing for the main effects when th... The expected mean squares for unbalanced mixed effect interactive model were derived using Brute Force Method. From the expected mean squares, there are no obvious denominators for testing for the main effects when the factors are mixed. An expression for F-test for testing for the main effects was derived which was proved to be unbiased. 展开更多
关键词 Mixed Model Expected Mean Squares unbalanced data
下载PDF
Variance Estimation for High-Dimensional Varying Index Coefficient Models
15
作者 Miao Wang Hao Lv Yicun Wang 《Open Journal of Statistics》 2019年第5期555-570,共16页
This paper studies the re-adjusted cross-validation method and a semiparametric regression model called the varying index coefficient model. We use the profile spline modal estimator method to estimate the coefficient... This paper studies the re-adjusted cross-validation method and a semiparametric regression model called the varying index coefficient model. We use the profile spline modal estimator method to estimate the coefficients of the parameter part of the Varying Index Coefficient Model (VICM), while the unknown function part uses the B-spline to expand. Moreover, we combine the above two estimation methods under the assumption of high-dimensional data. The results of data simulation and empirical analysis show that for the varying index coefficient model, the re-adjusted cross-validation method is better in terms of accuracy and stability than traditional methods based on ordinary least squares. 展开更多
关键词 high-dimensional data Refitted Cross-Validation VARYING INDEX COEFFICIENT MODELS Variance ESTIMATION
下载PDF
一种基于双分支注意力神经网络的皮肤癌检测框架
16
作者 王玉峰 成昊沅 +2 位作者 万承北 张博 石爱菊 《中国生物医学工程学报》 CAS CSCD 北大核心 2024年第2期153-161,共9页
皮肤癌是一种主要的癌症,在过去几十年中快速增长,早期发现可以极大提高治愈率。近年来,基于皮肤镜图像利用深度学习模型(尤其是各种卷积神经网络)对皮肤癌进行识别和分类获得了广泛应用。但是与传统的图像识别分类不同,皮肤病检测任务... 皮肤癌是一种主要的癌症,在过去几十年中快速增长,早期发现可以极大提高治愈率。近年来,基于皮肤镜图像利用深度学习模型(尤其是各种卷积神经网络)对皮肤癌进行识别和分类获得了广泛应用。但是与传统的图像识别分类不同,皮肤病检测任务存在数据不平衡、类间差异性小以及皮损面积占比少等方面的挑战。为此,本研究提出一种基于双分支注意力卷积神经网络(DACNN)皮肤癌分类框架。在数据预处理阶段,根据更细粒度的皮肤病类别,对数据集进行分解,降低数据不平衡程度。从网络结构上,上分支网络利用注意力残差学习(ARL)模块有效提取潜在的病变区域特征,接着利用损伤定位网络(LLN)模块定位病变区域。对其裁剪放大输入由ARL构成的下分支网络,进行局部细节的特征提取,然后结合上下分支网络的特征,进行有效的识别。最后,为了进一步缓解数据不平衡问题,在训练阶段中采用加权损失函数。在包含10015张皮肤镜图像数据集上,对所提出的DACNN模型与几种典型的皮肤病变检测框架进行了实验验证和比较。结果表明,DACNN皮肤癌变检测框架的Sensitivity、Accuracy和F1_score等性能指标分别达到了0.922、0.942和0.933,与已有的递归注意力卷积神经网络模型RACNN相比,以上3个指标分别提升了3.48%、2.95%和3.44%。总之,对于各类图像数不平衡,类间图像差异性小以及皮损面积占比少的皮肤镜图像而言,采用适当的类分解,以及双分支注意力神经网络结构首先对潜在的病变区域进行定位放大,然后进行局部细节的特征提取,能够极大的提高皮肤癌的检测准确度。 展开更多
关键词 皮肤癌 双分支神经网络 注意力机制 数据不平衡
下载PDF
基于欠采样和多层集成学习的恶意网页识别
17
作者 王法玉 于晓文 陈洪涛 《计算机工程与设计》 北大核心 2024年第3期669-675,共7页
现实中恶意网页与良性网页比重严重失衡,传统的机器学习分类模型不能很好的应用,为此提出一种基于欠采样和多层集成学习的恶意网页检测模型。通过欠采样达到局部数据平衡;通过第一层基于权重和阈值的集成学习确保模型的准确度;通过第二... 现实中恶意网页与良性网页比重严重失衡,传统的机器学习分类模型不能很好的应用,为此提出一种基于欠采样和多层集成学习的恶意网页检测模型。通过欠采样达到局部数据平衡;通过第一层基于权重和阈值的集成学习确保模型的准确度;通过第二层基于投票的集成学习保证全局信息的完整性。实验结果表明,所提模型在不平衡数据集上的恶意网页识别性能优于传统机器学习模型。 展开更多
关键词 恶意网页识别 不平衡数据 多层分类器 欠采样 机器学习 集成学习 检测效果
下载PDF
多尺度卷积与双注意力机制融合的入侵检测方法
18
作者 陈虹 李泓绪 金海波 《辽宁工程技术大学学报(自然科学版)》 CAS 北大核心 2024年第1期93-100,共8页
为提高互联网入侵检测方法的准确率,提出一种卷积神经网络与注意力机制结合的入侵检测方法。利用Borderline-SMOTE过采样算法和MinMax归一化对数据进行预处理,有效缓解入侵数据量差异较大问题,提升非平衡数据检测性能;使用卷积神经网络I... 为提高互联网入侵检测方法的准确率,提出一种卷积神经网络与注意力机制结合的入侵检测方法。利用Borderline-SMOTE过采样算法和MinMax归一化对数据进行预处理,有效缓解入侵数据量差异较大问题,提升非平衡数据检测性能;使用卷积神经网络Inception结构多尺度对数据进行特征提取,并配合注意力机制进行维度更新,提高模型处理海量数据时特征表达的准确性。研究结果表明:入侵检测方法的平均准确率为99.57%;相较于SVM方法、CNN方法、RNN方法、BLS-GMM方法,准确率分别提升了4.48%、1.35%、1.62%和0.04%,召回率分别提高了4.48%、1.36%、1.62%和0.14%。 展开更多
关键词 入侵检测 卷积神经网络 注意力机制 过采样算法 非平衡数据
下载PDF
不平衡数据驱动的山区公路货车移动遮断险态跟驰行为识别模型
19
作者 戢晓峰 薛唯 +2 位作者 卢梦媛 覃文文 李太峰 《安全与环境学报》 CAS CSCD 北大核心 2024年第8期3015-3027,共13页
为识别山区双车道公路货车移动遮断下的小客车险态跟驰行为,基于无人机拍摄和视频轨迹提取技术提取车辆轨迹,利用人工少数类过采样法(Synthetic Minority Oversampling Technique,SMOTE)对不平衡轨迹数据过采样,并对驾驶行为聚类分析,... 为识别山区双车道公路货车移动遮断下的小客车险态跟驰行为,基于无人机拍摄和视频轨迹提取技术提取车辆轨迹,利用人工少数类过采样法(Synthetic Minority Oversampling Technique,SMOTE)对不平衡轨迹数据过采样,并对驾驶行为聚类分析,将跟驰行为标定为危险和安全两种类别;依据紧迫跟驰、偏移过大和车速变化大三种险态跟驰行为诱因,确定险态跟驰行为风险测度(Measure of Driving Risk,MOR),包括碰撞时间倒数、相对横向偏移量和速度变异系数,并将MOR和聚类标定标签作为识别模型输入变量;通过轻量梯度提升机(Light Gradient Boosting Machine,LGBM)建立险态跟驰行为识别模型,再通过支持向量机(Support Vector Machines,SVM)、随机森林(Random Forest,RF)和自适应增强(Adaptive Boosting,AdaBoost)算法验证模型的有效性。以云南省某山区双车道公路为例进行试验,共提取543对小客车跟驰货车轨迹数据,数据预处理后筛选出467对有效跟驰数据;经过采样处理和聚类标定,结果表明:小客车跟驰货车时,超三成小客车处于险态跟驰状态;险态跟驰行为直道和弯道识别模型的精确率分别达95.49%和95.48%,其中LGBM表现最稳定,而RF和AdaBoost的稳定性较差且精确率不高。基于LGBM的险态跟驰行为识别模型具有较高的准确率和稳定性,在车路协同和自动驾驶等领域有应用前景。 展开更多
关键词 安全工程 险态跟驰行为识别 轻量梯度提升机(LGBM)算法 山区双车道公路 不平衡数据
下载PDF
基于知识蒸馏的不平衡数据下入侵检测方法研究
20
作者 董国芳 刘兵 鲁烨堃 《云南民族大学学报(自然科学版)》 CAS 2024年第2期219-224,共6页
基于深度学习的网络入侵检测模型面临模型结构复杂、部署效率低及流量数据类别不平衡的问题.针对这些问题,提出了1种结合知识蒸馏和类别权重焦点损失的网络入侵检测方法.该方法以精度高、参数量较多的入侵检测模型作为教师模型,与小型... 基于深度学习的网络入侵检测模型面临模型结构复杂、部署效率低及流量数据类别不平衡的问题.针对这些问题,提出了1种结合知识蒸馏和类别权重焦点损失的网络入侵检测方法.该方法以精度高、参数量较多的入侵检测模型作为教师模型,与小型学生模型生成蒸馏损失;引入增加类别权重的焦点损失函数作为学生损失;结合蒸馏损失与学生损失生成总的损失函数优化学生模型.实验结果表明,该方法性能相较于非蒸馏模型在各项指标上均有一定提升. 展开更多
关键词 入侵检测 深度学习 知识蒸馏 不平衡数据 焦点损失
下载PDF
上一页 1 2 25 下一页 到第
使用帮助 返回顶部