期刊文献+
共找到13篇文章
< 1 >
每页显示 20 50 100
A Study of EM Algorithm as an Imputation Method: A Model-Based Simulation Study with Application to a Synthetic Compositional Data
1
作者 Yisa Adeniyi Abolade Yichuan Zhao 《Open Journal of Modelling and Simulation》 2024年第2期33-42,共10页
Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear mode... Compositional data, such as relative information, is a crucial aspect of machine learning and other related fields. It is typically recorded as closed data or sums to a constant, like 100%. The statistical linear model is the most used technique for identifying hidden relationships between underlying random variables of interest. However, data quality is a significant challenge in machine learning, especially when missing data is present. The linear regression model is a commonly used statistical modeling technique used in various applications to find relationships between variables of interest. When estimating linear regression parameters which are useful for things like future prediction and partial effects analysis of independent variables, maximum likelihood estimation (MLE) is the method of choice. However, many datasets contain missing observations, which can lead to costly and time-consuming data recovery. To address this issue, the expectation-maximization (EM) algorithm has been suggested as a solution for situations including missing data. The EM algorithm repeatedly finds the best estimates of parameters in statistical models that depend on variables or data that have not been observed. This is called maximum likelihood or maximum a posteriori (MAP). Using the present estimate as input, the expectation (E) step constructs a log-likelihood function. Finding the parameters that maximize the anticipated log-likelihood, as determined in the E step, is the job of the maximization (M) phase. This study looked at how well the EM algorithm worked on a made-up compositional dataset with missing observations. It used both the robust least square version and ordinary least square regression techniques. The efficacy of the EM algorithm was compared with two alternative imputation techniques, k-Nearest Neighbor (k-NN) and mean imputation (), in terms of Aitchison distances and covariance. 展开更多
关键词 Compositional Data Linear Regression Model Least Square Method Robust Least Square Method Synthetic Data Aitchison distance Maximum Likelihood Estimation Expectation-Maximization Algorithm k-nearest neighbor and Mean imputation
下载PDF
自适应步长的Alpha?shape表面重建算法 被引量:8
2
作者 李世林 李红军 《数据采集与处理》 CSCD 北大核心 2019年第3期491-499,共9页
三维物体表面重建在现代临床医学、场景建模和林业测量等方面有着重要应用价值。为了更好地理解三维物体表面形状,本文先介绍了三维空间离散点集的Alpha形状的相关概念。在分析表面重建的Alpha-shape算法的基础上,本文提出一种自适应步... 三维物体表面重建在现代临床医学、场景建模和林业测量等方面有着重要应用价值。为了更好地理解三维物体表面形状,本文先介绍了三维空间离散点集的Alpha形状的相关概念。在分析表面重建的Alpha-shape算法的基础上,本文提出一种自适应步长的Alpha-shape算法。通过kd-tree和k近邻平均距离来动态更新α值,使得算法在处理点集密度较大的区域时也能以较少的遍历次数进行表面重建,从而改善了重建效果并提高了算法运行效率。大量随机数据和现实三维采样数据的实验结果表明,本文提出的改进算法与原始算法相比,能大幅度地提高运行效率。 展开更多
关键词 表面重建 Alpha形状 k近邻平均距离 Alpha-shape算法
下载PDF
BICM-ID中一种新的8PSK星座映射
3
作者 田心记 李亚 张延良 《计算机工程》 CAS CSCD 2012年第23期263-265,共3页
在比特交织编码调制及迭代译码(BICM-ID)系统基础上,设计一种8PSK调制星座映射方法。该方法以映射符号之间的平均近邻汉明距离为设计准则,将8PSK调制的符号分为2组半径和相位不同的QPSK调制的映射符号,通过调整半径减小星座映射符号之... 在比特交织编码调制及迭代译码(BICM-ID)系统基础上,设计一种8PSK调制星座映射方法。该方法以映射符号之间的平均近邻汉明距离为设计准则,将8PSK调制的符号分为2组半径和相位不同的QPSK调制的映射符号,通过调整半径减小星座映射符号之间的平均近邻汉明距离。性能分析和仿真结果表明,与其他BICM-ID中的8PSK星座映射相比,在低信噪比下,该方法的误比特率更低。 展开更多
关键词 比特交织编码调制 迭代译码 平均近邻汉明距离 低信噪比 星座映射 误比特率
下载PDF
基于最小距离均衡系数的TSP问题求解算法
4
作者 郏宣耀 《洛阳大学学报》 2005年第2期16-19,共4页
提出了一种基于最小距离均衡系数的TSP求解算法,该算法在最近邻算法(NearestneighborAlgorithm)的基础上进行了改进,引入了距离均衡系数的概念,把优化方法从局部最优转化为全局最优,即将最短路径问题转化为最小距离均衡系数问题.仿真结... 提出了一种基于最小距离均衡系数的TSP求解算法,该算法在最近邻算法(NearestneighborAlgorithm)的基础上进行了改进,引入了距离均衡系数的概念,把优化方法从局部最优转化为全局最优,即将最短路径问题转化为最小距离均衡系数问题.仿真结果表明,该算法能够弱化导致最近邻法等算法性能下降的因素,从而在不同情况下保持算法的高有效性. 展开更多
关键词 旅行商问题 最近邻算法 距离均衡系数
下载PDF
基于平均距离的K-近邻分类改进算法
5
作者 许燕青 《电脑编程技巧与维护》 2010年第24期41-42,共2页
提出了一种基于平均距离的K-近邻分类改进算法,克服了K-近邻分类算法准确率不高的两个问题:一是各个类别的近邻个数相同时则无法判断测试样本的类别;二是即使某一类别的近邻个数较多,但由于此类别的近邻样本与测试样本的相似度都比较小... 提出了一种基于平均距离的K-近邻分类改进算法,克服了K-近邻分类算法准确率不高的两个问题:一是各个类别的近邻个数相同时则无法判断测试样本的类别;二是即使某一类别的近邻个数较多,但由于此类别的近邻样本与测试样本的相似度都比较小,则有可能把测试样本错误地判断为此类别。 展开更多
关键词 分类 平均距离 K-近邻分类算法 属性值
下载PDF
基于K-近邻拟合平面点云简化算法 被引量:6
6
作者 付永健 李宗春 +1 位作者 何华 阮焕立 《北京测绘》 2017年第01S期86-90,共5页
海量点云数据较易获取,点云简化问题已成为众多学者的研究热点。本文提出了一种基于K-近邻拟合平面点云简化算法。通过建立KD-tree索引,寻找每个点的K-近邻,然后对K-近邻进行平面拟合剔除非特征点实现点云简化。实验结果表明本文算法简... 海量点云数据较易获取,点云简化问题已成为众多学者的研究热点。本文提出了一种基于K-近邻拟合平面点云简化算法。通过建立KD-tree索引,寻找每个点的K-近邻,然后对K-近邻进行平面拟合剔除非特征点实现点云简化。实验结果表明本文算法简化率能达到80%以上,点云特征信息保留明显,算法适用性广,稳定性强,并且点云经简化后不损害重建结果。 展开更多
关键词 点云简化 KD-TREE K-近邻 平面拟合 平均距离
下载PDF
Precipitation Retrieval from Himawari-8 Satellite Infrared Data Based on Dictionary Learning Method and Regular Term Constraint 被引量:2
7
作者 Wang Gen Ding Conghui Liu Huilan 《Meteorological and Environmental Research》 CAS 2019年第3期61-65,68,共6页
In this paper,the application of an algorithm for precipitation retrieval based on Himawari-8 (H8) satellite infrared data is studied.Based on GPM precipitation data and H8 Infrared spectrum channel brightness tempera... In this paper,the application of an algorithm for precipitation retrieval based on Himawari-8 (H8) satellite infrared data is studied.Based on GPM precipitation data and H8 Infrared spectrum channel brightness temperature data,corresponding "precipitation field dictionary" and "channel brightness temperature dictionary" are formed.The retrieval of precipitation field based on brightness temperature data is studied through the classification rule of k-nearest neighbor domain (KNN) and regularization constraint.Firstly,the corresponding "dictionary" is constructed according to the training sample database of the matched GPM precipitation data and H8 brightness temperature data.Secondly,according to the fact that precipitation characteristics in small organizations in different storm environments are often repeated,KNN is used to identify the spectral brightness temperature signal of "precipitation" and "non-precipitation" based on "the dictionary".Finally,the precipitation field retrieval is carried out in the precipitation signal "subspace" based on the regular term constraint method.In the process of retrieval,the contribution rate of brightness temperature retrieval of different channels was determined by Bayesian model averaging (BMA) model.The preliminary experimental results based on the "quantitative" evaluation indexes show that the precipitation of H8 retrieval has a good correlation with the GPM truth value,with a small error and similar structure. 展开更多
关键词 Himawari-8(H8) RETRIEVAL of PRECIPITATION k-nearest neighbor (KNN) REGULAR TERM constraints DICTIONARY method Bayesian model average (BMA)
下载PDF
Large Scale Fish Images Classification and Localization using Transfer Learning and Localization Aware CNN Architecture 被引量:1
8
作者 Usman Ahmad Muhammad Junaid Ali +7 位作者 Faizan Ahmed Khan Arfat Ahmad Khan ArifUr Rehman Malik Muhammad Ali Shahid Mohd Anul Haq Ilyas Khan Zamil SAlzamil Ahmed Alhussen 《Computer Systems Science & Engineering》 SCIE EI 2023年第5期2125-2140,共16页
Building an automatic fish recognition and detection system for largescale fish classes is helpful for marine researchers and marine scientists because there are large numbers of fish species.However,it is quite diffi... Building an automatic fish recognition and detection system for largescale fish classes is helpful for marine researchers and marine scientists because there are large numbers of fish species.However,it is quite difficult to build such systems owing to the lack of data imbalance problems and large number of classes.To solve these issues,we propose a transfer learning-based technique in which we use Efficient-Net,which is pre-trained on ImageNet dataset and fine-tuned on QuT Fish Database,which is a large scale dataset.Furthermore,prior to the activation layer,we use Global Average Pooling(GAP)instead of dense layer with the aim of averaging the results of predictions along with having more information compared to the dense layer.To check the validity of our model,we validate our model on the validation set which achieves satisfactory results.Also,for the localization task,we propose an architecture that consists of localization aware block,which captures localization information for better prediction and residual connections to handle the over-fitting problem.Actually,the residual connections help the layer to combine missing information with the relevant one.In addition,we use class weights and Focal Loss(FL)to handle class imbalance problems along with reducing false predictions.Actually,class weights assign less weights to classes having fewer instances and large weights to classes having more number of instances.During the localization,the qualitative assessment shows that we achieve 57%Mean Intersection Over Union(IoU)on testing data,and the classification results show 75%precision,70%recall,78%accuracy and 74%F1-Score for 468 fish species. 展开更多
关键词 Underwater species transfer learning k-nearest neighbors global average pooling efficientnet
下载PDF
复杂网络中节点重要度的一个评估指标 被引量:1
9
作者 蒋丰景 陈玥琪 《西安工程大学学报》 CAS 2014年第1期140-142,共3页
为了简单而有效地评估网络拓扑结构中各节点重要性,本文基于节点的连接度和局部连通性,定义了一个节点重要度函数.该重要度函数指标实质上与网络中的平均最短距离指标是一致的,通过该重要度函数指标值的大小可以得到网络中各节点的重要... 为了简单而有效地评估网络拓扑结构中各节点重要性,本文基于节点的连接度和局部连通性,定义了一个节点重要度函数.该重要度函数指标实质上与网络中的平均最短距离指标是一致的,通过该重要度函数指标值的大小可以得到网络中各节点的重要度排序.理论分析与实例表明,对于小型网络,该方法的计算比较简单,且直观、有效、合理. 展开更多
关键词 节点重要度 邻居节点 节点删除 平均最短距离
下载PDF
基于多源数据的河北省涉县公共服务设施优化研究 被引量:3
10
作者 田芳 张政 杨紫琼 《石家庄学院学报》 CAS 2021年第3期23-29,共7页
城市公共服务设施是社会各类公共产品和服务的空间载体,其合理的分布对提升居民生活质量具有重要意义.突如其来的新冠疫情给城市各类公共服务设施带来了极大的考验.采用高德地图POI数据与政府资料等多源数据,运用ArcGIS软件对邯郸市涉... 城市公共服务设施是社会各类公共产品和服务的空间载体,其合理的分布对提升居民生活质量具有重要意义.突如其来的新冠疫情给城市各类公共服务设施带来了极大的考验.采用高德地图POI数据与政府资料等多源数据,运用ArcGIS软件对邯郸市涉县主城区各类公共服务设施进行平均最近邻分析、多距离空间聚类分析和核密度分析.结果表明:购物、餐饮、金融、教育、医疗、公共交通设施空间上呈聚集特征,文体、养老、休闲游憩设施空间上呈随机模式;经营性设施聚集程度普遍高于公益性设施,医疗设施聚集强度和聚集规模最大,公共交通设施聚集强度和聚集规模最小;西部老城区各类设施密度高于东部新城区,医疗、休闲游憩、社会福利设施与居住小区分布存在空间上的不匹配.研究成果可为公共服务设施优化及居民生活质量提升提供参考. 展开更多
关键词 多源数据 城市公共服务设施 平均最近邻分析 多距离空间聚类分析 核密度分析
下载PDF
An up -to -date comparative analysis of the KNN classifier distance metrics for text categorization
11
作者 Onder Coban 《Data Science and Informetrics》 2023年第2期67-78,共12页
Text categorization(TC)is one of the widely studied branches of text mining and has many applications in different domains.It tries to automatically assign a text document to one of the predefined categories often by ... Text categorization(TC)is one of the widely studied branches of text mining and has many applications in different domains.It tries to automatically assign a text document to one of the predefined categories often by using machine learning(ML)techniques.Choosing the best classifier in this task is the most important step in which k-Nearest Neighbor(KNN)is widely employed as a classifier as well as several other well-known ones such as Support Vector Machine,Multinomial Naive Bayes,Logistic Regression,and so on.The KNN has been extensively used for TC tasks and is one of the oldest and simplest methods for pattern classification.Its performance crucially relies on the distance metric used to identify nearest neighbors such that the most frequently observed label among these neighbors is used to classify an unseen test instance.Hence,in this paper,a comparative analysis of the KNN classifier is performed on a subset(i.e.,R8)of the Reuters-21578 benchmark dataset for TC.Experimental results are obtained by using different distance metrics as well as recently proposed distance learning metrics under different cases where the feature model and term weighting scheme are different.Our comparative evaluation of the results shows that Bray-Curtis and Linear Discriminant Analysis(LDA)are often superior to the other metrics and work well with raw term frequency weights. 展开更多
关键词 Text categorization k-nearest neighbor distance metric distance learning algorithms
原文传递
潍坊市发热伴血小板减少综合征病例空间点格局分析
12
作者 范俊杰 王怡 +2 位作者 霍锡元 李东英 王锐 《实用预防医学》 CAS 2024年第1期28-30,共3页
目的探索潍坊市发热伴血小板减少综合征(severe fever with thrombocytopenia syndrome,SFTS)病例空间格局,为疾病防控提供依据。方法对SFTS的流行特征进行描述,利用平均最近邻指数、多距离空间聚类分析、核密度等方法对SFTS病例进行空... 目的探索潍坊市发热伴血小板减少综合征(severe fever with thrombocytopenia syndrome,SFTS)病例空间格局,为疾病防控提供依据。方法对SFTS的流行特征进行描述,利用平均最近邻指数、多距离空间聚类分析、核密度等方法对SFTS病例进行空间点格局识别,发现SFTS病例的空间分布规律。结果2011—2021年潍坊市SFTS共报告387例,年均发病率0.38/10万,5—8月高发,单发病高峰,青州市、临朐县、安丘市为主要高发地区,60岁以前发病率逐渐升高,之后未见下降。ANNI=0.648,Z=-12.640(P<0.01);Ripley’s K函数L(d)在48.00 km之前大于期望值;存在临朐县及青州市交界处、奎文区与潍城区交界处、青州市王府街道、安丘市辉渠镇4个高密度区域。结论潍坊市SFTS病例点格局存在明显空间聚集性,在100.00 km范围内聚集程度达到最大,空间格局与山区、丘陵分布基本一致。点数据空间格局分析较自相关分析有一定优势。 展开更多
关键词 发热伴血小板减少综合征 空间格局 平均最邻近指数 多距离空间聚类分析
原文传递
基于点状数据与GIS的广州大都市区产业空间格局 被引量:25
13
作者 田光进 沙默泉 《地理科学进展》 CSCD 北大核心 2010年第4期387-395,共9页
利用2004年数字城市数据,研究了广州大都市区产业内部、产业之间的空间关系,比较了广州大都市区中心城区和新城区各种产业的空间格局。将广州大都市区行业分为制造业、批发和运输、零售、生产服务业、房地产业、管理服务、教育、医疗保... 利用2004年数字城市数据,研究了广州大都市区产业内部、产业之间的空间关系,比较了广州大都市区中心城区和新城区各种产业的空间格局。将广州大都市区行业分为制造业、批发和运输、零售、生产服务业、房地产业、管理服务、教育、医疗保健及社会扶助和娱乐设施等10类。利用1 km2格网画出了各行业点状密度,并通过分区产业百分比及区位商分析了各产业企业的空间分布,中心城区的主导产业是管理服务、房地产、零售及金融保险等服务行业,而在新城区其主导功能是制造业、批发与运输及生产服务业等。利用平均最邻近距离分析广州大都市区中心城区和新城区各产业内企业之间的空间关系,广州大都市区各产业企业都呈凝聚分布,在中心城区金融行业分布最集中,其次是房地产、生产服务业、娱乐、管理服务等。利用邻近性指数分析了各产业之间的空间关系,发现生产服务业和管理服务业、教育和医疗保健与社会扶助、娱乐和零售等邻近性较大。 展开更多
关键词 广州大都市区 产业空间格局 区位商 平均最邻近距离 邻近性指数
原文传递
上一页 1 下一页 到第
使用帮助 返回顶部