在医疗领域,普遍存在的数据缺失现象会加剧构建临床预测模型的难度.针对某些具有重要医学价值的特征因数据缺失率较高而被丢弃的问题,提出基于互信息加权的K近邻填补算法(Weighted KNN Imputation Algorithm Based on Mutual Informatio...在医疗领域,普遍存在的数据缺失现象会加剧构建临床预测模型的难度.针对某些具有重要医学价值的特征因数据缺失率较高而被丢弃的问题,提出基于互信息加权的K近邻填补算法(Weighted KNN Imputation Algorithm Based on Mutual Information,MIW-KNN).首先,在心力衰竭合并艰难梭菌感染患者的数据集上,与多重插补法、K近邻(K-nearest neighbor,KNN)填补法、均值法等方法进行对比验证所提出方法的有效性.其次,对比不同模型的死亡风险预测效果以验证所提出方法的性能优势.通过单变量分析法所筛选的20个特征,根据9种机器学习算法分别构建预测模型.采用AUC(Area Under the Receiver Operating Characteristic Curve)与准确率作为主要指标以评估模型的性能,通过SHAP(Shapley Additive Explanations)解释分析不同临床特征对模型的影响.最终表明,MIW-KNN算法具有最高的填补精度,基于该方法填补的数据集所构建的随机森林模型实现了最佳的预测性能.AUC为0.841,准确率为0.821.SHAP显示红细胞宽度、晶体输注、白细胞计数是最具影响力的前三个特征.展开更多
Mining from ambiguous data is very important in data mining. This paper discusses one of the tasks for mining from ambiguous data known as multi-instance problem. In multi-instance problem, each pattern is a labeled b...Mining from ambiguous data is very important in data mining. This paper discusses one of the tasks for mining from ambiguous data known as multi-instance problem. In multi-instance problem, each pattern is a labeled bag that consists of a number of unlabeled instances. A bag is negative if all instances in it are negative. A bag is positive if it has at least one positive instance. Because the instances in the positive bag are not labeled, each positive bag is an ambiguous. The mining aim is to classify unseen bags. The main idea of existing multi-instance algorithms is to find true positive instances in positive bags and convert the multi-instance problem to the supervised problem, and get the labels of test bags according to predict the labels of unknown instances. In this paper, we aim at mining the multi-instance data from another point of view, i.e., excluding the false positive instances in positive bags and predicting the label of an entire unknown bag. We propose an algorithm called Multi-Instance Covering kNN (MICkNN) for mining from multi-instance data. Briefly, constructive covering algorithm is utilized to restructure the structure of the original multi-instance data at first. Then, the kNN algorithm is applied to discriminate the false positive instances. In the test stage, we label the tested bag directly according to the similarity between the unseen bag and sphere neighbors obtained from last two steps. Experimental results demonstrate the proposed algorithm is competitive with most of the state-of-the-art multi-instance methods both in classification accuracy and running time.展开更多
针对三维激光点云线性K最近邻(K-nearest neighbor, KNN)搜索耗时长的问题,提出了一种利用多处理器片上系统(multi-processor system on chip, MPSoC)现场可编程门阵列(field-programmable gate array,FPGA)实现三维激光点云KNN快速搜...针对三维激光点云线性K最近邻(K-nearest neighbor, KNN)搜索耗时长的问题,提出了一种利用多处理器片上系统(multi-processor system on chip, MPSoC)现场可编程门阵列(field-programmable gate array,FPGA)实现三维激光点云KNN快速搜索的方法。首先给出了三维激光点云KNN算法的MPSoC FPGA实现框架;然后详细阐述了每个模块的设计思路及实现过程;最后利用MZU15A开发板和天眸16线旋转机械激光雷达搭建了测试平台,完成了三维激光点云KNN算法MPSoC FPGA加速的测试验证。实验结果表明:基于MPSoC FPGA实现的三维激光点云KNN算法能在保证邻近点搜索精度的情况下,减少邻近点搜索耗时。展开更多
The study aims to investigate the financial technology(FinTech)factors influencing Chinese banking performance.Financial expectations and global realities may be changed by FinTech’s multidimensional scope,which is l...The study aims to investigate the financial technology(FinTech)factors influencing Chinese banking performance.Financial expectations and global realities may be changed by FinTech’s multidimensional scope,which is lacking in the traditional financial sector.The use of technology to automate financial services is becoming more important for economic organizations and industries because the digital age has seen a period of transition in terms of consumers and personalization.The future of FinTech will be shaped by technologies like the Internet of Things,blockchain,and artificial intelligence.The involvement of these platforms in financial services is a major concern for global business growth.FinTech is becoming more popular with customers because of such benefits.FinTech has driven a fundamental change within the financial services industry,placing the client at the center of everything.Protection has become a primary focus since data are a component of FinTech transactions.The task of consolidating research reports for consensus is very manual,as there is no standardized format.Although existing research has proposed certain methods,they have certain drawbacks in FinTech payment systems(including cryptocurrencies),credit markets(including peer-to-peer lending),and insurance systems.This paper implements blockchainbased financial technology for the banking sector to overcome these transition issues.In this study,we have proposed an adaptive neuro-fuzzy-based K-nearest neighbors’algorithm.The chaotic improved foraging optimization algorithm is used to optimize the proposed method.The rolling window autoregressive lag modeling approach analyzes FinTech growth.The proposed algorithm is compared with existing approaches to demonstrate its efficiency.The findings showed that it achieved 91%accuracy,90%privacy,96%robustness,and 25%cyber-risk performance.Compared with traditional approaches,the recommended strategy will be more convenient,safe,and effective in the transition period.展开更多
文摘在医疗领域,普遍存在的数据缺失现象会加剧构建临床预测模型的难度.针对某些具有重要医学价值的特征因数据缺失率较高而被丢弃的问题,提出基于互信息加权的K近邻填补算法(Weighted KNN Imputation Algorithm Based on Mutual Information,MIW-KNN).首先,在心力衰竭合并艰难梭菌感染患者的数据集上,与多重插补法、K近邻(K-nearest neighbor,KNN)填补法、均值法等方法进行对比验证所提出方法的有效性.其次,对比不同模型的死亡风险预测效果以验证所提出方法的性能优势.通过单变量分析法所筛选的20个特征,根据9种机器学习算法分别构建预测模型.采用AUC(Area Under the Receiver Operating Characteristic Curve)与准确率作为主要指标以评估模型的性能,通过SHAP(Shapley Additive Explanations)解释分析不同临床特征对模型的影响.最终表明,MIW-KNN算法具有最高的填补精度,基于该方法填补的数据集所构建的随机森林模型实现了最佳的预测性能.AUC为0.841,准确率为0.821.SHAP显示红细胞宽度、晶体输注、白细胞计数是最具影响力的前三个特征.
基金the National Natural Science Foundation of China (Nos. 61073117 and 61175046)the Provincial Natural Science Research Program of Higher Education Institutions of Anhui Province (No. KJ2013A016)+1 种基金the Academic Innovative Research Projects of Anhui University Graduate Students (No. 10117700183)the 211 Project of Anhui University
文摘Mining from ambiguous data is very important in data mining. This paper discusses one of the tasks for mining from ambiguous data known as multi-instance problem. In multi-instance problem, each pattern is a labeled bag that consists of a number of unlabeled instances. A bag is negative if all instances in it are negative. A bag is positive if it has at least one positive instance. Because the instances in the positive bag are not labeled, each positive bag is an ambiguous. The mining aim is to classify unseen bags. The main idea of existing multi-instance algorithms is to find true positive instances in positive bags and convert the multi-instance problem to the supervised problem, and get the labels of test bags according to predict the labels of unknown instances. In this paper, we aim at mining the multi-instance data from another point of view, i.e., excluding the false positive instances in positive bags and predicting the label of an entire unknown bag. We propose an algorithm called Multi-Instance Covering kNN (MICkNN) for mining from multi-instance data. Briefly, constructive covering algorithm is utilized to restructure the structure of the original multi-instance data at first. Then, the kNN algorithm is applied to discriminate the false positive instances. In the test stage, we label the tested bag directly according to the similarity between the unseen bag and sphere neighbors obtained from last two steps. Experimental results demonstrate the proposed algorithm is competitive with most of the state-of-the-art multi-instance methods both in classification accuracy and running time.
基金from funding agencies in the public,commercial,or not-for-profit sectors.
文摘The study aims to investigate the financial technology(FinTech)factors influencing Chinese banking performance.Financial expectations and global realities may be changed by FinTech’s multidimensional scope,which is lacking in the traditional financial sector.The use of technology to automate financial services is becoming more important for economic organizations and industries because the digital age has seen a period of transition in terms of consumers and personalization.The future of FinTech will be shaped by technologies like the Internet of Things,blockchain,and artificial intelligence.The involvement of these platforms in financial services is a major concern for global business growth.FinTech is becoming more popular with customers because of such benefits.FinTech has driven a fundamental change within the financial services industry,placing the client at the center of everything.Protection has become a primary focus since data are a component of FinTech transactions.The task of consolidating research reports for consensus is very manual,as there is no standardized format.Although existing research has proposed certain methods,they have certain drawbacks in FinTech payment systems(including cryptocurrencies),credit markets(including peer-to-peer lending),and insurance systems.This paper implements blockchainbased financial technology for the banking sector to overcome these transition issues.In this study,we have proposed an adaptive neuro-fuzzy-based K-nearest neighbors’algorithm.The chaotic improved foraging optimization algorithm is used to optimize the proposed method.The rolling window autoregressive lag modeling approach analyzes FinTech growth.The proposed algorithm is compared with existing approaches to demonstrate its efficiency.The findings showed that it achieved 91%accuracy,90%privacy,96%robustness,and 25%cyber-risk performance.Compared with traditional approaches,the recommended strategy will be more convenient,safe,and effective in the transition period.