期刊文献+
共找到242篇文章
< 1 2 13 >
每页显示 20 50 100
Support vector machine for predicting protein interactions using domain scores
1
作者 彭新俊 王翼飞 《Journal of Shanghai University(English Edition)》 CAS 2009年第3期207-212,共6页
Protein-protein interactions play a crucial role in the cellular processsuch as metabolic pathways and immunological recognition. This paper presents a new domain score-based support vector machine (SVM) to infer pr... Protein-protein interactions play a crucial role in the cellular processsuch as metabolic pathways and immunological recognition. This paper presents a new domain score-based support vector machine (SVM) to infer protein interactions, which can be used not only to explore all possible domain interactions by the kernel method, but also to reflect the evolutionary conservation of domains in proteins by using the domain scores of proteins. The experimental result on the Saccharomyces cerevisiae dataset demonstrates that this approach can predict protein-protein interactions with higher performances compared to the existing approaches. 展开更多
关键词 protein-protein interactions DOMAINS support vector machine (SVM) domain score
下载PDF
A machine learning model for diagnosing acute pulmonary embolism and comparison with Wells score,revised Geneva score,and Years algorithm
2
作者 Linfeng Xi Han Kang +8 位作者 Mei Deng Wenqing Xu Feiya Xu Qian Gao Wanmu Xie Rongguo Zhang Min Liu Zhenguo Zhai Chen Wang 《Chinese Medical Journal》 SCIE CAS CSCD 2024年第6期676-682,共7页
Background:Acute pulmonary embolism(APE)is a fatal cardiovascular disease,yet missed diagnosis and misdiagnosis often occur due to non-specific symptoms and signs.A simple,objective technique will help clinicians make... Background:Acute pulmonary embolism(APE)is a fatal cardiovascular disease,yet missed diagnosis and misdiagnosis often occur due to non-specific symptoms and signs.A simple,objective technique will help clinicians make a quick and precise diagnosis.In population studies,machine learning(ML)plays a critical role in characterizing cardiovascular risks,predicting outcomes,and identifying biomarkers.This work sought to develop an ML model for helping APE diagnosis and compare it against current clinical probability assessment models.Methods:This is a single-center retrospective study.Patients with suspected APE were continuously enrolled and randomly divided into two groups including training and testing sets.A total of 8 ML models,including random forest(RF),Naïve Bayes,decision tree,K-nearest neighbors,logistic regression,multi-layer perceptron,support vector machine,and gradient boosting decision tree were developed based on the training set to diagnose APE.Thereafter,the model with the best diagnostic performance was selected and evaluated against the current clinical assessment strategies,including the Wells score,revised Geneva score,and Years algorithm.Eventually,the ML model was internally validated to assess the diagnostic performance using receiver operating characteristic(ROC)analysis.Results:The ML models were constructed using eight clinical features,including D-dimer,cardiac troponin T(cTNT),arterial oxygen saturation,heart rate,chest pain,lower limb pain,hemoptysis,and chronic heart failure.Among eight ML models,the RF model achieved the best performance with the highest area under the curve(AUC)(AUC=0.774).Compared to the current clinical assessment strategies,the RF model outperformed the Wells score(P=0.030)and was not inferior to any other clinical probability assessment strategy.The AUC of the RF model for diagnosing APE onset in internal validation set was 0.726.Conclusions:Based on RF algorithm,a novel prediction model was finally constructed for APE diagnosis.When compared to the current clinical assessment strategies,the RF model achieved better diagnostic efficacy and accuracy.Therefore,the ML algorithm can be a useful tool in assisting with the diagnosis of APE. 展开更多
关键词 Acute pulmonary embolism machine learning Wells score Revised Geneva score Years algorithm
原文传递
Computational prediction of RNA tertiary structures using machine learning methods 被引量:1
3
作者 Bin Huang Yuanyang Du +3 位作者 Shuai Zhang Wenfei Li Jun Wang Jian Zhang 《Chinese Physics B》 SCIE EI CAS CSCD 2020年第10期17-23,共7页
RNAs play crucial and versatile roles in biological processes. Computational prediction approaches can help to understand RNA structures and their stabilizing factors, thus providing information on their functions, an... RNAs play crucial and versatile roles in biological processes. Computational prediction approaches can help to understand RNA structures and their stabilizing factors, thus providing information on their functions, and facilitating the design of new RNAs. Machine learning (ML) techniques have made tremendous progress in many fields in the past few years. Although their usage in protein-related fields has a long history, the use of ML methods in predicting RNA tertiary structures is new and rare. Here, we review the recent advances of using ML methods on RNA structure predictions and discuss the advantages and limitation, the difficulties and potentials of these approaches when applied in the field. 展开更多
关键词 RNA structure prediction RNA scoring function knowledge-based potentials machine learning convolutional neural networks
下载PDF
Using the Support Vector Machine Algorithm to Predict β-Turn Types in Proteins
4
作者 Xiaobo Shi Xiuzhen Hu 《Engineering(科研)》 2013年第10期386-390,共5页
The structure and function of proteins are closely related, and protein structure decides its function, therefore protein structure prediction is quite important.β-turns are important components of protein secondary ... The structure and function of proteins are closely related, and protein structure decides its function, therefore protein structure prediction is quite important.β-turns are important components of protein secondary structure. So development of an accurate prediction method ofβ-turn types is very necessary. In this paper, we used the composite vector with position conservation scoring function, increment of diversity and predictive secondary structure information as the input parameter of support vector machine algorithm for predicting theβ-turn types in the database of 426 protein chains, obtained the overall prediction accuracy of 95.6%, 97.8%, 97.0%, 98.9%, 99.2%, 91.8%, 99.4% and 83.9% with the Matthews Correlation Coefficient values of 0.74, 0.68, 0.20, 0.49, 0.23, 0.47, 0.49 and 0.53 for types I, II, VIII, I’, II’, IV, VI and nonturn respectively, which is better than other prediction. 展开更多
关键词 Support Vector machine ALGORITHM INCREMENT of Diversity VALUE Position Conservation scoring Function VALUE Secondary Structure Information
下载PDF
Protein domain boundary prediction by combining support vector machine and domain guess by size algorithm
5
作者 董启文 Wang +2 位作者 Xiaolong Lin Lei 《High Technology Letters》 EI CAS 2007年第1期74-78,共5页
Successful prediction of protein domain boundaries provides valuable information not only for the computational structure prediction of muhi-domain proteins but also for the experimental structure determination. A nov... Successful prediction of protein domain boundaries provides valuable information not only for the computational structure prediction of muhi-domain proteins but also for the experimental structure determination. A novel method for domain boundary prediction has been presented, which combines the support vector machine with domain guess by size algorithm. Since the evolutional information of multiple domains can be detected by position specific score matrix, the support vector machine method is trained and tested using the values of position specific score matrix generated by PSI-BLAST. The candidate domain boundaries are selected from the output of support vector machine, and are then inputted to domain guess by size algorithm to give the final results of domain boundary, prediction. The experimental results show that the combined method outperforms the individual method of both support vector machine and domain guess by size. 展开更多
关键词 domain boundary prediction support vector machine domain guess by size positionspecific score matrix
下载PDF
The Comparison between Random Forest and Support Vector Machine Algorithm for Predicting β-Hairpin Motifs in Proteins
6
作者 Shaochun Jia Xiuzhen Hu Lixia Sun 《Engineering(科研)》 2013年第10期391-395,共5页
Based on the research of predictingβ-hairpin motifs in proteins, we apply Random Forest and Support Vector Machine algorithm to predictβ-hairpin motifs in ArchDB40 dataset. The motifs with the loop length of 2 to 8 ... Based on the research of predictingβ-hairpin motifs in proteins, we apply Random Forest and Support Vector Machine algorithm to predictβ-hairpin motifs in ArchDB40 dataset. The motifs with the loop length of 2 to 8 amino acid residues are extracted as research object and thefixed-length pattern of 12 amino acids are selected. When using the same characteristic parameters and the same test method, Random Forest algorithm is more effective than Support Vector Machine. In addition, because of Random Forest algorithm doesn’t produce overfitting phenomenon while the dimension of characteristic parameters is higher, we use Random Forest based on higher dimension characteristic parameters to predictβ-hairpin motifs. The better prediction results are obtained;the overall accuracy and Matthew’s correlation coefficient of 5-fold cross-validation achieve 83.3% and 0.59, respectively. 展开更多
关键词 Random FOREST ALGORITHM Support Vector machine ALGORITHM β-Hairpin MOTIF INCREMENT of Diversity scoring Function Predicted Secondary Structure Information
下载PDF
Machine Learning Technology for Evaluation of Liver Fibrosis, Inflammation Activity and Steatosis (LIVERFASt<sup>TM</sup>)
7
作者 Abhishek Aravind Avinash G. Bahirvani +1 位作者 Ronald Quiambao Teresa Gonzalo 《Journal of Intelligent Learning Systems and Applications》 2020年第2期31-49,共19页
Using the latest available artificial intelligence (AI) technology, an advanced algorithm LIVERFAStTM has been used to evaluate the diagnostic accuracy of machine learning (ML) biomarker algorithms to assess liver dam... Using the latest available artificial intelligence (AI) technology, an advanced algorithm LIVERFAStTM has been used to evaluate the diagnostic accuracy of machine learning (ML) biomarker algorithms to assess liver damage. Prevalence of NAFLD (Nonalcoholic fatty liver disease) and resulting NASH (nonalcoholic steatohepatitis) are constantly increasing worldwide, creating challenges for screening as the diagnosis for NASH requires invasive liver biopsy. Key issues in NAFLD patients are the differentiation of NASH from simple steatosis and identification of advanced hepatic fibrosis. In this prospective study, the staging of three different lesions of the liver to diagnose fatty liver was analyzed using a proprietary ML algorithm LIVERFAStTM developed with a database of 2862 unique medical assessments of biomarkers, where 1027 assessments were used to train the algorithm and 1835 constituted the validation set. Data of 13,068 patients who underwent the LIVERFAStTM test for evaluation of fatty liver disease were analysed. Data evaluation revealed 11% of the patients exhibited significant fibrosis with fibrosis scores 0.6 - 1.00. Approximately 7% of the population had severe hepatic inflammation. Steatosis was observed in most patients, 63%, whereas severe steatosis S3 was observed in 20%. Using modified SAF (Steatosis, Activity and Fibrosis) scores obtained using the LIVERFAStTM algorithm, NAFLD was detected in 13.41% of the patients (Sx > 0, Ay 0). Approximately 1.91% (Sx > 0, Ay = 2, Fz > 0) of the patients showed NAFLD or NASH scorings while 1.08% had confirmed NASH (Sx > 0, Ay > 2, Fz = 1 - 2) and 1.49% had advanced NASH (Sx > 0, Ay > 2, Fz = 3 - 4). The modified SAF scoring system generated by LIVERFAStTM provides a simple and convenient evaluation of NAFLD and NASH in a cohort of Southeast Asians. This system may lead to the use of noninvasive liver tests in extended populations for more accurate diagnosis of liver pathology, prediction of clinical path of individuals at all stages of liver diseases, and provision of an efficient system for therapeutic interventions. 展开更多
关键词 machine Learning (ML) Artificial Intelligence (AI) Neural Networks (NNs) STEATOSIS INFLAMMATION ACTIVITY Fibrosis (SAF Score) NONALCOHOLIC Fatty Liver Disease (NAFLD) Non-Alcoholic STEATOHEPATITIS (NASH)
下载PDF
Leveraging Geospatial Technology for Smallholder Farmer Credit Scoring
8
作者 Susan A. Okeyo Galcano C. Mulaku Collins M. Mwange 《Journal of Geographic Information System》 2023年第5期524-539,共16页
According to the Food and Agriculture Organization of the United Nations (FAO), there are about 500 million smallholder farmers in the world, and in developing countries, such farmers produce about 80% of the food con... According to the Food and Agriculture Organization of the United Nations (FAO), there are about 500 million smallholder farmers in the world, and in developing countries, such farmers produce about 80% of the food consumed there;their farming activities are therefore critical to the economies of their countries and to the global food security. However, these farmers face the challenges of limited access to credit, often due to the fact that many of them farm on unregistered land that cannot be offered as collateral to lending institutions;but even when they are on registered land, the fear of losing such land that they should default on loan payments often prevents them from applying for farm credit;and even if they apply, they still get disadvantaged by low credit scores (a measure of creditworthiness). The result is that they are often unable to use optimal farm inputs such as fertilizer and good seeds among others. This depresses their yields, and in turn, has negative implications for the food security in their communities, and in the world, hence making it difficult for the UN to achieve its sustainable goal no.2 (no hunger). This study aimed to demonstrate how geospatial technology can be used to leverage farm credit scoring for the benefit of smallholder farmers. A survey was conducted within the study area to identify the smallholder farms and farmers. A sample of surveyed farmers was then subjected to credit scoring by machine learning. In the first instance, the traditional financial data approach was used and the results showed that over 40% of the farmers could not qualify for credit. When non-financial geospatial data, i.e. Normalized Difference Vegetation Index (NDVI) was introduced into the scoring model, the number of farmers not qualifying for credit reduced significantly to 24%. It is concluded that the introduction of the NDVI variable into the traditional scoring model could improve significantly the smallholder farmers’ chances of accessing credit, thus enabling such a farmer to be better evaluated for credit on the basis of the health of their crop, rather than on a traditional form of collateral. 展开更多
关键词 Credit scoring machine Learning Geospatial Technology Migori
下载PDF
基于LSTM和注意力机制的蛋白质-配体结合亲和力预测
9
作者 王伟 吴世玉 +5 位作者 刘栋 梁慧茹 史进玲 周运 张红军 王鲜芳 《陕西师范大学学报(自然科学版)》 CAS CSCD 北大核心 2024年第3期76-84,共9页
蛋白质-配体的结合亲和力预测是药物重定位回归中具有挑战性的任务。深度学习方法可以有效预测蛋白质与配体相互作用的结合亲和力,减少药物发现的时间和成本。由此,基于长短期记忆模块(LSTM)和注意力机制模块(attention)提出了一种深度... 蛋白质-配体的结合亲和力预测是药物重定位回归中具有挑战性的任务。深度学习方法可以有效预测蛋白质与配体相互作用的结合亲和力,减少药物发现的时间和成本。由此,基于长短期记忆模块(LSTM)和注意力机制模块(attention)提出了一种深度卷积神经网络模型(DLLSA)。模型由嵌入LSTM和空间注意力模块(spatial-attention)的卷积网络并行模块构建,其中LSTM模块针对蛋白质-配体接触特征的长序列信息,spatial-attention注意力模块聚集接触特征局部信息。采用PDBbind(v.2020)数据集进行训练,CASF-2013和CASF-2016数据集进行验证,模型的皮尔逊相关系数相比于PLEC模型分别提高了0.6%和3%,实验结果显著优于其他相关方法。 展开更多
关键词 结合亲和力 卷积神经网络 注意力机制 评分功能 机器学习
下载PDF
基于机器学习的不平衡数据下个人信用评分预测模型研究
10
作者 费振华 《长江信息通信》 2024年第4期112-114,共3页
文章介绍了个人信用评分的基本概念,以及不平衡数据及其处理方法和机器学习算法在信用评分中的应用。然后,通过数据预处理,包括数据来源与特性、数据清洗与整理、数据不平衡分析、数据增强方法和效果评估,为后续模型构建提供基础。最后... 文章介绍了个人信用评分的基本概念,以及不平衡数据及其处理方法和机器学习算法在信用评分中的应用。然后,通过数据预处理,包括数据来源与特性、数据清洗与整理、数据不平衡分析、数据增强方法和效果评估,为后续模型构建提供基础。最后,使用实际数据集进行模型训练和测试,并评估模型的性能。实验结果表明,基于机器学习的不平衡数据下个人信用评分预测模型能够有效地预测个人信用风险,对于金融机构的风险管理和信贷决策具有重要意义。 展开更多
关键词 个人信用评分 不平衡数据 机器学习 数据预处理 模型研究
下载PDF
RCMNAAPE在旋转机械故障诊断中的应用
11
作者 储祥冬 戴礼军 +3 位作者 涂金洲 罗震寰 于震 秦磊 《机电工程》 CAS 北大核心 2024年第6期1039-1049,共11页
针对精细复合多尺度排列熵(RCMPE)无法充分提取旋转机械振动信号中的故障信息,从而导致旋转机械故障识别准确率不稳定这一缺陷,提出了一种基于精细复合多尺度归一化幅值感知排列熵(RCMNAAPE)、拉普拉斯分数(LS)和灰狼算法优化支持向量机... 针对精细复合多尺度排列熵(RCMPE)无法充分提取旋转机械振动信号中的故障信息,从而导致旋转机械故障识别准确率不稳定这一缺陷,提出了一种基于精细复合多尺度归一化幅值感知排列熵(RCMNAAPE)、拉普拉斯分数(LS)和灰狼算法优化支持向量机(GWO-SVM)的旋转机械故障诊断方法。首先,利用幅值感知排列熵替换了RCMPE中的排列熵,提出了RCMNAAPE,并将其用于提取旋转机械振动信号的故障特征生成特征样本;随后,采用了LS从原始的高维故障特征向量中筛选出较少的能够更准确描述故障状态的特征,构造敏感特征样本;最后,将低维的故障特征向量输入由灰狼算法优化的支持向量机中进行了训练和测试,完成了旋转机械样本的故障识别和分类,利用滚动轴承和齿轮箱故障数据集将RCMNAAPE-LS-GWO-SVM与其他故障诊断方法进行了对比分析,并开展了评估。研究结果表明:基于RCMNAAPE-LS-GWO-SVM的故障诊断方法能够有效识别旋转机械的各类故障,其识别准确率高于其他对比的故障诊断方法,其中滚动轴承故障的识别准确率达到99.33%,齿轮箱故障的识别准确率达到98.67%。虽然,该方法的特征提取效率不佳,平均特征提取时间分别为153.02 s和163.98 s,仅优于精细复合多尺度模糊熵(RCMFE),但其综合性能更加优异。 展开更多
关键词 故障识别准确率 滚动轴承 齿轮箱 精细复合多尺度归一化幅值感知排列熵 拉普拉斯分数 灰狼优化支持向量机
下载PDF
Synergistic application of molecular docking and machine learning for improved binding pose 被引量:1
12
作者 Yaqi Li Hongrui Lin +5 位作者 He Yang Yannan Yuan Rongfeng Zou Gengmo Zhou Linfeng Zhang Hang Zheng 《National Science Open》 2024年第2期36-45,共10页
Accurate prediction of protein-ligand complex structures is a crucial step in structure-based drug design.Traditional molecular docking methods exhibit limitations in terms of accuracy and sampling space,while relying... Accurate prediction of protein-ligand complex structures is a crucial step in structure-based drug design.Traditional molecular docking methods exhibit limitations in terms of accuracy and sampling space,while relying on machine-learning approaches may lead to invalid conformations.In this study,we propose a novel strategy that combines molecular docking and machine learning methods.Firstly,the protein-ligand binding poses are predicted using a deep learning model.Subsequently,position-restricted docking on predicted binding poses is performed using Uni-Dock,generating physically constrained and valid binding poses.Finally,the binding poses are re-scored and ranked using machine learning scoring functions.This strategy harnesses the predictive power of machine learning and the physical constraints advantage of molecular docking.Evaluation experiments on multiple datasets demonstrate that,compared to using molecular docking or machine learning methods alone,our proposed strategy can significantly improve the success rate and accuracy of protein-ligand complex structure predictions. 展开更多
关键词 binding pose molecular docking machine learning machine learning scoring function
原文传递
基于D-score与支持向量机的混合特征选择方法 被引量:5
13
作者 谢娟英 雷金虎 +1 位作者 谢维信 高新波 《计算机应用》 CSCD 北大核心 2011年第12期3292-3296,共5页
F-score作为特征评价准则时,没有考虑不同特征的不同测量量纲对特征重要性的影响。为此,提出一种新的特征评价准则D-score,该准则不仅可以衡量样本特征在两类或多类之间的辨别能力,而且不受特征测量量纲对特征重要性的影响。以D-score... F-score作为特征评价准则时,没有考虑不同特征的不同测量量纲对特征重要性的影响。为此,提出一种新的特征评价准则D-score,该准则不仅可以衡量样本特征在两类或多类之间的辨别能力,而且不受特征测量量纲对特征重要性的影响。以D-score为特征重要性评价准则,结合前向顺序搜索、前向顺序浮动搜索以及后向浮动搜索三种特征搜索策略,以支持向量机分类正确率评价特征子集的分类性能得到三种混合的特征选择方法。这些特征选择方法结合了Filter方法和Wrapper方法的各自优势实现特征选择。对UCI机器学习数据库中9个标准数据集的实验测试,以及与基于改进F-score与支持向量机的混合特征选择方法的实验比较,表明D-score特征评价准则是一种有效的样本特征重要性,也即特征辨别能力衡量准则。基于该准则与支持向量机的混合特征选择方法实现了有效的特征选择,在保持数据集辨识能力不变情况下实现了维数压缩。 展开更多
关键词 D-score F-score 支持向量机 特征选择 评估准则 维压缩
下载PDF
基于改进的F-score与支持向量机的特征选择方法 被引量:31
14
作者 谢娟英 王春霞 +1 位作者 蒋帅 张琰 《计算机应用》 CSCD 北大核心 2010年第4期993-996,共4页
将传统F-score度量样本特征在两类之间的辨别能力进行推广,提出了改进的F-score,使其不但能够评价样本特征在两类之间的辨别能力,而且能够度量样本特征在多类之间的辨别能力大小。以改进的F-score作为特征选择准则,用支持向量机(SVM)评... 将传统F-score度量样本特征在两类之间的辨别能力进行推广,提出了改进的F-score,使其不但能够评价样本特征在两类之间的辨别能力,而且能够度量样本特征在多类之间的辨别能力大小。以改进的F-score作为特征选择准则,用支持向量机(SVM)评估所选特征子集的有效性,实现有效的特征选择。通过UCI机器学习数据库中六组数据集的实验测试,并与SVM、PCA+SVM方法进行比较,证明基于改进F-score与SVM的特征选择方法不仅提高了分类精度,并具有很好的泛化能力,且在训练时间上优于PCA+SVM方法。 展开更多
关键词 F-score 支持向量机 特征选择 主成分分析 核函数主成分分析
下载PDF
基于F-score特征选择和支持向量机的P300识别算法 被引量:5
15
作者 杨立才 李金亮 +1 位作者 姚玉翠 吴晓晴 《生物医学工程学杂志》 EI CAS CSCD 北大核心 2008年第1期23-26,52,共5页
如何从脑电信号中快速准确地识别出P300成分是脑-机接口研究中的一个热点问题。针对P300的识别问题,我们提出了一种将F-score特征选择与支持向量机相结合的判别方法,该方法采用F-score特征选择减少输入特征的维数,以克服支持向量机算法... 如何从脑电信号中快速准确地识别出P300成分是脑-机接口研究中的一个热点问题。针对P300的识别问题,我们提出了一种将F-score特征选择与支持向量机相结合的判别方法,该方法采用F-score特征选择减少输入特征的维数,以克服支持向量机算法判别速度慢的缺点;然后借助支持向量机算法良好的分类性能实现P300的识别。本文在BCI Competition 2003的P300实验数据集上对该方法进行了验证,结果表明,在5次重复实验中该方法的识别准确率达到了100%,且判别速度与未经特征选择的传统支持向量机算法相比提高了近2倍。 展开更多
关键词 脑-机接口 P300 F-score特征选择 支持向量机
下载PDF
院内临床早期预警系统的研究进展:从传统模型到人工智能
16
作者 吴昌德 袁世鑫 +2 位作者 黄力维 杨毅 刘松桥 《实用医院临床杂志》 2024年第4期48-52,共5页
早期识别高危的患者并及时干预,可预防患者在住院期间发生心脏呼吸骤停等严重不良事件。在院内心脏呼吸骤停之前,患者多会表现出生命体征或生理指标的异常。临床早期预警系统正是基于监测这些关键指标,以实现对高危患者的早期识别和干预... 早期识别高危的患者并及时干预,可预防患者在住院期间发生心脏呼吸骤停等严重不良事件。在院内心脏呼吸骤停之前,患者多会表现出生命体征或生理指标的异常。临床早期预警系统正是基于监测这些关键指标,以实现对高危患者的早期识别和干预,从而降低不良事件的发生率。本文综述了院内临床早期预警系统的发展历程,从传统的单参数系统、多参数系统和综合加权预警系统,到基于信息化的自动化预警系统,再到基于机器学习和人工智能的新型临床预警系统。此外,本文还评估了这些系统在实际临床环境中的应用效果,以及它们在提高患者安全和改善预后方面的潜力。 展开更多
关键词 快速反应系统 早期预警评分 临床恶化 自动化预警 机器学习
下载PDF
改进的F-score算法在语音情感识别中的应用 被引量:8
17
作者 叶吉祥 王聪慧 《计算机工程与应用》 CSCD 2013年第16期137-141,共5页
针对F-score特征选择算法不能揭示特征间互信息而不能有效降维这一问题,应用去相关的方法对F-score进行改进,利用德语情感语音库EMO-DB,在提取语音情感特征的基础上,根据支持向量机(SVM)的分类精度选择出分类效果最佳的特征子集。与F-sc... 针对F-score特征选择算法不能揭示特征间互信息而不能有效降维这一问题,应用去相关的方法对F-score进行改进,利用德语情感语音库EMO-DB,在提取语音情感特征的基础上,根据支持向量机(SVM)的分类精度选择出分类效果最佳的特征子集。与F-score特征选择算法对比,改进后的算法实现了候选特征集较大幅度的降维,选择出了有效的特征子集,同时得到了较理想的语音情感识别效果。 展开更多
关键词 特征选择 F-score 互信息 支持向量机 语音情感识别
下载PDF
心血管疾病中高风险人群颈动脉粥样硬化的识别:基于机器学习的预测模型及验证
18
作者 刘忠典 许琪 +4 位作者 陈伊静 覃玲巧 陈淑萍 唐薇婷 钟秋安 《中国全科医学》 CAS 北大核心 2024年第30期3763-3771,共9页
背景颈动脉粥样硬化(CAS)常被视为心血管疾病(CVD)的预警信号,其诊断技术颈动脉多普勒超声检查没有被纳入公共卫生服务项目,同时弗雷明汉风险评分(FRS)存在着评估CAS风险准确性不足的情况,不利于基层医疗人员识别CAS。目前,关于机器学... 背景颈动脉粥样硬化(CAS)常被视为心血管疾病(CVD)的预警信号,其诊断技术颈动脉多普勒超声检查没有被纳入公共卫生服务项目,同时弗雷明汉风险评分(FRS)存在着评估CAS风险准确性不足的情况,不利于基层医疗人员识别CAS。目前,关于机器学习方法识别FRS中高风险人群CAS的研究依然缺乏。目的运用机器学习方法构建FRS中高风险人群CAS的预测模型,比较其判别效能,筛选出性能最优的模型,以期辅助基层医疗人员更简便更准确地识别CAS。方法采用方便抽样法,选取2019—2021年和2023年在广西壮族自治区柳州市两乡镇的674例当地居民作为研究对象。收集相关信息,并采集空腹血样、尿样检测生化指标。采用FRS评估CVD发生风险;运用颈动脉超声诊断CAS。将2019—2021年517例研究对象按照8∶2的比例随机分为训练集和验证集,训练集用于构建Logistic回归、随机森林(RF)、支持向量机(SVM)、极端梯度增强(XGBoost)模型和梯度增强决策树(GBDT)模型,验证集用于内部验证;2023年157例研究对象作为测试集,用于外部验证。通过Lasso回归分析筛选特征变量,运用灵敏度、特异度、准确度、F1值和曲线下面积(AUC)评价判别效能,外部验证采用AUC值评价最优模型泛化能力,并通过Shapley Additive exPlanation(SHAP)方法探讨影响最优模型识别CAS的重要变量。结果通过Lasso回归,筛选出15个非零特征变量:年龄、BMI、收缩压(SBP)、吸烟、饮酒、高血压、总胆固醇、高密度脂蛋白胆固醇、C-反应蛋白(CRP)、空腹血糖、载脂蛋白B(ApoB)、脂蛋白a(LPA)、天冬氨酸氨基转移酶(AST)、AST/丙氨酸氨基转移酶、尿微量白蛋白肌酐比值。构建的Logistic回归、RF、SVM、XGBoost模型和GBDT模型的AUC值均较高,其中GBDT模型的判别性能最优,其灵敏度、特异度、准确度、F1值和AUC分别是0.7551、0.8364、0.7981、0.7789、0.8349,外部验证AUC为0.7940。SHAP方法发现年龄、SBP、CRP、LPA、ApoB是影响GBDT模型识别CAS排名前5的因素。结论基于机器学习识别CAS的Logistic回归、RF、SVM、XGBoost模型和GBDT模型均显示出较高的判别性能,其中GBDT模型综合判别效能最佳,同时具有较强的泛化能力。 展开更多
关键词 心血管疾病 颈动脉粥样硬化 机器学习 弗雷明汉风险评分 识别 预测
下载PDF
开放式情境判断测验的自动化评分 被引量:1
19
作者 徐静 骆方 +2 位作者 马彦珍 胡路明 田雪涛 《心理学报》 CSSCI CSCD 北大核心 2024年第6期831-844,共14页
受限于评分成本,开放式情境判断测验难以广泛使用。本研究以教师胜任力测评为例,探索了自动化评分的应用。针对教学中的典型问题场景开发了开放式情境判断测验,收集中小学教师作答文本,采用有监督学习策略分别从文档层面和句子层面应用... 受限于评分成本,开放式情境判断测验难以广泛使用。本研究以教师胜任力测评为例,探索了自动化评分的应用。针对教学中的典型问题场景开发了开放式情境判断测验,收集中小学教师作答文本,采用有监督学习策略分别从文档层面和句子层面应用深度神经网络识别作答类别,卷积神经网络(ConvolutionalNeuralNetwork,CNN)效果理想,各题评分准确率为70%~88%,与人类评分一致性高,人机评分的相关系数r为0.95,二次加权Kappa系数(Quadratic Weighted Kappa,QWK)为0.82。结果表明,机器评分可以获得稳定的效果,自动化评分研究能够助力于开放式情境判断测验的广泛应用。 展开更多
关键词 情境判断测验 自动化评分 教师胜任力 开放式测验 机器学习
下载PDF
F-score结合核极限学习机的集成特征选择算法 被引量:9
20
作者 谢娟英 郑清泉 吉新媛 《陕西师范大学学报(自然科学版)》 CAS CSCD 北大核心 2020年第2期1-8,共8页
特征选择是高维小样本癌症基因数据分析的首要和关键步骤,但是现有特征选择算法存在特征子集依赖于训练样本且随训练样本不同而变化的问题。为了解决特征选择过程的特征子集不稳定问题,提出一种基于核极限学习机的集成特征选择方法,利... 特征选择是高维小样本癌症基因数据分析的首要和关键步骤,但是现有特征选择算法存在特征子集依赖于训练样本且随训练样本不同而变化的问题。为了解决特征选择过程的特征子集不稳定问题,提出一种基于核极限学习机的集成特征选择方法,利用5-折交叉验证划分原始数据,对各训练集继续采用5-折交叉验证进行划分并进行特征选择,以所得5个特征子集之并集作为该训练集的特征子集,构造核极限学习机评价该特征子集的分类性能,以原始数据集5-折交叉验证所得特征子集的平均Jaccard系数评价特征选择算法所选特征子集的稳定性。5个基因数据集的实验测试以及与经典特征选择算法SVM-RFE、LLE Score、ARCO、DRJMIM、Random Forest和mRMR的实验比较表明,本文算法不仅能选择到稳定的特征子集,且所选特征子集具有很好的泛化能力。 展开更多
关键词 F-score 特征选择 极限学习机 集成特征选择
下载PDF
上一页 1 2 13 下一页 到第
使用帮助 返回顶部