Protein structure prediction is one of the most essential objectives practiced by theoretical chemistry and bioinformatics as it is of a vital importance in medicine,biotechnology and more.Protein secondary structure ...Protein structure prediction is one of the most essential objectives practiced by theoretical chemistry and bioinformatics as it is of a vital importance in medicine,biotechnology and more.Protein secondary structure prediction(PSSP)has a significant role in the prediction of protein tertiary structure,as it bridges the gap between the protein primary sequences and tertiary structure prediction.Protein secondary structures are classified into two categories:3-state category and 8-state category.Predicting the 3 states and the 8 states of secondary structures from protein sequences are called the Q3 prediction and the Q8 prediction problems,respectively.The 8 classes of secondary structures reveal more precise structural information for a variety of applications than the 3 classes of secondary structures,however,Q8 prediction has been found to be very challenging,that is why all previous work done in PSSP have focused on Q3 prediction.In this paper,we develop an ensemble Machine Learning(ML)approach for Q8 PSSP to explore the performance of ensemble learning algorithms compared to that of individual ML algorithms in Q8 PSSP.The ensemble members considered for constructing the ensemble models are well known classifiers,namely SVM(Support Vector Machines),KNN(K-Nearest Neighbor),DT(Decision Tree),RF(Random Forest),and NB(Naïve Bayes),with two feature extraction techniques,namely LDA(Linear Discriminate Analysis)and PCA(Principal Component Analysis).Experiments have been conducted for evaluating the performance of single models and ensemble models,with PCA and LDA,in Q8 PSSP.The novelty of this paper lies in the introduction of ensemble learning in Q8 PSSP problem.The experimental results confirmed that ensemble ML models are more accurate than individual ML models.They also indicated that features extracted by LDA are more effective than those extracted by PCA.展开更多
Congenital heart defect,accounting for about 30%of congenital defects,is the most common one.Data shows that congenital heart defects have seriously affected the birth rate of healthy newborns.In Fetal andNeonatal Car...Congenital heart defect,accounting for about 30%of congenital defects,is the most common one.Data shows that congenital heart defects have seriously affected the birth rate of healthy newborns.In Fetal andNeonatal Cardiology,medical imaging technology(2D ultrasonic,MRI)has been proved to be helpful to detect congenital defects of the fetal heart and assists sonographers in prenatal diagnosis.It is a highly complex task to recognize 2D fetal heart ultrasonic standard plane(FHUSP)manually.Compared withmanual identification,automatic identification through artificial intelligence can save a lot of time,ensure the efficiency of diagnosis,and improve the accuracy of diagnosis.In this study,a feature extraction method based on texture features(Local Binary Pattern LBP and Histogram of Oriented Gradient HOG)and combined with Bag of Words(BOW)model is carried out,and then feature fusion is performed.Finally,it adopts Support VectorMachine(SVM)to realize automatic recognition and classification of FHUSP.The data includes 788 standard plane data sets and 448 normal and abnormal plane data sets.Compared with some other methods and the single method model,the classification accuracy of our model has been obviously improved,with the highest accuracy reaching 87.35%.Similarly,we also verify the performance of the model in normal and abnormal planes,and the average accuracy in classifying abnormal and normal planes is 84.92%.The experimental results show that thismethod can effectively classify and predict different FHUSP and can provide certain assistance for sonographers to diagnose fetal congenital heart disease.展开更多
针对传统支持向量机(Support Vector Machine,SVM)集成学习(Ensemble Learning,EL)方法不能够解决高维复杂数据且子学习器差异性小集成效果不明显的问题,提出一种基于多种特征选择方法进行Bagging集成的支持向量机学习(Support Vector M...针对传统支持向量机(Support Vector Machine,SVM)集成学习(Ensemble Learning,EL)方法不能够解决高维复杂数据且子学习器差异性小集成效果不明显的问题,提出一种基于多种特征选择方法进行Bagging集成的支持向量机学习(Support Vector M achine Based on M ultiple Feature Selection Bagging,M FSB_SVM)方法.该方法首先采用不同的特征选择方法构建子学习器,以增加不同子学习器间的差异性,并直接从训练数据中对样本特征的重要性进行评估,而无需学习算法的反馈.实验表明,本文提出的MFSB_SVM方法既可以有效解决高维数据问题,也可避免传统SVM集成方法效果不明显的缺点,从而进一步提高学习模型的泛化性能.展开更多
基于视觉的手势识别中,手势的识别效果易受手势旋转,光照亮度的影响,针对该问题,借鉴了目标识别和图像检索领域的Bag of Features(特征袋)算法,将Bag of Features算法应用到手势识别领域。通过SURF(加速鲁棒性特征)算法提取手势图像的...基于视觉的手势识别中,手势的识别效果易受手势旋转,光照亮度的影响,针对该问题,借鉴了目标识别和图像检索领域的Bag of Features(特征袋)算法,将Bag of Features算法应用到手势识别领域。通过SURF(加速鲁棒性特征)算法提取手势图像的特征描述符,使手势对尺度、旋转、光照具有很强的适应力,再应用Bag of Features算法把SURF特征描述符映射到一个统一维度的向量,即Bag of Features特征向量,再用支持向量机对图像得到的特征向量进行训练分类。实验结果表示,该方法不仅具有较高的时间效率,满足手势识别的实时性,而且即使在很大角度的旋转以及亮度的变化下,仍能达到较高的识别率。展开更多
"视觉词袋"(Bag of Visual Words,BOV)算法是一种有效的基于语义特征表达的物体识别算法。针对传统BOV模型存在的不足,综合利用SAR图像的灰度和纹理特征,提出基于感兴趣目标(Target of Interest,TOI)的"视觉词袋"..."视觉词袋"(Bag of Visual Words,BOV)算法是一种有效的基于语义特征表达的物体识别算法。针对传统BOV模型存在的不足,综合利用SAR图像的灰度和纹理特征,提出基于感兴趣目标(Target of Interest,TOI)的"视觉词袋"算法。首先,对训练图像进行TOI选取,用灰度共生矩阵模型提取TOI的纹理特征,再结合灰度特征,组成多维特征向量集,以簇内相似度最高、数据分布密度最大为准则,生成"视觉词袋"。其次,对测试图像,依据已生成的"视觉词袋",采用支持向量机(Support Vector Machine,SVM)分类器,实现SAR图像感兴趣目标的有效分类。实验结果表明,与传统的"视觉词袋"构建算法相比,该算法在分类正确率提高的同时,能够在训练图像较少的情况下达到良好的分类效果。展开更多
考虑到传统物理分析方法无法解决导线舞动的预测问题,综合运用机器学习算法,对已有的舞动历史数据进行筛选和预处理,并挖掘有效信息,利用one class SVM算法解决舞动数据中负样本缺失问题,采用集成学习算法中Bagging算法建立分类器学习方...考虑到传统物理分析方法无法解决导线舞动的预测问题,综合运用机器学习算法,对已有的舞动历史数据进行筛选和预处理,并挖掘有效信息,利用one class SVM算法解决舞动数据中负样本缺失问题,采用集成学习算法中Bagging算法建立分类器学习方法,实现了数据的随机抽样,分成不同组数据集进行相互独立的训练,避免对舞动数据过拟合,提升机器学习算法的抗噪声能力以及泛化能力,采用k折交叉验证算法进行模型的验证,并利用F1-score描述导线舞动预警模型的性能,验证了该方法在舞动预测方面的有效性。展开更多
文摘Protein structure prediction is one of the most essential objectives practiced by theoretical chemistry and bioinformatics as it is of a vital importance in medicine,biotechnology and more.Protein secondary structure prediction(PSSP)has a significant role in the prediction of protein tertiary structure,as it bridges the gap between the protein primary sequences and tertiary structure prediction.Protein secondary structures are classified into two categories:3-state category and 8-state category.Predicting the 3 states and the 8 states of secondary structures from protein sequences are called the Q3 prediction and the Q8 prediction problems,respectively.The 8 classes of secondary structures reveal more precise structural information for a variety of applications than the 3 classes of secondary structures,however,Q8 prediction has been found to be very challenging,that is why all previous work done in PSSP have focused on Q3 prediction.In this paper,we develop an ensemble Machine Learning(ML)approach for Q8 PSSP to explore the performance of ensemble learning algorithms compared to that of individual ML algorithms in Q8 PSSP.The ensemble members considered for constructing the ensemble models are well known classifiers,namely SVM(Support Vector Machines),KNN(K-Nearest Neighbor),DT(Decision Tree),RF(Random Forest),and NB(Naïve Bayes),with two feature extraction techniques,namely LDA(Linear Discriminate Analysis)and PCA(Principal Component Analysis).Experiments have been conducted for evaluating the performance of single models and ensemble models,with PCA and LDA,in Q8 PSSP.The novelty of this paper lies in the introduction of ensemble learning in Q8 PSSP problem.The experimental results confirmed that ensemble ML models are more accurate than individual ML models.They also indicated that features extracted by LDA are more effective than those extracted by PCA.
基金supported by Fujian Provincial Science and Technology Major Project(No.2020HZ02014)by the grants from National Natural Science Foundation of Fujian(2021J01133,2021J011404)by the Quanzhou Scientific and Technological Planning Projects(Nos.2018C113R,2019C028R,2019C029R,2019C076R and 2019C099R).
文摘Congenital heart defect,accounting for about 30%of congenital defects,is the most common one.Data shows that congenital heart defects have seriously affected the birth rate of healthy newborns.In Fetal andNeonatal Cardiology,medical imaging technology(2D ultrasonic,MRI)has been proved to be helpful to detect congenital defects of the fetal heart and assists sonographers in prenatal diagnosis.It is a highly complex task to recognize 2D fetal heart ultrasonic standard plane(FHUSP)manually.Compared withmanual identification,automatic identification through artificial intelligence can save a lot of time,ensure the efficiency of diagnosis,and improve the accuracy of diagnosis.In this study,a feature extraction method based on texture features(Local Binary Pattern LBP and Histogram of Oriented Gradient HOG)and combined with Bag of Words(BOW)model is carried out,and then feature fusion is performed.Finally,it adopts Support VectorMachine(SVM)to realize automatic recognition and classification of FHUSP.The data includes 788 standard plane data sets and 448 normal and abnormal plane data sets.Compared with some other methods and the single method model,the classification accuracy of our model has been obviously improved,with the highest accuracy reaching 87.35%.Similarly,we also verify the performance of the model in normal and abnormal planes,and the average accuracy in classifying abnormal and normal planes is 84.92%.The experimental results show that thismethod can effectively classify and predict different FHUSP and can provide certain assistance for sonographers to diagnose fetal congenital heart disease.
文摘针对传统支持向量机(Support Vector Machine,SVM)集成学习(Ensemble Learning,EL)方法不能够解决高维复杂数据且子学习器差异性小集成效果不明显的问题,提出一种基于多种特征选择方法进行Bagging集成的支持向量机学习(Support Vector M achine Based on M ultiple Feature Selection Bagging,M FSB_SVM)方法.该方法首先采用不同的特征选择方法构建子学习器,以增加不同子学习器间的差异性,并直接从训练数据中对样本特征的重要性进行评估,而无需学习算法的反馈.实验表明,本文提出的MFSB_SVM方法既可以有效解决高维数据问题,也可避免传统SVM集成方法效果不明显的缺点,从而进一步提高学习模型的泛化性能.
文摘基于视觉的手势识别中,手势的识别效果易受手势旋转,光照亮度的影响,针对该问题,借鉴了目标识别和图像检索领域的Bag of Features(特征袋)算法,将Bag of Features算法应用到手势识别领域。通过SURF(加速鲁棒性特征)算法提取手势图像的特征描述符,使手势对尺度、旋转、光照具有很强的适应力,再应用Bag of Features算法把SURF特征描述符映射到一个统一维度的向量,即Bag of Features特征向量,再用支持向量机对图像得到的特征向量进行训练分类。实验结果表示,该方法不仅具有较高的时间效率,满足手势识别的实时性,而且即使在很大角度的旋转以及亮度的变化下,仍能达到较高的识别率。
文摘"视觉词袋"(Bag of Visual Words,BOV)算法是一种有效的基于语义特征表达的物体识别算法。针对传统BOV模型存在的不足,综合利用SAR图像的灰度和纹理特征,提出基于感兴趣目标(Target of Interest,TOI)的"视觉词袋"算法。首先,对训练图像进行TOI选取,用灰度共生矩阵模型提取TOI的纹理特征,再结合灰度特征,组成多维特征向量集,以簇内相似度最高、数据分布密度最大为准则,生成"视觉词袋"。其次,对测试图像,依据已生成的"视觉词袋",采用支持向量机(Support Vector Machine,SVM)分类器,实现SAR图像感兴趣目标的有效分类。实验结果表明,与传统的"视觉词袋"构建算法相比,该算法在分类正确率提高的同时,能够在训练图像较少的情况下达到良好的分类效果。
文摘考虑到传统物理分析方法无法解决导线舞动的预测问题,综合运用机器学习算法,对已有的舞动历史数据进行筛选和预处理,并挖掘有效信息,利用one class SVM算法解决舞动数据中负样本缺失问题,采用集成学习算法中Bagging算法建立分类器学习方法,实现了数据的随机抽样,分成不同组数据集进行相互独立的训练,避免对舞动数据过拟合,提升机器学习算法的抗噪声能力以及泛化能力,采用k折交叉验证算法进行模型的验证,并利用F1-score描述导线舞动预警模型的性能,验证了该方法在舞动预测方面的有效性。