The recognition of pathological voice is considered a difficult task for speech analysis.Moreover,otolaryngologists needed to rely on oral communication with patients to discover traces of voice pathologies like dysph...The recognition of pathological voice is considered a difficult task for speech analysis.Moreover,otolaryngologists needed to rely on oral communication with patients to discover traces of voice pathologies like dysphonia that are caused by voice alteration of vocal folds and their accuracy is between 60%–70%.To enhance detection accuracy and reduce processing speed of dysphonia detection,a novel approach is proposed in this paper.We have leveraged Linear Discriminant Analysis(LDA)to train multiple Machine Learning(ML)models for dysphonia detection.Several ML models are utilized like Support Vector Machine(SVM),Logistic Regression,and K-nearest neighbor(K-NN)to predict the voice pathologies based on features like Mel-Frequency Cepstral Coefficients(MFCC),Fundamental Frequency(F0),Shimmer(%),Jitter(%),and Harmonic to Noise Ratio(HNR).The experiments were performed using Saarbrucken Voice Data-base(SVD)and a privately collected dataset.The K-fold cross-validation approach was incorporated to increase the robustness and stability of the ML models.According to the experimental results,our proposed approach has a 70%increase in processing speed over Principal Component Analysis(PCA)and performs remarkably well with a recognition accuracy of 95.24%on the SVD dataset surpassing the previous best accuracy of 82.37%.In the case of the private dataset,our proposed method achieved an accuracy rate of 93.37%.It can be an effective non-invasive method to detect dysphonia.展开更多
We revisit a comparison of two discriminant analysis procedures, namely the linear combination classifier of Chung and Han (2000) and the maximum likelihood estimation substitution classifier for the problem of classi...We revisit a comparison of two discriminant analysis procedures, namely the linear combination classifier of Chung and Han (2000) and the maximum likelihood estimation substitution classifier for the problem of classifying unlabeled multivariate normal observations with equal covariance matrices into one of two classes. Both classes have matching block monotone missing training data. Here, we demonstrate that for intra-class covariance structures with at least small correlation among the variables with missing data and the variables without block missing data, the maximum likelihood estimation substitution classifier outperforms the Chung and Han (2000) classifier regardless of the percent of missing observations. Specifically, we examine the differences in the estimated expected error rates for these classifiers using a Monte Carlo simulation, and we compare the two classifiers using two real data sets with monotone missing data via parametric bootstrap simulations. Our results contradict the conclusions of Chung and Han (2000) that their linear combination classifier is superior to the MLE classifier for block monotone missing multivariate normal data.展开更多
A direct linear discriminant analysis algorithm based on economic singular value decomposition (DLDA/ESVD) is proposed to address the computationally complex problem of the conventional DLDA algorithm, which directl...A direct linear discriminant analysis algorithm based on economic singular value decomposition (DLDA/ESVD) is proposed to address the computationally complex problem of the conventional DLDA algorithm, which directly uses ESVD to reduce dimension and extract eigenvectors corresponding to nonzero eigenvalues. Then a DLDA algorithm based on column pivoting orthogonal triangular (QR) decomposition and ESVD (DLDA/QR-ESVD) is proposed to improve the performance of the DLDA/ESVD algorithm by processing a high-dimensional low rank matrix, which uses column pivoting QR decomposition to reduce dimension and ESVD to extract eigenvectors corresponding to nonzero eigenvalues. The experimental results on ORL, FERET and YALE face databases show that the proposed two algorithms can achieve almost the same performance and outperform the conventional DLDA algorithm in terms of computational complexity and training time. In addition, the experimental results on random data matrices show that the DLDA/QR-ESVD algorithm achieves better performance than the DLDA/ESVD algorithm by processing high-dimensional low rank matrices.展开更多
This paper presents two novel algorithms for feature extraction-Subpattern Complete Two Dimensional Linear Discriminant Principal Component Analysis (SpC2DLDPCA) and Subpattern Complete Two Dimensional Locality Preser...This paper presents two novel algorithms for feature extraction-Subpattern Complete Two Dimensional Linear Discriminant Principal Component Analysis (SpC2DLDPCA) and Subpattern Complete Two Dimensional Locality Preserving Principal Component Analysis (SpC2DLPPCA). The modified SpC2DLDPCA and SpC2DLPPCA algorithm over their non-subpattern version and Subpattern Complete Two Dimensional Principal Component Analysis (SpC2DPCA) methods benefit greatly in the following four points: (1) SpC2DLDPCA and SpC2DLPPCA can avoid the failure that the larger dimension matrix may bring about more consuming time on computing their eigenvalues and eigenvectors. (2) SpC2DLDPCA and SpC2DLPPCA can extract local information to implement recognition. (3)The idea of subblock is introduced into Two Dimensional Principal Component Analysis (2DPCA) and Two Dimensional Linear Discriminant Analysis (2DLDA). SpC2DLDPCA combines a discriminant analysis and a compression technique with low energy loss. (4) The idea is also introduced into 2DPCA and Two Dimensional Locality Preserving projections (2DLPP), so SpC2DLPPCA can preserve local neighbor graph structure and compact feature expressions. Finally, the experiments on the CASIA(B) gait database show that SpC2DLDPCA and SpC2DLPPCA have higher recognition accuracies than their non-subpattern versions and SpC2DPCA.展开更多
Marginal Fisher analysis (MFA) not only aims to maintain the original relations of neighboring data points of the same class but also wants to keep away neighboring data points of the different classes.MFA can effec...Marginal Fisher analysis (MFA) not only aims to maintain the original relations of neighboring data points of the same class but also wants to keep away neighboring data points of the different classes.MFA can effectively overcome the limitation of linear discriminant analysis (LDA) due to data distribution assumption and available projection directions.However,MFA confronts the undersampled problems.Generalized marginal Fisher analysis (GMFA) based on a new optimization criterion is presented,which is applicable to the undersampled problems.The solutions to the proposed criterion for GMFA are derived,which can be characterized in a closed form.Among the solutions,two specific algorithms,namely,normal MFA (NMFA) and orthogonal MFA (OMFA),are studied,and the methods to implement NMFA and OMFA are proposed.A comparative study on the undersampled problem of face recognition is conducted to evaluate NMFA and OMFA in terms of classification accuracy,which demonstrates the effectiveness of the proposed algorithms.展开更多
Some dimensionality reduction (DR) approaches based on support vector machine (SVM) are proposed. But the acquirement of the projection matrix in these approaches only considers the between-class margin based on S...Some dimensionality reduction (DR) approaches based on support vector machine (SVM) are proposed. But the acquirement of the projection matrix in these approaches only considers the between-class margin based on SVM while ignoring the within-class information in data. This paper presents a new DR approach, call- ed the dimensionality reduction based on SVM and LDA (DRSL). DRSL considers the between-class margins from SVM and LDA, and the within-class compactness from LDA to obtain the projection matrix. As a result, DRSL can realize the combination of the between-class and within-class information and fit the between-class and within-class structures in data. Hence, the obtained projection matrix increases the generalization ability of subsequent classification techniques. Experiments applied to classification techniques show the effectiveness of the proposed method.展开更多
在机器学习和模式识别中,降维能够显著提升分类器的判别性能与效率。比率和(ratio sum,RS)是线性判别分析(linear discriminant analysis,LDA)的一种全新变体,它试图使投影矩阵在每个维度上都达到最优。但RS并没有考虑到数据的局部几何...在机器学习和模式识别中,降维能够显著提升分类器的判别性能与效率。比率和(ratio sum,RS)是线性判别分析(linear discriminant analysis,LDA)的一种全新变体,它试图使投影矩阵在每个维度上都达到最优。但RS并没有考虑到数据的局部几何结构,这就可能导致无法求得最优解。为了克服RS的这一缺点,提出了一种自适应近邻局部比值和线性判别分析算法(adaptive neighbor local ratio sum linear discriminant analysis,ANLRSLDA)。该算法使用自适应近邻的构图方法构建邻接矩阵,保留数据的局部几何结构完成了数据类间及类内矩阵的构建,从而更好地找到数据的最优表示;并且该方法采用有效的无核参数邻域分配策略来构造邻接矩阵,避免调整热核参数的需要。在UCI数据集及人脸数据集进行了对比实验,验证了该算法的有效性。展开更多
该文采用近红外光谱技术与化学计量学方法结合实现贝类毒素无损鉴别。该研究以新鲜翡翠贻贝为研究对象,使用近红外光谱仪采集健康贻贝和感染腹泻性毒素贻贝的反射光谱数据,利用Savitzky-Golay卷积平滑求导结合标准正态变量变换光谱预处...该文采用近红外光谱技术与化学计量学方法结合实现贝类毒素无损鉴别。该研究以新鲜翡翠贻贝为研究对象,使用近红外光谱仪采集健康贻贝和感染腹泻性毒素贻贝的反射光谱数据,利用Savitzky-Golay卷积平滑求导结合标准正态变量变换光谱预处理方式消除光谱中的干扰因素,采用间隔影响分析(margin influence analysis,MIA)结合连续投影算法(successive projections algorithm,SPA)对数据进行降维处理,应用偏最小二乘线性判别分析(partial least squares linear discriminant analysis,PLS-LDA)方法构建贝类毒素鉴别模型,并与支持向量机和随机森林分析模型进行比较。结果表明,采用MIA-SPA-PLS-LDA方法,可实现贝类毒素的100%鉴别。为此,利用MIA-SPA-PLS-LDA方法可建立准确的贝类毒素鉴别模型,为贝类毒素的快速鉴别提供了新途径,也为后续各种贝类水产品的毒素鉴别分析提供了参考。展开更多
提出了模块2DPCA(two-d im ensional princ ipal component analysis)的人脸识别方法。模块2DPCA方法先对图像矩阵进行分块,将分块得到的子图像矩阵直接用于构造总体散布矩阵,然后利用总体散布矩阵的特征向量进行图像特征抽取。与基于...提出了模块2DPCA(two-d im ensional princ ipal component analysis)的人脸识别方法。模块2DPCA方法先对图像矩阵进行分块,将分块得到的子图像矩阵直接用于构造总体散布矩阵,然后利用总体散布矩阵的特征向量进行图像特征抽取。与基于图像向量的鉴别方法(比如PCA)相比,该方法在特征抽取之前不需要将子图像矩阵转化为图像向量,能快速地降低鉴别特征的维数,可以完全避免使用矩阵的奇异值分解,特征抽取方便;此外,模块2DPCA是2DPCA的推广。在ORL和NUST603人脸库上的试验结果表明,模块2DPCA方法在识别性能上优于PCA,比2DPCA更具有鲁棒性。展开更多
多生物特征的融合与识别可提高身份识别系统的整体性能.本文在研究特征层融合的基础上,结合二维Fisher线性判别分析(2-Dimensional Fisher Linear Discriminant Analysis,2DFLD),提出了一种人脸与虹膜特征融合与识别模型.首先,对人脸图...多生物特征的融合与识别可提高身份识别系统的整体性能.本文在研究特征层融合的基础上,结合二维Fisher线性判别分析(2-Dimensional Fisher Linear Discriminant Analysis,2DFLD),提出了一种人脸与虹膜特征融合与识别模型.首先,对人脸图像与虹膜图像分别进行压缩降维处理,得到相应的初始特征矩阵.然后将人脸与虹膜的初始特征矩阵进行组合,获得组合特征矩阵.同时,利用2DFLD算法对组合特征矩阵进行融合,获得了人脸与虹膜的融合特征.最后运用最小距离分类器进行识别.基于ORL(Olivetti Research Laboratory)人脸数据库和CASIA(Chinese Academy ofSciences,Institute of Automation)虹膜数据库的实验结果表明,该模型实现了特征层融合,不仅克服了"小样本"效应,而且有效提高了身份识别的正确识别率,为多生物特征身份识别提供了一种有效模型.展开更多
文摘The recognition of pathological voice is considered a difficult task for speech analysis.Moreover,otolaryngologists needed to rely on oral communication with patients to discover traces of voice pathologies like dysphonia that are caused by voice alteration of vocal folds and their accuracy is between 60%–70%.To enhance detection accuracy and reduce processing speed of dysphonia detection,a novel approach is proposed in this paper.We have leveraged Linear Discriminant Analysis(LDA)to train multiple Machine Learning(ML)models for dysphonia detection.Several ML models are utilized like Support Vector Machine(SVM),Logistic Regression,and K-nearest neighbor(K-NN)to predict the voice pathologies based on features like Mel-Frequency Cepstral Coefficients(MFCC),Fundamental Frequency(F0),Shimmer(%),Jitter(%),and Harmonic to Noise Ratio(HNR).The experiments were performed using Saarbrucken Voice Data-base(SVD)and a privately collected dataset.The K-fold cross-validation approach was incorporated to increase the robustness and stability of the ML models.According to the experimental results,our proposed approach has a 70%increase in processing speed over Principal Component Analysis(PCA)and performs remarkably well with a recognition accuracy of 95.24%on the SVD dataset surpassing the previous best accuracy of 82.37%.In the case of the private dataset,our proposed method achieved an accuracy rate of 93.37%.It can be an effective non-invasive method to detect dysphonia.
文摘We revisit a comparison of two discriminant analysis procedures, namely the linear combination classifier of Chung and Han (2000) and the maximum likelihood estimation substitution classifier for the problem of classifying unlabeled multivariate normal observations with equal covariance matrices into one of two classes. Both classes have matching block monotone missing training data. Here, we demonstrate that for intra-class covariance structures with at least small correlation among the variables with missing data and the variables without block missing data, the maximum likelihood estimation substitution classifier outperforms the Chung and Han (2000) classifier regardless of the percent of missing observations. Specifically, we examine the differences in the estimated expected error rates for these classifiers using a Monte Carlo simulation, and we compare the two classifiers using two real data sets with monotone missing data via parametric bootstrap simulations. Our results contradict the conclusions of Chung and Han (2000) that their linear combination classifier is superior to the MLE classifier for block monotone missing multivariate normal data.
基金The National Natural Science Foundation of China (No.61374194)
文摘A direct linear discriminant analysis algorithm based on economic singular value decomposition (DLDA/ESVD) is proposed to address the computationally complex problem of the conventional DLDA algorithm, which directly uses ESVD to reduce dimension and extract eigenvectors corresponding to nonzero eigenvalues. Then a DLDA algorithm based on column pivoting orthogonal triangular (QR) decomposition and ESVD (DLDA/QR-ESVD) is proposed to improve the performance of the DLDA/ESVD algorithm by processing a high-dimensional low rank matrix, which uses column pivoting QR decomposition to reduce dimension and ESVD to extract eigenvectors corresponding to nonzero eigenvalues. The experimental results on ORL, FERET and YALE face databases show that the proposed two algorithms can achieve almost the same performance and outperform the conventional DLDA algorithm in terms of computational complexity and training time. In addition, the experimental results on random data matrices show that the DLDA/QR-ESVD algorithm achieves better performance than the DLDA/ESVD algorithm by processing high-dimensional low rank matrices.
基金Sponsored by the National Science Foundation of China( Grant No. 61201370,61100103)the Independent Innovation Foundation of Shandong University( Grant No. 2012DX07)
文摘This paper presents two novel algorithms for feature extraction-Subpattern Complete Two Dimensional Linear Discriminant Principal Component Analysis (SpC2DLDPCA) and Subpattern Complete Two Dimensional Locality Preserving Principal Component Analysis (SpC2DLPPCA). The modified SpC2DLDPCA and SpC2DLPPCA algorithm over their non-subpattern version and Subpattern Complete Two Dimensional Principal Component Analysis (SpC2DPCA) methods benefit greatly in the following four points: (1) SpC2DLDPCA and SpC2DLPPCA can avoid the failure that the larger dimension matrix may bring about more consuming time on computing their eigenvalues and eigenvectors. (2) SpC2DLDPCA and SpC2DLPPCA can extract local information to implement recognition. (3)The idea of subblock is introduced into Two Dimensional Principal Component Analysis (2DPCA) and Two Dimensional Linear Discriminant Analysis (2DLDA). SpC2DLDPCA combines a discriminant analysis and a compression technique with low energy loss. (4) The idea is also introduced into 2DPCA and Two Dimensional Locality Preserving projections (2DLPP), so SpC2DLPPCA can preserve local neighbor graph structure and compact feature expressions. Finally, the experiments on the CASIA(B) gait database show that SpC2DLDPCA and SpC2DLPPCA have higher recognition accuracies than their non-subpattern versions and SpC2DPCA.
基金supported by Science Foundation of the Fujian Province of China (No. 2010J05099)
文摘Marginal Fisher analysis (MFA) not only aims to maintain the original relations of neighboring data points of the same class but also wants to keep away neighboring data points of the different classes.MFA can effectively overcome the limitation of linear discriminant analysis (LDA) due to data distribution assumption and available projection directions.However,MFA confronts the undersampled problems.Generalized marginal Fisher analysis (GMFA) based on a new optimization criterion is presented,which is applicable to the undersampled problems.The solutions to the proposed criterion for GMFA are derived,which can be characterized in a closed form.Among the solutions,two specific algorithms,namely,normal MFA (NMFA) and orthogonal MFA (OMFA),are studied,and the methods to implement NMFA and OMFA are proposed.A comparative study on the undersampled problem of face recognition is conducted to evaluate NMFA and OMFA in terms of classification accuracy,which demonstrates the effectiveness of the proposed algorithms.
文摘Some dimensionality reduction (DR) approaches based on support vector machine (SVM) are proposed. But the acquirement of the projection matrix in these approaches only considers the between-class margin based on SVM while ignoring the within-class information in data. This paper presents a new DR approach, call- ed the dimensionality reduction based on SVM and LDA (DRSL). DRSL considers the between-class margins from SVM and LDA, and the within-class compactness from LDA to obtain the projection matrix. As a result, DRSL can realize the combination of the between-class and within-class information and fit the between-class and within-class structures in data. Hence, the obtained projection matrix increases the generalization ability of subsequent classification techniques. Experiments applied to classification techniques show the effectiveness of the proposed method.
文摘在机器学习和模式识别中,降维能够显著提升分类器的判别性能与效率。比率和(ratio sum,RS)是线性判别分析(linear discriminant analysis,LDA)的一种全新变体,它试图使投影矩阵在每个维度上都达到最优。但RS并没有考虑到数据的局部几何结构,这就可能导致无法求得最优解。为了克服RS的这一缺点,提出了一种自适应近邻局部比值和线性判别分析算法(adaptive neighbor local ratio sum linear discriminant analysis,ANLRSLDA)。该算法使用自适应近邻的构图方法构建邻接矩阵,保留数据的局部几何结构完成了数据类间及类内矩阵的构建,从而更好地找到数据的最优表示;并且该方法采用有效的无核参数邻域分配策略来构造邻接矩阵,避免调整热核参数的需要。在UCI数据集及人脸数据集进行了对比实验,验证了该算法的有效性。
文摘该文采用近红外光谱技术与化学计量学方法结合实现贝类毒素无损鉴别。该研究以新鲜翡翠贻贝为研究对象,使用近红外光谱仪采集健康贻贝和感染腹泻性毒素贻贝的反射光谱数据,利用Savitzky-Golay卷积平滑求导结合标准正态变量变换光谱预处理方式消除光谱中的干扰因素,采用间隔影响分析(margin influence analysis,MIA)结合连续投影算法(successive projections algorithm,SPA)对数据进行降维处理,应用偏最小二乘线性判别分析(partial least squares linear discriminant analysis,PLS-LDA)方法构建贝类毒素鉴别模型,并与支持向量机和随机森林分析模型进行比较。结果表明,采用MIA-SPA-PLS-LDA方法,可实现贝类毒素的100%鉴别。为此,利用MIA-SPA-PLS-LDA方法可建立准确的贝类毒素鉴别模型,为贝类毒素的快速鉴别提供了新途径,也为后续各种贝类水产品的毒素鉴别分析提供了参考。
文摘提出了模块2DPCA(two-d im ensional princ ipal component analysis)的人脸识别方法。模块2DPCA方法先对图像矩阵进行分块,将分块得到的子图像矩阵直接用于构造总体散布矩阵,然后利用总体散布矩阵的特征向量进行图像特征抽取。与基于图像向量的鉴别方法(比如PCA)相比,该方法在特征抽取之前不需要将子图像矩阵转化为图像向量,能快速地降低鉴别特征的维数,可以完全避免使用矩阵的奇异值分解,特征抽取方便;此外,模块2DPCA是2DPCA的推广。在ORL和NUST603人脸库上的试验结果表明,模块2DPCA方法在识别性能上优于PCA,比2DPCA更具有鲁棒性。
文摘多生物特征的融合与识别可提高身份识别系统的整体性能.本文在研究特征层融合的基础上,结合二维Fisher线性判别分析(2-Dimensional Fisher Linear Discriminant Analysis,2DFLD),提出了一种人脸与虹膜特征融合与识别模型.首先,对人脸图像与虹膜图像分别进行压缩降维处理,得到相应的初始特征矩阵.然后将人脸与虹膜的初始特征矩阵进行组合,获得组合特征矩阵.同时,利用2DFLD算法对组合特征矩阵进行融合,获得了人脸与虹膜的融合特征.最后运用最小距离分类器进行识别.基于ORL(Olivetti Research Laboratory)人脸数据库和CASIA(Chinese Academy ofSciences,Institute of Automation)虹膜数据库的实验结果表明,该模型实现了特征层融合,不仅克服了"小样本"效应,而且有效提高了身份识别的正确识别率,为多生物特征身份识别提供了一种有效模型.