Support vector machines have met with significant success in the information retrieval field, especially in handling text classification tasks. Although various performance estimators for SVMs have been proposed, thes...Support vector machines have met with significant success in the information retrieval field, especially in handling text classification tasks. Although various performance estimators for SVMs have been proposed, these only focus on accuracy which is based on the leave-one-out cross validation procedure. Information-retrieval-related performance measures are always neglected in a kernel learning methodology. In this paper, we have proposed a set of information-retrieval-oriented performance estimators for SVMs, which are based on the span bound of the leave-one-out procedure. Experiments have proven that our proposed estimators are both effective and stable.展开更多
This paper studies various classifiers to identify primary, secondary or tertiary alcohols by using segmental spectra and their combinations to support vector machines (SVMs). The results showed that the O-H in-plan...This paper studies various classifiers to identify primary, secondary or tertiary alcohols by using segmental spectra and their combinations to support vector machines (SVMs). The results showed that the O-H in-plane bending absorption contributed most to identification their substitute. This conclusion disagrees with related known research results.展开更多
The structure and function of proteins are closely related, and protein structure decides its function, therefore protein structure prediction is quite important.β-turns are important components of protein secondary ...The structure and function of proteins are closely related, and protein structure decides its function, therefore protein structure prediction is quite important.β-turns are important components of protein secondary structure. So development of an accurate prediction method ofβ-turn types is very necessary. In this paper, we used the composite vector with position conservation scoring function, increment of diversity and predictive secondary structure information as the input parameter of support vector machine algorithm for predicting theβ-turn types in the database of 426 protein chains, obtained the overall prediction accuracy of 95.6%, 97.8%, 97.0%, 98.9%, 99.2%, 91.8%, 99.4% and 83.9% with the Matthews Correlation Coefficient values of 0.74, 0.68, 0.20, 0.49, 0.23, 0.47, 0.49 and 0.53 for types I, II, VIII, I’, II’, IV, VI and nonturn respectively, which is better than other prediction.展开更多
Based on the research of predictingβ-hairpin motifs in proteins, we apply Random Forest and Support Vector Machine algorithm to predictβ-hairpin motifs in ArchDB40 dataset. The motifs with the loop length of 2 to 8 ...Based on the research of predictingβ-hairpin motifs in proteins, we apply Random Forest and Support Vector Machine algorithm to predictβ-hairpin motifs in ArchDB40 dataset. The motifs with the loop length of 2 to 8 amino acid residues are extracted as research object and thefixed-length pattern of 12 amino acids are selected. When using the same characteristic parameters and the same test method, Random Forest algorithm is more effective than Support Vector Machine. In addition, because of Random Forest algorithm doesn’t produce overfitting phenomenon while the dimension of characteristic parameters is higher, we use Random Forest based on higher dimension characteristic parameters to predictβ-hairpin motifs. The better prediction results are obtained;the overall accuracy and Matthew’s correlation coefficient of 5-fold cross-validation achieve 83.3% and 0.59, respectively.展开更多
It remains a great challenge to achieve sufficient cancer classification accuracy with the entire set of genes, due to the high dimensions, small sample size, and big noise of gene expression data. We thus proposed a ...It remains a great challenge to achieve sufficient cancer classification accuracy with the entire set of genes, due to the high dimensions, small sample size, and big noise of gene expression data. We thus proposed a hybrid gene selection method, Information Gain-Support Vector Machine (IG-SVM) in this study. IG was initially employed to filter irrelevant and redundant genes. Then, further removal of redundant genes was performed using SVM to eliminate the noise in the datasets more effectively. Finally, the informative genes selected by IG-SVM served as the input for the LIBSVM classifier. Compared to other related algorithms, IG-SVM showed the highest classification accuracy and superior performance as evaluated using five cancer gene expression datasets based on a few selected genes. As an example, IG-SVM achieved a classification accuracy of 90.32% for colon cancer, which is difficult to be accurately classified, only based on three genes including CSRP1, MYLg, and GUCA2B.展开更多
This paper offers a new combined approach to predict and characterize β-turns in proteins.The approach includes two key steps,i.e.,how to represent the features of β-turns and how to develop a predictor.The first st...This paper offers a new combined approach to predict and characterize β-turns in proteins.The approach includes two key steps,i.e.,how to represent the features of β-turns and how to develop a predictor.The first step is to use factor analysis scales of generalized amino acid information(FASGAI),involving hydrophobicity,alpha and turn propensities,bulky properties,compositional characteristics,local flexibility and electronic properties,to represent the features of β-turns in proteins.The second step is to construct a support vector machine(SVM) predictor of β-turns based on 426 training proteins by a sevenfold cross validation test.The SVM predictor thus predicted β-turns on 547 and 823 proteins by an external validation test,separately.Our results are compared with the previously best known β-turn prediction methods and are shown to give comparative performance.Most significantly,the SVM model provides some information related to β-turn residues in proteins.The results demonstrate that the present combination approach may be used in the prediction of protein structures.展开更多
In this paper, we propose a novel Residuals-Based Deep Least Squares Support Vector Machine(RBDLSSVM). In the RBD-LSSVM, multiple LSSVMs are sequentially connected. The second LSSVM uses the fitting residuals of the f...In this paper, we propose a novel Residuals-Based Deep Least Squares Support Vector Machine(RBDLSSVM). In the RBD-LSSVM, multiple LSSVMs are sequentially connected. The second LSSVM uses the fitting residuals of the first LSSVM as input time series, and the third LSSVM trains the residuals of the second, and so on. The original time series is the input of the first LSSVM. Additionally, to obtain the best hyper-parameters for the RBD-LSSVM, we propose a model validation method based on redundancy test using Omni-Directional Correlation Function(ODCF). This method is based on the fact when a model is appropriate for a given time series, there should be no information or correlation in the residuals. We propose the use of ODCF as a statistic to detect nonlinear correlation between two random variables. Thus, we can select hyper-parameters without encountering overfitting,which cannot be avoided by only cross validation using the validation set. We conducted experiments on two time series: annual sunspot number series and monthly Total Column Ozone(TCO) series in New Delhi. Analysis of the prediction results and comparisons with recent and past studies demonstrate the promising performance of the proposed RBD-LSSVM approach with redundancy test based model selection method for modeling and predicting nonlinear time series.展开更多
With a lack of coverage in private and public power communication networks,especially for collection of information from hydropower stations in remote areas,communication coverage is a significant issue.Satellite comm...With a lack of coverage in private and public power communication networks,especially for collection of information from hydropower stations in remote areas,communication coverage is a significant issue.Satellite communication provides a large coverage area suitable for a variety of services and is less affected by geographical factors;moreover,the costs are independent of the communication distance.This study investigates information acquisition technology for small hydropower stations in remote areas using high-and low-orbit satellites.The information collection needs of small hydropower stations in remote areas are analyzed,and an information acquisition system is designed using high-and low-orbit satellites.For network security protection,network anomaly detection technology based on a support vector machine algorithm is proposed.The effectiveness of information collection was evaluated and verified for small hydropower plants in remote areas.The system provides technical support for“full coverage,full collection,and full monitoring”of the measurement automation information acquisition system.展开更多
In order to obtain the trend of urban rail transit traffic flow and grasp the fluctuation range of passenger flow better,this paper proposes a combined forecasting model of passenger flow fluctuation range based on fu...In order to obtain the trend of urban rail transit traffic flow and grasp the fluctuation range of passenger flow better,this paper proposes a combined forecasting model of passenger flow fluctuation range based on fuzzy information granulation and least squares support vector machine(LS-SVM)optimized by chaos particle swarm optimization(CPSO).Due to the nonlinearity and fluctuation of the passenger flow,firstly,fuzzy information granulation is used to extract the valid data from the window according to the requirement.Secondly,CPSO that has strong global search ability is applied to optimize the parameters of the LS-SVM forecasting model.Finally,the combined model is used to forecast the fluctuation range of early peak passenger flow at Tiyu Xilu Station of Guangzhou Metro Line 3 in 2014,and the results are compared and analyzed with other models.Simulation results demonstrate that the combined forecasting model can effectively track the fluctuation of passenger flow,which provides an effective method for predicting the fluctuation range of short-term passenger flow in the future.展开更多
This letter presents a new discriminative model for Information Retrieval (IR), referred to as Ordinal Regression Model (ORM). ORM is different from most existing models in that it views IR as ordinal regression probl...This letter presents a new discriminative model for Information Retrieval (IR), referred to as Ordinal Regression Model (ORM). ORM is different from most existing models in that it views IR as ordinal regression problem (i.e. ranking problem) instead of binary classification. It is noted that the task of IR is to rank documents according to the user information needed, so IR can be viewed as ordinal regression problem. Two parameter learning algorithms for ORM are presented. One is a perceptron-based algorithm. The other is the ranking Support Vector Machine (SVM). The effec- tiveness of the proposed approach has been evaluated on the task of ad hoc retrieval using three English Text REtrieval Conference (TREC) sets and two Chinese TREC sets. Results show that ORM sig- nificantly outperforms the state-of-the-art language model approaches and OKAPI system in all test sets; and it is more appropriate to view IR as ordinal regression other than binary classification.展开更多
A novel detection method of support vector machine (SVM) based on fractal dimension of signals is presented. And models of SVM are made based on nugget size defects of spot welding. Classification using these traine...A novel detection method of support vector machine (SVM) based on fractal dimension of signals is presented. And models of SVM are made based on nugget size defects of spot welding. Classification using these trained SVM models is done to signals of spot welding. It is shown from effect of different SVM models that these models with different inputs. In detection of defects, these models with inputs including sound signal have a high percentage of accuracy, the detection accuracy of these models with inputs including voltage signal will reduce. So the SVM models based on fractal dimensions of sound are some optimal nondestructive detection ones. At last a comparison between SVM detection model and ANNS detection model is researched which indicates that SVM is a more effective measure than Artificial neural networks in detection of nugget size defects during spot welding.展开更多
Computer-aided detection and diagnosis (CAD) systems are increasingly being used as an aid by clinicians for detection and interpretation of diseases. In general, a CAD system employs a classifier to detect or disting...Computer-aided detection and diagnosis (CAD) systems are increasingly being used as an aid by clinicians for detection and interpretation of diseases. In general, a CAD system employs a classifier to detect or distinguish between abnormal and normal tissues on images. In the phase of classification, a set of image features and/or texture features extracted from the images are commonly used. In this article, we investigated the characteristic of the output entropy of an image and demonstrated the usefulness of the output entropy acting as a texture feature in CAD systems. In order to validate the effectiveness and superiority of the output-entropy-based texture feature, two well-known texture features, i.e., mean and standard deviation were used for comparison. The database used in this study comprised 50 CT images obtained from 10 patients with pulmonary nodules, and 50 CT images obtained from 5 normal subjects. We used a support vector machine for classification. A leave-one-out method was employed for training and classification. Three combinations of texture features, i.e., mean and entropy, standard deviation and entropy, and standard deviation and mean were used as the inputs to the classifier. Three different regions of interest (ROI) sizes, i.e., 11 × 11, 9 × 9 and 7 × 7 pixels from the database were selected for computation of the feature values. Our experimental results show that the combination of entropy and standard deviation is significantly better than both the combination of mean and entropy and that of standard deviation and mean in the case of the ROI size of 11 × 11 pixels (p < 0.05). These results suggest that information entropy of an image can be used as an effective feature for CAD applications.展开更多
The information access is the rich data available for information retrieval, evolved to provide principle approaches or strategies for searching. For building the successful web retrieval search engine model, there ar...The information access is the rich data available for information retrieval, evolved to provide principle approaches or strategies for searching. For building the successful web retrieval search engine model, there are a number of prospects that arise at the different levels where techniques, such as Usenet, support vector machine are employed to have a significant impact. The present investigations explore the number of problems identified its level and related to finding information on web. The authors have attempted to examine the issues and prospects by applying different methods such as web graph analysis, the retrieval and analysis of newsgroup postings and statistical methods for inferring meaning in text. The proposed model thus assists the users in finding the existing formation of data they need. The study proposes three heuristics model to characterize the balancing between query and feedback information, so that adaptive relevance feedback. The authors have made an attempt to discuss the parameter factors that are responsible for the efficient searching. The important parameters can be taken care of for the future extension or development of search engines.展开更多
This paper presents keystroke dynamics based authentication system using the information set concept. Two types of membership functions (MFs) are computed: one based on the timing features of all the samples and anoth...This paper presents keystroke dynamics based authentication system using the information set concept. Two types of membership functions (MFs) are computed: one based on the timing features of all the samples and another based on the timing features of a single sample. These MFs lead to two types of information components (spatial and temporal) which are concatenated and modified to produce different feature types. Two Component Information Set (TCIS) is proposed for keystroke dynamics based user authentication. The keystroke features are converted into TCIS features which are then classified by SVM, Random Forest and proposed Convex Entropy Based Hanman Classifier. The TCIS features are capable of representing the spatial and temporal uncertainties. The performance of the proposed features is tested on CMU benchmark dataset in terms of error rates (FAR, FRR, EER) and accuracy of the features. In addition, the proposed features are also tested on Android Touch screen based Mobile Keystroke Dataset. The TCIS features improve the performance and give lower error rates and better accuracy than that of the existing features in literature.展开更多
The axial selection of tunnels constructed in the interlayered soft-hard rock mass affects the stability and safety during construction.Previous optimization is primarily based on experience or comparison and selectio...The axial selection of tunnels constructed in the interlayered soft-hard rock mass affects the stability and safety during construction.Previous optimization is primarily based on experience or comparison and selection of alternative values under specific geological conditions.In this work,an intelligent optimization framework has been proposed by combining numerical analysis,machine learning(ML)and optimization algorithm.An automatic and intelligent numerical analysis process was proposed and coded to reduce redundant manual intervention.The conventional optimization algorithm was developed from two aspects and applied to the hyperparameters estimation of the support vector machine(SVM)model and the axial orientation optimization of the tunnel.Finally,the comprehensive framework was applied to a numerical case study,and the results were compared with those of other studies.The results of this study indicate that the determination coefficients between the predicted and the numerical stability evaluation indices(STIs)on the training and testing datasets are 0.998 and 0.997,respectively.For a given geological condition,the STI that changes with the axial orientation shows the trend of first decreasing and then increasing,and the optimal tunnel axial orientation is estimated to be 87.This method provides an alternative and quick approach to the overall design of the tunnels.展开更多
文摘Support vector machines have met with significant success in the information retrieval field, especially in handling text classification tasks. Although various performance estimators for SVMs have been proposed, these only focus on accuracy which is based on the leave-one-out cross validation procedure. Information-retrieval-related performance measures are always neglected in a kernel learning methodology. In this paper, we have proposed a set of information-retrieval-oriented performance estimators for SVMs, which are based on the span bound of the leave-one-out procedure. Experiments have proven that our proposed estimators are both effective and stable.
基金This work is partially supported by the National Natural Science Foundation of China (No 29877016).
文摘This paper studies various classifiers to identify primary, secondary or tertiary alcohols by using segmental spectra and their combinations to support vector machines (SVMs). The results showed that the O-H in-plane bending absorption contributed most to identification their substitute. This conclusion disagrees with related known research results.
文摘The structure and function of proteins are closely related, and protein structure decides its function, therefore protein structure prediction is quite important.β-turns are important components of protein secondary structure. So development of an accurate prediction method ofβ-turn types is very necessary. In this paper, we used the composite vector with position conservation scoring function, increment of diversity and predictive secondary structure information as the input parameter of support vector machine algorithm for predicting theβ-turn types in the database of 426 protein chains, obtained the overall prediction accuracy of 95.6%, 97.8%, 97.0%, 98.9%, 99.2%, 91.8%, 99.4% and 83.9% with the Matthews Correlation Coefficient values of 0.74, 0.68, 0.20, 0.49, 0.23, 0.47, 0.49 and 0.53 for types I, II, VIII, I’, II’, IV, VI and nonturn respectively, which is better than other prediction.
文摘Based on the research of predictingβ-hairpin motifs in proteins, we apply Random Forest and Support Vector Machine algorithm to predictβ-hairpin motifs in ArchDB40 dataset. The motifs with the loop length of 2 to 8 amino acid residues are extracted as research object and thefixed-length pattern of 12 amino acids are selected. When using the same characteristic parameters and the same test method, Random Forest algorithm is more effective than Support Vector Machine. In addition, because of Random Forest algorithm doesn’t produce overfitting phenomenon while the dimension of characteristic parameters is higher, we use Random Forest based on higher dimension characteristic parameters to predictβ-hairpin motifs. The better prediction results are obtained;the overall accuracy and Matthew’s correlation coefficient of 5-fold cross-validation achieve 83.3% and 0.59, respectively.
基金supported by the National Natural Science Foundation of China(Grant No.61672386)Humanities and Social Sciences Planning Project of Ministry of Education,China(Grant No.16YJAZH071)+1 种基金Anhui Provincial Natural Science Foundation of China(Grant No.1708085MF142)the Natural Science Research Key Project of Anhui Colleges,China(Grant No.KJ2014A266)
文摘It remains a great challenge to achieve sufficient cancer classification accuracy with the entire set of genes, due to the high dimensions, small sample size, and big noise of gene expression data. We thus proposed a hybrid gene selection method, Information Gain-Support Vector Machine (IG-SVM) in this study. IG was initially employed to filter irrelevant and redundant genes. Then, further removal of redundant genes was performed using SVM to eliminate the noise in the datasets more effectively. Finally, the informative genes selected by IG-SVM served as the input for the LIBSVM classifier. Compared to other related algorithms, IG-SVM showed the highest classification accuracy and superior performance as evaluated using five cancer gene expression datasets based on a few selected genes. As an example, IG-SVM achieved a classification accuracy of 90.32% for colon cancer, which is difficult to be accurately classified, only based on three genes including CSRP1, MYLg, and GUCA2B.
基金supported by the National Natural Science Foundation of China(10901169)Innovation Ability Training Foundation of Chongqing University(CDCX008)
文摘This paper offers a new combined approach to predict and characterize β-turns in proteins.The approach includes two key steps,i.e.,how to represent the features of β-turns and how to develop a predictor.The first step is to use factor analysis scales of generalized amino acid information(FASGAI),involving hydrophobicity,alpha and turn propensities,bulky properties,compositional characteristics,local flexibility and electronic properties,to represent the features of β-turns in proteins.The second step is to construct a support vector machine(SVM) predictor of β-turns based on 426 training proteins by a sevenfold cross validation test.The SVM predictor thus predicted β-turns on 547 and 823 proteins by an external validation test,separately.Our results are compared with the previously best known β-turn prediction methods and are shown to give comparative performance.Most significantly,the SVM model provides some information related to β-turn residues in proteins.The results demonstrate that the present combination approach may be used in the prediction of protein structures.
文摘In this paper, we propose a novel Residuals-Based Deep Least Squares Support Vector Machine(RBDLSSVM). In the RBD-LSSVM, multiple LSSVMs are sequentially connected. The second LSSVM uses the fitting residuals of the first LSSVM as input time series, and the third LSSVM trains the residuals of the second, and so on. The original time series is the input of the first LSSVM. Additionally, to obtain the best hyper-parameters for the RBD-LSSVM, we propose a model validation method based on redundancy test using Omni-Directional Correlation Function(ODCF). This method is based on the fact when a model is appropriate for a given time series, there should be no information or correlation in the residuals. We propose the use of ODCF as a statistic to detect nonlinear correlation between two random variables. Thus, we can select hyper-parameters without encountering overfitting,which cannot be avoided by only cross validation using the validation set. We conducted experiments on two time series: annual sunspot number series and monthly Total Column Ozone(TCO) series in New Delhi. Analysis of the prediction results and comparisons with recent and past studies demonstrate the promising performance of the proposed RBD-LSSVM approach with redundancy test based model selection method for modeling and predicting nonlinear time series.
基金funded by the Guangdong Power Grid Co.,Ltd.Technology Project(GDKJXM20180019).
文摘With a lack of coverage in private and public power communication networks,especially for collection of information from hydropower stations in remote areas,communication coverage is a significant issue.Satellite communication provides a large coverage area suitable for a variety of services and is less affected by geographical factors;moreover,the costs are independent of the communication distance.This study investigates information acquisition technology for small hydropower stations in remote areas using high-and low-orbit satellites.The information collection needs of small hydropower stations in remote areas are analyzed,and an information acquisition system is designed using high-and low-orbit satellites.For network security protection,network anomaly detection technology based on a support vector machine algorithm is proposed.The effectiveness of information collection was evaluated and verified for small hydropower plants in remote areas.The system provides technical support for“full coverage,full collection,and full monitoring”of the measurement automation information acquisition system.
基金National Natural Science Foundation of China(No.61663021)Science and Technology Support Project of Gansu Province(No.1304GKCA023)Scientific Research Project in University of Gansu Province(No.2017A-025)
文摘In order to obtain the trend of urban rail transit traffic flow and grasp the fluctuation range of passenger flow better,this paper proposes a combined forecasting model of passenger flow fluctuation range based on fuzzy information granulation and least squares support vector machine(LS-SVM)optimized by chaos particle swarm optimization(CPSO).Due to the nonlinearity and fluctuation of the passenger flow,firstly,fuzzy information granulation is used to extract the valid data from the window according to the requirement.Secondly,CPSO that has strong global search ability is applied to optimize the parameters of the LS-SVM forecasting model.Finally,the combined model is used to forecast the fluctuation range of early peak passenger flow at Tiyu Xilu Station of Guangzhou Metro Line 3 in 2014,and the results are compared and analyzed with other models.Simulation results demonstrate that the combined forecasting model can effectively track the fluctuation of passenger flow,which provides an effective method for predicting the fluctuation range of short-term passenger flow in the future.
基金Supported by the High Technology Research and Devel-opment Program of China (No.2006AA01Z150)the Key Project of the National Natural Science Foundation of China (No.60373101)+1 种基金the Natural Science Foundation of Heilongjiang Province (No.F2007-14)the Project of Heilongjiang Outstanding Young University Teacher (No. 1151G037).
文摘This letter presents a new discriminative model for Information Retrieval (IR), referred to as Ordinal Regression Model (ORM). ORM is different from most existing models in that it views IR as ordinal regression problem (i.e. ranking problem) instead of binary classification. It is noted that the task of IR is to rank documents according to the user information needed, so IR can be viewed as ordinal regression problem. Two parameter learning algorithms for ORM are presented. One is a perceptron-based algorithm. The other is the ranking Support Vector Machine (SVM). The effec- tiveness of the proposed approach has been evaluated on the task of ad hoc retrieval using three English Text REtrieval Conference (TREC) sets and two Chinese TREC sets. Results show that ORM sig- nificantly outperforms the state-of-the-art language model approaches and OKAPI system in all test sets; and it is more appropriate to view IR as ordinal regression other than binary classification.
基金supported by National Natural Science Foundation of China (No.50575159)Science Foundation of Ministry of Education of China (No.106049)+1 种基金Doctoral Foundation of Ministry of Education of China (No.20060056058)and Tianjin Municipal Natural Science Foundation of China (No.06YFJMJC03400).
文摘A novel detection method of support vector machine (SVM) based on fractal dimension of signals is presented. And models of SVM are made based on nugget size defects of spot welding. Classification using these trained SVM models is done to signals of spot welding. It is shown from effect of different SVM models that these models with different inputs. In detection of defects, these models with inputs including sound signal have a high percentage of accuracy, the detection accuracy of these models with inputs including voltage signal will reduce. So the SVM models based on fractal dimensions of sound are some optimal nondestructive detection ones. At last a comparison between SVM detection model and ANNS detection model is researched which indicates that SVM is a more effective measure than Artificial neural networks in detection of nugget size defects during spot welding.
文摘Computer-aided detection and diagnosis (CAD) systems are increasingly being used as an aid by clinicians for detection and interpretation of diseases. In general, a CAD system employs a classifier to detect or distinguish between abnormal and normal tissues on images. In the phase of classification, a set of image features and/or texture features extracted from the images are commonly used. In this article, we investigated the characteristic of the output entropy of an image and demonstrated the usefulness of the output entropy acting as a texture feature in CAD systems. In order to validate the effectiveness and superiority of the output-entropy-based texture feature, two well-known texture features, i.e., mean and standard deviation were used for comparison. The database used in this study comprised 50 CT images obtained from 10 patients with pulmonary nodules, and 50 CT images obtained from 5 normal subjects. We used a support vector machine for classification. A leave-one-out method was employed for training and classification. Three combinations of texture features, i.e., mean and entropy, standard deviation and entropy, and standard deviation and mean were used as the inputs to the classifier. Three different regions of interest (ROI) sizes, i.e., 11 × 11, 9 × 9 and 7 × 7 pixels from the database were selected for computation of the feature values. Our experimental results show that the combination of entropy and standard deviation is significantly better than both the combination of mean and entropy and that of standard deviation and mean in the case of the ROI size of 11 × 11 pixels (p < 0.05). These results suggest that information entropy of an image can be used as an effective feature for CAD applications.
文摘The information access is the rich data available for information retrieval, evolved to provide principle approaches or strategies for searching. For building the successful web retrieval search engine model, there are a number of prospects that arise at the different levels where techniques, such as Usenet, support vector machine are employed to have a significant impact. The present investigations explore the number of problems identified its level and related to finding information on web. The authors have attempted to examine the issues and prospects by applying different methods such as web graph analysis, the retrieval and analysis of newsgroup postings and statistical methods for inferring meaning in text. The proposed model thus assists the users in finding the existing formation of data they need. The study proposes three heuristics model to characterize the balancing between query and feedback information, so that adaptive relevance feedback. The authors have made an attempt to discuss the parameter factors that are responsible for the efficient searching. The important parameters can be taken care of for the future extension or development of search engines.
文摘This paper presents keystroke dynamics based authentication system using the information set concept. Two types of membership functions (MFs) are computed: one based on the timing features of all the samples and another based on the timing features of a single sample. These MFs lead to two types of information components (spatial and temporal) which are concatenated and modified to produce different feature types. Two Component Information Set (TCIS) is proposed for keystroke dynamics based user authentication. The keystroke features are converted into TCIS features which are then classified by SVM, Random Forest and proposed Convex Entropy Based Hanman Classifier. The TCIS features are capable of representing the spatial and temporal uncertainties. The performance of the proposed features is tested on CMU benchmark dataset in terms of error rates (FAR, FRR, EER) and accuracy of the features. In addition, the proposed features are also tested on Android Touch screen based Mobile Keystroke Dataset. The TCIS features improve the performance and give lower error rates and better accuracy than that of the existing features in literature.
基金supported by the National Natural Science Foundation of China(Grant Nos.51991392 and 51922104).
文摘The axial selection of tunnels constructed in the interlayered soft-hard rock mass affects the stability and safety during construction.Previous optimization is primarily based on experience or comparison and selection of alternative values under specific geological conditions.In this work,an intelligent optimization framework has been proposed by combining numerical analysis,machine learning(ML)and optimization algorithm.An automatic and intelligent numerical analysis process was proposed and coded to reduce redundant manual intervention.The conventional optimization algorithm was developed from two aspects and applied to the hyperparameters estimation of the support vector machine(SVM)model and the axial orientation optimization of the tunnel.Finally,the comprehensive framework was applied to a numerical case study,and the results were compared with those of other studies.The results of this study indicate that the determination coefficients between the predicted and the numerical stability evaluation indices(STIs)on the training and testing datasets are 0.998 and 0.997,respectively.For a given geological condition,the STI that changes with the axial orientation shows the trend of first decreasing and then increasing,and the optimal tunnel axial orientation is estimated to be 87.This method provides an alternative and quick approach to the overall design of the tunnels.