Tensor representation is useful to reduce the overfitting problem in vector-based learning algorithm in pattern recognition.This is mainly because the structure information of objects in pattern analysis is a reasonab...Tensor representation is useful to reduce the overfitting problem in vector-based learning algorithm in pattern recognition.This is mainly because the structure information of objects in pattern analysis is a reasonable constraint to reduce the number of unknown parameters used to model a classifier.In this paper, we generalize the vector-based learning algorithm TWin Support Vector Machine(TWSVM) to the tensor-based method TWin Support Tensor Machines(TWSTM), which accepts general tensors as input.To examine the effectiveness of TWSTM, we implement the TWSTM method for Microcalcification Clusters(MCs) detection.In the tensor subspace domain, the MCs detection procedure is formulated as a supervised learning and classification problem, and TWSTM is used as a classifier to make decision for the presence of MCs or not.A large number of experiments were carried out to evaluate and compare the performance of the proposed MCs detection algorithm.By comparison with TWSVM, the tensor version reduces the overfitting problem.展开更多
Effect of with and without categorization of continuous variables on the number and nature of statistically significant predictors was examined while analyzing clinical trial data. The number of categories required to...Effect of with and without categorization of continuous variables on the number and nature of statistically significant predictors was examined while analyzing clinical trial data. The number of categories required to have consistent statistical inference was also explored. Multiple Logistic Regression Analysis was employed with the dependent variable in the model may be a dichotomous/multi-category in nature while the independent variables (predictors) may be either continuous or categorical or ordinal. Real-life clinical trial data was used to answer the objectives. It was found that there was no hard and fast rule to categorize the continuous variables. Sometimes, it was observed that the set of significant predictors identified might change with the criteria of categorization. Certain variables without categorization produced too large odds ratios to interpret meaningfully. The nature as well as number of significant predictors altered with classification criteria often forcing the authors to categorize variables, it is recommended that the independent variables need not be coded, unless otherwise warranted. Coding is needed when the odds ratio is extremely high. In this situation, two or more categories, including regression analysis. median cut off point, will be sufficient to undertake the logistic展开更多
基金Supported by the National Natural Science Foundation of China (No. 60771068)the Natural Science Basic Research Plan in Shaanxi Province of China (No. 2007F248)
文摘Tensor representation is useful to reduce the overfitting problem in vector-based learning algorithm in pattern recognition.This is mainly because the structure information of objects in pattern analysis is a reasonable constraint to reduce the number of unknown parameters used to model a classifier.In this paper, we generalize the vector-based learning algorithm TWin Support Vector Machine(TWSVM) to the tensor-based method TWin Support Tensor Machines(TWSTM), which accepts general tensors as input.To examine the effectiveness of TWSTM, we implement the TWSTM method for Microcalcification Clusters(MCs) detection.In the tensor subspace domain, the MCs detection procedure is formulated as a supervised learning and classification problem, and TWSTM is used as a classifier to make decision for the presence of MCs or not.A large number of experiments were carried out to evaluate and compare the performance of the proposed MCs detection algorithm.By comparison with TWSVM, the tensor version reduces the overfitting problem.
文摘Effect of with and without categorization of continuous variables on the number and nature of statistically significant predictors was examined while analyzing clinical trial data. The number of categories required to have consistent statistical inference was also explored. Multiple Logistic Regression Analysis was employed with the dependent variable in the model may be a dichotomous/multi-category in nature while the independent variables (predictors) may be either continuous or categorical or ordinal. Real-life clinical trial data was used to answer the objectives. It was found that there was no hard and fast rule to categorize the continuous variables. Sometimes, it was observed that the set of significant predictors identified might change with the criteria of categorization. Certain variables without categorization produced too large odds ratios to interpret meaningfully. The nature as well as number of significant predictors altered with classification criteria often forcing the authors to categorize variables, it is recommended that the independent variables need not be coded, unless otherwise warranted. Coding is needed when the odds ratio is extremely high. In this situation, two or more categories, including regression analysis. median cut off point, will be sufficient to undertake the logistic