摘要
将统计检验方法应用于核函数度量。以核函数、规范化核函数、中心化核函数和核距离作为样本在特征空间中的几何关系度量,使用t检验和F检验等7种统计检验方法检验特征空间中同类样本间几何关系度量值与异类样本间几何关系度量值的分布差异,以此反映特征空间中同类样本间内聚性与异类样本间分离性间的差异。在11个UCI数据集上进行的核函数选择实验表明,基于统计检验的核度量方法达到或超过了核校准与特征空间核度量标准等方法的效果,适用于核函数度量;并且发现两类数据分布差异主要体现在了方差差异上。此外,对核函数的处理(规范化或中心化)会改变特征空间,使得度量结果失真。
This paper explored the research on evaluating kernel function by using statistical testing.By employing kernel,normalized kernel,centered kernel and kernel distance as geometric measure among samples in feature space,and applying 7statistical testing methods such as t-test and f-test,this paper evaluated the distributional difference between the geometric measures among samples from same classes and the geometric measure among samples from different classes.The experimental results of kernel selection on 11 UCI datasets show that the kernel evaluation measures based on statistical testing reach or exceed the performance of KTA and FSM,etc.And we found that the two types of data distribution differences are mainly reflected in the variance difference.Moreover,the formatting of kernel function such as normalization or centering can change the feature space,and make the evaluation distorted.
出处
《计算机科学》
CSCD
北大核心
2015年第4期199-205,共7页
Computer Science
基金
国家"十二五"科技支撑计划项目(2012BAH14F00)资助
关键词
核函数
核函数度量
统计检验
Kernel function
Kernel evaluation
Statistical testing