摘要
WML-kN N(weighted multi-label k nearest neighbor)算法中近邻点个数取固定值,而没有考虑样本数据的实际特点,可能会将相似度高的点排除在近邻集外,或者将相似度低的点包含在近邻集内,这些都会影响分类器的性能。而中医(traditional Chinese medicine,TCM)临床获得的关于疾病的数据很可能是多标记的,同时由于病例的特殊性,每个病例可能具有不同的相似近邻集。因此,对WML-kNN算法进行了改进,提出WML-GkN N(WML-granular kNN)算法。该算法通过粒计算对粒度空间进行控制,从而确定近邻点集,使得邻域内的样本点有高相似性。在中医临床采集的经络电阻数据上的实验结果显示,WML-GkNN算法提高了分类性能。
WML-kNN (weighted multi-label k nearest neighbor) learning algorithm, the number of neighbor points from fixed value, without considering the actual characteristics of the sample data, may make the high similarity point excluded from the neighbor set, or the low similarity point contained in the neighbor set, which will affect the performance of classifier. Traditional Chinese medicine (TCM) clinical data on the disease are likely to have multipie labels, and because of the particularity of the sample, each sample may have different similarity neighbors. This paper improves the WML-kNN algorithm and proposes WML-GkNN (WML-granular kNN) algorithm. In WML- GkNN algorithm, the granular control is used to control the granularity space, and the set of neighbors is deter- mined, so that the sample points in the neighborhood have high similarity. The experimental results on the meridian resistance data collected by TCM show that the WML-GkNN algorithm improves the classification performance.
作者
潘主强
张林
张磊
李国正
颜仕星
PAN Zhuqiang1, ZHANG Lin1, ZHANG Lei2, LI Guozheng3, YAN Shixing4(1. School of Computer Science, Southwest Petroleum University, Chengdu 610500, China; 2. Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China; 3. National Data Center of Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China; 4. Shanghai Menorah Information Technology Co., Ltd., Shanghai 201800, Chin)
出处
《计算机科学与探索》
CSCD
北大核心
2018年第8期1295-1304,共10页
Journal of Frontiers of Computer Science and Technology
基金
国家自然科学基金No.81503680
中央级公益性科研院所基本科研业务费专项资金No.ZZ0908032
全民健康保障信息化工程中医药项目研究No.215005~~
关键词
中医临床数据
多标记学习
粒计算
权重
Chinese medicine clinical data
multi-label learning
granular computing
weight