期刊文献+

中医临床疾病数据多标记分类方法研究 被引量:1

Research on Multi-Label Classification Method of Traditional Chinese Medicine Clinical Disease Data
下载PDF
导出
摘要 WML-kN N(weighted multi-label k nearest neighbor)算法中近邻点个数取固定值,而没有考虑样本数据的实际特点,可能会将相似度高的点排除在近邻集外,或者将相似度低的点包含在近邻集内,这些都会影响分类器的性能。而中医(traditional Chinese medicine,TCM)临床获得的关于疾病的数据很可能是多标记的,同时由于病例的特殊性,每个病例可能具有不同的相似近邻集。因此,对WML-kNN算法进行了改进,提出WML-GkN N(WML-granular kNN)算法。该算法通过粒计算对粒度空间进行控制,从而确定近邻点集,使得邻域内的样本点有高相似性。在中医临床采集的经络电阻数据上的实验结果显示,WML-GkNN算法提高了分类性能。 WML-kNN (weighted multi-label k nearest neighbor) learning algorithm, the number of neighbor points from fixed value, without considering the actual characteristics of the sample data, may make the high similarity point excluded from the neighbor set, or the low similarity point contained in the neighbor set, which will affect the performance of classifier. Traditional Chinese medicine (TCM) clinical data on the disease are likely to have multipie labels, and because of the particularity of the sample, each sample may have different similarity neighbors. This paper improves the WML-kNN algorithm and proposes WML-GkNN (WML-granular kNN) algorithm. In WML- GkNN algorithm, the granular control is used to control the granularity space, and the set of neighbors is deter- mined, so that the sample points in the neighborhood have high similarity. The experimental results on the meridian resistance data collected by TCM show that the WML-GkNN algorithm improves the classification performance.
作者 潘主强 张林 张磊 李国正 颜仕星 PAN Zhuqiang1, ZHANG Lin1, ZHANG Lei2, LI Guozheng3, YAN Shixing4(1. School of Computer Science, Southwest Petroleum University, Chengdu 610500, China; 2. Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China; 3. National Data Center of Traditional Chinese Medicine, China Academy of Chinese Medical Sciences, Beijing 100700, China; 4. Shanghai Menorah Information Technology Co., Ltd., Shanghai 201800, Chin)
出处 《计算机科学与探索》 CSCD 北大核心 2018年第8期1295-1304,共10页 Journal of Frontiers of Computer Science and Technology
基金 国家自然科学基金No.81503680 中央级公益性科研院所基本科研业务费专项资金No.ZZ0908032 全民健康保障信息化工程中医药项目研究No.215005~~
关键词 中医临床数据 多标记学习 粒计算 权重 Chinese medicine clinical data multi-label learning granular computing weight
  • 相关文献

参考文献8

二级参考文献125

  • 1王永炎.完善中医辨证方法体系的建议[J].中医杂志,2004,45(10):729-731. 被引量:242
  • 2李丹,李国正,陆文聪.用于药物活性预报的Co-Training方法[J].计算机科学,2006,33(12):159-161. 被引量:3
  • 3龚燕冰,倪青,王永炎.中医证候研究的现代方法学述评(一)——中医证候数据挖掘技术[J].北京中医药大学学报,2006,29(12):797-801. 被引量:96
  • 4Schapire R E, Singer Y. Boostexter: A boosting-based system for text categorization. Machine Learning, 2000, 39 (2--3):135-168.
  • 5McCallum A. Multi-label text classification with a mixture model trained by EM. Working Notes of the AAAI' 99 Workshop on Text Learning. Orlando: AAAI, 1999.
  • 6Boutell M R, Luo J, Shen X, et al. Learning multi-label scene classification. Pattern Recognition, 2004, 37(9): 1757-1771.
  • 7Yin Z, Zhou Z H. Multi-label dimensionality reduction via dependency maximization. Proceedings of the 23^rd AAAI Conference on Artificial Intelligence, Chicago, IL: AAAI, 2008, 1503-1505.
  • 8Yu K, Yu S P, Tresp V. Multi-label informed latent semantic indexing. Proceedings of the 28^th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, New York, NY:ACM, 2005, 258--265.
  • 9Moody J, Utans J. Principled architecture selection for neural networks: Application to corporate bond rating prediction. Moody J E, Hanson S J, Lippmann R P. Neural Information Processing Systems 4. Morgan Kaufmann Publishers, Inc. 1992, 683-690.
  • 10Guyon I, Elisseeff A. An introduction to variable and feature selection. Journal of Machine Learning Research, 2003, 3:1157-1182.

共引文献68

同被引文献9

引证文献1

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部