摘要
相似度是描述两个对象之间相似程度的一种度量,依据对象不同,相似度计算方法亦不同。相似度计算被广泛应用于数据挖掘算法中,它是对象分类的基础。该文将数据对象划分为数值型、非数值型和混合型三种,并根据数据对象的类型,探讨了相应的相似度计算方法,最后,通过实例描述了相似度计算在数据挖掘中的应用。
The Similarity is a measure of similarity between two objects, according to different objects, similarity calculation method is also different. Similarity calculation is widely used in data classification, is the basis for object classification. In this paper, the data objects were divided into three kinds: numeric type, non-numeric type and mixed type. And the similarity calculation methods of different types are discussed. Finally, we illustrated the application of similarity in the data mining.
作者
李俊磊
滕少华
LI Jun-lei, TENG Shao-hua (1.Guangdong Justice Police Vocational College, Guangzhou 510520, China;2. School of Computer Science and Technology, Guangdong University of Technology, Guangzhou 510006, China)
出处
《电脑知识与技术》
2016年第5期14-17,共4页
Computer Knowledge and Technology
基金
教育部重点实验室基金项目(110411)
广东省自然科学基金资助项目(10451009001004804,9151009001000007)
广东省科技计划项目(2012B091000173)
广州市科技计划项目(2012J5100054)资助