期刊文献+

一种语义数据的核分类方法 被引量:1

A Kernel-Based Classification Method for Nominal Data
下载PDF
导出
摘要 语义数据的内积计算是个难点问题,制约了有关语义数据的核分类方法的研究和发展。针对此问题,通过给出一种语义数据相异性度量测度的新定义、计算语义数据内积的简化方法、研究核方法和支撑向量机中的核函数的本质,提出了一种语义数据的核分类方法,并把方法向语义数据、连续属性构成的异构数据的分类问题进行了拓展。仿真实验表明方法具有一定的抗离群数据干扰能力,方法的总体性能优于文献中已有的其他方法。通过在异常检测领域中的应用研究,说明方法能高效地实现不平衡数据的分类,具有一定的实用价值。 A kernel-based nominal data classification(KNDC) method is proposed with a new distance definition and a simple inner product computing method in this paper.It's insensitivity to outliers and classification capability to unbalanced data in real datasets are further analyzed.The calculation on inner product of nominal data is difficult,often regarded as the bottleneck of SVM.The KNDC possesses a lower computation complexity than SVM over the nominal dataset,which is discussed for its validity.Experimental results on the standard datasets demonstrate that the proposed method has promising performance compared with other methods.
出处 《中文信息学报》 CSCD 北大核心 2010年第6期37-42,共6页 Journal of Chinese Information Processing
基金 国家自然科学基金青年科学基金资助项目(60704047)
关键词 核分类方法 语义数据 相异性度量测度 内积计算 kernel-based classification method nominal dataset dissimilarity measure inner production calculation
  • 相关文献

参考文献16

  • 1Minho Kim, R. S. Ramakrishna, Projected Clustering for Categorical Datasets[J]. Pattern Recognition Letters,2006,27: 1405-1417.
  • 2F. Esposito, D. Malerba, V. Tamma, H. H. Bock, Classical resemblance measure, in: H.-H. Bock, E. Diday (Eds.), Analysis of Symbolic Data, Springer[C]//Berlin,2000,139-152.
  • 3C. Stanfill, D. Waltz, Towards memory-based reasoning [J]. Commun,ACM,1986, 29(12) : 1213-1228.
  • 4Victor Cheng, Chun-Hung Li, James T. Kwok^b, Chi- Kwong Li^c,Dissimilarity learning for nominal data[J]. Pattern Recognition, 2004,37 : 1471-1477.
  • 5J. C. Gower, P. Legendre, Metric and Euclidean properties of dissimilarity coefficients[J]. J. Classif. 1986,3: 5-48.
  • 6H. Spath,Cluster Analysis Algorithm for Data Reduction and Classification[J]. Ellis Horwood, Chichester, 1980.
  • 7Burges J. C. , A tutorial on support vector machine for pattern recognition [J]. Data Mining and Knowledge Discoverty, 1998,2 (2) : 121-167.
  • 8V apnik V N. Statistical learning theo ry [M]. New York: John Wiley & Sons, INC, 1998.
  • 9Scholkopf B, MIka S, Burges C, et al. Input Space Versus Feature Space in Kernel-based Methods [J]. IEEE Trans on Neural Networks, 1999,10 (5) : 1000- 1017.
  • 10Defeng Wang, Daniel S. Yeung, Eric C. C. Tsang, Weighted Mahalanobis Distance Kernels for Support Vector Machines [J]. IEEE Transaction on Neural networks, 2007,18: 1453-1462.

二级参考文献25

  • 1沈红斌,王士同,吴小俊.离群模糊核聚类算法[J].软件学报,2004,15(7):1021-1029. 被引量:37
  • 2邓赵红,王士同.鲁棒性的模糊聚类神经网络[J].软件学报,2005,16(8):1415-1422. 被引量:11
  • 3陈友,程学旗,李洋,戴磊.基于特征选择的轻量级入侵检测系统[J].软件学报,2007,18(7):1639-1651. 被引量:78
  • 4A. -H. Tan. Text mining: The state of the art and the challenges[C]//Ning Zhong and Lizhu Zhou. Proceedings of PAKDD 1999. China:Springer, 1999:65-70.
  • 5Chien-Chung Huang,et al. Using a web based categorization approach to generate thematic metadata from texts[J]. ACM Transactions on Asian Language Information Processing, 2004, 3(3) :190-212.
  • 6S. Bechhofer, C. Gobel. Towards annotation using daml+oil[C]//Yolanda Gil, et al. Proceedings of KCAP 2001. Canada:ACM, 2001.
  • 7M. Erdmann, et al. From manual to semi-automatic semantic annotation: About ontology-based text annotation tools[C]//Buitelaar, P. and Hasida, K. Proceeding of COLING 2000. Germany: Morgan Kaufmann, 2000.
  • 8S. Handschuh, S. Stabb. Authoring and annotation of web pages in cream[C]//David Lassner, et al. Proceeding of WWW2002. USA..ACM, 2002:462-473.
  • 9M.-R. Koivunen, R. Swick. Metadata based annotation infrastructure offer flexibility and extensibility for collaborative applications and beyond[C]//Yolanda Gil, et al. Proceedings of K-CAP 2001. Canada: ACM, 2001.
  • 10P. Martin, P. Eklund. Embedding knowledge in web documents [J]. Computer Networks, 1999, 81: t403-1419.

共引文献4

同被引文献14

  • 1胡国平,张巍,王仁华.基于双层决策的新闻网页正文精确抽取[J].中文信息学报,2006,20(6):1-9. 被引量:16
  • 2刘迁,焦慧,贾惠波.信息抽取技术的发展现状及构建方法的研究[J].计算机应用研究,2007,24(7):6-9. 被引量:41
  • 3Bing Liu, Robert Grossman, Yanhong Zhai. Mining data records in Web pages[ C]//Proceedings of the ninth ACM SIGKDD international conference on knowledge discovery and data mining. ACM ,2003:601 - 606.
  • 4Hu Wenehen, Chang Kaihsiung, Gerhard X. Ritter: Web document classification using modified decision trees [ C ]//ACM Southeast Re- gional Conference ,2000:262 - 263.
  • 5Wong Taklam. Wai Lam: Learning to Adapt Web Information Extrac- tion Knowledge and Discovering New Attributes via a Bayesian Ap- proach [ J ]. IEEE Trans. Kn0wl. Data Eng. (TKDE) ,2010,22 (4) : 523 - 536.
  • 6Cfistianini N, Shawe J T. An introduction to support vector machines [ M ]. Cambridge : Cambridge University Press ,2000:35 - 38.
  • 7Yang S X,Tian Y J,Zhang C H. Rule Extraction from Support Vector Machines and Its Applications[ C ]//Web Intelligence/IAT Workshops 2011:221 -224.
  • 8Cai Deng,Yu Shipeng,Wen J R,et al. VIPS: A vision-based page seg- mentation algorithm. Microsoft Technical Report [ R ]. MSR-TR-2003- 79,2003 : 10.
  • 9Yoav Freund, Robert E. Schapire: A decision-theoretic generalization of on-line learning and an application to boosting [ J] . EuroCOLT, 1995 : 23 -37.
  • 10韩先培,刘康,赵军.基于布局特征与语言特征的网页主要内容块发现[J].中文信息学报,2008,22(1):15-21. 被引量:8

引证文献1

二级引证文献2

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部