期刊文献+

一种基于EVS相似度的邮件社区聚类方法

New email community clustering method based on EVS similarity
原文传递
导出
摘要 聚类方法的核心是如何度量事物间的邻近性。介绍了邮件特征的向量表示形式、构建了邮件特征矩阵,并使用变形后的极值分布函数模型拟合了邮件间通信特征信息;在此基础上提出了一个新的邻近性度量方法(ex-treme value distribution similarity,EVS),用以指导邮件社区划分;使用微聚类-宏聚类邮件社区划分算法验证了该方法的有效性。实验表明,在测试数据集上,相比余弦、PCC等经典的邻近性度量方法,以EVS作为划分依据的邮件社区划分算法能够更加有效地发现高质量的邮件社区。 Proximity measurement between objects is a key problem of the clustering method.The email feature vector was introduced,and the email feature matrix was constructed.The information of email features was fitted by the model of the transformed extremal value distribution function.Based on this,EVS(extreme value distribution similarity) was proposed for email community clustering.The effectiveness of the new measurement was verified by the micro-macro clustering algorithm.Experiments show that compared to cosine-based similarity and Pearson correlation coefficient,the algorithm using the new proposed similarity measurement can identify higher quality communities.
出处 《山东大学学报(理学版)》 CAS CSCD 北大核心 2010年第3期34-40,共7页 Journal of Shandong University(Natural Science)
基金 国家自然科学基金资助项目(60773048)
关键词 社会网络 邮件社区划分 极值分布 EVS相似度 social network email community partition extreme value distribution EVS similarity
  • 相关文献

参考文献13

  • 1NEWMAN M E J, Fast algorithm for detecting community structure in networks [ J ]. Physical Review E, 2004, 69 (6) : 066133.
  • 2TYLER JOSHUA R, WILKINSON DENNIS M, HUBERMAN BERNARDO A. Email as spectroscopy:automated discovery of community structure within organizations[ C ]//International Conference on Communities and Technologies. Amsterdam: [ s. n. ] 2003:81-96.
  • 3ROSVALL M, BERGSTROM C T. An information-theoretic framework for resolving community structure in complex networks[J]. PNAS, 2007, 104(18): 7327-7331.
  • 4郭崇慧,张亮.基于PCA的复杂网络社区结构分析方法[J].运筹与管理,2008,17(6):144-149. 被引量:9
  • 5LIU Yah, WANG Qingxian, WANG Qiang, et al. Email community detection using artificial ant colony clustering[ J ]. Lecture Notes in Computer Science, 2007, 4537:287-298.
  • 6LIN Hui, FAN Weiguo, LINDA WALLACE,et al. An empirical study of web-based knowledge community success[ C]// Proceedings of the 40th Hawaii International Conference on System Sciences. Washington, DC, USA: IEEE Computer Society, 2007 : 2946-2955.
  • 7LI Fulu, HSIEH Mob_an. An empirical study of clustering behavior of spammers and group-based anti-spare strategies[ C ]//: Proceedings of the 3rd Conference on Email and Anti-spare ( CEAS 2006). Mountain View, California: [ s. n. ] ,2006:21-28.
  • 8SHLOMO HERSHKOP. Behavior-based email analysis with application to spam detection [D]. New York, USA: Graduate School of Arts and Sciences of Columbia University, 2006:40-77.
  • 9NIST. Engineering statistics [ EB/OL]. (2006-07-18) [ 2009-11-20 ]. http://www. itl. hist. gov/div898/handbook/apr/sectionl/apr163. htm.
  • 10BREESE JOHN S, HECKERMAN DAVID, KADIE CARL. Empirical analysis of predictive algorithms for collaborative filtering[C]//Proceedings of the 14th Annual Conference on Uncertainty in Artificial Intelligence. San Francisco: Morgan Kaufmann Publisher, 1998:43-52.

二级参考文献18

  • 1王林,戴冠中.复杂网络中的社区发现——理论与应用[J].科技导报,2005,23(8):62-66. 被引量:50
  • 2张光卫,康建初,夏传良,李鹤松.复杂网络集团特征研究综述[J].计算机科学,2006,33(10):1-4. 被引量:12
  • 3Kernighan B W, Lin S. An efficient heuristic procedure for portioning graphs[ J]. Bell System Technical Journal, 1970, 49:291-307.
  • 4Fiedler M. Algebraic connectivity of graphs[ J]. Czechoslovak Mathematical Journal, 1973, 23 (98) : 298-305.
  • 5Pothen A, Simon H D, Liou K P. Partitioning sparse matrices with eigenvectors of graphs[ J]. SIAM Journal on Matrix Analysis and Applications, 1990 ,11 (3) : 430-452.
  • 6Wu F, Huberman B A. Finding communities in linear time: a physics approach[J]. The European Physics Journal B, 2004, 38 : 331-338.
  • 7Girvan M, Newman M E J. Community structure in social and biological networks[J]. PNAS, 2001, 99: 7821-7826.
  • 8Newman M E J. Modularity and community structure in networks[ J]. PNAS, 2006, 103 (23) : 8577-8582.
  • 9Newman M E J. Finding community structure in networks using the eigenvectors of matrices[J]. Physical Review E, 2006, 74(3) : 361-419.
  • 10Rovall M, Bergstrom C T. An information-theoretic framework for resolving community structure in complex networks [ J]. PNAS, 2007, 104(18): 7327-7331.

共引文献21

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部