期刊文献+

基于高频词汇的英文文本可视化 被引量:3

Visualization Based on High-frequency Words for English Text
下载PDF
导出
摘要 为探索高频词汇间上下文关系的远近,本文研究了一种基于英文文本中高频词汇的可视化算法流程,并进行了可视化实现。我们首先用统计算法从英文文本中抽取出高频词汇及词汇间的上下文,然后定义了3种词汇间的连接方式,计算出有上下文关系的词汇间的关系度,并通过k-means算法对词汇间的关系度进行聚类,以体现出词汇间关系的远近,最后利用放射状树布局对聚类结果进行可视化。通过这种可视化形式,我们能够快速理解英文文本的内容。 Targeting at exploring whether high-frequency words' context relations are close or distant,this paper studied on the algorithmic process of a kind of visual form based on high-frequency words in English texts and achieves this visual form.This paper firstly used statistic algorithm to extract high-frequency words and their context,then defined three kinds of context relations among words,compute values of relations among words that have context,cluster the values' set through k-means cluster algorithm to show whether words' context relations are close or distant.Finally,visualized these clustering results by means of radial layout graph.Through this visual form,can quickly understand the contents of the English text.
出处 《现代情报》 CSSCI 2011年第8期21-24,共4页 Journal of Modern Information
关键词 文本可视化 高频词汇 K-MEANS聚类算法 放射状树布局 text visualization high-frequency words k-means algorithm radial layout graph
  • 相关文献

参考文献13

  • 1Dan Kurland.Three Ways to Read and Discuss Texts. http:∥www.criticalreading.com/ways-to-read.htm . 2010
  • 2M.Wattenberg,F.B.Viégas.The word tree,an interactive visualconcordance. IEEE Trans.on Visualization and Computer Graph-ics . 2008
  • 3Stop words. http:∥en.wikipedia.org/wiki/Stop-words . 2010
  • 4M.F.Porter.An algorithm for suffix stripping. Program:elec-tronic library and information systems .
  • 5Edward Segel,Jeffrey Heer.Narrative Visualization:Telling Storieswith Data. IEEE Transactions on Visualization and ComputerGraphics . 2010
  • 6k-means clustering. http:∥en.wikipedia.org/wiki/K-means-clustering . 2010
  • 7Prefuse. http:∥www.prefuse.org . 2010
  • 8Yee K P,Fisher D,Dhamija R,Hearst M.Animated exploration of dynamic graphs with radial layout. IEEE Symposium on Information Visualization . 2001
  • 9Paley W B.TextArc:Showing Word Frequency and Distribution inText. IEEE Symposium on Information Visualization . 2002
  • 10L.Nowell,S.Havre,B.Hetzler,P.Whitney."Themeriver:Visualizing thematic changesinlarge docu-ment collections,". Transactions on Visualization and Computer Graphics . 2001

同被引文献41

引证文献3

二级引证文献31

相关作者

内容加载中请稍等...

相关机构

内容加载中请稍等...

相关主题

内容加载中请稍等...

浏览历史

内容加载中请稍等...
;
使用帮助 返回顶部