摘要
为了更好地支持不同语料源中话题的集中分析,提出一种可视化分析的方法,可将不同来源文本中相关或独立的话题进行紧密聚类,构成一幅话题全景图,实现高效地交互式话题分析。该方法首先采用相关话题建模的方法,从不同来源的文本中抽取话题,然后,提出一系列交互式的可视化方案,从不同的角度,清晰地展示每个话题的观点和话题之间的关系,使用户可以更深入地分析和理解随着时间的变化,不同来源的文本之间的关系和各自的特点。该可视化分析方案不仅是可交互的,而且能从不同的层次和粒度来表现文本话题之间的关系。最后,将该方法应用于3种来源的论文集,并与专家共同对其进行定性评估,通过对现实世界的案例研究,验证该方法的有效性和鲁棒性。
This paper presents a visual analysis approach to develop a full picture of relevant topics discussed in multiple textual sources. The key idea behind our approach is to jointly cluster the topics extracted from each source in order to interactively and effectively analyze common and distinctive topics. Firstly,This approach extracts topics from textual corpora by using a correlated topic model method. Different sources of textual corpora are matched together with some common topics. Next,this paper develops a visualization tool consisting of three visualization views for better understanding and analyzing matched topics,as well as the relatively independent ones. The major feature of these three visualizations is that they clearly present what each topic is and the relations among them from multiple perspectives,which allows users to analyze deeper relationships and features,from multiple textual corpora and over different periods of time. To meet different users' needs and help better understanding the structure of different textual corpora,these visualizations provide interactive analysis methods at multiple scales. To verify the usability of this method,this paper has applied it in a variety of data sets including IEEE Transactions on Knowledge and Data Engineering( TKDE),ACM Transactions on Graphics( TOG) and IEEE Transactions on Visualization and Computer Graphics( TVCG). Qualitative evaluation and several real-world case studies with domain experts demonstrate the promise of our approach,especially in support of analyzing a visualization based full picture of topics at different levels of detail.
出处
《计算机与现代化》
2016年第5期22-32,38,共12页
Computer and Modernization