Authors
Frank van Ham
Martin Wattenberg
Fernanda B. Viégas

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TVCG.2009.165

Abstract
We present a new technique, the phrase net, for generating visual overviews of unstructured text. A phrase net displays a graph whose nodes are words and whose edges indicate that two words are linked by a user-specified relation. These relations may be defined either at the syntactic or lexical level; different relations often produce very different perspectives on the same text. Taken together, these perspectives often provide an illuminating visual overview of the key concepts and relations in a document or set of documents.


Authors
Yanhua Chen
Lijun Wang
Ming Dong
Jing Hua

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TVCG.2009.140

Download - Video | Audio
Running Time: 18 min 28 sec

Abstract
With the rapid growth of the World Wide Web and electronic information services, text corpus is becoming available on-line at an incredible rate. By displaying text data in a logical layout (e.g., color graphs), text visualization presents a direct way to observe the documents as well as understand the relationship between them. In this paper, we propose a novel technique, Exemplar-based Visualization (EV), to visualize an extremely large text corpus. Capitalizing on recent advances in matrix approximation and decomposition, EV presents a probabilistic multidimensional projection model in the low-rank text subspace with a sound objective function. The probability of each document proportion to the topics is obtained through iterative optimization and embedded to a low dimensional space using parameter embedding. By selecting the representative exemplars, we obtain a compact approximation of the data. This makes the visualization highly efficient and flexible. In addition, the selected exemplars neatly summarize the entire data set and greatly reduce the cognitive overload in the visualization, leading to an easier interpretation of large text corpus. Empirically, we demonstrate the superior performance of EV through extensive experiments performed on the publicly available text data sets.


Authors
Jian Zhang
Chaomei Chen
Jiexun Li

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TVCG.2009.202

Abstract
Visualizing the intellectual structure of scientific domains using co-cited units such as references or authors has become a routine for domain analysis. In previous studies, paper-reference matrices are usually transformed into reference-reference matrices to obtain co-citation relationships, which are then visualized in different representations, typically as node-link networks, to represent the intellectual structures of scientific domains. Such network visualizations sometimes contain tightly knit components, which make visual analysis of the intellectual structure a challenging task. In this study, we propose a new approach to reveal co-citation relationships. Instead of using a reference-reference matrix, we directly use the original paper-reference matrix as the information source, and transform the paper-reference matrix into an FP-tree and visualize it in a Java-based prototype system. We demonstrate the usefulness of our approach through visual analyses of the intellectual structure of two domains: Information Visualization and Sloan Digital Sky Survey (SDSS). The results show that our visualization not only retains the major information of co-citation relationships, but also reveals more detailed sub-structures of tightly knit clusters than a conventional node-link network visualization.


Authors
Hendrik Strobelt
Daniela Oelke
Christian Rohrdantz
Andreas Stoffel
Daniel A. Keim
Oliver Deussen

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TVCG.2009.139

Abstract
Finding suitable, less space consuming views for a document’s main content is crucial to provide convenient access to large document collections on display devices of different size. We present a novel compact visualization which represents the document’s key semantic as a mixture of images and important key terms, similar to cards in a top trumps game. The key terms are extracted using an advanced text mining approach based on a fully automatic document structure extraction. The images and their captions are extracted using a graphical heuristic and the captions are used for a semi-semantic image weighting. Furthermore,
we use the image color histogram for classi?cation and show at least one representative from each non-empty image class. The approach is demonstrated for the IEEE InfoVis publications of a complete year. The method can easily be applied to other publication collections and sets of documents which contain images.


Authors
Fernanda B. Viégas
Martin Wattenberg
Jonathan Feinberg

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TVCG.2009.171

Abstract
We discuss the design and usage of “Wordle,” a web-based tool for visualizing text. Wordle creates tag-cloud-like displays that give careful attention to typography, color, and composition. We describe the algorithms used to balance various aesthetic criteria and create the distinctive Wordle layouts. We then present the results of a study of Wordle usage, based both on spontaneous behaviour observed in the wild, and on a large-scale survey of Wordle users. The results suggest that Wordles have become a kind of medium of expression, and that a “participatory culture” has arisen around them.


Authors
Chris Muelder
Francois Gygi
Kwan-Liu Ma

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TVCG.2009.196

Abstract
In serial computation, program pro?ling is often helpful for optimization of key sections of code. When moving to parallel computation, not only does the code execution need to be considered but also communication between the different processes which can induce delays that are detrimental to performance. As the number of processes increases, so does the impact of the communication delays on performance. For large-scale parallel applications, it is critical to understand how the communication impacts performance in order to make the code more ef?cient. There are several tools available for visualizing program execution and communications on parallel systems. These tools generally provide either views which statistically summarize the entire program execution or process-centric views. However, process-centric visualizations do not scale well as the number of processes gets very large. In particular, the most common representation of parallel processes is a Gantt char t with a row for each process. As the number of processes increases, these charts can become dif?cult to work with and can even exceed screen resolution. We propose a new visualization approach that affords more scalability and then demonstrate it on systems running with up to 16,384 processes.


Authors
Michael Bostock
Jeffrey Heer

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TVCG.2009.174

Abstract
Despite myriad tools for visualizing data, there remains a gap between the notational efficiency of high-level visualization systems and the expressiveness and accessibility of low-level graphical systems. Powerful visualization systems may be inflexible or impose abstractions foreign to visual thinking, while graphical systems such as rendering APIs and vector-based drawing programs are tedious for complex work. We argue that an easy-to-use graphical system tailored for visualization is needed. In response, we contribute Protovis, an extensible toolkit for constructing visualizations by composing simple graphical primitives. In Protovis, designers specify visualizations as a hierarchy of marks with visual properties defined as functions of data. This representation achieves a level of expressiveness comparable to low-level graphics systems, while improving efficiency–the effort required to specify a visualization–and accessibility–the effort required to learn and modify the representation. We substantiate this claim through a diverse collection of examples and comparative analysis with popular visualization tools.


Authors
Harald Piringer
Christian Tominski
Philipp Muigg
Wolfgang Berger

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TVCG.2009.110

Abstract
During continuous user interaction, it is hard to provide rich visual feedback at interactive rates for datasets containing millions of entries. The contribution of this paper is a generic architecture that ensures responsiveness of the application even when dealing with large data and that is applicable to most types of information visualizations. Our architecture builds on the separation of the main application thread and the visualization thread, which can be cancelled early due to user interaction. In combination with a layer mechanism, our architecture facilitates generating previews incrementally to provide rich visual feedback quickly. To help avoiding common pitfalls of multi-threading, we discuss synchronization and communication in detail. We explicitly denote design choices to control trade-offs. A quantitative evaluation based on the system VI S P L ORE shows fast visual feedback during continuous interaction even for millions of entries. We describe instantiations of our architecture in additional tools.


Authors
Bryan McDonnel
Niklas Elmqvist

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TVCG.2009.191

Abstract
Modern programmable GPUs represent a vast potential in terms of performance and visual flexibility for information visualization research, but surprisingly few applications even begin to utilize this potential. In this paper, we conjecture that this may be due to the mismatch between the high-level abstract data types commonly visualized in our field, and the low-level floating-point model supported by current GPU shader languages. To help remedy this situation, we present a refinement of the traditional information visualization pipeline that is amenable to implementation using GPU shaders. The refinement consists of a final image-space step in the pipeline where the multivariate data of the visualization is sampled in the resolution of the current view. To concretize the theoretical aspects of this work, we also present a visual programming environment for constructing visualization shaders using a simple drag-and-drop interface. Finally, we give some examples of the use of shaders for well-known visualization techniques.


Authors
Michael Ogawa
Kwan-Liu Ma

DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TVCG.2009.123

Download - Video | Audio
Running Time: 17 min 58 sec

Abstract
In May of 2008, we published online a series of software visualization videos using a method called code_swarm. Shortly thereafter, we made the code open source and its popularity took off. This paper is a study of our code swarm application, comprising its design, results and public response. We share our design methodology, including why we chose the organic information visualization technique, how we designed for both developers and a casual audience, and what lessons we learned from our experiment. We validate the results produced by code_swarm through a qualitative analysis and by gathering online user comments. Furthermore, we successfully released the code as open source, and the software community used it to visualize their own projects and shared their results as well. In the end, we believe code_swarm has positive implications for the future of organic information design and open source information visualization practice.

keep looking »