Visual Analytics for BigData Variety and Its Behaviours

Jinson Zhang, Mao Lin Huang, Zhao-Peng Meng

BigData, defined as structured and unstructured data containing images, videos, texts, audio and other forms of data collected from multiple datasets, is too big, too complex and moves too fast to analyze using traditional methods. This has given rise to a few issues that must be addressed; 1) how to analyze BigData across multiple datasets, 2) how to classify the different data forms, 3) how to identify BigData patterns based on its behaviours, 4) how to visualize BigData attributes in order to gain a better understanding of data. It is therefore necessary to establish a new framework for BigData analysis and visualization. In this paper, we have extended our previous works for classifying the BigData attributes into the „5Ws‟ dimensions based on different data behaviours. Our approach not only classifies BigData attributes for different data forms across multiple datasets, but also establishes the „5Ws‟ densities to represent the characteristics of data flow patterns. We use additional non-dimensional parallel axes in parallel coordinates to display the „5Ws‟ sending and receiving densities, which provide more analytic features for BigData analysis. The experiment shows that our approach with parallel coordinate visualization can be efficiently used for BigData analysis and visualization.