The world’s leading publication for data science, AI, and ML professionals.

Data Visualization in Big Data

How to visualize Big Amounts of Data

How to gain Insights on new Visualizing Techniques

Photo by Nikolay Maslov on Unsplash
Photo by Nikolay Maslov on Unsplash

In the world of Big Data, data visualization tools and techniques are essential to analyze large amounts of information and make data-driven decisions as data is increasingly used for important management decisions. So there is a trend away from gut feeling and emotional decisions towards rational choices that are made based on numbers. Therefore, reports and visualizations have to be easily understood and meaningful.

Impact of Big Data

It is increasingly beneficial for professionals to be able to use data to make decisions and visuals to tell stories that communicate how data informs the question of person, subject, time, place, and method [1]. In the area of Big Data visualization comes with new ways and challenges due to the huge amounts of data. Therefore, new visualization techniques had to be created in order to make the data amounts more tangible for the user.

Toolset to get Started

In the examples that follow immediately below of new visualization possibilities in the area of Big Data, I have used Google’s Bigquery and Data Studio. For the free tier of BigQuery, you can simply register and use public data sets here [2], which definitely fall under the Big Data label. Data Studio is free anyway and a great alternative to MS Power BI, Qlik and other BI Tools. Since you get a whole scalable Data Warehouse technology and the BI layer for free, I find Google as a sandbox for your first steps in the field of Big Data visualization perfectly suited.

Visualization Examples

Here a few examples of visualizations that I used and have seen the most in the field of representing Big Data.

TreeMapsA tree map or a tile chart is used to visualize hierarchical structures, which are represented by nested rectangles. In this way, size ratios can be vividly displayed by selecting the area of the rectangles proportionally to the size of the data unit to be displayed.

Treemap Example - Image by Author
Treemap Example – Image by Author

Here, I visualized the wholesale purchase of liquor in the State of Iowa by retailers for sale to individuals since January 1, 2012 per category. The example shows once again, how this type of chart is perfectly showing size relationships.

MapsSpeaking of states, maps are another great way to represent a lot of data. Here, the same data is visualized differently like a map. Size of the bubbles can represent the number of bottles sold, but of course also any other measure.

Maps as Dashboard - Image by Author
Maps as Dashboard – Image by Author

Maps have the advantage that they are common. People know how to read a map, so it is relatively easy for the audience to understand.

GaugeA gauge diagram can very well represent states, which have different evaluations and are to be clarified with colors. This diagram is based on a simple pie chart. A gauge diagram is particularly well suited for measuring target / actual comparisons for key figures, customer satisfaction or quality measurements.

Gauge Diagram - Image by Author
Gauge Diagram – Image by Author

Here, for example, the sales of a company in relation to the overall average are pictured.

SunburstThe Sunburst chart is ideal for displaying hierarchical data. Each level of the hierarchy is indicated by a ring or circle, with the innermost circle representing the top level of the hierarchy. Sunburst diagrams with multiple category levels show how the outer rings relate to the inner rings. The sunburst diagram is particularly useful for showing how a ring is divided into its contributing components, while another type of diagram, the treemap diagram, is particularly useful for comparing relative sizes [3].

Sunburst - Image by Author
Sunburst – Image by Author

Other useful Viz

There are of course other common diagrams which can be used in the area of Big Data, e.g.:

  • Heat Maps
  • Word Clouds
  • Symbol Maps
  • Dendrogram
  • Network Models

Summary

With large amounts of data comes new challenges to visualize it. Therefore, different techniques and diagrams are needed than the common used visualizations like tables, bar charts, etc. In the best case, it is possible to create a simple and clear visualization and still not let any information fall under the table. In this article, I showed some examples and a good toolset to start with BigQuery and Data Studio. These new cloud-based technologies are a prerequisite for processing such large amounts of data anyway.

Sources and Further Readings

[1]Tableau, Handbuch zur Datenvisualisierung: Definition, Beispiele und Lernressourcen (2021)

[2]Google, Solve real business challenges on Google Cloud (2021)

[3]Microsoft, Erstellen eines Sunburst-Diagramms in Office (2021)


Related Articles