Spotlighting: A Visual Approach to Precise Clustering Interpretation
On spotlights, radar charts, and how to make sense of your clusters

Understanding the meaning of clusters is maybe more important than making the clusters. The process of making clusters is more mathematically oriented, however, interpreting clusters is not straightforward.
In this story, you will see a visual approach to interpreting clusters. The visual approach described here uses two visual techniques – radar chart and spotlight. Though radar chart is quite known, the spotlighting technique is one of the most under-utilized techniques by data scientists. Here you will see how powerful and visually appealing it is.
But first, let us start with the problem at hand – that of interpreting clusters.
So you have your beautiful-looking clusters. Now what?
The figure below shows the results of K-Means Clustering on data-related cars. The data has different brands of cars and related information such as length, width, horse-power, price, etc. There are more than 25 fields in the dataset, so the dimensionality reduction PCA technique is chosen to visualize the clusters.


The good news is that clusters are well-formed and very visible in the above figure. The not-so-good news is that the real work to understand what the clusters signify is not yet done.
A quick tour of possible cluster interpretation approaches
There are multiple algorithmic approaches that we can use to interpret the clusters. You can refer to my article here on different algorithmic approaches such as PCA eigenvector analysis or using Machine Learning to interpret the clusters.
These algorithmic approaches are interesting, however, introduce additional complexity. So let us see a visual approach to the problem.
Let’s bring the clusters on our radar!
Let us now radar chart our clusters! But first, you may ask, why radar chart? To answer your boiling question, let me state two facts.
Cluster interpretation actually means defining the cluster in terms of dimensions in the data. As there are multiple dimensions in the data,
Cluster interpretation is a "multi-dimensional" analysis problem
Now on to the second fact.
Radar chart is a "multi-dimensional" visualisation technique
Radar charts are cooler than scatter charts, bar charts, etc.., as they help visualize data in multi-dimensions. So it is a perfect technique for the cluster interpretation problem.
Here is the radar chart based on clustering output. The color of the groups corresponds to the clusters – red, green, and blue.

Wow! The multi-dimensional Visualization looks much better than a two-dimensional scatterplot. The left side of the radar chart has the numeric fields in the data. The right side has the categorical fields.
Now let us go one step further by analyzing the differences between the red, green and blue groups as shown in the figure below.

You will observe that the numeric fields, which are on the left side, have a clear separation between the red, green, and blue areas. However, the categorical fields, on the right side, do not have a clear separation and appear mixed up. This implies that the numeric fields are good candidates to interpret the clusters.
We see that the red cluster has low values of height, weight, num-of-cylinders, engine size, and price. The blue cluster has medium values in these fields and the green cluster has high values. We can translate this observation into the following statements
- the red cluster is a small-car cluster
- the blue cluster is a mid-size car cluster
- the green cluster is a large-size car cluster
Awesome! This is already a breakthrough, as it has given us what we are looking for – the meaning of each cluster.
Now let us take the cluster interpretation to the next level with the spot-lighting technique.
Spotlight the clusters!
Till now, we have a meaning associated with each cluster, such as the red cluster is for a small-sized car. However, we do not have what is the level of numerical fields which classify the car as a small car. For example, what is the value of price or engine size which would signify a small car?
Let us answer this question using the spotlighting technique.
Spotlighting is a way to highlight certain data, without hiding the rest.
As I mentioned earlier, spotlighting is one of the most effective, yet most under-utilized visualization techniques by data scientists.
To demonstrate the technique, we will take our scatter plot which was shown earlier. Then we will select one of the numeric fields, vary its value to see, and then spotlight which are dots gets spotlighted. Shown below is an animated visual, which demonstrated the spotlight technique.

You will observe that between the price of 0 to approximately 11000, the red cluster is highlighted and the other clusters become colorless. However, they do not disappear. This is called spotlighting. It is much more powerful than filtering, as all data points are still on the visualization.
Here are the results of spotlighting shown as images.
Spotlighting the red cluster

Spotlighting the blue cluster

Spotlighting the green cluster

You will observe that, even though there is a slight overlap, the spot-lighting helps to determine the threshold values for the numeric fields.
We can make the following conclusion:
- the red cluster, which is a small-car cluster, is all cars priced less than approximately 11000
- the blue cluster, which is a mid-size car cluster, is all cars above 11000, as well as weight less than 3000
- the green cluster, which is a large-size car cluster, all cars above 11000, as well as a weight more than 3000
This is unbelievable! We now have the precise definitions of our clusters! Let us give it a name – Precision Cluster Interpretation! You will not see this terminology anywhere and here you have seen it first! Congratulations!
Datasource citation
The data is from https://archive.ics.uci.edu/ml/datasets/automobile.
Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
It’s your turn now!
You can visit my website to make cluster interpretations as well as other analytics with no coding. https://experiencedatascience.com
Here is a step-by-step tutorial on my Youtube channel. You will be able to customize the demo to your data with zero-coding
Please subscribe in order to stay informed whenever I release a new story.
You can also join Medium with my referral link. Thank you.