
If someone gave you the 1-number summary (central tendency) of the below shown five datasets, in your mind, you would have thought they all are the same since their means are the same but when you plot each Data point of each set and compare them visually you shall realize that there should exist a measure to detect this distinguishing pattern as well.

Say "hi" to the spread measure – Range/Variance/Standard Deviation
Taking reference to the same data used in the previous blog:
Mean, Median & Mode – Which central tendency measure to use & when?
Range
The quickest spread measurer of a data set is Range. We take the maximum and minimum value out of the data set and subtract to get the range.

(*Data assumed as the population data for this blog)
Variance
For this measure, we will have to go back for reference, let’s recollect some thoughts previously mentioned. As stated earlier these numbers are nothing but distance measures from the origin when plotted on the number line.

Let’s take one more dataset with 7 data points of the same value (i.e. Mean of our parent data set).

Parent Data Set = 150,155,160,160,170,175,180 (all in cm)
New Data Set = 164.3, 164.3, 164.3, 164.3, 164.3, 164.3, 164.3 (all in cm)
Now we will calculate the average of squared distances from the origin for both data sets:

Difference = 27092.85–26994.49 = 98.36 cm2
There is an evident difference between these two measures of two data sets but what if we change the reference from origin to the mean, let’s find out:

You too will agree now that things are better in terms of reference and output value, visualizing them again will lead to further clarity:

Standard Deviation
You might have noticed that the unit of the output of Variance is cm2 & if we want a measure of spread having a similar mean reference but with original units (cm), then all we need to do is take the square root of Variance. This is what standard deviation is.

- Note – All the calculations done above are assuming dataset as population set. The most discussed & debated topic related to the denominator of such measures i.e. why it is N-1 for the samples and why N for the population will be explained later in the upcoming blogs.
With the 2-number summary (Central Tendency & Spread), we can better distinguish among datasets and remember every statistical measure has a purpose to serve. These measures are existing to capture details that were not being captured with already available measures at hand. In the upcoming blogs, you will notice the need for capturing further extractable details using more statistical measures. With this, I end this chewable bite here, looking forward to sharing more blogs in the future.
Thanks!!!