A very common task in Computer Vision is based on identifying colors.
More generally, a fascinating aspect of learning is that sometimes we know things but we can’t explain them. For example, if I ask you to describe me the "red color" your only option is to show me a red object, like the red orange that is reported above… but you can’t really explain what "red" is.
For this reason a certain effort is necessary to transmit the definition of a color to a computer.
Let’s do it!
1. The image
The first step is to get an image. I’ve used this one:

The great advantage of this algorithm is that, as you will see, it is pretty robust and it gives satisfying output quite independently from the input image, so you can choose your own photo.
2. The libraries
Let’ s invoke some demons. You will need the Sklearn libraries for the Machine Learning part, Numpy for the vector transformations, Pandas for the final summary and some Image Processing typical libraries (cv2, skimage, matplotlib.pyplot, …)
3. The Machine Learning part
This great article gives us a really good hint. In fact, the main idea is that it is possible to use the image as a (N_rows X N_columns X N_channels) vector. Considering this vector, it is possible to apply the K Means algorithm and identify k clusters, that will be our colors.
This is super interesting for several reasons. The first one is that it does not require any specific training on a huge set of images. The second one is that you can increase the number of clusters (and, thus, the number of colors), choosing a smaller or higher amount of tones.
In order to do that, you will need these functions:
And these commands:
These last lines may take a while, but not that much. Plus, this is the only "computationally expensive" part of the process.
Once it is done, you will have your 10 colors (I have chosen them to be 10 by setting "number_of_colors=10")
And here they are.
In particular, it is useful to adopt the rgb encoding of colors (rgb_colors list)… but we’ll get there later.
3. The Image Processing part
So we have our colors. The difficult part is that while we can see them and identify them, we can’t associate each element of the "label" list to its correspondent color. In fact, K Means is an unsupervised learning method.
So if we want to automatically detect the color of the sky and associate it to one of the color in the pie chart, we need to be a little more creative.
The main idea is pretty basic. In fact, the image is RGB encoded. That means that if we compute the difference between the image and the rgb expression of one of the 10 colors, we get [0,0,0] exactly when the image is equal to the color.
Let me tell you, this is not orthodox. In fact, you will have negative pixels. But it works, and if it works, don’t touch it 🙂
The first step is technical, and it is based on converting the RGB in integer values.
Then, if we want to identify the colors of the image, the idea is to break this image into smaller squares. In this case, I’ve chosen the dimension of each square to be N_rows/10 X N_columns/10, thus obtaining 100 squares. These squares are obtained by using the following function.
Of course, the squares are not monochromatic. This means that each square will have multiple colors. Nonetheless, the average distance between the image is the indicator that we need: we pick the color that, in average, is closer to 0 than the others. In particular, we do that by using this function:
With this function, we can plot the "best color" for each square.
Let me show you:
It is great! Isn’t it?
4. The final result
To get the summary of this experiment for all the squares of the image, the following function can be used:
And this is it:
P.S. The number under each column are the percentage of the color for that specific square
Conclusions
We are terrified by the idea that we live in a world with machines that "can see". And while this idea may be alarming, as it is comprehensible to be, at the same time I can’t help but thinking that it is extremely fascinating. I like to think that we are somehow creator of new worlds, and a new nature… or maybe I’m just too tired.
If you liked the article and you want to know more about Machine Learning, or you just want to ask me something you can:
A. Follow me on Linkedin, where I publish all my stories B. Subscribe to my newsletter. It will keep you updated about new stories and give you the chance to text me to receive all the corrections or doubts you may have. C. Become a referred member, so you won’t have any "maximum number of stories for the month" and you can read whatever I (and thousands of other Machine Learning and Data Science top writer) write about the newest technology available.
Ciao! 🙂