Learning How to Dress from Vincent

Extracting color patterns from Van Gogh’s paintings with KMeans

Shuyang Xiang
Towards Data Science

--

The story started with a dress I accidentally saw in an online store. The designer called it “a Van Gogh style” dress and it did have an outstanding design, especially the color matching.

Then I recalled those afternoons when I walked alone in Musée d’Orsay being impressed by the way how Vincent Van Gogh used colors in front of all his paintings and one idea came to my mind, why not learn how to dress from Vincent? Color matching for everyday wear can be a big problem for some people, me included and if we can use the great artist’s work as a guide, we might make fewer mistakes.

As a data scientist, I dare not challenge tasks beyond my profession, so I would rather return to machine learning. In this article, you will be reading how I use KMeans to extract main color patterns from Van Gogh’s paintings and I hope the final result can make you less confused when you are in front of your wardrobe tomorrow morning.

Color separation from an individual painting

In this article, I used a subset of Van Gogh paintings dataset [1], i.e. paintings done in Arles, Paris, Auvers-Sur-Oise, Saint Remy, and Viellge. To extract his color patterns, the first task is to separate different colors from every individual painting and this is a process that can be done with the KMeans clustering method, one of the simplest unsupervised machine learning algorithms. It identifies k numbers of centroids in a set of data points and allocates each point into the nearest cluster.

Back to a painting or an image, is nothing but a set of pixels, each of which consists of three components representing the values in three color channels: red, green, and blue. Note that when we read images with OpenCV, we will end up with a BGR format. We need to convert BGR to RGB in the first place.

Instead of using the well-known elbow method to decide the best number of labels, we would like to directly find 6 clusters for each image at this step to get a united form of color pattern. Here is an example of the painting “vase with carnation and roses and a bottle” and its color palette.

Image by author: A particular painting and its color palette

An issue to take extra care of is that orders of colors should be ignored: for example, a palette of 3 colors [black, white, grey] is indeed equivalent to [white, grey, black]. To avoid the unnecessary distraction brought by orders, we sort the order of colors of the output palette according to the values of all three color channels, so that a color palette with 6 different colors would be forced to follow a unit order. We save all the ordered color palettes in a list which will be processed for the next step. The code below shows how I a create a list of color palettes of all images in the dataset with this extra care:

Color palette clustering

Now we have 6 main colors extracted from all images in the dataset of Van Gogh’s paintings and we would like to see if we can conclude certain patterns that the artist used colors. We would like to use the KMeans clustering method again to identify some numbers of centroids, but this time, among data points such that each of them represents 6 different colors.

Since an artist usually had several types of palettes, we use the elbow method to determine the optimal number of clusters, that is, a number that guarantees different points of different labels are sufficiently far away from each other while each cluster has sufficient samples.

We iterate the values of k from 1 to 10 and calculate the corresponding inertia values.

According to the plot below, 6 is chosen as the cluster number from which no significant improvements are observed for larger numbers.

Image by author: Plot for elbow method

We then get the 6 color palettes that represent the pattern that Van Gogh used colors.

Image by author

Next steps

Color patterns extracted from Van Gogh’s work with the simple and interesting implementation that I gave above can already serve as a guide when we stand in front of our wardrobe: just choose the colors from these Van Gogh’s palettes.

However, the work is far away from being done: the recommendation is not automatized and we still have to make decisions based on our visual impression. Besides, great artists have only given an idea of color matching without other information. For example, you know that yellow and blue will be a good match, but you might still have no idea about whether it should be a yellow dress with a blue hat or blue pants with a yellow scarf. One possibility to treat this concern is to analyze the percentages of the color palettes to give a more precise suggestion by comparing their clothes sizes.

Reference

[1] Kim, Alexander (2022), “Van Gogh Paintings dataset ”, Mendeley Data, V2, doi: 10.17632/3sjjtjfhx7.2

--

--