Machine Learning Meets Fashion

Published in

Towards Data Science

5 min readNov 3, 2017

In this episode of AI Adventures, we will attempt to go through an entire machine learning workflow into one, pulling best practices from our previous episodes. It’s a lot of material, but I think we can do it!

Training a model with the MNIST dataset is often considered the “Hello world” of machine learning. That’s been done many times over, but unfortunately, just because a model does well on MNIST, doesn’t necessarily mean it is predictive of high performance on other datasets, especially since most image data we have today are considerable more complex than handwritten digits.

Fashionable Machine Learning

Zalando decided it was time to make MNIST fashionable again, and recently released a dataset called fashion-mnist. It’s the exact same format as ‘regular’ MNIST except the data is in the form of pictures of various clothing types, shoes, and bags. It is still across 10 categories and the images are still 28 by 28 pixels.

Let’s train a model to detect which type of clothing is being shown!

Linear Classifier

We’ll start by building a linear classifier, and see how we do. As usual, we’ll use TensorFlow’s Estimator framework to make our code easy to write and maintain. As a reminder, we’ll load in the data, create our classifier, and then run the training and evaluation. We’ll also make some predictions directly from our local model.

Let’s start by creating our model. We’ll flatten the dataset from being 28x28 to 1x784 pixels, and make a feature column called pixels. This is analogous to our flower_features from episode 3, Plain and Simple Estimators.

Next, let’s create our linear classifier. We have 10 different possible classes to label, instead of the 3 that we used previously with the Iris flowers.

To run our training, we’ll need to set up our dataset and input function. TensorFlow has a built-in utility to accept a numpy array to generate an input function , so we’ll take advantage of it.

And we’ll load in our dataset using the input_data module. Point the function at the folder where the dataset is downloaded.

Now we can call classifier.train() to bring together our classifier, the input function, and the dataset.

Finally, we run an evaluation step to see how our model did. When we use the classic MNIST dataset, this model typically gets about 91% accuracy. However, fashion-mnist is a considerably more complex dataset, and we can only achieve an accuracy in the low 80s, and sometimes even lower than that.

How can we do better? As we saw in episode 6, let’s go deep!

Going Deep

Swapping in the DNNClassifier is a one line change, and we can now re-run our training and evaluation to see if a deep neural network can perform any better than the linear one.

As we learned discussed in episode 5, we should now bring up TensorBoard to take a look at these 2 models side by side!

$ tensorboard --logdir=models/fashion_mnist/

( Browse to http://localhost:6006 )

TensorBoard

Looking at tensorboard, It looks like my deep model is performing no better than my linear one did! This is perhaps an opportunity to tune some of my hyperparameters like discussed in episode 2.

Maybe my model needs to be larger to accommodate the complexity in this dataset? Or perhaps my learning rate needs to be lowered? Let’s try that out. Experimenting with these parameters a bit, we can break through and achieve a higher accuracy than our linear model can obtain.

The deep model (in blue/red) achieves a consistently lower loss

It takes quite a bit more steps of training to achieve this accuracy, but ultimately this was worth it for the higher accuracy numbers.

Notice also that the linear model plateaus earlier than the deep network. Because deep models are often more complex than linear ones, they can take longer to train.

At this stage, let’s say we are happy with our model. We’d be able to export it and produce a scalable fashion-mnist classifier API. You can see episode 4 for more details on how to do that.

Making Predictions

Let’s also take a quick peek at how you can make predictions using Estimators. In large part, it looks just like how we call train and evaluate; that’s one of the great things about estimators — the consistent interface.

Notice that this time we’ve specified batch_size of 1, num_epochs of 1, and shuffle as False. This is because we want the predictions to go one by one, making predictions on all the data one time, preserving the order. I’ve extracted 5 images from the middle of the evaluation dataset for us to try predicting on.

I picked these 5 not just because they were in the middle, but because the model got 2 of them wrong. Both were supposed to be shirts, but the model thought that the 3rd example was a bag, and it thought the 5th example was a coat. You can see how these examples are more challenging that handwritten numbers, if for no other reason than the graininess of the images.

Next Steps

You can find the full code that I used to train this model and generate the images here. How did your model perform? And what parameters did you end up using to achieve that accuracy? Let me know in the comments!

Our next set of episodes will be focused on some of the tools of the machine learning ecosystem, to help you build out your workflow and toolchain, as well as showcasing even more architectures that you can employ to solve your machine learning problems. I look forward to seeing you there! Until then, keep on machine learning!

Thanks for reading this episode of Cloud AI Adventures. If you’re enjoying the series, please let me know by clapping for the article. If you want more machine learning action, be sure to follow me on Medium or subscribe to the YouTube channel to catch future episodes as they come out. More episodes coming at you soon!