No Data No Problem — TensorFlow.js Transfer Learning

Seek out new datasets to boldly train where no models have trained before

Gant Laborde
Towards Data Science

--

Photo by John Fowler from Unsplash

In this article, we’ll attempt to train a browser to identify Star Trek insignias from drawings on an HTML canvas. You’ll learn how to utilize best practices and frameworks to hopefully solve this problem.

Why does this matter?

Web developers are surrounded by amazing feats of AI. The big hurdle that keeps most ideas from being born is a lack of data. Popular training sets include thousands upon thousands of images, and who has time for that? That’s why the method explained in this post is your friend. We’ll cover “transfer learning” with the following:

  • Why is transfer learning so important?
  • What is transfer learning?
  • How can we do this in JavaScript with TensorFlow.js?

In 2018 I gave a talk at React Native EU, and I started with an idea.

“How funny would it be if I were able to create a Nicolas Cage identifier, and we identified Nic Cage in the audience?”

That’s the way most products are created; you formulate a concept, and then you make it.

What’s strange is that in Machine Learning, you don’t know if you’re going to succeed or not. My idea failed at first. NOOOOOOOO!!!!

But… but… why?

Image created by author (scene from WandaVision)

When I looked at my training graphs for my AI, I could see that the machine wasn’t learning. For those of you who are new… it should NOT look like that.

Image by author (Failed training graph)

The culprit? I just didn’t have enough data to train the model.

Image of Star Trek from Giphy.com

I only had a few hundred images of Nic Cage. How could I imagine extending the process to thousands? That’s a lot, and I nearly lost my sanity with gathering a few hundred! It seems like the idea was not going to happen.

That is until I remembered I could capitalize on transfer learning. Transfer learning allows you to utilize a previously trained machine learning model. The model simply has to be trained on something relatively similar to what you’re looking to do. By importing a model that was trained on 500k faces, would I now be able to create a model that can identify Nic Cage?

As many of you know, the talk was a success and continues to be one of the most fun moments I’ve ever had on stage.

From a complete failure to a performable success? The answer was “transfer learning.”

Transfer learning makes small datasets a success!!! (Images by author)

What is Transfer Learning?

When any machine learning model is first created, it is pretty poor. It hasn’t learned anything. A model needs data to get smart. Watching a machine learn can be pretty entertaining. I made a website that allows you to play against an AI that hasn’t learned how to play Tic Tac Toe, but you can teach it how to play by letting it learn from each game.

If you want a guided tour of that site, I made a short video here.

“OK a model gets smarter with data, so what does that have to do with ‘transfer learning’?”

Now we know that a model gets smarter with data. How do we utilize “transfer learning” to get a benefit?

Transfer Learning — is the act of taking a trained model and repurposing it for a similar but different task.

As you might assume. Repurposing requires less data and time. That’s why it’s so valuable. There’s no need to reinvent the wheel if you have an existing model that will suffice.

Image of Star Trek from Giphy.com

⚠⚠⚠ Jargon Alert! ⚠⚠⚠

If you get lost in this blog article but like the lesson, just push on! You’ll learn a lot by leaving your comfort zone and diving headlong into the result.

Some of the vocabularies might seem strange, and that’s OK especially if you’re new to AI or even Star Trek.

Just remember… YES, YOU CAN! Let’s transfer learn!

What are you starting with?

I’ve trained a TensorFlow.js model on nearly 100,000 drawings that I call the Riddikulus dataset. For the full story of the dataset, check it out here:

This existing model had lots of data. We’re going to use this as a starting point. We’re starting with a model that is only 400KB but has hit 90% accuracy with 100,000 black and white images. Loading this existing model can be done in one line.

The following code loads the model and prints the layers.

Image by author using carbon

Now we have our fully functional model off the shelf that performs an action that isn’t what we want. However, we can modify the model to make something brand new ✨ So, what do we really want?

Our mission!

Rather than identifying drawings of animals, we want to identify three Star Trek insignias. There’s Starfleet, the Klingon Empire, and the Ferengi Alliance. Our goal is to make the model properly identify drawings of each.

Images by author representing insignias from Star Trek

I do NOT have thousands of drawings of each logo. However, I can draw each logo 10 times, and then slightly warp those drawings five times each. The result would make 50 new drawings of each logo; a mere dataset of 150 new drawings.

Images drawn by author

In the world of data, a dataset of 150 images is infinitesimal. Nevertheless, with transfer learning, it might just be enough.

Our next plan is to keep most of the model layers, and just cut off the previously created neural network so we can place our own.

Diagram by author

We can cut at any point in the layers, but it makes sense to cut after the fancy trained convolution base, and then combine it with our own untrained network. It’s critical that the convolutional layers are frozen during additional training, otherwise, we will destroy the benefits of the previous training.

Shaving the original model down to the convolutions can be done like so:

Image by author using carbon

This exports the original model’s 3x3x64 dimensional feature layer but does not classify the content. Now you create a new model that expects the features of the previous model as a flattened input and train on the features of previous images.

Image by author using carbon

So the first model creates a tensor of features and the second model interprets those features, like an encoder/decoder. The second model is then trained on the new data (the feature encodings). This effectively separates into two models so the first one is frozen and cannot be modified.

Image by author

So how does this code look in action? Does it work? Can we get a new model from such a small dataset?

Fortunately, the answer to all those questions has been combined into a website for experimentation.

https://aisortinghat.com/transfer/

Image by author

At the top of the page, you can see the steps we’ve identified in this post. Each button enables the next step and pressing the button performs the code in real-time.

Image by author
  • Load Training Images: load the 150 new images into active memory
  • Create Feature Model: This loads the sorting hat model and shaves off the original neural network
  • Create Transfer Model: Create a new untrained model that expects the feature encodings
  • Train Model: Train on the data ten times and make the decoder smart

AND NOW!!!? The moment of truth. The canvas is enabled. Draw your best insignia and see what happens!

Image by author

IT WORKS! A brand new model that can identify Star Trek drawings from only 50 (poorly drawn) images of each! The power of TensorFlow.js and transfer learning unleashed.

Yes, it’s in two parts, but that’s easy enough to fix.

Image by MyNiceProfile.com

If you want, you can now merge the two models together into one and do additional training for even more accuracy.

How do you merge two models? That’s beyond this blog post, but I do cover it in my book (see below). The takeaway is that TensorFlow.js + existing models give you a world of amazing things you can now create. Transfer learning with TensorFlow.js gives you the power of AI, and experience of previously trained models, and conveniently wraps it all up in the browser.

Did you find this exciting, enticing, entertaining? It’s only a taste!

“GANT! TEACH ME THE WAYS! “— You, right now

Image by O’Reilly Media for author Gant Laborde

Chapters 10 and 11 of my book cover this all in deep detail.

It’s a powerful start! JavaScript lets you create front-end websites that can leverage the power of AI directly on the browser.

Reserve your copy on Amazon

Want more help? Consulting available with Infinite Red

Gant Laborde is a co-owner and Chief Innovation Officer at Infinite Red, published author, adjunct professor, worldwide public speaker, and mad scientist in training. Follow/tweet or visit him at a conference.

--

--

Software Consultant, Adjunct Professor, Published Author, Award Winning Speaker, Mentor, Organizer and Immature Nerd :D — Lately full of React Native Tech