
I’ve been doing data analytics for almost ten years now. From time to time, I use machine learning techniques to get insights from data, and I’m comfortable using classic ML.
Although I’ve passed a few MOOCs on Neural Networks and Deep Learning, I have never used them in my work, and this domain seemed quite challenging for me. I had all these prejudices:
- You need to learn a lot to start using Deep Learning: maths, different frameworks (I’ve heard at least about three of them:
PyTorch
,TensorFlow
andKeras
) and networks’ architectures. - Huge datasets are required to fit a model.
- It’s impossible to achieve decent results without powerful computers (they also must have Nvidia GPU), so it’s pretty hard to get a setup.
- There is much boilerplate to have an ML-powered service up and running: you need to handle front-end and back-end sides.
I believe the primary goal of analytics is to help the product team make the right decisions based on data. Nowadays, Neural Networks can definitely improve our analysis, i.e. NLP helps to get much more insights from texts. So I’ve decided that it would be helpful for me to make another attempt to leverage power of Deep Learning.
That’s how I started the Fast.AI course (it was updated at the beginning of 2022, so I suppose content has changed since previous reviews on TDS). I’ve realised that solving your tasks using Deep Learning is not so difficult.
This course follows the top-down approach. So you’re starting with building a working system, and only afterwards will you dive deeper to understand all the needed basics and nuances.
I made my first ML-powered app in the second week (_you can try it here_). It’s an image classification model that can identify my favourite dog breeds. Surprisingly, it works well even though only a couple of thousand images were in my dataset. It’s inspiring for me how easily we can now build a service that was complete magic just ten years ago.

So in this article, you will find a beginner-level tutorial on building and deploying your first service powered by Machine Learning.
What is Deep Learning?
Deep Learning is a specific use case of Machine Learning when we use multi-layered Neural Networks as a model.
Neural Networks are extremely powerful. According to Universal Approximation Theorem, Neural Networks can approximate any function, which means they are capable of solving any task.
For now, you can just treat this model as a black box that takes input (in our case – a dog image) and returns output (in our case -a label).

Building a model
You can find the complete code for this stage on Kaggle.
We will be using Kaggle Notebooks to build our Deep Learning model. If you don’t have an account on Kaggle yet, it’s worth going through the registration process. Kaggle is a popular platform for data scientists where you can find datasets, participate in competitions and run and share your code.
You can create a Notebook at Kaggle and execute code here as in your local Jupyter Notebook. Kaggle even provides GPU, so we will be able to train NN models pretty quickly.

Let’s start with importing all packages because we will use many Fast.AI tools.
from fastcore.all import *
from Fastai.vision.all import *
from fastai.vision.widgets import *
from fastdownload import download_url
Loading data
It goes without saying we need a dataset to train our model. The easiest way to get a set of images is by using a search engine.
DuckDuckGo search engine has an easy-to-use API and handy Python package duckduckgo_search
(more info), so we will use it.
Let’s try to search for a dog image. We’ve specified license_image = any
to use only images with Creative Commons license.
from duckduckgo_search import DDGS
import itertools
with DDGS() as ddgs:
res = list(itertools.islice(ddgs.images('photo samoyed happy',
license_image = 'any'), 1))
In the output, we got all the information about the image: name, URLs and sizes.
{
"title": "Happy Samoyed dog photo and wallpaper. Beautiful Happy Samoyed dog picture",
"image": "http://www.dogwallpapers.net/wallpapers/happy-samoyed-dog-wallpaper.jpg",
"thumbnail": "https://tse2.mm.bing.net/th?id=OIP.BqTE8dYqO-W9qcCXdGcF6QHaFL&pid=Api",
"url": "http://www.dogwallpapers.net/samoyed-dog/happy-samoyed-dog-wallpaper.html",
"height": 834, "width": 1193, "source": "Bing"
}
Now we can use Fast.AI tools to download the image and show a thumbnail.

We see a happy samoyed, which means it’s working. So let’s load more photos.
I aim to identify five different dog breeds (my favourite ones). I will load pictures for each breed and store them in separate directories.
breeds = ['siberian husky', 'corgi', 'pomeranian', 'retriever', 'samoyed']
path = Path('dogs_breeds') # defining path
for b in tqdm.tqdm(breeds):
dest = (path/b)
dest.mkdir(exist_ok=True, parents=True)
download_images(dest, urls=search_images(f'photo {b}'))
sleep(10)
download_images(dest, urls=search_images(f'photo {b} puppy'))
sleep(10)
download_images(dest, urls=search_images(f'photo {b} sleep'))
sleep(10)
resize_images(path/b, max_size=400, dest=path/b)
After running this code, you will see all loaded photos on the right panel of Kaggle.

The next step is to convert data to a format suitable for the Fast.AI model – DataBlock
.
There are a few arguments you need to specify for this object, but I will highlight only the most important ones:
splitter=RandomSplitter(valid_pct=0.2, seed=18)
: Fast.AI requires you to select a validation set. The validation set is a hold-out data that will be used to estimate model quality. The validation data isn’t used during training to prevent overfitting. In our case validation set is a random 20% of our dataset. We specified theseed
parameter to be able to reproduce exactly the same split next time.item_tfms=[Resize(256, method='squish')]
: Neural Networks process images in batches. That’s why we must have pictures of the same size. There are different methods for image resizing, we used squish for now, but we’ll discuss it in more detail later.
We’ve defined a data block. The function show_batch
can show us a random set of images with labels.
](https://unsplash.com?utm_source=medium&utm_medium=referral) | Photo by Brigitta Botrágyi on Unsplash | Photo by Charlotte Freeman on Unsplash](https://towardsdatascience.com/wp-content/uploads/2024/12/1h567jnEaBRB3x6PEqSN7Vg.png)
Data looks ok, so let’s proceed to training.
Training the model
You may be surprised, but the two lines of code below will do all the work.

We used a pre-trained model (Convolutional Neural Network with 18 deep layers – Resnet18
). That’s why we called the function fine_tune
.
We trained the model for three epochs, which means the model saw the whole dataset 3 times.
We also specified the metric – accuracy
(the share of correctly labelled pictures). You can see this metric in the results after each epoch (it’s calculated only using the validation set not to skew results). However, it’s not used in the optimization process and is shown only for your information.
The whole process took around 30 minutes, and now our model can predict dogs’ breeds with 94.45% of accuracy. Good job! But could we improve this result?
Improving the model: data cleaning and augmentations
Feel free to leave this section for later and move on to the model’s deployment if you want to see your first model working as soon as possible.
First, let’s see the model’s errors: whether it can’t distinguish corgi from husky or pomeranian from retriever. We can use confusion_matrix
for it. Note that the confusion matrix is also calculated only using the validation set.

The other life hack shared in the Fast.AI course is that a model can be used to clean our data. For it, we can see the images with the highest loss: it could be cases where the model was wrong with high confidence or correct but with low confidence.
](https://unsplash.com?utm_source=medium&utm_medium=referral) | Photo by Xennie Moore on Unsplash | Photo by Alvan Nee on Unsplash](https://towardsdatascience.com/wp-content/uploads/2024/12/1aqkC5XOhJdTJALwicagG3A.png)
Apparently, the first image has an incorrect label while the second one includes both husky and corgi. So there’s some room for improvement.
Luckily, Fast.AI provides a handy ImageClassifierCleaner
widget that could help us quickly fix data issues. You can initialise it in your notebook, then you will be able to change labels in your dataset.
cleaner = ImageClassifierCleaner(learn)
cleaner
After each category, you can run the following code to fix issues: delete the image or move it to the correct folder.
for idx in cleaner.delete(): cleaner.fns[idx].unlink()
for idx,breed in cleaner.change(): shutil.move(str(cleaner.fns[idx]), path/breed)
Now we can train our model again and see that accuracy improved: 95.4% vs 94.5%.

Share of correctly identified corgis has increased from 88% to 96%. Brilliant!

The other way to improve our model is to change our approach to resizing. We used the squish method, but as you may see, it can change the proportions of natural objects. Let’s try to be more imaginative and use augmentations.
Augmentations are changes to the images (for example, contrast improvements, rotations, or crops). It will give our model more variable data and hopefully improve its quality.
As usual with Fast.AI, you need to change just a couple of parameters to add augmentations.

Also, since with augmentations model will see a slightly different picture at each epoch, we can increase the number of epochs. After six epochs, we’ve achieved 95.65% accuracy – a bit better result. The whole process took around an hour.
Downloading model
The last step is to download our model. It’s pretty straightforward.
learn.export('cuttest_dogs_model.pkl')
Then you will have a standard pickle
file (common Python format to store objects) saved. Just choose More actions
next to the file in the right panel of the Kaggle Notebook, and you will get the model on your computer.

Now we have our trained model, let’s deploy it so you can share the results with the world.
Deploying your model
We will use HuggingFace Spaces and Gradio to build our web app.
Setting up HuggingFace Space
HuggingFace is a company providing handy tools for Machine Learning, for example, a popular transformers library or tool to share models and datasets. Today we will be using their Spaces to host our application.
First, you need to create an account if you haven’t registered yet. It will take just a couple of minutes. Follow this link.
Now it’s time to create a new Space. Head to the Spaces tab and push the "create" button. You can find instructions with more details in the documentation.
Then you need to specify the following parameters:
- name (it will be used for your app URL, so choose wisely),
- license (I’ve selected open-source Apache 2.0 license)
- SDK (I will be using Gradio in this example).

Then user-friendly HuggingFace shows you instructions. TL;DR now you have a Git repository, and you need to commit your code there.
There’s one nuance with Git. Since your model is likely pretty big, it’s better to set up Git LFS (Large File Storage), then Git won’t keep track of all the changes for this file. For installation, follow instructions from the site.
-- cloning repo
git clone https://huggingface.co/spaces/<your_login>/<your_app_name>
cd <your_app_name>
-- setting up git-lfs
git lfs install
git lfs track "*.pkl"
git add .gitattributes
git commit -m "update gitattributes to use lfs for pkl files"
Gradio
Gradio is a framework that allows you to build pleasant and friendly web apps just using Python. That’s why it’s an invaluable tool for prototyping (especially, for people without deep javascript knowledge like me).
In Gradio, we will define our interface, specifying the following parameters:
- input – an image,
- output – labels with five possible classes,
- title, description and a set of example images (we will have to commit them to the repo as well),
enable_queue=True
would help app to process huge amount of traffic, if it becomes extremely popular,- function to be executed for input images.
To get a label for an input image, we need to define the prediction function that loads our model and returns a dictionary with probabilities for each class.
In the end, we will have the following code for app.py
import gradio as gr
from fastai.vision.all import *
learn = load_learner('cuttest_dogs_model.pkl')
labels = learn.dls.vocab # list of model classes
def predict(img):
img = PILImage.create(img)
pred,pred_idx,probs = learn.predict(img)
return {labels[i]: float(probs[i]) for i in range(len(labels))}
gr.Interface(
fn=predict,
inputs=gr.inputs.Image(shape=(512, 512)),
outputs=gr.outputs.Label(num_top_classes=5),
title="The Cuttest Dogs Classifier 🐶🐕 🦮🐕 🦺",
description="Classifier trainded on images of huskies, retrievers, pomeranians, corgis and samoyeds. Created as a demo for Deep Learning app using HuggingFace Spaces & Gradio.",
examples=['husky.jpg', 'retriever.jpg', 'corgi.jpg', 'pomeranian.jpg', 'samoyed.jpg'],
enable_queue=True).launch()
If you would like to learn more about Gradio, read the docs.
Let’s also create requirements.txt
file with fastai
then this library will be installed on our server.
So the only bit left is to push everything to HuggingFace Git repository.
git add *
git commit -am 'First version of Cuttest Dogs app'
git push
You can find the full code on GitHub.
After pushing files, return to the HuggingFace Space, and you will see a similar picture showing the building process. If everything is okay, your app will be running in a couple of minutes.

In case there are any problems, you will see a stack trace. Then you will have to return to your code, fix bugs, push a new version, and wait a few more minutes.
It’s working
Now we can use this model with real photos, for example, to verify that my family’s dog is actually a corgi.

Today we’ve gone through the whole process of building a Deep Learning application: from getting the dataset and fitting a model to writing and deploying a web app. Hopefully, you were able to finish this Tutorial, and now you’re testing your fantastic model in production.
Thank you a lot for reading this article. I hope it was insightful to you. If you have any follow-up questions or comments, please leave them in the comments section. Also, don’t hesitate to share link to your app.