The world’s leading publication for data science, AI, and ML professionals.

PyTorch: Switching to the GPU

How and Why to train models on the GPU – Code Included.

Unlike TensorFlow, PyTorch doesn’t have a dedicated library for GPU users, and as a developer, you’ll need to do some manual work here. But in the end, it will save you a lot of time.

Photo by Artiom Vallat on Unsplash
Photo by Artiom Vallat on Unsplash

Just if you are wondering, installing CUDA on your machine or switching to GPU runtime on Colab isn’t enough. Don’t get me wrong, it is still a necessary first step, but doing only this won’t leverage the power of the GPU.

In this article you’ll find out how to switch from CPU to GPU for the following scenarios:

  1. Train/Test split approach
  2. Data Loader approach

The first one is most commonly used for tabular data, whilst you’ll use the second one pretty much every time you’re dealing with image data (at least according to my experience).

There are quite some differences between these two approaches, so each one will be explained in depth. I should also mention that I will be using Google Colab for this article. You can read my opinion and more about it here if you haven’t already:

Google Colab: How does it compare to a GPU-enabled laptop?

The article is structured as follows:

  1. Why Should I Switch to the GPU?
  2. Train/Test Split Approach
  3. DataLoader Approach
  4. Conclusion

So without much ado, let’s get started!


Why Should I Switch to the GPU?

In cases where you are using really deep neural networks – e.g. transfer learning with ResNet152 – training on the CPU will last for a long time. If you are a sane person you won’t try to do that.

The linear algebra operations are done in parallel on the GPU and therefore you can achieve around 100x decrease in training time. Needless to mention, but it is also an option to perform training on multiple GPUs, which would once again decrease training time.

You don’t need to take my words for it. I’ve decided to make a Cat vs Dog classifier based on this dataset. The model is based on the ResNet50 architecture – trained on the CPU first and then on the GPU.

Here are the training times:

GPU runtime: 00:11:57h; CPU runtime: 06:08:40h
GPU runtime: 00:11:57h; CPU runtime: 06:08:40h

Judge for yourself, but I’ll stick to the GPU runtime. It’s free on Colab, so there’s no reason why not. Okay, we now know that GPUs should be used for model training, let’s now see how to make a switch.


Train/Test Split Approach

If you’ve done some machine learning with Python in Scikit-Learn, you are most certainly familiar with the train/test split. In a nutshell, the idea is to train the model on a portion of the dataset (let’s say 80%) and evaluate the model on the remaining portion (let’s say 20%).

Train/Test split is still a valid approach in Deep Learning – particularly with tabular data. The first thing to do is to declare a variable which will hold the device we’re training on (CPU or GPU):

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
device
>>> device(type='cuda')

Now I will declare some dummy data which will act as X_train tensor:

X_train = torch.FloatTensor([0., 1., 2.])
X_train
>>> tensor([0., 1., 2.])

Cool! We can now check if the tensor is stored on the GPU:

X_train.is_cuda
>>> False

As expected – by default data won’t be stored on GPU, but it’s fairly easy to move it there:

X_train = X_train.to(device)
X_train
>>> tensor([0., 1., 2.], device='cuda:0')

Neat. The same sanity check can be performed again, and this time we know that the tensor was moved to the GPU:

X_train.is_cuda
>>> True

Great, but what about model declaration?

I’m glad you’ve asked. Once again, it’s a pretty straightforward thing to do:

model = MyAwesomeNeuralNetwork()
model.to(device)

And that’s it, you can begin the training process now. Just to recap, here’s a summary of how your code should be structured:

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
X_train, X_test = X_train.to(device), X_test.to(device)
y_train, y_test = y_train.to(device), y_test.to(device)
class MyAwesomeNeuralNetwork(nn.Module):
    # your model here
model = MyAwesomeNeuralNetwork()
model.to(device)
# training code here

Let’s proceed with the DataLoader approach.


DataLoader Approach

DataLoader approach is more common for CNNs and in this section, we’ll see how to put data (images) on the GPU. The first step remains the same, ergo you must declare a variable which will hold the device we’re training on (CPU or GPU):

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
device
>>> device(type='cuda')

Now we will declare our model and place it on the GPU:

model = MyAwesomeNeuralNetwork()
model.to(device)

You’ve probably noticed that we haven’t placed data on the GPU yet. It’s not possible to transfer Data Loaders directly, so we’ll have to be a bit smarter here. We’ll transfer the images during the training process, like this:

for epoch in range(epochs):
    for inputs, labels in train_loader:
        inputs, labels = inputs.to(device), labels.to(device)

So the overall structure of your code should look something like this:

class MyAwesomeNeuralNetwork(nn.Module):
    # your model here
model = MyAwesomeNeuralNetwork()
model.to(device)
epochs = 10
for epoch in range(epochs):
    for inputs, labels in train_loader:
        inputs, labels = inputs.to(device), labels.to(device)
        # backpropagation code here
        # evaluation
        with torch.no_grad():
            for inputs, labels in test_loader:
                inputs, labels = inputs.to(device), labels.to(device)
        # ...

And that’s all you have to do – both data and model are placed on GPU.


Conclusion

And there you have it – two steps to drastically reduce the training time. At first, it might seem like a lot of additional steps you need to perform, but it’s straightforward once you get the gist of it.

Training on the CPU is something I would never advise you to do, and thanks to Google Colab you don’t have to – as you can use GPU runtime for free.

Thanks for reading.


Loved the article? Become a Medium member to continue learning without limits. I’ll receive a portion of your membership fee if you use the following link, with no extra cost to you.

Join Medium with my referral link – Dario Radečić


Related Articles