The world’s leading publication for data science, AI, and ML professionals.

How to render 3D files using PyTorch3D

At the end of this article, you will know how to:

Hands-on Tutorials

How to render a 3D mesh and convert it to a 2D image using PyTorch3D

A hands-on guide with Python code to render 3D .obj files (polygonal meshes) using PyTorch3D API

Fig 1: How to render 3D files. Image created by myself; source: Behance.
Fig 1: How to render 3D files. Image created by myself; source: Behance.

3D understanding plays a critical role in numerous applications ranging from self-driving cars and autonomous robots to virtual reality and augmented reality. Over the past year, PyTorch3D has become an increasingly popular open-source framework for 3D deep learning with Python. Gratefully, the folks behind the PyTorch3D library have done the legwork of implementing several common 3D operators, loss functions, and differentiable rendering API, making PyTorch3D even more accessible and easier to get started (or play) with [1]. Some of the key PyTorch3D components include:

  • Data structures for storing and manipulating triangle meshes
  • Efficient operations on triangle meshes
  • Differentiable mesh rendering API

Rendering is an essential building block in a computer graphics pipeline that converts 3D representations – be they meshes (.obj) or point clouds (.ply) – into 2D images.

In this post, we’ll build background knowledge on how to render a 3D .obj file from various viewpoints to create 2D images. We’ll also build a basic 3D rendering pipeline using PyTorch3D in Python with components shown below.

Fig 2: PyTorch3D rendering pipeline. Source: 3D Deep Learning with PyTorch3D.
Fig 2: PyTorch3D rendering pipeline. Source: 3D Deep Learning with PyTorch3D.

This post assumes only a basic knowledge of 3D file representation so hopefully it’ll be accessible for everyone 🙂 However, if you’d like to read more about 3D reconstruction, then check out this fabulous, up-to-date resource list [2] or course notes from Stanford CS231A [3] and CS468 [4] classes.

At the end of this article, you will know how to:

  • Load a 3d Mesh using .obj and .mtl files
  • Create a renderer
  • Render the mesh
  • Optional: Use batch properties to render the mesh efficiently from different viewpoints

Just want the code? The entire code is available in this GitHub repository [5].

Ready? Let’s start! 🎉

Step # 1: Import libraries and initialize parameters

We start with importing pre-requisite libraries, such as torch or numpy, and a variety of utility functions and structures from Pytorch3d library.

Fig 3: Importing libraries and utility modules. Code snippet is hosted on GitHub and was created using Carbon.
Fig 3: Importing libraries and utility modules. Code snippet is hosted on GitHub and was created using Carbon.

Lastly, line 43 imports class Params from utils.py, which loads the important hyperparameters from a configuration file. Generally, it is a good practice to write all your parameters in a single file and load them from this particular file. This allows you to keep track of the hyperparameters that you are testing and review which hyperparameters led to the best performance. In our case, the hyperparameters are stored in params_demo.json :

Fig 4: Looking at params_demo.json.
Fig 4: Looking at params_demo.json.

Don’t worry if some of those hypeparameters don’t make sense; I’ll explain them later in this tutorial!

Loading the hyperparameters is done via:

params = Params("params_demo.json")
# Access batch size parameter
print(params.elevation)

Once your params object is initialized, you can also update it with another .json file using params.update("your_other_params.json") method.

Okay, so now that we imported libraries and declared parameters, we can load the mesh. 🎉

Step # 2: Load the 3D mesh

There are a couple of ways to represent 3D data, such as point clouds, meshes, or voxels [6]. In this tutorial we’ll focus on 3D meshes although the same procedure in PyTorch3D is applicable to point clouds too [7].

Information about a 3D textured mesh is typically stored in the following files:

  • .obj file, which stores vertices and faces
  • .mtl file, which stores material properties
  • .jpg or .png texture image

In this tutorial, we will be working with a 3D capsule object, which is stored in data/capsule folder. The example files were obtained from a public repository hosted here [8]. To visualize the mesh we are working with, we can use Blender:

Fig 5: Visualizing the capsule mesh in Blender. Screenshot created by myself.
Fig 5: Visualizing the capsule mesh in Blender. Screenshot created by myself.

PyTorch3D contains several functions to load .obj files, such as load_obj or load_objs_as_meshes. We will use the first one and we load the .obj file using the following syntax:

verts, faces, aux = load_obj(filename)

Here, verts is a (V,3)-tensor of vertices, faces.verts_idx is an (F,3)-tensor of indices for each face corner, and aux stores auxiliary information about the mesh, such as uv coordinates, material colors, or textures. We then pass those verts, faces.verts_idx, and aux structures into the Meshes constructor which creates an object named capsule_mesh:

Fig 6: Loading the mesh. Code snippet is hosted on GitHub.
Fig 6: Loading the mesh. Code snippet is hosted on GitHub.

Lastly, line 33checks the number of faces and vertices in capsule mesh. This returns:

We have 5252 vertices and 10200 faces.

which is what we would expect by checking the .obj file structure.

If you’d like to learn more, the official documentation for Meshes object can be found here [9].

Step # 3: Create a renderer

This is probably the most important step. __ Now that we successfully read our capsule mesh, we need to create a renderer using MeshRenderer class. Looking at MeshRenderer documentation [10], we see that it consists of 2 components:

  • rasterizer
  • shader

So let’s break down this task into 2 steps and, at the end, we put them together.

Step # 3a: Create a rasterizer

Rasterization refers to taking an image representation described in polygons or triangles (.obj file) and converting it into a raster image described in pixels (.png or .jpg file).

Our rasterizer is created by using a class called MeshRasterizer, which also has several subcomponents, such camerasand raster_settingsarguments. Basically, cameras is responsible for transforming 3D coordinates from the world space to the screen space. To initialize the camera, we need 3 important arguments. These are: 1) distance, 2) azimuth angle, and 3) elevation angle. If that sounds like a lot, don’t worry; I’ll go through them step-by-step.

Distance refers to the distance between the camera and the object.

Elevation angle refers to the angle between the vector from the object to the camera and the horizontal plane y=0 (plane xz). Elevation basically tells us from how high we are looking at the object.

Azimuth angle refers to the vector from the object to the camera is projected onto a horizontal plane y = 0. Azimuth angle is the angle between the projected vector and a reference vector at (0, 0, 1) on the reference plane (the horizontal plane). Azimuth angle takes on values on the interval from 0º to 360º. It basically tells us from which side (e.g. left size, right side, front view, back view, etc.) we are looking at the object. See more info here [11], [12].

In our params.json file (Fig 2), we declared that distance is 3, elevation is 0, and azimuth angle is 90, so if we render this mesh, we should be looking at it directly from distance of 3 units.

Regarding raster settings, the most important parameter is the size of the resulting 2D image. The smaller the size, the more pixelated the image will appear.

Fig 7: Creating a rasterizer.
Fig 7: Creating a rasterizer.

Step # 3b: Create a shader

PyTorch3D offers multiple types of shaders, including SoftPhongShader or HardPhongShader. Here we’ll use a predefined SoftPhongShader and pass in the camera and the device on which to initialize the default parameters.

Last but not least, we combine both the rasterizer and the shader:

Fig 8: Creating a renderer.
Fig 8: Creating a renderer.

Step # 4: Render the mesh

This is a very easy step because we only have to call renderer method on our Meshes object. Let’s render our capsule mesh and plot the results:

Fig 9: Rendering the mesh.
Fig 9: Rendering the mesh.
Fig 10: Rendered capsule mesh, which looks very similar to the result in Fig 5.
Fig 10: Rendered capsule mesh, which looks very similar to the result in Fig 5.

The rendered result looks pretty much the same as the Blender visualization in Fig 5, which is a good sign! 🙂

Optional: Step # 5: Using batch properties

Using batch properties may potentially be useful if you want to render the mesh from multiple viewpoints. Before we dive into the code, it’s worth understanding how the current batching implementation works. The current implementation relies on a single argument, which is the batch size. This batch size then divides both elevation and azimuth angle space into n equal increments. So, if your batch size is 4, then your elevation and azimuth angle spaces are torch.linspace(0, 360, 4) which is tensor([0, 120, 240, 360]). In each batch, the index moves along both elevation and azimuth angle lists and stops once all elements are exhausted. As a result, we only get 4 rendered pictures: a) with both elev. and azimuth = 0, b) with both elev. and azimuth = 120, c) with both elev. and azimuth = 240, and d) with both elev. and azimuth = 360.

This is analogous to the Python map() function where you pass two iterable arguments – you also don’t get the results from all pair-wise combinations of those two parameters. So if you expect to get all pair-wise combinations of elevations and azimuth angles, then something like list comprehension is one way to go.

Alright, alright, alright, back to batch properties… We set the batch size to 4, which represents the number of viewpoints from which we want to render the mesh. We use this batch size to extend our meshes, elevation vectors, and azimuth angle vectors. After images are rendered, the resulting tensor has shape [4, 256, 256, 4].

Fig 11: Using batched rendering to render the mesh from multiple viewpoints.
Fig 11: Using batched rendering to render the mesh from multiple viewpoints.
Fig 12: Resulting rendered images from multiple viewpoints.
Fig 12: Resulting rendered images from multiple viewpoints.

Congratulations! 🎉 You have now understood the ins and outs of how to render a 3D mesh from a single and multiple viewpoints.

Here’s what we’ve gone through:

  • We’ve seen the installation of PyTorch3D
  • We’ve loaded the mesh and textures from .obj and .mtl files
  • We’ve created a renderer to render the mesh
  • We’ve utilized PyTorch3D batching features to extend the mesh and render it from multiple viewpoints in a single forward pass

I’ve helped you to explore some basic PyTorch3D properties but please don’t stop here. It’s just the beginning and it’s up to you where this journey will take you!

Before you go

Like the tutorial and have any comments or suggestions? Anything I may have missed? I’d love to hear your thoughts 🙂 Send me a message or follow me on GitHub!

References

[1] https://pytorch3d.org/

[2] https://github.com/timzhang642/3D-Machine-Learning

[3] https://web.stanford.edu/class/cs231a/syllabus.html

[4] https://graphics.stanford.edu/courses/cs468-17-spring/schedule.html

[5] https://github.com/adelekuzmiakova/pytorch3d-renderer

[6] https://towardsdatascience.com/how-to-represent-3d-data-66a0f6376afb

[7] https://github.com/facebookresearch/pytorch3d/blob/master/docs/tutorials/render_colored_points.ipynb

[8] http://paulbourke.net/dataformats/obj/minobj.html

[9] https://github.com/facebookresearch/pytorch3d/blob/master/pytorch3d/structures/meshes.py

[10] https://medium.com/r?url=https%3A%2F%2Fgithub.com%2Ffacebookresearch%2Fpytorch3d%2Fblob%2Fmaster%2Fpytorch3d%2Frenderer%2Fmesh%2Frenderer.py

[11] https://pvpmc.sandia.gov/modeling-steps/1-weather-design-inputs/sun-position/

[12] https://www.celestis.com/resources/faq/what-are-the-azimuth-and-elevation-of-a-satellite/


Related Articles