The world’s leading publication for data science, AI, and ML professionals.

Torch and Torchvision C++ installation and debugging on Linux

Example debugging RoIAlign from Torchvision

https://unsplash.com/@udayawal
https://unsplash.com/@udayawal

Introduction

In this tutorial we are going to install Torch and Torchvision C++ libraries on a Linux machine to debug a function that is not written in Python but in C++. Because our function of interest is written in C++ and not in Pytorch, we cannot debug it from the Python API to Torchvision and thus we need to dive into the C++ code. If you want to know more about how Python and C++ link and how C++ code is called from Python you can refer to my previous article here.

Torch installation

First of all we need to download the Torchlib C++ library, that provides binary distributions of all headers and libraries needed to use Torch. This can be done from the official PyTorch website here. I selected the following configuration for the download:

Image by Author - Figure 1
Image by Author – Figure 1

Then we create a new folder for the project, place there the .zip folder we have just downloaded and unzip it there.

mkdir torchinstall
unzip libtorch-cxx11-abi-shared-with-deps-1.13.1+cpu.zip

From here we will refer to the official documentation of Torch in order to compile and build files that depend on Torch library. First of all let’s create two files:

  • main.Cpp – C++ file where we will write some code to make sure we can use the install Torch library and it’s working on our machine
  • CMakeLists.txt – a text file containing instructions for CMake tool to generate and build files

And a build folder where our compiled main.cpp file will be stored

touch main.cpp
touch CMakeLists.txt

#################################################################
touch main.cpp file will contain the following code:

// import libraries 
#include <iostream>
#include <torch/torch.h>

int main(){
  torch::manual_seed(0); // set manual seed
  torch::Tensor x = torch::randn({2,3}); // create torch random tensor
  std::cout << x;} // print tensor

#################################################################
touch CMakeLists.txt file will contain the following:

cmake_minimum_required(VERSION 3.0)
# project name
project(debugfunc)

# define path to the libtorch extracted folder
set(CMAKE_PREFIX_PATH /home/alexey/Desktop/torchinstall/libtorch)

# find torch library and all necessary files
find_package(Torch REQUIRED)
set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${TORCH_CXX_FLAGS}")

# executable to add that we want to compile and run
add_executable(debugfunc main.cpp)
# link torch libraries to our executable
target_link_libraries(debugfunc "${TORCH_LIBRARIES}")
set_property(TARGET debugfunc PROPERTY CXX_STANDARD 14)
#################################################################

# create build folder
mkdir build

Now we have everything ready to compile, build and run our main.cpp file.

# go into the build folder
cd build
# compile main.cpp file
cmake ..
# build it
make
# run the built file
./debugfunc

If everything was done correctly, you should see the following output:

Image by Author - Figure 2
Image by Author – Figure 2

Congratulations, you can now build and run files that use torch C++ library! Next step is to install torchvision C++ library.

Torchvision Installation

Let’s go back to our Desktop directory and create another folder called torchvision. First of all download as zip torchvision C++ library from here , place it into out torchvision directory and unzip. After that we go into the unzipped folder vision-main, create a build directory and open CMakeLists.txt file in the vision-main to amend and add some things.

mkdir torchvision
unzip vision-main.zip
cd vision-main

mkdir build

The final version of the CMakeLists.txt file looks like below for me. I added _CMAKE_PREFIXPATH and turned off all the options like _WITHCUDA, _WITHPNG, _WITHJPEG and _USEPYTHON.

cmake_minimum_required(VERSION 3.12)
project(torchvision)
set(CMAKE_CXX_STANDARD 14)
file(STRINGS version.txt TORCHVISION_VERSION)

# added CMAKE_PREFIX_PATH
set(CMAKE_PREFIX_PATH /home/alexey/Desktop/torchinstall/libtorch;)

# turned off all the options
option(WITH_CUDA "Enable CUDA support" OFF)
option(WITH_PNG "Enable features requiring LibPNG." OFF)
option(WITH_JPEG "Enable features requiring LibJPEG." OFF)
option(USE_PYTHON "Link to Python when building" OFF)

if(WITH_CUDA)
  enable_language(CUDA)
  add_definitions(-D__CUDA_NO_HALF_OPERATORS__)
  add_definitions(-DWITH_CUDA)
  set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} --expt-relaxed-constexpr")
  # CUDA-11.x can not be compiled using C++14 standard on Windows
  string(REGEX MATCH "^[0-9]+" CUDA_MAJOR ${CMAKE_CUDA_COMPILER_VERSION})
  if(${CUDA_MAJOR} GREATER 10 AND MSVC)
    set(CMAKE_CXX_STANDARD 17)
  endif()
endif()

find_package(Torch REQUIRED)

if (WITH_PNG)
    add_definitions(-DPNG_FOUND)w
    find_package(PNG REQUIRED)
endif()

if (WITH_JPEG)
    add_definitions(-DJPEG_FOUND)
    find_package(JPEG REQUIRED)
endif()

if (USE_PYTHON)
  add_definitions(-DUSE_PYTHON)
  find_package(Python3 REQUIRED COMPONENTS Development)
endif()

function(CUDA_CONVERT_FLAGS EXISTING_TARGET)
    get_property(old_flags TARGET ${EXISTING_TARGET} PROPERTY INTERFACE_COMPILE_OPTIONS)
    if(NOT "${old_flags}" STREQUAL "")
        string(REPLACE ";" "," CUDA_flags "${old_flags}")
        set_property(TARGET ${EXISTING_TARGET} PROPERTY INTERFACE_COMPILE_OPTIONS
            "$<$<BUILD_INTERFACE:$<COMPILE_LANGUAGE:CXX>>:${old_flags}>$<$<BUILD_INTERFACE:$<COMPILE_LANGUAGE:CUDA>>:-Xcompiler=${CUDA_flags}>"
            )
    endif()
endfunction()

if(MSVC)
  set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} /wd4819")
  if(WITH_CUDA)
    set(CMAKE_CUDA_FLAGS "${CMAKE_CUDA_FLAGS} -Xcompiler=/wd4819")
    foreach(diag cc_clobber_ignored integer_sign_change useless_using_declaration
      set_but_not_used field_without_dll_interface
      base_class_has_different_dll_interface
      dll_interface_conflict_none_assumed
      dll_interface_conflict_dllexport_assumed
      implicit_return_from_non_void_function
      unsigned_compare_with_zero
      declared_but_not_referenced
      bad_friend_decl)
      string(APPEND CMAKE_CUDA_FLAGS " -Xcudafe --diag_suppress=${diag}")
    endforeach()
    CUDA_CONVERT_FLAGS(torch_cpu)
    if(TARGET torch_cuda)
      CUDA_CONVERT_FLAGS(torch_cuda)
    endif()
    if(TARGET torch_cuda_cu)
      CUDA_CONVERT_FLAGS(torch_cuda_cu)
    endif()
    if(TARGET torch_cuda_cpp)
      CUDA_CONVERT_FLAGS(torch_cuda_cpp)
    endif()
  endif()
endif()

include(GNUInstallDirs)
include(CMakePackageConfigHelpers)

set(TVCPP torchvision/csrc)
list(APPEND ALLOW_LISTED ${TVCPP} ${TVCPP}/io/image ${TVCPP}/io/image/cpu ${TVCPP}/models ${TVCPP}/ops
  ${TVCPP}/ops/autograd ${TVCPP}/ops/cpu ${TVCPP}/io/image/cuda)
if(WITH_CUDA)
    list(APPEND ALLOW_LISTED ${TVCPP}/ops/cuda ${TVCPP}/ops/autocast)
endif()

FOREACH(DIR ${ALLOW_LISTED})
    file(GLOB ALL_SOURCES ${ALL_SOURCES} ${DIR}/*.*)
ENDFOREACH()

add_library(${PROJECT_NAME} SHARED ${ALL_SOURCES})
target_link_libraries(${PROJECT_NAME} PRIVATE ${TORCH_LIBRARIES})

if (WITH_PNG)
    target_link_libraries(${PROJECT_NAME} PRIVATE ${PNG_LIBRARY})
endif()

if (WITH_JPEG)
    target_link_libraries(${PROJECT_NAME} PRIVATE ${JPEG_LIBRARIES})
endif()

if (USE_PYTHON)
  target_link_libraries(${PROJECT_NAME} PRIVATE Python3::Python)
endif()

set_target_properties(${PROJECT_NAME} PROPERTIES
  EXPORT_NAME TorchVision
  INSTALL_RPATH ${TORCH_INSTALL_PREFIX}/lib)

include_directories(torchvision/csrc)

if (WITH_PNG)
    include_directories(${PNG_INCLUDE_DIRS})
endif()

if (WITH_JPEG)
    include_directories(${JPEG_INCLUDE_DIRS})
endif()

set(TORCHVISION_CMAKECONFIG_INSTALL_DIR "share/cmake/TorchVision" CACHE STRING "install path for TorchVisionConfig.cmake")

configure_package_config_file(cmake/TorchVisionConfig.cmake.in
  "${CMAKE_CURRENT_BINARY_DIR}/TorchVisionConfig.cmake"
  INSTALL_DESTINATION ${TORCHVISION_CMAKECONFIG_INSTALL_DIR})

write_basic_package_version_file(${CMAKE_CURRENT_BINARY_DIR}/TorchVisionConfigVersion.cmake
  VERSION ${TORCHVISION_VERSION}
  COMPATIBILITY AnyNewerVersion)

install(FILES ${CMAKE_CURRENT_BINARY_DIR}/TorchVisionConfig.cmake
  ${CMAKE_CURRENT_BINARY_DIR}/TorchVisionConfigVersion.cmake
  DESTINATION ${TORCHVISION_CMAKECONFIG_INSTALL_DIR})

install(TARGETS ${PROJECT_NAME}
  EXPORT TorchVisionTargets
  LIBRARY DESTINATION ${CMAKE_INSTALL_LIBDIR}
  )

install(EXPORT TorchVisionTargets
  NAMESPACE TorchVision::
  DESTINATION ${TORCHVISION_CMAKECONFIG_INSTALL_DIR})

FOREACH(INPUT_DIR ${ALLOW_LISTED})
    string(REPLACE "${TVCPP}" "${CMAKE_INSTALL_INCLUDEDIR}/${PROJECT_NAME}" OUTPUT_DIR ${INPUT_DIR})
    file(GLOB INPUT_FILES ${INPUT_DIR}/*.*)
    install(FILES ${INPUT_FILES} DESTINATION ${OUTPUT_DIR})
ENDFOREACH()

After that we go into the build folder, and compile, build and install the library.

cd build
cmake ..
make
sudo make install

If you didn’t see any errors (warnings are not errors!) then everything should be fine. Now let’s test that we can import the torchvision library in a project. Go back to the torchinstall folder into main.cpp and CMakeLists.txt files and amend them as follows:

// main.cpp we import torchvision library

#include <iostream>
#include <torch/torch.h>
#include <torchvision/vision.h>

int main(){
  torch::manual_seed(0);
  torch::Tensor x = torch::randn({2,3});
  std::cout << x;
}
cmake_minimum_required(VERSION 3.0)
project(debugfunc)

set(CMAKE_PREFIX_PATH /home/alexey/Desktop/torchinstall/libtorch)

find_package(Torch REQUIRED)
find_package(TorchVision REQUIRED)

set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} ${TORCH_CXX_FLAGS}")

add_executable(${PROJECT_NAME} main.cpp)
target_compile_features(${PROJECT_NAME} PUBLIC cxx_range_for)
target_link_libraries(${PROJECT_NAME} TorchVision::TorchVision)
set_property(TARGET ${PROJECT_NAME} PROPERTY CXX_STANDARD 14)

Now compile, build and run as before:

# compile main.cpp file
cmake ..
# build it
make
# run the built file
./debugfunc

You should see no errors and again the same tensor output:

Image by Author - Figure 3
Image by Author – Figure 3

Cool, we have installed now Torchvision library as well!

Debug a function

Now, let’s say we want to debug _roialign function from torchvision library which is in C++. I use Visual Code for Debugging and because we are using CMake you will need to install some dependencies into Visual Code in order to be able to debug in it. I found this video quite useful to get started.

In order to debug _roialign function I had to copy paste some bits from this path: torchvision/csrc/ops/cpu ending up with the following main.cpp file

#include <iostream>
#include <torch/torch.h>
#include <torchvision/vision.h>
#include <torchvision/ops/nms.h>
#include <torch/script.h>

#include <ATen/core/dispatch/Dispatcher.h>
#include <torch/library.h>
#include <torch/types.h>

#include <ATen/ATen.h>
#include <torchvision/ops/cpu/roi_align_common.h>

namespace vision {
namespace ops {

namespace {

template <typename T>
void roi_align_forward_kernel_impl(
    int n_rois,
    const T* input,
    const T&amp; spatial_scale,
    int channels,
    int height,
    int width,
    int pooled_height,
    int pooled_width,
    int sampling_ratio,
    bool aligned,
    const T* rois,
    T* output) {a
  // (n, c, ph, pw) is an element in the pooled output
  // can be parallelized using omp
  // #pragma omp parallel for num_threads(32)
  for (int n = 0; n < n_rois; n++) {
    int index_n = n * channels * pooled_width * pooled_height;

    const T* offset_rois = rois + n * 5;
    int roi_batch_ind = offset_rois[0];

    // Do not using rounding; this implementation detail is critical
    T offset = aligned ? (T)0.5 : (T)0.0;
    T roi_start_w = offset_rois[1] * spatial_scale - offset;
    T roi_start_h = offset_rois[2] * spatial_scale - offset;
    T roi_end_w = offset_rois[3] * spatial_scale - offset;
    T roi_end_h = offset_rois[4] * spatial_scale - offset;

    T roi_width = roi_end_w - roi_start_w;
    T roi_height = roi_end_h - roi_start_h;
    if (!aligned) {
      // Force malformed ROIs to be 1x1
      roi_width = std::max(roi_width, (T)1.);
      roi_height = std::max(roi_height, (T)1.);
    }

    T bin_size_h = static_cast<T>(roi_height) / static_cast<T>(pooled_height);
    T bin_size_w = static_cast<T>(roi_width) / static_cast<T>(pooled_width);

    // We use roi_bin_grid to sample the grid and mimic integral
    int roi_bin_grid_h = (sampling_ratio > 0)
        ? sampling_ratio
        : ceil(roi_height / pooled_height); // e.g., = 2
    int roi_bin_grid_w =
        (sampling_ratio > 0) ? sampling_ratio : ceil(roi_width / pooled_width);

    // We do average (integral) pooling inside a bin
    // When the grid is empty, output zeros.
    const T count = std::max(roi_bin_grid_h * roi_bin_grid_w, 1); // e.g. = 4

    // we want to precalculate indices and weights shared by all chanels,
    // this is the key point of optimization
    std::vector<detail::PreCalc<T>> pre_calc(
        roi_bin_grid_h * roi_bin_grid_w * pooled_width * pooled_height);
    detail::pre_calc_for_bilinear_interpolate(
        height,
        width,
        pooled_height,
        pooled_width,
        roi_start_h,
        roi_start_w,
        bin_size_h,
        bin_size_w,
        roi_bin_grid_h,
        roi_bin_grid_w,
        pre_calc);

    for (int c = 0; c < channels; c++) {
      int index_n_c = index_n + c * pooled_width * pooled_height;
      const T* offset_input =
          input + (roi_batch_ind * channels + c) * height * width;
      int pre_calc_index = 0;

      for (int ph = 0; ph < pooled_height; ph++) {
        for (int pw = 0; pw < pooled_width; pw++) {
          int index = index_n_c + ph * pooled_width + pw;

          T output_val = 0.;
          for (int iy = 0; iy < roi_bin_grid_h; iy++) {
            for (int ix = 0; ix < roi_bin_grid_w; ix++) {
              detail::PreCalc<T> pc = pre_calc[pre_calc_index];
              output_val += pc.w1 * offset_input[pc.pos1] +
                  pc.w2 * offset_input[pc.pos2] +
                  pc.w3 * offset_input[pc.pos3] + pc.w4 * offset_input[pc.pos4];

              pre_calc_index += 1;
            }
          }
          output_val /= count; // Average pooling

          output[index] = output_val;
        } // for pw
      } // for ph
    } // for c
  } // for n
}

template <class T>
inline void add(T* address, const T&amp; val) {
  *address += val;
}

at::Tensor roi_align_forward_kernel(
    const at::Tensor&amp; input,
    const at::Tensor&amp; rois,
    double spatial_scale,
    int64_t pooled_height,
    int64_t pooled_width,
    int64_t sampling_ratio,
    bool aligned) {
  TORCH_CHECK(input.device().is_cpu(), "input must be a CPU tensor");
  TORCH_CHECK(rois.device().is_cpu(), "rois must be a CPU tensor");
  TORCH_CHECK(rois.size(1) == 5, "rois must have shape as Tensor[K, 5]");

  at::TensorArg input_t{input, "input", 1}, rois_t{rois, "rois", 2};

  at::CheckedFrom c = "roi_align_forward_kernel";
  at::checkAllSameType(c, {input_t, rois_t});

  auto num_rois = rois.size(0);
  auto channels = input.size(1);
  auto height = input.size(2);
  auto width = input.size(3);

  at::Tensor output = at::zeros(
      {num_rois, channels, pooled_height, pooled_width}, input.options());

  if (output.numel() == 0)
    return output;

  auto input_ = input.contiguous(), rois_ = rois.contiguous();
  AT_DISPATCH_FLOATING_TYPES_AND_HALF(
      input.scalar_type(), "roi_align_forward_kernel", [&amp;] {
        roi_align_forward_kernel_impl<scalar_t>(
            num_rois,
            input_.data_ptr<scalar_t>(),
            spatial_scale,
            channels,
            height,
            width,
            pooled_height,
            pooled_width,
            sampling_ratio,
            aligned,
            rois_.data_ptr<scalar_t>(),
            output.data_ptr<scalar_t>());
      });
  return output;
}

} // namespace

} // namespace ops
} // namespace vision

int main(){
  torch::manual_seed(0);

  // load tensors saved from Python 
  torch::jit::script::Module tensors = torch::jit::load("/media/sf_Linux_shared_folder/roialign/tensors.pth");
  c10::IValue feats = tensors.attr("cr_features");
  torch::Tensor feat_ts = feats.toTensor();

  c10::IValue boxes = tensors.attr("cr_proposal");
  torch::Tensor boxes_ts = boxes.toTensor();

  std::cout << boxes_ts << std::endl;

  double spatial_scale = 1.0;
  int64_t pooled_height = 2, pooled_width = 2, sampling_ratio = -1;
  bool aligned = false;

  at::Tensor out = vision::ops::roi_align_forward_kernel(feat_ts, boxes_ts, spatial_scale, pooled_height, pooled_width, sampling_ratio, aligned);

  std::cout << out;

}

You will notice that I load features and proposals tensors that I used in this article for _roialign. You can now place your breakpoints wherever you want to debug your code and then step line by line to see what is happening.

Conclusions

In this article we have seen how to install torch and torchvision C++ libraries on a Linux machine and debug functions that we cannot directly debug in Pytorch Python API. If know about easier ways on how to debug functions without copy pasting them how I did please let me know in the comments.


Related Articles