Deep Learning Framework Deployment (Ubuntu)

Authored by Tony Feng

Created on June 26th, 2021

Last Modified on Sept 1st, 2022

Intro

This is the tutorial of deep learning framework deployment (GPU version), covering the basic steps of installation, solutions to some errors, and other key points. If you meet any bugs that are not mentioned here, then Google them :)

Prerequisites

To build up an deep learning environment, you need to install a NVIDIA driver that supports your GPU. Also, you need to pay attention to the versions of your deep learning framework, CUDA, CUDNN, Python. Be sure that they are compatible with each other.

You can refer to these webpages: - The relationship between between GPU, CUDA, CUDNN - What is GPU, NVCC, CUDA, CUDNN?

Anaconda

Anaconda is a Python distribution that is particularly popular for data analysis and scientific computing. This platform can integrate many useful tools, such as Visual Studio Code, Jupyter Notebook, R Studio, PyCharm, etc.

The software provides a flexible solution that allows users to build, distribute, install, update, and manage software in a cross-platform manner. The multiple environments can be maintained and run separately without interference from each other.

You can download Anaconda here and choose a version matching your OS.

After installing the software, (base) will be displayed in front of each line in your termial. This means you are in the root environment.

Now, I am gonna show you some basic conda operations in the terminal. Firstly, let’s build a new enviroment.

1
2
conda create -n [name of your environment] python=[version]
# e.g. conda create -n py36 python=3.6

If you already has multiple environments and forget their names, you can enter the command under the base environment.

1
conda info --env

You can delete the enviroment if necessary, and all packages inside will be deleted as well. Be careful and don’t do this to the base.

1
conda remove -n [name_env] --all

Then, move into the environment.

1
conda activate [name_env]

You can exit the environment and go to the base with the command:

1
conda deactivate 

If you want to check what packages have been installed in the environment, you can enter:

1
conda list 

For more instructions, please refer to this webpage.

PyTorch

In this section, I will illustrate how to install PyTorch under the Anaconda environment.

Firstly, we create an environment for PyTorch, namely pt and move into the environment

1
2
conda create -n pt python=3.7
conda activate pt

Next, we can go to the PyTorch’s official website and click Get Started. You can choose the configuration you want and the webpage can provide the corresponding command. Also, you could visit here to download historical versions. For example,

1
conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.2 -c pytorch 

Then, we intall cudnn through conda command.

1
conda install cudnn==7.6.5

You can use the following code to test if the installation is successful.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
import torch 

flag = torch.cuda.is_available() 
print(flag) 

ngpu= 1 

# Decide which device we want to run on 
device = torch.device("cuda:0" if (torch.cuda.is_available() and ngpu > 0) else "cpu") 

print(device) 
print(torch.cuda.get_device_name(0)) 
print(torch.rand(3,3).cuda())

Tensorflow 1.x

The deployment of TF1.x is much different from that of TF2.x. If you need to install TF2.x, you can refer to this webpage. In this part, I will go through the process of TF 1.x deployment. If the following instructions are not clear, you can visit this webpage.

Firstly, we create an environment, namely tf1,for Tensorflow 1.x, and step into the environment

1
2
conda create -n tf1 python=3.6
conda activate tf1

Then, install the GPU version of TF 1.x

1
conda install tensorflow-gpu=1.15.0

Before installing the CUDA, CUDNN, you can visit here to find versions of CUDA Toolkit and cuDNN SDK, compatible with your TF version.

1
2
conda install cudatoolkit=10.0.0
conda install cudnn=7.6.5 # no cudnn 7.4 in conda source. And cudnn 7.6.5 supports cuda 10.0 

You can use the following code to test if the installation is successful.

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

import tensorflow as tf

# Check if the gpu is available
print('********************************')
print(tf.test.is_gpu_available())
print('********************************')
	
# Creates a graph.
a = tf.constant([1.0, 2.0, 3.0], shape=[1, 3], name='a')
b = tf.constant([3.0, 2.0, 1.0], shape=[3, 1], name='b')
c = tf.matmul(a, b)

# Creates a session with log_device_placement set to True.
sess = tf.Session(config=tf.ConfigProto(log_device_placement=True))

# Runs the op.
print(sess.run(c))
sess.close()

If the result of print(tf.test.is_gpu_available()) is false, try to check if tensorflow detects your GPU. Re-installing the tensorflow may solve the issue. You can visit this webpage to acquire more details.

1
2
conda uninstall tensorflow 
conda install tensorflow-gpu=1.15.0

Some tips

  1. If your network is slow, you may configure a mirror to facilitate the package downloading. You may refer to this webpage to find more details.

    1
    
    conda config --add channels https://mirrors.ustc.edu.cn/anaconda/pkgs/free/
    
  2. You can check the versions of CUDA and CUDNN with the commands:

    1
    2
    
    conda list cuda
    conda list cudnn
    
  3. If you want to find a campatible combination of CUDA, CUDNN, the following command can list a series of suported versions.

    1
    2
    
    conda search cudatoolkit
    conda search cudnn
    

MIT License
Last updated on Sep 01, 2022 22:08 EDT
Built with Hugo
Theme Stack designed by Jimmy