NVIDIA-Tensorflow Docker Image on the Alibaba Cloud ECS Part 2
Tensorflow runs in the docker and GPU within docker
In the previous article, we prepared ECS and Docker environment. Now come to the second part, we will install Nvidia-Docker and pull the Tenforflow GPU docker image and run the image classification pre-trained model on the docker environment.
The code examples are compatible with TensorFlow version 2.1.0 with GPU support, so we pull exactly this version from the docker registry:
First, we will check GPU card availability:
lspci | grep -i nvidia
Then we verify your nvidia-docker
Installation:
docker run --rm --gpus all nvidia/cuda:10.1-base nvidia-smi
In the same manner, we verify GPU-enabled TensorFlow image verification:
docker run --gpus all -it --rm tensorflow/tensorflow:latest-gpu \
python -c "import tensorflow as tf; print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
Finally, we pull TensorFlow:
docker pull tensorflow/tensorflow:2.1.0-gpu
Let’s check the docker images in the machine:docker images
The original name of the docker image has no connection with the current product. Hence, let’s modify tags to make them related to a related project. Here as an example, we use dog and cat classification tasks, so we name them like machine learning library name-its version-project name:the tag will be the version of the docker image
. It gives us an opportunity to quickly identify and understand when we will have a dozen or more docker images in the machine.
docker tag tensorflow/tensorflow:2.1.0-gpu tf.2.1.0-gpu-cat_dog:v0.0.1
docker images
Change the tag and view existing images.
Now we have two containers with different tags but the same image_id, and it’s better to untag the redundant one and print the list of docker images again:
docker rmi -f tensorflow/tensorflow:2.1.0-gpu
docker images
Manage Containers
Congratulations, now you have a running container with installed TensorFlow and supports GPU, and the next step is to access the container. PreviousRun the docker images
command to obtain the ImageId value, which is cb908459d986
. Then, run the docker run
command to access the container. To keep the container running in the background, we run it with --detach
(or -d
) argument. Usually, Docker gives a random sweet name to the container. We can give the name of the container by ourselves. For this need to add the --name
parameter to the command to specify cat_dog
the container name.
docker run -t -d --name cat_dog cb908459d986 /bin/bash
By the running command docker ps -a
we can see the list of containers
Let’s access the container that runs in the background.
docker exec -it cat_dog /bin/bash
To from the container bash, run the exit
.
Now we have the ready environment to run the inference model on this machine. Alibaba Cloud ECS instances with Docker are a fantastic platform for deploying and using AI technologies. The combination of Docker and Alibaba Cloud ECS instances provides a range of benefits, including the ability to run applications in isolated containers, manage the lifecycle of applications, and access a range of AI and big data services. With these tools, developers can build and deploy intelligent applications with ease and efficiency, solving complex problems and making tasks faster, easier, and more efficient. The following article will be about how to run a Machine Learning model on Docker under the Alibaba Cloud environment.