Prepare GPU-Container Environment on the Alibaba Cloud ECS Part 1
Nowadays, we have an extremely high-speed changing live environment. Most changes come with the thriving of Artificial Intelligence (AI). The revolution started not only because of AI. It comes with other high-tech. One of the high-tech generators is containerization, and containers are part of MLOps. We start our journey to MLOps with easy-to-use containers. This article describes creating and preparing an environment to train or inference the AI model. Since Alibaba Cloud provides comprehensive tools and environments for deploying and using AI models, we will run example code and environment on it.
Containerization allows you to quickly and easily replicate your working environment. It makes it possible to initialize MLOps, and quickly reach the technical support L3 level. Moreover, containerizing allows developers to generate the same results repeatedly on a different machine. However, creating containers is time-consuming and requires expert knowledge. As a container, we will use docker images.
Docker is a popular open-source platform that helps to automate the deployment, scaling, and management of applications and services. Its container-based approach to software development makes it easy to package and deploy applications consistently and repeatedly.
Alibaba Cloud is a leading cloud computing platform that offers a range of services, including Elastic Compute Service (ECS), which allows users to deploy and use Docker on any Operating System (OS). We run ESC with Ubuntu OS as an example. However, ECS supports all OS and even user-defined OS images. Using Docker on ECS, AI developers can build, deploy, and run their applications quickly and efficiently.
Using Docker on ECS can bring you several benefits:
- Allows applications to run in isolated containers, meaning they can be deployed and run on any machine without worrying about dependencies or configuration issues. This makes it easy to move applications from one environment to another and ensures that they can run seamlessly across multiple machines, making it possible to scale them as needed.
- Provides a range of tools that make it easy to manage applications' deployment, scaling, and maintenance. For example, Docker Compose is a tool that makes it easy to define and run multi-container applications, which means developers can define the dependencies between different containers and manage them as a single unit.
- Provides a range of platform and AI services, such as machine learning and deep learning algorithms, which can be used to build and deploy intelligent applications. It also provides a range of big data services, such as data warehousing and data analytics, which can be used to analyze vast amounts of data and accurately make predictions.
As an initial step, we sign-up with an Alibaba Cloud account. Run instances that run an Ubuntu 20.04 operating system. For more information, see Create an instance by using the wizard.
In this article, we have chosen instances to run AI model inference. Developers not only need to consider inference models but also need to evaluate pricing, so an instance with T4 GPU will be an excellent choice to infer most of the AI models.
Deploy Docker
To deploy, the docker image should be connected to the ECS instance, the detailed information regarding ECS connection methods can be found at the following link. It's recommended to update the package sources list with the latest versions of the packages in the repositories. Then need to install packages to allow apt
to use a repository over HTTPS:
apt update
apt install \
ca-certificates \
curl \
gnupg \
lsb-release
It's recommended to have a look at the official documentation before installing any software. Hence here is the official documentation on how to Install Docker on Ubuntu. However, we provide whole steps to installing Docker on Ubuntu 20.04.
- It is recommended to check if does machine already installed Docker. If yes, uninstall existing/old versions:
apt remove docker docker-engine docker.io containerd runc
2. Add the official GPG key of the Docker and set up the repository:
mkdir -m 0755 -p /etc/apt/keyrings
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
3. To get newly added repositories, it needs to update apt
and then install Docker Engine, containerd, and Docker Compose:
apt update
apt install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
The installation of Docker has been done, but we have to verify whether the installation succeeds or not by checking its status of it and running the hello-world:
systemctl status docker
docker run hello-world
The successful result can be defined by getting a similar response as shown in the pictures.
Let's check what Docker images the machine has:
docker images
We have installed Docker and tested it. Logically, the next step will be to pull docker images with Tensorflow and Nvidia-Docker, which we will do in the second part.