Docker: Containerizing the Future of AI Engineering

In the world of software development and deployment, there exists a notorious phrase: "It works on my machine."

For many years, this statement has epitomized the challenges developers face when their applications run seamlessly in one environment, but encounter issues in another.

Enter Docker: a solution that has revolutionized how we think about consistency, scalability, and deployment in software development. For AI engineers, the adoption of Docker can be a game-changer. In this chapter, we will explore why every modern-day AI engineer should make Docker a key tool in their repertoire.

What is Docker?

Docker is a platform designed to develop, ship, and run applications inside containers. A container encapsulates an application along with all its dependencies, libraries, and binaries in one package. This ensures that the application will run identically regardless of where the container is deployed.

AI Engineering and the Need for Consistency

AI models, by nature, are complex entities that rely on a specific stack of libraries, dependencies, and environmental variables. A slight change in any of these elements could lead to variations in the model's performance or, worse, complete failures. Docker's containerization ensures that once a model is trained, it can be wrapped with its entire environment, ensuring consistent results from development to production.

Scalability and Portability in the Age of AI

Modern AI applications are not limited to powerful servers or data centers. They are deployed on the cloud, edge devices, and even on IoT devices. Docker containers can be effortlessly moved across these environments, ensuring that AI applications are scalable and portable.

Streamlining the AI Workflow

Training AI models is just one part of an AI engineer's journey. Model serving, versioning, and continuous integration and deployment are integral aspects of bringing AI to the real world. Docker simplifies these processes by providing a unified framework where models can be easily versioned, shared, and deployed without the overhead of traditional setup and configuration.

Embracing Docker isn't merely about adopting a new technology; it's about embracing a paradigm that ensures the fruits of AI engineering can be reliably and consistently enjoyed by end-users.

As we delve deeper into this chapter, we will unpack the technicalities, best practices, and transformative potential of Docker in the world of AI.

Below is a Dockerfile to set up and run the FastAPI service:

Dockerfile

# Use an official Python runtime as the parent image
FROM python:3.8-slim

# Install build utilities for Annoy
RUN apt-get update && \
    apt-get install -y --no-install-recommends build-essential && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# Set the working directory in the container
WORKDIR /usr/src/app

# Copy the local code to the container
COPY . .

# Install any needed packages specified in requirements.txt
RUN pip install --no-cache-dir fastapi[all] uvicorn sentence_transformers annoy

# Make port 80 available to the world outside this container
EXPOSE 80

# Define environment variable for Uvicorn
ENV UVICORN_HOST 0.0.0.0
ENV UVICORN_PORT 80

# Run the FastAPI application using Uvicorn when the container launches
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "80"]

Here's a quick guide to build and run the Docker container:

First create a directory locally.

  • Save the above content in a file named Dockerfile in the root directory of your FastAPI application.
  • In the same directory, save your FastAPI code in a file named main.py
  • Download the vecindex.ann and chunks.db files from your Kaggle notebook in the same directory where main.py and Dockerfile are stored.
  • Navigate to the root directory (where the Dockerfile is located) in your terminal or command prompt.
  • Build the Docker image by issuing the below command in the Terminal: docker build -t fastapi-service .
  • After the build completes, run the Docker container: docker run -p 80:80 fastapi-service

You can then access the FastAPI service in your browser or using tools like curl at http://localhost:80.

Remember, Docker provides a self-contained environment, so this setup ensures that all dependencies, including the necessary build tools for Annoy, are encapsulated within the Docker image, allowing for easy deployment and scaling.

The code snippet below will allow you to test your service.


import requests

# Define the endpoint URL
url = "http://localhost:80/find_similar_text/"

# Define the query parameters
params = {
    "query_text": "few-shot prompting"
}

# Make the GET request
response = requests.get(url, params=params)

# Check if the request was successful
if response.status_code == 200:
    similar_texts = response.json()
    for i, text in enumerate(similar_texts, 1):
        print(f"{i}. {text['chunk_text']}\n")
else:
    print(f"Error {response.status_code}: {response.text}")