Skip to main content

Command Palette

Search for a command to run...

Creating and Optimizing a Dockerfile for a Simple Web Application

#90daysofDevOps Day-17

Published
9 min read
Creating and Optimizing a Dockerfile for a Simple Web Application

Welcome to today’s blog, where we’ll dive into Dockerfiles and learn how to optimize them for production using multi-stage builds. If you’re looking to build, run, and deploy containerized applications efficiently, learning to create Dockerfiles is essential. We’ll walk through creating a Dockerfile for a Python web application, explore why multi-stage builds are helpful, and see a comparison of the image sizes with and without multi-staging.

What is a Dockerfile?

In simple terms, a Dockerfile is a set of instructions that Docker follows to build an image. Think of it as a recipe that includes everything Docker needs to create an environment for your app. With a Dockerfile, you can:

  1. Choose a Base Image: Decide on the base environment (like Python or Node.js) that has essential components for your app.

  2. Add Application Files: Copy your application files, set up configurations, and install dependencies.

  3. Define Commands: Instruct Docker on how to run or serve your app.

A Dockerfile ensures that your application behaves consistently across different systems, from your laptop to a production server.

What is a Multi-Stage Build?

A multi-stage build allows you to create a more optimized, smaller image by separating the build and runtime environments. Here’s why it’s advantageous:

  • Reduced Size: By copying only necessary files into the final image, we avoid adding unneeded dependencies, which helps reduce image size.

  • Improved Security: Using a slim runtime image keeps unnecessary build tools out of production, minimizing potential vulnerabilities.

  • Better Performance: Smaller images result in faster deployment times and efficient resource usage.

Task Overview

  1. Create a Dockerfile for a simple Python web application (Flask). Here, we will be using a Random Programming Quotes based web app.

  2. Optimize the Dockerfile with a multi-stage build.

  3. Compare the image sizes with and without multi-staging.

Step 1: Setting Up a Simple Python Flask Application

Let’s start by creating a basic Flask application.

  1. Create a project folder with the following files:

     your-project/
     ├── app/
     │   ├── app.py
     │   └── templates/
     │       └── index.html
     ├── requirements.txt
     ├── Dockerfile
     └── Dockerfile.multi
    
  2. app.py: This file is the main application script.

     from flask import Flask, jsonify, render_template
     import random
    
     app = Flask(__name__)
    
     # List of programming quotes
     quotes = [
         "Talk is cheap. Show me the code. - Linus Torvalds",
         "Programs must be written for people to read, and only incidentally for machines to execute. - Hal Abelson",
         "Code is like humor. When you have to explain it, it’s bad. - Cory House",
         "Any fool can write code that a computer can understand. Good programmers write code that humans can understand. - Martin Fowler",
         "First, solve the problem. Then, write the code. - John Johnson"
     ]
    
     @app.route('/')
     def index():
         return render_template("index.html")
    
     @app.route('/quote')
     def quote():
         return jsonify(random.choice(quotes))
    
     if __name__ == '__main__':
         app.run(host='0.0.0.0', port=5000)
    
  3. requirements.txt: This file lists the dependencies.

     flask==2.0.3
     Werkzeug==2.0.3
    
  4. index.html: A simple HTML file to display.

     <!DOCTYPE html>
     <html lang="en">
     <head>
         <meta charset="UTF-8">
         <meta name="viewport" content="width=device-width, initial-scale=1.0">
         <title>Random Programming Quotes</title>
         <style>
             body { font-family: Arial, sans-serif; display: flex; justify-content: center; align-items: center; height: 100vh; margin: 0; }
             .container { text-align: center; }
             h1 { color: #333; }
             p { color: #666; }
             button { padding: 10px 20px; background-color: #007bff; border: none; color: white; cursor: pointer; }
             button:hover { background-color: #0056b3; }
         </style>
         <script>
             async function fetchQuote() {
                 const response = await fetch('/quote');
                 const data = await response.json();
                 document.getElementById('quote').innerText = data;
             }
         </script>
     </head>
     <body>
         <div class="container">
             <h1>Random Programming Quotes</h1>
             <p id="quote">Press the button for a new quote.</p>
             <button onclick="fetchQuote()">Get Quote</button>
         </div>
     </body>
     </html>
    

Step 2: Writing the Dockerfile

A. Without Multi-Stage Build (Dockerfile)

Let’s create a simple Dockerfile without optimization.

# Dockerfile (without multi-stage build)
FROM python:3.8

# Set the working directory
WORKDIR /app

# Copy the requirements file and install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy the application files
COPY /app .

# Expose the port for Flask
EXPOSE 5000

# Command to run the app
CMD ["python", "app.py"]
  • This Dockerfile installs dependencies and copies all files directly into a single container. Although functional, it includes unnecessary build dependencies in the final image.

B. Optimizing with Multi-Stage Build (Dockerfile.multi)

Now, let’s optimize this Dockerfile using a multi-stage build. We’ll use python:3.8 for building the dependencies and then copy only the necessary files into a lighter python:3.8-slim image.

# Stage 1: Build Stage
FROM python:3.8 AS build

# Set the working directory
WORKDIR /app

# Copy only the requirements file to install dependencies
COPY requirements.txt .

# Install dependencies
RUN pip install --no-cache-dir -r requirements.txt

# Stage 2: Runtime Stage
FROM python:3.8-slim

# Set the working directory
WORKDIR /app

# Copy the installed dependencies from the build stage
COPY --from=build /usr/local/lib/python3.8/site-packages /usr/local/lib/python3.8/site-packages

# Copy the application files
COPY /app .

# Expose the port for Flask
EXPOSE 5000

# Command to run the app
CMD ["python", "app.py"]

Explanation of Multi-Stage Dockerfile

  1. Build Stage: In the first stage, we use a full python:3.8 image to install all required dependencies for our app. This stage is heavier but contains all necessary libraries for building the application.

  2. Runtime Stage: In the second stage, we use a lighter python:3.8-slim image. We copy the dependencies installed in the first stage (site-packages) into this image, but we avoid including any unnecessary build tools, resulting in a leaner final image.

Step 3: Building and Running the Image

To see the difference, let’s build both Dockerfiles.

  1. Without Multi-Stage Build:

     docker build -t flask-app-no-multi -f Dockerfile .
    
  2. With Multi-Stage Build:

     docker build -t flask-app-multi -f Dockerfile.multi .
    
  3. Run the Container:

    To start the container and access the app, you can run it in either attached or detached mode.

    Run in Attached Mode:

     docker run -p 5000:5000 flask-app-multi
    

    Run in Detached Mode:

     docker run -d -p 5000:5000 flask-app-multi
    

    In detached mode (-d), Docker runs the container in the background, allowing you to continue using your terminal for other commands.

  4. Access the App:

    • Local Development: Open a browser and visit http://localhost:5000.

    • AWS Instance: If you’re running this on an AWS EC2 instance, you’ll need to access the app using the public IP address of your instance instead of localhost. For example, use http://<your-public-ip>:5000.

  5. Firewall/Port Access:

    • Local Firewall: Ensure port 5000 is open in your firewall/security settings, especially if you’re accessing from an external network.

    • AWS Security Groups: For AWS EC2 instances, you must configure the security group associated with your instance to allow inbound traffic on port 5000. Make sure to add a rule to allow HTTP traffic on port 5000 from your desired IP range (or allow all traffic for testing purposes, but this is not recommended for production environments).

    • Best Practices for Security: When configuring security groups, it's best to restrict access to specific IP addresses or ranges to enhance security, rather than allowing access from all IPs.

  6. Sample Output:

Step 4: Comparing the Image Sizes

To understand the benefit of multi-staging, let’s compare the sizes of the images.

  1. List Image Sizes:

     docker images | grep flask-app
    

    Sample Output:

     REPOSITORY           TAG             IMAGE ID       CREATED             SIZE
     flask-app-no-multi   latest          abc1234        5 minutes ago       1.01GB
     flask-app-multi      latest          def5678        2 minutes ago       142MB
    

    In this example, you’ll notice the flask-app-multi image is significantly smaller than flask-app-no-multi, demonstrating the efficiency of multi-stage builds.

Docker Hub: Pushing Your Image

Explain how to push the Docker image to Docker Hub:

  1. Log in to Docker Hub:

     docker login
    
     docker login -u <username> #to login via command line
    
  2. Tag Your Image: Replace <username> with your Docker Hub username.

     docker tag flask-app-multi <username>/flask-app-multi:latest
    
  3. Push the Image:

     docker push <username>/flask-app-multi:latest
    

This enables users to pull your image directly from Docker Hub for quick testing and deployment.

Additional Dockerfile Tips

  1. Minimize Layers: Each command in Dockerfile creates a new layer. Minimize commands to reduce image size.

  2. Use .dockerignore: Exclude unnecessary files (e.g., .git, local config files) to keep your image clean and lightweight.

  3. Set Environment Variables: Use ENV to define configuration values within Dockerfiles.

Conclusion

Multi-stage builds are invaluable for creating optimized, production-ready Docker images. By separating the build and runtime environments, you reduce the final image size, enhance security, and streamline deployment. In our example, the multi-stage build reduced the image size from 1.01GB to 142MB, which saves bandwidth and speeds up deployments.

However, it's important to note that for smaller applications, a single-stage Dockerfile using a slim Python image could also suffice and yield a smaller build size, around 135MB in this case. This approach would have worked effectively for our simple Flask application, which has relatively few dependencies.

PS: Considerations for Single-Stage vs. Multi-Stage Builds

While single-stage builds may seem appealing due to their simplicity and reduced image size for small applications, they often fall short for larger, enterprise-level applications with more complex dependencies. Here are some reasons why multi-stage builds should be prioritized in such scenarios:

  1. Complex Dependencies: Larger applications typically require more libraries and tools that may not be available in a slim image. A multi-stage build allows you to include all necessary build tools in the first stage without bloating the final image.

  2. Enhanced Security: By using multi-stage builds, you can ensure that the final image contains only the essential runtime dependencies. This minimizes the attack surface and reduces the number of potential vulnerabilities, which is critical for applications in production.

  3. Layer Management: Multi-stage builds help in managing image layers more effectively. Each command in a Dockerfile creates a layer; by optimizing these commands across multiple stages, you can minimize the number of layers in your final image, leading to a more efficient deployment.

  4. Future Scalability: Using multi-stage builds sets a solid foundation for future expansion. As applications grow, they often require additional dependencies. A multi-stage approach allows for easier integration of these dependencies without significant rework of the Dockerfile.

  5. Performance Optimization: Smaller images generally result in faster pull times, which is particularly beneficial in continuous integration/continuous deployment (CI/CD) environments. Multi-stage builds contribute to faster deployment processes by keeping the final image lightweight.

Key Takeaways

  • Key Dockerfile Commands Recap:

    • FROM <image>: Sets the base image for subsequent instructions.

    • WORKDIR <path>: Defines the working directory inside the container.

    • COPY <source> <destination>: Copies files from the host to the container.

    • RUN <command>: Executes commands in the container (e.g., installing dependencies).

    • CMD ["executable", "param1", "param2"]: Specifies the command to run when the container starts.

Multi-Stage Dockerfile Commands:

  • FROM <base-image> AS <stage-name>: Defines a specific stage in a multi-stage build.

  • COPY --from=<stage-name> <source> <destination>: Copies files from a previous stage to optimize final image size.

    • Example: Use a build stage to compile code, and then copy only the compiled output to the final image, reducing size by leaving out unnecessary build dependencies.

Practical Docker Commands:

  • docker build -t <name>:<tag>: Tag and build your image with a specific name.

  • docker run -p <hostPort>:<containerPort>: Run the container and expose necessary ports for access.

  • docker images: List available Docker images and their sizes.

  • docker rmi <image>: Remove unused Docker images to free up space.

  • docker ps: List running containers to monitor your active applications.

  • docker exec -it <container> <command>: Run a command inside a running container for troubleshooting or debugging.

  • docker logs <container>: View logs from a specific container to troubleshoot issues.

By using Docker multi-stage builds, you're well on your way to mastering efficient containerization for DevOps and production-ready environments. Happy Dockerizing! 🐳