Sr. Content Developer at Microsoft, working remotely in PA, TechBash conference organizer, former Microsoft MVP, Husband, Dad and Geek.
149304 stories
·
33 followers

Get started with language model post-training using Training Hub

1 Share

Open source language models can deliver incredible performance because of their smaller size and efficient cost, but they might not deliver the required results for more targeted use cases. There is, however, one more key benefit that open source models have over their frontier model counterparts: customizability. Often, you can fit an open source model to your specific use case via post-training, which offers better and cheaper performance than even frontier models can provide. while running securely and safely for private data, offline platforms, and so on.

This leads to the question: how do you actually get started with language model post-training? Many methods are available today, spanning dozens of libraries with diverging requirements, APIs, setup, usage, and more. How can you learn and resolve all of these differences while also trying to learn what method works best for a given task, data, or hardware? On top of this, what if you are trying to run multiple methods in sequence or to include them in the same project or application?

Therein lies the value of Training Hub, an open source library with algorithm-level abstractions for common, modern post-training techniques. It pulls from a collection of community implementations and provides a common, pythonic interface from Supervised Fine-Tuning to Reinforcement Learning.

An intuitive, uniform entrypoint

Training Hub is a Python-based library built by Red Hat’s AI Innovation team. It helps developers focus on language model post-training algorithms without having to manage the unique constraints and overhead of discovering, understanding, and running a number of independent libraries of varying complexity.

More specifically, Training Hub provides a mapping of algorithms to community libraries and official backend implementations, and also a common interface for running the post-training algorithms (Figure 1).

A flow chart that maps four top-level algorithms for AI model fine-tuning to their respective backend implementations: Supervised Fine-Tuning (SFT) maps to Instructlab Training; Orthogonal Subspace Fine-Tuning (OSFT) maps to Red Hat AI Innovation Mini Trainer; Low Rank Adaptation (LoRA) maps to Hugging Face PEFT (Parameter-Efficient Fine-Tuning); Group Relative Policy Optimization (GRPO) maps to Volcano Engine Reinforcement Learning (VERL).
Figure 1: Training Hub uses multiple community libraries and official implementations for each available algorithm.

Every algorithm is exposed and runnable as a simple Python function, with a set of common arguments (base model, data, learning rate, batch size, GPU/node distributed setup, and so on), alongside a set of algorithm-specific arguments. For example, running standard Supervised Fine-Tuning (SFT) is as simple as calling the sft function with desired model, data, and hyperparameters:

from training_hub import sft

sft(
    model_path="/path/to/model",
    data_path="/path/to/data",
    ckpt_output_dir="/path/to/checkpoints",
    num_epochs=3,
    learning_rate=1e-5,
    effective_batch_size=16
)

If you wanted to run the same algorithm, but with a different backend implementation (perhaps the default community library doesn’t support a model or data format you need during training, or another library includes an enticing optimization for certain hardware), switching out the backend is as simple as just asking for it! For the same algorithm, the interface remains constant:

from training_hub import sft

sft(
    model_path="/path/to/model",
    data_path="/path/to/data",
    ckpt_output_dir="/path/to/checkpoints",
    num_epochs=3,
    learning_rate=1e-5,
    effective_batch_size=16,
    bakend="alternate-backend" #<-implemented
)

If you instead wanted to run a continual learning method like Orthogonal Subspace Fine-Tuning (OSFT), you would run the osft function with the same arguments, just adding one new algorithm-specific argument, unfreeze-rank-ratio, which dictates what percent of the most critical model weights remain frozen:

from training_hub import osft

osft(
    model_path="/path/to/model",
    data_path="/path/to/data",
    ckpt_output_dir="/path/to/checkpoints",
    num_epochs=3,
    learning_rate=1e-5,
    effective_batch_size=16,
    unfreeze_rank_ratio=0.3 #<--- OSFT-specific
)

Plug-and-play development

With Training Hub, adding new algorithms and new implementations for existing algorithms is incredibly simple.

Every algorithm has an entrypoint function under its name (sft, osft, etc.) and an Algorithm class defining the core train function, the required_params, and the optional_params.

There are then Backend classes that define how to execute_training for a given implementation—that is, how to parse the input arguments and run the training job using the community implementation selected (or defaulted). For any given algorithm, there can be any number of backend providers. For example, if you wanted to add a Hugging Face library, VERL, Unsloth, Llama Factory, etc., as a backend implementation for a given algorithm, you can define a new Backend class for the library. It will then be usable under the same common algorithm-centered interface as the rest of the implementations.

If you want to add a new algorithm, simply create a file under the algorithms directory defining a new Algorithm class and the entrypoint function (lora, grpo, cpt, etc.).

Outside of documentation, algorithms and backends can also be discovered directly through the library:

from training_hub import AlgorithmRegistry

# List all available algorithms
algorithms = AlgorithmRegistry.list_algorithms()
print("Available algorithms:", algorithms) # ['sft', 'osft']

# List backends for SFT
sft_backends = AlgorithmRegistry.list_backends('sft')
print("SFT backends:", sft_backends) # ['instructlab-training']

# Get algorithm class directly
SFTAlgorithm = AlgorithmRegistry.get_algorithm('sft')

What's more, you can dynamically register, list, and use new algorithms and backends on the fly, enabling you to make use of the library in whatever way is most convenient to your workflow.

The first home for OSFT

The Training Hub is also home to the first official Orthogonal Subspace Fine-Tuning (OSFT) implementation available openly! This parameter-efficient method addresses continual learning and iterative updates in a language model. By analyzing component matrices via adaptive singular value decomposition (SVD), the method identifies the critical directions in the model and the corresponding weights. It instead only updates in orthogonal directions (the least critical model weights).

By setting the unfreeze_rank_ratio, a user can select what percent of the model’s weights are deemed critical and left frozen versus how much of the least critical model components should remain unfrozen for new task learning capacity. Setting the value to 1.0 is effectively equivalent to full-fine-tuning with SFT, whereas setting to 0.5 will leave 50% of the model frozen and 50% trainable. See Figure 2.

Diagram illustrating the effect of the unfreeze_rank_ratio (urr) value on model fine-tuning.
Figure 2: OSFT provides a controllable balance for new task learning while retaining previous task performance and general model capability.

With OSFT now being officially merged into Hugging Face PEFT, we expect future adoption and collaboration with popular community training libraries. All of these will be available through the Training Hub as a common source with direct inventor support.

How to get started with Training Hub

Let’s now do a quick run-through of getting set up with the Training Hub. The package is available on PyPI and can also be installed from source.

Step 1: Set up your environment

You’ll need Python 3.11+ installed. Then, create a project directory and virtual environment:

mkdir training-project && cd training-project

Enter python3 -m venv .venv or uv venv if preferred. Then run:

source .venv/bin/activate

Step 2: Install training-hub

For basic installation with GPU support:

pip install training-hub[cuda]

For development, one can do an editable install from source:

git clone https://github.com/Red-Hat-AI-Innovation-Team/training_hub
cd training_hub
pip install -e .[cuda]

Note: If you encounter build issues with flash-attn, install the base package first:

# Install base package (provides torch, packaging, wheel, ninja)
pip install training-hub
pip install training-hub[cuda] --no-build-isolation

Similarly, for development, you can run:

pip install -e .
pip install -e .[cuda] --no-build-isolation

Step 3: Begin training

To start training, either create your own Python file or notebook, or start with one of our existing examples. To see a list of available algorithms, you can run:

from training_hub import AlgorithmRegistry
algorithms = AlgorithmRegistry.list_algorithms()
print("Available algorithms:", algorithms)

All algorithms are also listed in the Training Hub main README and on the examples page.

3.1: General guidance

To begin training with a given algorithm, start by importing the desired algorithm:

from training_hub import <algorithm_name>

All you have to do from there is run:

<algorithm_name>(
model_path=”/path/to/model”,
data_path=”path/to/data”,
ckpt_output_dir=”path/to/save/checkpoints”,
…
)

Most algorithms will have a set of similar common parameters along with a few algorithm-specific parameters. While there are also only a few required parameters, there is a large variety of available optional parameters. To view the available parameters for a given algorithm, you can run:

from training_hub import create_algorithm
osft_algo = create_algorithm('osft', 'mini-trainer')
required_params = osft_algo.get_required_params()
print("Required parameters:", list(required_params.keys()))
optional_params = osft_algo.get_optional_params()
print("Optional parameters:", list(optional_params.keys()))

Or, simply view the documentation for the algorithm within our docs directory.

3.2: Algorithm-specific examples

Getting started with an algorithm like SFT is relatively straightforward. Run:

from training_hub import sft
sft(
model_path="/path/to/model",
data_path="/path/to/data",
ckpt_output_dir="/path/to/checkpoints",
num_epochs=3,
learning_rate=1e-5,
effective_batch_size=16,
)

However, the options for customizing to fit your use case and hardware are quite expansive:

  • To set a limit on the number of tokens per GPU (a hard cap for memory) you will want to set max_tokens_per_gpu.
  • For specifying the number of GPUs, set nprocs_per_node.
  • For configuring your checkpoint frequency, you can adjust checkpoint_at_epoch and save_samples, or to save optimizer state, add accelerate_full_state_at_epoch.
  • If you are starting a multi-node job, nnodes, node_rank, rdvz_id, and rdvz_endpoint will all be critical.

The Training Hub lets you access the full power of the underlying implementations:

# All possible SFT parameters
training_params = {
    # Required parameters
    'model_path': model_path,
    'data_path': data_path,
    'ckpt_output_dir': ckpt_output_dir,
    
    # Core training parameters
    'num_epochs': num_epochs,
    'effective_batch_size': effective_batch_size,
    'learning_rate': learning_rate,
    'max_seq_len': max_seq_len,
    'max_tokens_per_gpu': max_tokens_per_gpu,
    
    # Data and processing parameters
    'data_output_dir': data_output_dir,
    'warmup_steps': warmup_steps,
    
    # Checkpointing parameters
    'checkpoint_at_epoch': checkpoint_at_epoch,
  'save_samples': save_samples,
    'accelerate_full_state_at_epoch': accelerate_full_state_at_epoch,
    
    # Distributed training parameters
    'nproc_per_node': nproc_per_node,
    'nnodes': nnodes,
    'node_rank': node_rank,
    'rdzv_id': rdzv_id,
    'rdzv_endpoint': rdzv_endpoint,
}

For example, if running on 8xH100 (single node) and training Llama-3.1-8B-Instruct, your function might look like:

from training_hub import sft
sft(
model_path="meta-llama/Llama-3.1-8B-Instruct",
data_path="/path/to/data",
ckpt_output_dir="llama-ckpts",
num_epochs=3,
learning_rate=1e-5,
effective_batch_size=16,
nprocs_per_node=8,
max_tokens_per_gpu=25000,
max_seq_len=8192,
checkpoint_at_epoch=True,
accelerate_full_state_at_epoch=False
)

For runnable examples for both SFT and OSFT (using settings for 2x48GB GPUs by default), check out the following notebooks:

  • <SFT RUNNABLE NB>
  • <OSFT RUNNABLE NB>

For more in-depth guidance, tutorials, documentation, and examples, check out our examples page.

Continued progress

The Training Hub will continue to receive new algorithm and community implementation support and is also fully open to contribution. If you have any core post-training algorithm or required library that you would like to incorporate in your workflows under the Training Hub interface, feel free to open issues or PRs directly on the GitHub repository. For any further inquiries, you can tag me directly on GitHub @Maxusmusti. Let’s work together to build a common home for all language model post-training!

The post Get started with language model post-training using Training Hub appeared first on Red Hat Developer.

Read the whole story
alvinashcraft
29 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

How to Use to Docker with Node.js: A Handbook for Developers

1 Share

In this handbook, you’ll learn what Docker is, why it’s become an essential, must-have skill for backend and full-stack developers in 2025, and most importantly, how to use it in real-world projects from start to finish.

We will go far beyond the usual “Hello World” examples and walk you through containerizing a complete full-stack JavaScript application (Node.js + Express backend, HTML/CSS/JS frontend, MongoDB database, and Mongo Express admin UI).

You’ll learn about networking multiple containers, orchestrating everything with Docker Compose, building and versioning your own images, persisting data with volumes, and securely pushing your Images to a private AWS ECR repository for sharing and production deployment.

By the end, you’ll be able to eliminate “it works on my machine” issues, confidently manage multi-service applications, deploy consistent environments anywhere, and integrate Docker into your daily workflow and CI/CD pipelines like a pro.

Since Docker is such a key skill for backend developers, we’ll start by covering its basic concepts.

Prerequisites

This technical handbook is designed for developers who have some practical, hands-on experience in full-stack development. You should be comfortable deploying applications and have a basic understanding of CI/CD pipelines.

While we’ll cover Docker from the ground up, this guide is not for absolute beginner developers. I assume you have real-world development experience and want to level up your workflow with Docker.

Finally, a basic familiarity with AWS and general deployment concepts will also be useful, though you don’t need to be an expert. This handbook is ideal for developers looking to enhance their production-grade skills and confidently integrate Docker into their projects.

Table of Contents:

  1. What is a Container?

  2. Docker vs Virtual Machines

  3. Docker Installation

  4. Basic Docker Commands

  5. Practice with JavaScript

  6. How to Run the Mongo Container

  7. How to Run the Mongo Express Container

  8. How to Connect Node.js to MongoDB

  9. How to Use Docker Compose

  10. How to Build Our Own Docker Image

  11. How to Manage Your Containers

  12. How to Create a Private Docker Repository

  13. Assignment: Create and Push a New Version

  14. Docker Volumes

  15. Conclusion

What is a Container?

A container is a way to package an application together with everything it needs, including its dependencies, libraries, and configuration files.

Because containers are portable, they can be shared across teams and deployed on any machine without worrying about compatibility.

pictures of stack containers, to portrait or give an idea what containers are or a vivid pictureof containers aliking to containers in docker

Where Do Containers Live?

Since containers are portable and can be shared across teams and systems, they need a place to live. That’s where container repositories come in – special storage locations for containers. Organizations can have private repositories for internal use, while public ones like Docker Hub let anyone browse and use shared containers.

an image of docker hub, showing a catalogue of images

If you visit the catalog page on Docker Hub, you will see a variety of container repositories, both official and community-made, from developers and teams like Redis, Jenkins, and many others.

In the past, when multiple developers worked on different projects, each had to manually install services on their own systems. Since different developers often use different operating systems like Linux, macOS, and Windows, the setup process was never the same. It took a lot of time, led to plenty of errors, and made setting up new environments a real headache, especially when you had to repeat it for multiple services.

Docker changed the game for developers and teams. Instead of manually installing every service and dependency, you can just run a single Docker command to start a container. Each container has its own isolated environment with everything it needs, so it runs the same on any machine, no matter if it’s Windows, macOS, or Linux. This makes collaboration smoother and eliminates all the bottlenecks that come from different setups, missing dependencies, or version mismatches.

In short, Docker is a platform that packages your app and its dependencies into a single, portable container, so it runs the same way everywhere.

Docker vs Virtual Machines

Docker and virtual machines (VMs) are both ways to run apps in a “virtual” environment, but they work differently. To understand the differences, it helps to know a bit about how computers run software.

A quick look at the layers:

  • Kernel: This is the part of the operating system that talks to your computer’s hardware, like the CPU, memory, and disk. Think of it as the middleman between your apps and your computer.

  • Application layer: This is where programs and apps run. It sits on top of the kernel and uses it to access hardware resources.

So, now let’s get into a bit more detail about Virtual Machines. A VM virtualizes the entire operating system, which means it comes with its own kernel and its own application layer. When you download a VM, you are basically getting a full OS inside your computer, often several gigabytes in size.

Because it has to boot its own OS, VMs start slowly. But VMs are very compatible, and can run on almost any host because they include everything they need.

Docker, on the other hand, only virtualizes the application layer, not the full OS. Containers share the host system’s kernel but include everything the app needs, dependencies, libraries, and configuration.

Docker images are small, often just a few megabytes. Containers start almost instantly because they don’t boot a full OS. A Docker container can run anywhere Docker is installed, no matter what operating system your computer uses.

In simple terms, to summarize:

  • A VM is like running a whole computer inside your computer – big, heavy, and slow.

  • A Docker container is like a self-contained app package – small, fast, and portable.

Here’s a quick comparison:

FeatureVirtual MachineDocker Container
SizeGBs (large)MBs (small)
Startup SpeedSlowFast
OS LayerFull OS + kernelShares host kernel
PortabilityRuns on compatible hostRuns anywhere Docker is installed

Docker Installation

Alright, now that you know what Docker is, let’s get it running on your own machine.

Docker works on Windows, macOS, and Linux, but each system has slightly different steps. The official Docker documentation has clear instructions for all operating systems under Docker Docs: Install Docker.

If you are more of a visual learner, this YouTube video walks you through installing Docker on Windows and Linux step by step: Watch here.

Here is a simple roadmap:

First, check your system requirements. Docker won’t run on every computer, so make sure your OS version is supported (the official docs have a checklist).

  1. Windows and macOS users:

    • Newer systems: Download and install Docker Desktop. It’s the easiest way to get started.

    • Older systems: If your computer doesn’t support Docker Desktop (for example, missing Hyper-V or older OS versions), you can use Docker Toolbox. Toolbox installs Docker using a lightweight virtual machine, so you can still run containers even on older machines.

  2. Linux users: You will usually install Docker through your package manager (apt for Ubuntu/Debian, yum for CentOS/Fedora, etc.). The official docs show the commands for your distro.

Then verify your installation: Open a terminal or command prompt and type:

docker --version

If you see the Docker version displayed, congratulations! Docker is ready to go.

docker version displayed on cli

Once Docker is installed, you’ll be ready to start running containers, pulling images, and experimenting with your apps in a safe, isolated environment.

Tip for beginners:

If you’re on an older machine and using Docker Toolbox, commands are mostly the same, but you will run them inside the Docker Quickstart Terminal, which sets up the virtual machine for you.

Basic Docker Commands

So far, we have been throwing around terms like images and containers, sometimes even interchangeably. But there is an important difference:

  • Docker image: Think of an image as a blueprint or a package. It contains everything your app needs: the code, libraries, dependencies, and configuration, but it’s not running yet.

  • Docker container: A container is a running instance of an image. When you start a container, Docker takes the image and runs it in its own isolated environment.

A helpful way to remember it is this: the image is the recipe, while the container is the cake. You can have one recipe (image) and make multiple cakes (containers) from it.

Important note: Docker Hub stores images, not containers. So when you pull something from Docker Hub, you’re downloading an image. For example:

docker pull redis

Here’s what you’ll see:

docker run redis shown on cli

This command downloads the Redis image to your machine. Once the download is complete, you can see all the images you have locally with:

docker images

running docker images on cli

From there, you can start a container from an image whenever you need it:

docker run -d --name my-redis redis

This command starts a container, my-redis, from the redis image you just pulled.

  • docker run tells Docker to start a new container from an image.

  • -d stands for “detached mode.” It means the container runs in the background so you can keep using your terminal.

  • --name my-redis gives your container a friendly name (my-redis) instead of letting Docker assign a random one. It makes it easier to manage later.

  • redis is the image you are using to start the container.

To see all containers that are currently running, you can use:

docker ps

ran docker ps in the terminal to list all running containers

This will list containers with details like:

  • Container ID

  • Name

  • Status (running or stopped)

  • The image it’s running from

If you want to see all containers, even ones that aren’t running, you can add the -a flag:

docker ps -a

How to Specify a Version of an Image:

By default, Docker pulls the latest version of an image. But sometimes you might need a specific version. You can do this using a colon (:) followed by the version tag. For example:

docker pull redis:7.2
docker run -d --name my-redis redis:7.2

To know which versions are available, you can visit Docker Hub or check the image tags online. Also, running docker images on your machine will show you all downloaded images and their versions.

How to Stop, Start, and Remove a Container

If you want to stop a running container, run this:

docker stop my-redis

To start it again:

docker start my-redis

You can also remove a container if you no longer need it:

docker rm my-redis

How to Restart a Container

You can restart a container using its container ID (or name) if something crashes, needs a refresh, or you just want to apply changes.

For example:

docker ps
CONTAINER ID   IMAGE     COMMAND                  CREATED         STATUS         PORTS      NAMES
c002bed0ae9a   redis     "docker-entrypoint.s…"   3 minutes ago   Up 3 minutes   6379/tcp   my-redis

Restart it like this:

docker restart c002bed0ae9a

or by name:

docker restart my-redis

Other handy ways:

  • Stop then start

      docker stop c002bed0ae9a
      docker start c002bed0ae9a
    
  • Start with logs

      docker start c002bed0ae9a && docker logs -f c002bed0ae9a
    

starting a docker container with logs

How to Run Multiple Redis Containers and Understanding Ports

Right now, you have a Redis container running:

docker ps

It shows something like this:

CONTAINER ID   IMAGE     COMMAND                  STATUS          PORTS      NAMES
c002bed0ae9a   redis     "docker-entrypoint.s…"   Up 20 minutes   6379/tcp   my-redis

Notice the PORTS column: 6379/tcp. This means the container is running Redis on its internal port 6379. By default, this port is inside the container and is not automatically exposed to your computer (the host). Docker maps it only if you specify it.

Trying to Run Another Redis Container on the Same Port

If you try:

docker run -d --name my-redis2 redis:7.4.7-alpine

It will fail to map the host port 6379 because the first container is already using it. This is where port binding comes in.

What is Port Binding?

Port binding (also called port mapping) is the mechanism Docker uses to connect a port inside a container to a port on your host machine (your laptop/desktop/server).

Without port binding, any service running inside a container is completely isolated: it can listen on its internal ports (for example, Redis on 6379, a Node.js app on 3000, MongoDB on 27017), but nothing outside the container, including your browser, another app on your computer, or even another container on a different network, can reach it.

  • Container Port: The port inside the container where the app is running (Redis defaults to 6379).

  • Host Port: The port on your computer that you want to use to access that container.

Docker lets you map a container port to a different host port using the -p flag.

Running a Second Redis Container on a Different Host Port

docker run -d --name my-redis2 -p 6380:6379 redis:7.4.7-alpine

-p 6380:6379 maps host port 6380 to container port 6379.

  • Now you can connect to Redis in the second container using localhost:6380.

  • Inside the container, Redis still runs on port 6379.

Check both containers:

docker ps

Output will look like this:

CONTAINER ID   IMAGE     STATUS          PORTS             NAMES
c002bed0ae9a   redis     Up 20 minutes   6379/tcp          my-redis
d123abcd5678   redis     Up 1 minute     0.0.0.0:6380->6379/tcp   my-redis2

The first container is running internally on 6379 (host port not exposed), while the second container is mapped so host port 6380 forwards traffic to container port 6379.

Think of each container as a room with a phone line (container port).

  • You want to call that room from the outside (host).

  • You can’t use the same external phone line for two rooms at the same time.

  • With port binding, you assign a different external line for each room, even if the internal phone number is the same.

Why Port Binding Exists

  1. Avoid port conflicts on the host: Only one process on your computer can use a given port at a time. If you already have one Redis container using host port 6379, a second container cannot also bind to the same host port. Port binding lets you run many identical containers side-by-side by mapping each one to a different host port (6379 → 6380, 6381, etc.).

  2. Access containerised services from your host: Your browser, Postman, MongoDB Compass, redis-cli, curl, etc., all run on the host. Without -p, they have no way to talk to services inside containers.

  3. Selective exposure: You don’t have to expose every port a container uses. Only map the ports you actually need externally, keeping the rest private and secure.

It also gives you more flexibility in development and production. In development, you might map container 3000 to host 3000. But in production (for example, behind a reverse proxy), you might map container 3000 to host 80 or 443, or not expose it at all and let another container talk to it over Docker’s internal network.

How to Explore a Container

To explore a container, run:

docker exec -it my-redis2 /bin/sh
  • docker exec runs a command in the container.

  • -it interactive terminal (lets you type and see output).

  • /bin/sh starts a shell inside the container.

Once inside, your prompt changes to something like:

/data #

Now you can list files, navigate directories, or run programs, all inside the container, without affecting your host machine.

result of running docker exec -it my-redis2 /bin/sh

docker run vs docker start

We have been using docker run and docker start throughout this article, but here’s why the difference is important:

  • Avoid accidental duplicates: Using docker run every time creates a new container. If you just want to restart something you already set up, docker start is faster and safer.

  • Maintain configuration: docker start preserves the container’s original settings, ports, volumes, and names so you don’t risk breaking anything by changing options.

  • Work efficiently with multiple containers: When running multiple services or different versions of the same app, knowing when to run vs start helps you manage resources, avoid port conflicts, and keep your workflow smooth.

  • Speed up your workflow: Starting existing containers is almost instant, while creating a new one takes slightly longer.

Bottom line docker run = create something new, while docker start = resume what you already have.

Practice with JavaScript

Now that we have covered the core Docker concepts, let’s put them into action. In this section, we’ll containerize a simple JavaScript project that consists of:

  • A frontend: Built with HTML, CSS, and JavaScript

  • A backend: A simple Node.js server (server.js)

  • A database: A MongoDB instance pulled directly from Docker Hub

  • A UI for MongoDB: Using Mongo Express to visualize and manage our database

This example demonstrates how Docker can manage multiple components of an application, including code, dependencies, and services in isolated, consistent environments.

You can pull the starter project from GitHub here.

Or clone it directly using your terminal:

git clone https://github.com/Oghenekparobo/docker_tut_js.git
cd docker_tut_js

This contains the basic HTML and JavaScript files along with the Node.js backend.

Next, we will prepare to set up our database. Head over to Docker Hub and type “mongo” in the search box. You will see the official MongoDB image published by Docker.

official mongo db database in dockerhub

How to Pull the MongoDB Image

Now that you have explored the official MongoDB image on Docker Hub, let’s actually pull it into your local environment.

Open your terminal, navigate to your project directory (for example, docker_tut_js), and run:

docker pull mongo

This command tells Docker to download the latest version of the MongoDB image from Docker Hub.

You will see output similar to this:

Using default tag: latest
latest: Pulling from library/mongo
b8a35db46e38: Already exists 
a637dbfff7e5: Pull complete 
0c9047ace63c: Pull complete 
02cd4cf70021: Pull complete 
dfb5d357a025: Pull complete 
007bf0024f67: Pull complete 
67fd8af3998d: Pull complete 
d702312e8109: Pull complete 
Digest: sha256:7d1a1a613b41523172dc2b1b02c706bc56cee64144ccd6205b1b38703c85bf61
Status: Downloaded newer image for mongo:latest
docker.io/library/mongo:latest

Here’s what’s happening:

  • “Using default tag: latest”: Docker pulls the most recent version of MongoDB since no specific version was provided.

  • “Pulling from library/mongo”: It’s downloading from Docker’s official image library.

  • “Pull complete”: Each line represents a layer of the image being successfully downloaded.

  • “Downloaded newer image for mongo:latest”: Confirms that the MongoDB image is now stored locally on your system.

You can confirm that it’s available by running:

docker images

You should see mongo listed in the repository column.

mongo db listed in the repository column after running docker images

How to Pull the Mongo Express Image

Now that the MongoDB image is ready, let’s pull the Mongo Express image.

Mongo Express is a lightweight web-based interface that lets you view and manage your MongoDB collections through a browser, similar to how phpMyAdmin works for MySQL.

Open your terminal (still in your project directory) and run:

docker pull mongo-express

You’ll see output similar to this:

Using default tag: latest
latest: Pulling from library/mongo-express
b8a35db46e38: Already exists
a637dbfff7e5: Pull complete
4e0e0977e9c3: Pull complete
02cd4cf70021: Pull complete
Digest: sha256:3d6dbac587ad91d0e2eab83f09a5b31a1c8f9d91a8825ddaa6c7453c25cb4812
Status: Downloaded newer image for mongo-express:latest
docker.io/library/mongo-express:latest

Here’s what this means:

  • docker pull mongo-express downloads the official Mongo Express image from Docker Hub.

  • Each “Pull complete” line represents a successfully downloaded layer of the image.

  • mongo-express:latest confirms that the latest version is now stored locally.

To verify that both images are available, run:

docker images

You should see mongo and mongo-express listed in the output.

docker images command showing both mongo db database and mongo express images verifying they have been installed by docker on your project

Now that both images are downloaded, the next step is to run the containers to make sure MongoDB is up and accessible, and then connect it to Mongo Express so we can manage it through the browser.

Before we do that, let’s briefly look at how these two containers will communicate.

Docker Network

When MongoDB and Mongo Express run in separate containers, they need a way to talk to each other. Docker handles this using something called a Docker Network, a virtual bridge that lets containers communicate securely without exposing internal ports to the outside world.

When you run containers in Docker, it automatically creates an isolated network for them. Think of it like a private space where your containers can talk to each other safely without exposing everything to the outside world.

For example, if our MongoDB container and Mongo Express container are on the same Docker network, they can communicate just by using their container names (like mongo or mongo-express). You don’t need to use localhost or port numbers, as Docker handles that part internally.

But anything outside the Docker network (like your host machine or a Node.js app) connects through the exposed ports.

So later, when we package our entire application, the Node.js backend, MongoDB, Mongo Express, and even the frontend (index.html) into Docker, all these containers will interact smoothly through the Docker network. The browser on your computer will then connect to your Node.js app using the host address and port we have exposed.

By default, Docker already provides a few built-in networks. You can see them by running:

docker network ls

You will get something like this:

NETWORK ID     NAME      DRIVER    SCOPE
712a7144f1a0   bridge    bridge    local
4ae27eedea5b   host      host      local
4806000201ce   none      null      local

These are automatically created by Docker. You don’t need to worry too much about them right now – we will just focus on creating our own custom network.

For our setup, we will create a separate network that both MongoDB and Mongo Express can share. Let’s call it mongo-network:

docker network create mongo-network

mongo-network created with docker network create mongo-network then to see it in the list run docker network ls

How to Run the Mongo Container

To make sure our MongoDB and Mongo Express containers can communicate, we need to run them inside the same Docker network. That’s why we created mongo-network earlier.

Let’s start with MongoDB. Remember, the docker run command is used to start a container from an image. In this case, we will run the official MongoDB image and attach it to our network.

We will also expose the default MongoDB port 27017 so it’s accessible from outside the container, and set up environment variables for the root username and password.

Here is the command:

docker run -p 27017:27017 -d \
  -e MONGO_INITDB_ROOT_USERNAME=admin \
  -e MONGO_INITDB_ROOT_PASSWORD=password \
  --name mongo \
  --network mongo-network \
  mongo

Here’s what each part does:

  • -p 27017:27017 maps the container’s MongoDB port to your host machine.

  • -d runs the container in detached mode (in the background).

  • -e sets environment variables for the database’s root credentials.

  • --name mongo gives the container a custom name for easier reference.

  • --network mongo-network connects the container to the network we created.

Once it runs successfully, your MongoDB instance will be up and running inside the Docker network, ready for other containers like Mongo Express to connect to it.

After creating your MongoDB container, you can easily check if it’s running and healthy.

First, run docker ps to see all active containers. You should see your MongoDB container (mongo) listed with its port 27017 exposed. To get more details about what’s happening inside the container, you can check its logs using docker logs mongo or, if you prefer, by using the container ID (for example: docker logs 7abb38175ae28). The logs will show startup messages from MongoDB, and you should look for lines indicating that the database started successfully and is ready to accept connections.

This is a quick way to verify that everything is working correctly before connecting other services, like Mongo Express, to it.

docker ps

This will list all running containers. You should see your MongoDB container (mongo) with its port 27017 exposed.

docker logs mongo or the id of the container e.g docker logs 7abb38175ae283429354609866c8d97521f37b535c475ae448295f8fc0ed947f

This will show startup messages. Look for lines indicating MongoDB started successfully and is ready to accept connections.

checking if the mongo container is running

How to Run the Mongo Express Container

Now that MongoDB is up and running, we can run Mongo Express, which is a web-based interface to manage and view your MongoDB databases. We will connect it to the same network (mongo-network) so it can communicate with MongoDB.

Here’s the command:

docker run -d \
  -e ME_CONFIG_MONGODB_ADMINUSERNAME=admin \
  -e ME_CONFIG_MONGODB_ADMINPASSWORD=password \
  -e ME_CONFIG_MONGODB_SERVER=mongo \
  --name mongo-express \
  --network mongo-network \
  -p 8081:8081 \
  mongo-express

Here’s what each part does:

  • -d runs the container in detached mode (in the background).

  • -e ME_CONFIG_MONGODB_ADMINUSERNAME=admin sets the MongoDB admin username for Mongo Express to use.

  • -e ME_CONFIG_MONGODB_ADMINPASSWORD=password sets the corresponding MongoDB password.

  • -e ME_CONFIG_MONGODB_SERVER=mongo tells Mongo Express which MongoDB server to connect to. Here we use the container name mongo because both containers are on the same network.

  • --name mongo-express gives the container a friendly name for easier reference.

  • --network mongo-network connects the container to the same Docker network as MongoDB so they can talk to each other.

  • -p 8081:8081 exposes the Mongo Express web interface on port 8081 of your host machine.

  • mongo-express the name of the Docker image we’re running.

Once the container is running, you can open your browser and visit http://localhost:8081 to access Mongo Express and interact with your MongoDB instance.

For more details about the available environment variables and options, you can check the official Docker Hub page for Mongo Express here.

Before opening your browser at http://localhost:8081, it’s a good idea to check if the Mongo Express container is running properly. You can do this by viewing its logs:

docker logs <container-id>
# or
docker logs mongo-express

You should see output similar to this:

Waiting for mongo:27017...
No custom config.js found, loading config.default.js
Welcome to mongo-express 1.0.2
------------------------
Mongo Express server listening at http://0.0.0.0:8081
Server is open to allow connections from anyone (0.0.0.0)
basicAuth credentials are "admin:pass", it is recommended you change this in your config.js!

This confirms that Mongo Express is up and running and ready to connect to your MongoDB instance.

Take note of the basicAuth credentials shown in the logs (admin:pass). If these credentials are present, you’ll need to use them when accessing Mongo Express from your browser. Later, you can change them in a custom config.js file for better security.

Once everything looks good in the logs, you can safely visit http://localhost:8081 to access the Mongo Express interface.

mongo-express interface from http://localhost:8081

If your browser asks for a username and password when accessing Mongo Express, use the basicAuth credentials shown in the container logs:

Username: admin
Password: pass

These are the default credentials, and it’s strongly recommended to change them later in a custom config.js file for better security.

When you open Mongo Express, you will notice some default databases already created. For this project, we will create a new database called todos. Once it’s created, your Node.js application can connect to this database to store and retrieve data.

How to Connect Node.js to MongoDB

You already have MongoDB running inside a Docker container (mongo). The container exposes the default MongoDB port 27017 to the host, so any process on your laptop/desktop can reach it via localhost:27017.

Important: The Node.js app is outside Docker (it’s just a regular node server.js process you start from your terminal).

Because the app is external, we must use localhost (or 127.0.0.1) as the host name – not the container name mongo.

Once we later containerise the Node.js app and put it on the same Docker network, we’ll switch the host to mongo. For now, keep it localhost.

Node.js Backend

Here’s a version of our server.js using MongoDB:

const express = require("express");
const multer = require("multer");
const path = require("path");
const fs = require("fs");
const { MongoClient, ObjectId } = require("mongodb");

const app = express();
const PORT = 3000;

// Host = localhost  →  talks to the MongoDB container via the exposed port
// Port = 27017      →  default MongoDB port
// User / Pass       →  admin / password (the credentials you gave the container)
const mongoUrl = "mongodb://admin:password@localhost:27017";
const dbName = "todos";
let db;

MongoClient.connect(mongoUrl)
  .then((client) => {
    db = client.db(dbName);
    console.log("Connected to MongoDB →", dbName);
  })
  .catch((err) => console.error("MongoDB connection error:", err));

const uploadDir = path.join(__dirname, "uploads");
if (!fs.existsSync(uploadDir)) fs.mkdirSync(uploadDir);

const storage = multer.diskStorage({
  destination: (req, file, cb) => cb(null, uploadDir),
  filename: (req, file, cb) => {
    const unique = Date.now() + "-" + Math.round(Math.random() * 1e9);
    cb(null, "photo-" + unique + path.extname(file.originalname));
  },
});
const upload = multer({ storage });

app.use(express.static(__dirname));
app.use("/uploads", express.static(uploadDir));
app.use(express.json());
app.use(express.urlencoded({ extended: true }));

app.get("/todos", async (req, res) => {
  const todos = await db.collection("todos").find().toArray();
  res.json(todos);
});

app.post("/todos", upload.single("photo"), async (req, res) => {
  const text = req.body.text?.trim();
  if (!text) return res.status(400).json({ error: "Text required" });

  const todo = {
    text,
    image: req.file ? `/uploads/${req.file.filename}` : null,
    createdAt: new Date(),
  };

  const result = await db.collection("todos").insertOne(todo);
  todo._id = result.insertedId;
  res.json(todo);
});

// Start server
app.listen(PORT, () => {
  console.log(`Server → http://localhost:${PORT}`);
});

Frontend

index.html:

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <title>Todo + Image</title>
    <style>
      body {
        font-family: sans-serif;
        margin: 2rem;
        max-width: 800px;
      }
      .todo {
        border: 1px solid #ccc;
        padding: 1rem;
        margin-bottom: 1rem;
        border-radius: 8px;
      }
      .todo img {
        max-height: 150px;
        margin-top: 0.5rem;
      }
      .error {
        color: red;
      }
      input[type="text"] {
        width: 100%;
        padding: 0.5rem;
        margin-bottom: 0.5rem;
      }
      #preview {
        max-width: 300px;
        margin-top: 0.5rem;
        display: none;
      }
    </style>
  </head>
  <body>
    <h1>Todo List with Images</h1>

    <div id="addForm">
      <input type="text" id="textInput" placeholder="What needs to be done?" />
      <input type="file" id="imageInput" accept="image/*" />
      <img id="preview" alt="preview" />
      <button id="addBtn">Add Todo</button>
      <p id="status"></p>
    </div>

    <h2>Todos</h2>
    <div id="todos"></div>

    <script>
      const $ = document.querySelector.bind(document);

      const textInput = $("#textInput");
      const imageInput = $("#imageInput");
      const preview = $("#preview");
      const addBtn = $("#addBtn");
      const status = $("#status");
      const todosDiv = $("#todos");

      imageInput.addEventListener("change", () => {
        const file = imageInput.files[0];
        if (!file) {
          preview.style.display = "none";
          return;
        }
        const reader = new FileReader();
        reader.onload = (e) => {
          preview.src = e.target.result;
          preview.style.display = "block";
        };
        reader.readAsDataURL(file);
      });

      addBtn.addEventListener("click", async () => {
        const text = textInput.value.trim();
        if (!text) {
          status.textContent = "Please enter a todo text.";
          status.className = "error";
          return;
        }

        const form = new FormData();
        form.append("text", text);
        if (imageInput.files[0]) form.append("photo", imageInput.files[0]);

        try {
          const res = await fetch("/todos", { method: "POST", body: form });
          const data = await res.json();
          if (!res.ok) throw new Error(data.error || "failed");
          status.textContent = "Todo added!";
          status.className = "";
          textInput.value = "";
          imageInput.value = "";
          preview.style.display = "none";
          loadTodos(); // refresh list
        } catch (err) {
          status.textContent = "Error: " + err.message;
          status.className = "error";
        }
      });

      async function loadTodos() {
        const res = await fetch("/todos");
        const todos = await res.json();
        todosDiv.innerHTML = "";
        todos.forEach((t) => {
          const div = document.createElement("div");
          div.className = "todo";
          div.innerHTML = `<strong>${escapeHtml(t.text)}</strong>`;
          if (t.image) {
            div.innerHTML += `<br><img src="${t.image}" alt="todo image">`;
          }
          todosDiv.appendChild(div);
        });
      }

      function escapeHtml(s) {
        const div = document.createElement("div");
        div.textContent = s;
        return div.innerHTML;
      }

      loadTodos();
    </script>
  </body>
</html>

Now your Node.js app can connect to the MongoDB container running in Docker. Since the app is running outside Docker for now, it connects through localhost:27017 using the credentials you set (admin / password).

Once connected, your Node.js backend stores and retrieves todos directly from the todos database in MongoDB, replacing the in-memory array. Later, if you containerize the Node.js app and put it on the same Docker network as MongoDB, you can switch the host from localhost to the container name mongo. we are getting there

You can get the full backend and frontend code ready to run and tweak it for your setup here: GitHub repo.

How to Use Docker Compose

So we now have our Node.js app connected to MongoDB and Mongo Express, both running inside containers. We’ve created the network, started the containers, and everything is talking to each other perfectly.

But let’s be honest: typing out all those long docker run commands every time can get tedious. You probably want a simpler, cleaner way to spin everything up with just one command. That’s where Docker Compose comes in.

Docker Compose is a tool that lets you define and run multi-container applications with a single command. Instead of manually running multiple docker run commands, you describe your setup in a simple docker-compose.yml file, specifying each service (like your Node.js app, MongoDB, and Mongo Express), their configurations, environment variables, and shared networks.

Basically, it lets you manage multiple containers as one project, easy to start, stop, and maintain with a single file and a single command.

The standard naming convention is docker-compose.yml (or docker-compose.yaml. Both work, but .yml is more common).

Docker automatically detects it when you run:

docker compose up

So yeah, stick with docker-compose.yml for convention.

Now, to run the containers for MongoDB and Mongo Express, we can use the following two commands, respectively:

# MongoDB container
docker run -p 27017:27017 -d \
  -e MONGO_INITDB_ROOT_USERNAME=admin \
  -e MONGO_INITDB_ROOT_PASSWORD=password \
  --name mongo \
  --network mongo-network \
  mongo

# Mongo Express container
docker run -d \
  -e ME_CONFIG_MONGODB_ADMINUSERNAME=admin \
  -e ME_CONFIG_MONGODB_ADMINPASSWORD=password \
  -e ME_CONFIG_MONGODB_SERVER=mongo \
  --name mongo-express \
  --network mongo-network \
  -p 8081:8081 \
  mongo-express

Now, instead of typing these long commands every time, we will combine them and run everything at once using a Docker Compose file.

The docker-compose.yml file will be located at the root of our Node.js project.

docker-composer.yml file in the root of the project

Here’s how our docker-compose.yml file looks:

version: "3.8"

services:
  mongodb:
    image: mongo
    container_name: mongo
    ports:
      - "27017:27017"
    environment:
      MONGO_INITDB_ROOT_USERNAME: admin
      MONGO_INITDB_ROOT_PASSWORD: password

  mongo-express:
    image: mongo-express
    container_name: mongo-express
    ports:
      - "8081:8081"
    environment:
      ME_CONFIG_MONGODB_ADMINUSERNAME: admin
      ME_CONFIG_MONGODB_ADMINPASSWORD: password
      ME_CONFIG_MONGODB_SERVER: mongodb
    depends_on:
      - mongodb

Let’s break down what’s going on here:

  • version: "3.8": This defines the Compose file version. Each version has slightly different syntax rules and features. Version 3.8 is modern and works with the latest Docker Engine.

  • services:: All the containers we want to run are defined here. In our case, two services: mongodb and mongo-express.

MongoDB service:

  • image: mongo pulls the official MongoDB image from Docker Hub.

  • container_name: mongo gives the container a friendly name.

  • ports: "27017:27017" exposes MongoDB’s default port to our host, so Node.js or other apps can connect.

  • environment: sets up the root username and password for MongoDB.

Mongo Express service:

  • image: mongo-express is the official Mongo Express image.

  • container_name: mongo-express is a friendly name for easier reference.

  • ports: "8081:8081" exposes Mongo Express web interface on host port 8081.

  • environment: let’s Mongo Express know how to connect to MongoDB (username, password, host).

  • depends_on: - mongodb ensures MongoDB starts first, so Mongo Express can connect immediately.

Why Use Docker Compose?

  • Single command: Instead of running multiple long docker run commands, just run:
docker compose up -d
  • Automatic networking: Compose creates a default network so services can communicate using their service names (mongodb In our case)

  • Easier maintenance: You can stop, start, or rebuild all services with simple commands.

Before we run our new docker-compose.yml, it’s important to make sure no conflicting containers are running. Remember, we already had MongoDB and Mongo Express running from the previous docker run commands.

To avoid conflicts (like ports already in use), we should stop and remove any running containers first.

Here’s how:

# List all running containers
docker ps

# Stop a specific container (replace <container_name> with mongo or mongo-express)
docker stop mongo
docker stop mongo-express

# Remove the stopped containers
docker rm mongo
docker rm mongo-express

# Optional: stop and remove all running containers at once
docker stop $(docker ps -q)
docker rm $(docker ps -a -q)
  • docker ps shows currently running containers.

  • docker stop <name> stops a container gracefully.

  • docker rm <name> removes the container from Docker.

  • docker stop $(docker ps -q) stops all running containers.

  • docker rm $(docker ps -a -q) removes all containers (running or stopped).

Once all previous containers are stopped and removed, we’re ready to run our Docker Compose setup safely without conflicts.

Now that all previous containers are stopped, we can start MongoDB and Mongo Express together using our docker-compose.yml file.

From the root of your Node.js project (where the docker-compose.yml file is located), run:

docker compose up -d

Here’s what this does:

  • docker compose tells Docker to use Compose.

  • up builds (if needed) and starts all the services defined in the Compose file.

  • -d runs the containers in detached mode, meaning they run in the background.

After running this command, Docker will start both MongoDB and Mongo Express, connect them on the same internal network, and expose the ports we defined (27017 for MongoDB and 8081 for Mongo Express).

If everything worked correctly, after running:

docker compose up -d

You should see output similar to this:

[+] Running 3/3
 ✔ Network docker_tut_default  Created                                                                                               0.0s 
 ✔ Container mongo             Started                                                                                               0.6s 
 ✔ Container mongo-express     Started                                                                                               0.8s 
stephenjohnson@Oghenekparobo docker_tut %

What this means:

  • Network docker_tut_default Created: Docker Compose automatically creates a network for your services so they can communicate with each other.

  • Container mongo Started: Your MongoDB container is running.

  • Container mongo-express Started: Your Mongo Express container is running.

You can confirm that the containers are running by using:

docker ps

This will list all active containers. You should see both mongo and mongo-express with their respective ports (27017 for MongoDB and 8081 for Mongo Express) exposed.

  • To access Mongo Express, open your browser and go to http://localhost:8081 to interact with MongoDB through the web interface.

  • To access MongoDB, your Node.js app can connect to MongoDB at localhost:27017 using the credentials you set in the Compose file.

Compared to running long docker run commands for each container, using Docker Compose is easier because:

  • Starts multiple containers with one command.

  • Automatically sets up networking between containers.

  • Makes it easier to stop, remove, or rebuild containers later.

In short, Docker Compose simplifies and organizes everything, making it much easier to manage your development environment.

docker compose up -d succesfuly created the containers and docker ps shows the containers

At this stage, it’s important to know that any data you add to MongoDB is temporary. If you stop or remove your containers and then start them again, you will notice that all your data is gone. This happens because data inside a container isn’t persistent by default.

Don’t worry, this is expected, and we’ll cover how to make data persistent later in the tutorial when we introduce Docker volumes. For now, just be aware that each time you restart your containers, MongoDB starts fresh with no previous data.

You can get a full sample, including the Dockerfile and the docker‑compose file, here.

How to Build Our Own Docker Image

Now that we have tested our Node.js application locally and seen it working perfectly with MongoDB and Mongo Express, the next step is preparing it for deployment.

Running the app directly on our machine works fine for development, but it’s not practical when we want to move it to another environment or server. By creating a Docker image, we can package the application together with all its dependencies, configuration, and environment setup into a single, portable unit. This image can then run anywhere Docker is installed, ensuring our app works the same way across development, testing, and production.

In short, building a Docker image is how we containerize our app and make it deployment-ready.

In order to containerize our Todo app, we need a Dockerfile. A Dockerfile is essentially a blueprint that tells Docker how to build an image for our application. It defines the base environment, copies our application code, installs dependencies, and specifies how the app should start. With this blueprint, Docker can create a consistent image that behaves the same way on any machine, making our Node.js app fully portable and ready for deployment.

In our Dockerfile, notice the capital D, which is the standard naming convention. Place this file in the root directory of your Node.js project. In simple projects like ours, our main app file (like server.js or index.js) is usually in the root too, along with package.json. Docker will use this file as a blueprint to build a container image of your application.

If your main app file is inside a subfolder, that’s fine too. Just make sure the Dockerfile’s COPY and CMD commands point to the correct location. The important thing is that the Dockerfile lives in the root so Docker knows where to start building your app.

Here’s how the contents of our Dockerfile look:

# Use full Node 18 (Debian-based)
FROM node:18

# Set environment variables
ENV MONGO_DB_USERNAME=admin \
    MONGO_DB_PASSWORD=password

# Set working directory
WORKDIR /home/app

# Copy package files
COPY package*.json ./

# Install dependencies
RUN npm install

# Copy source code
COPY . .

# Expose port
EXPOSE 3000

# Start the app
CMD ["node", "server.js"]

Let’s see what’s going on here:

  • FROM node:13-alpine is the base image for our container. It comes with Node.js installed and is very lightweight, keeping the image small.

  • ENV MONGO_DB_USERNAME=admin \ MONGO_DB_PASSWORD=password sets environment variables inside the container so the Node.js app can connect to MongoDB.

  • WORKDIR /home/app sets the working directory inside the container. All subsequent commands like COPY or RUN will run relative to this folder.

  • COPY . . copies all files from your local project into the container’s working directory. This includes your server.js, package.json, and any other files needed to run the app.

  • RUN npm install installs all the Node.js dependencies listed in package.json inside the container.

  • EXPOSE 3000 tells Docker that the container will listen on port 3000, which is the port our Node.js app runs on.

  • CMD ["node", "server.js"] defines the command that runs when the container starts, which launches our Node.js server.

By placing this Dockerfile in the root of your project, Docker knows exactly where to find your app’s files and dependencies. When we build the image, it packages everything inside a portable container that can run anywhere Docker is installed, making deployment straightforward and consistent.

Dockerfile VS CODE Illustration

Now that we have our Dockerfile ready, the next step is to build the Docker image for our Node.js app.

To build the image, open your terminal, make sure you are in the root directory of your project (where the Dockerfile is), and run:

docker build -t todo-app:1.0 .
  • todo-app is the name of your image.

  • :1.0 is the version tag (you can use any versioning scheme, like 1.0, v1, latest, etc.).

  • . tells Docker to use the current folder (root of your project) as the build context.

After running:

docker build -t todo-app:1.0 .

Docker reads your Dockerfile, packages your Node.js app with all its dependencies, and creates a Docker image. You can confirm the image exists by running:

docker images

You should see output like this:

REPOSITORY      TAG       IMAGE ID       CREATED          SIZE
todo-app        1.0       d85dd4ed97f9   45 seconds ago   147MB
mongo           latest    1d659cebf5e9   2 weeks ago      894MB
mongo-express   latest    1133e12468c7   20 months ago    182MB

This shows that your todo-app image has been created successfully, alongside the images for MongoDB and Mongo Express.

Running Your Node.js App Container

Now that the image exists, the next step is to run a container from it. A container is basically a running instance of your image. To do this:

docker run todo-app:1.0

Here’s what this command does:

  • docker run starts a new container from the image.

  • todo-app:1.0 tells Docker which image to use (the one we just built).

Once this runs, your Node.js app will be live inside a container, separate from your local environment. You can open your browser at http://localhost:3000 and see your Todo app working just like it did locally.

To see all running containers, use:

docker ps

You’ll see something like:

CONTAINER ID   IMAGE           COMMAND         CREATED       STATUS       PORTS                  NAMES
d85dd4ed97f9   todo-app:1.0    "node server.js"  10s ago      Up 10s       0.0.0.0:3000->3000/tcp   awesome_todo

This confirms your container is running. If you ever need to stop it:

docker stop <container-id>

Troubleshooting Errors

We started facing some issues here: when you run docker run todo-app:1.0 You'll see an error like this:

Server  http://localhost:3000 
MongoDB connection error: MongoServerSelectionError: getaddrinfo ENOTFOUND mongodb
    at Topology.selectServer (/home/app/node_modules/mongodb/lib/sdam/topology.js:346:38)
    ...
    [cause

especially when you try to perform an operation like creating a todo list.

The error getaddrinfo ENOTFOUND mongodb tells us that your Node.js container can't find MongoDB. Even though MongoDB is running in another container, your app container is isolated and doesn't know how to reach it.

Why This Happens:

Remember in our server.js, we connect to MongoDB using:

const mongoUrl = "mongodb://admin:password@localhost:27017";

The problem is with localhost. When you run your app locally on your machine (not in Docker), localhost works perfectly because MongoDB is running on the same machine. But when your app runs inside a Docker container, localhost refers to the container itself, not your host machine or other containers.

Think of it like this:

  • Running locally: Your app and MongoDB are like two people in the same room, localhost works

  • Running in Docker: Each container is like a separate room, localhost only refers to that specific room

The Solution

We need to change the MongoDB connection URL to use the Docker service name instead of localhost. Update your server.js file:

const mongoUrl = "mongodb://admin:password@localhost:27017";

To this:

const mongoUrl = "mongodb://admin:password@mongodb:27017";

Here's the complete updated server.js:

const express = require("express");
const multer = require("multer");
const path = require("path");
const fs = require("fs");
const { MongoClient, ObjectId } = require("mongodb");

const app = express();
const PORT = 3000;

// Host = localhost  →  talks to the MongoDB container via the exposed port
// Port = 27017      →  default MongoDB port
// User / Pass       →  admin / password (the credentials you gave the container)
const mongoUrl = "mongodb://admin:password@mongodb:27017";
const dbName = "todos";
let db;

MongoClient.connect(mongoUrl)
  .then((client) => {
    db = client.db(dbName);
    console.log("Connected to MongoDB →", dbName);
  })
  .catch((err) => console.error("MongoDB connection error:", err));

const uploadDir = path.join(__dirname, "uploads");
if (!fs.existsSync(uploadDir)) fs.mkdirSync(uploadDir);

const storage = multer.diskStorage({
  destination: (req, file, cb) => cb(null, uploadDir),
  filename: (req, file, cb) => {
    const unique = Date.now() + "-" + Math.round(Math.random() * 1e9);
    cb(null, "photo-" + unique + path.extname(file.originalname));
  },
});
const upload = multer({ storage });

app.use(express.static(__dirname));
app.use("/uploads", express.static(uploadDir));
app.use(express.json());
app.use(express.urlencoded({ extended: true }));

app.get("/todos", async (req, res) => {
  const todos = await db.collection("todos").find().toArray();
  res.json(todos);
});

app.post("/todos", upload.single("photo"), async (req, res) => {
  const text = req.body.text?.trim();
  if (!text) return res.status(400).json({ error: "Text required" });

  const todo = {
    text,
    image: req.file ? `/uploads/${req.file.filename}` : null,
    createdAt: new Date(),
  };

  const result = await db.collection("todos").insertOne(todo);
  todo._id = result.insertedId;
  res.json(todo);
});

// Start server
app.listen(PORT, () => {
  console.log(`Server → http://localhost:${PORT}`);
});

Why mongodb Works

The hostname mongodb matches the service name we defined in our docker-compose.yml:

services:
  mongodb:    # ← This is the hostname other containers use
    image: mongo
    container_name: mongo
    ...

When containers run in the same Docker Compose network, Docker provides an internal DNS that resolves service names to the correct container IP addresses. So when your app tries to connect to mongodb:27017, Docker automatically routes it to the MongoDB container.

Rebuild Your Docker Image

Now that we have updated the code, we need to rebuild the Docker image to include this change:

docker build -t todo-app:1.0 .
``

You should see output confirming the build completed successfully:
```
[+] Building 8.1s (10/10) FINISHED
 => [internal] load build definition from Dockerfile
 => => transferring dockerfile: 443B
 ...
 => => naming to docker.io/library/todo-app:1.0

Add Your App to Docker Compose

Now update your docker-compose.yml file to include the todo-app service:

version: "3.8"

services:
  mongodb:
    image: mongo
    container_name: mongo
    ports:
      - "27017:27017"
    environment:
      MONGO_INITDB_ROOT_USERNAME: admin
      MONGO_INITDB_ROOT_PASSWORD: password

  mongo-express:
    image: mongo-express
    container_name: mongo-express
    ports:
      - "8081:8081"
    environment:
      ME_CONFIG_MONGODB_ADMINUSERNAME: admin
      ME_CONFIG_MONGODB_ADMINPASSWORD: password
      ME_CONFIG_MONGODB_SERVER: mongodb
    depends_on:
      - mongodb

  todo-app:
    image: todo-app:1.0
    container_name: todo-app
    ports:
      - "3000:3000"
    depends_on:
      - mongodb

The todo-app service includes:

  • image: todo-app:1.0 that uses the Docker image we just rebuilt

  • container_name: todo-app that gives the container a friendly name

  • ports: "3000:3000" that exposes the app on port 3000

  • depends_on: mongodb that ensures MongoDB starts before the app

Start All Services

First, stop any running containers:

docker compose down

If you have port 3000 running in your local system, then stop it (that is, free up port 3000).

We were running the server locally before, but now that we’ve built a Docker image, the app runs inside a container, so it’s no longer dependent on the local machine’s environment.

node server.js
Server  http://localhost:3000

Now stop it with Ctrl + C in that terminal. That’s it.

Then start everything together:

docker compose up -d
```

You should see:
```
[+] Running 4/4
 ✔ Network docker_tut_default  Created
 ✔ Container mongo             Started
 ✔ Container mongo-express     Started
 ✔ Container todo-app          Started

Verify Everything Works

Check that all containers are running:

docker ps
```

Expected output:
```
CONTAINER ID   IMAGE           COMMAND                  CREATED          STATUS          PORTS                      NAMES
a1b2c3d4e5f6   todo-app:1.0    "node server.js"         30 seconds ago   Up 28 seconds   0.0.0.0:3000->3000/tcp     todo-app
3d7c797fde1d   mongo-express   "/sbin/tini -- /dock…"   30 seconds ago   Up 29 seconds   0.0.0.0:8081->8081/tcp     mongo-express
4511ade73c38   mongo           "docker-entrypoint.s…"   30 seconds ago   Up 29 seconds   0.0.0.0:27017->27017/tcp   mongo
```

## Test Your Application

Now let's verify everything works:

### 1. Access Your Todo App
Open your browser and go to:
```
http://localhost:3000
```

### 2. Create Some Todos
Add a few todo items to test the functionality. Try uploading images too!

### 3. Verify in Mongo Express
Open Mongo Express:
```
http://localhost:8081

Navigate to the todos database, then the todos collection. You should see all the todos you just created with their complete data.

What Changed and Why It Works

Before the fix:

  • Connection string used localhost:27017

  • Container looked for MongoDB on itself

  • Connection failed with ENOTFOUND error

After the fix:

  • Connection string uses mongodb:27017

  • Docker's internal DNS resolves mongodb to the MongoDB container

  • Connection succeeds and data flows properly

This is a crucial lesson in Docker networking: containers communicate using service names, not localhost. Docker Compose automatically creates a network where all services can find each other by name.

How to Manage Your Containers

Here’s a quick overview of how to manage your containers once you have them up and running. You’ll typically use these common commands:

Stop all services:

docker compose down

View logs from your app:

docker compose logs todo-app

View logs in real-time:

docker compose logs -f todo-app

Rebuild after code changes:

docker build -t todo-app:1.0 .
docker compose up -d --force-recreate todo-app

Your application is now fully containerized and production-ready. All three services work together seamlessly, and you can deploy this entire stack anywhere Docker is supported with just the docker-compose.yml file and your built image.

Get the full updated code here.

How to Create a Private Docker Repository

Now we want to store our custom Docker image in a private container registry (instead of our local machine only). This gives you three major advantages:

  1. Controlled access – Only people or servers you explicitly authorize can pull (or push) the image. Your code and dependencies stay private and secure.

  2. Reliable distribution – Anyone (or any server) with the correct AWS credentials can pull the exact same image from anywhere in the world, eliminating “it works on my machine” problems.

  3. Versioning and lifecycle management – You can keep multiple tagged versions (1.0, 2.0, latest, and so on) and easily roll back if needed.

The first step is to create a private Docker repository, also known as a container registry. In this case, we will use AWS Elastic Container Registry (ECR). Amazon ECR is a fully managed container registry that makes it easy to store, manage, share, and deploy your container images and artifacts securely from anywhere.

Amazon ECR Landing page

Once you’re on the home page, just click on the Create button. Name the repository the same as your image, todo-app, and then click Create to finalize the setup.

creating our repository on AWS ECR

Don’t worry about the extra options – this isn’t an AWS tutorial.

Note: In AWS ECR, each image has its own repository, where we store the different tagged versions of that image.

AWS ECR our todo-app empty repository

Now, to push our image into the private repository, we need to do two things. First, we have to log in to the private repo. This is necessary because you’ll need authenticate yourself before AWS allows you to push anything. In other words, when you push your local image to the repo, you’re basically saying, “Yes, I have access to this registry. Here are my credentials.”

In our case, since we’re using AWS ECR, we will authenticate through AWS instead of typing our username and password manually.

Step 1: Get Your AWS Access Keys

To locate your access keys in the AWS console, follow these steps:

  1. Log in to the AWS Console at https://console.aws.amazon.com

  2. Click your account name (top right corner) and go to Security Credentials

  3. Scroll down to "Access keys" section

  4. If you don't have an access key:

    • Click "Create access key"

    • Select "Command Line Interface (CLI)"

    • Check the confirmation box and click Next

    • Add a description (optional) and click "Create access key"

  5. IMPORTANT: Copy both the access key ID (looks like: AKIAIOSFODNN7EXAMPLE) and the secret access key (looks like: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY). Save these immediately. The secret key is only shown once. If you lose it, you'll need to create a new key pair.

Alternatively, if someone else manages your AWS account, you’ll need to ask your AWS administrator for:

  • An IAM user with ECR permissions

  • The Access Key ID and Secret Access Key for that user

Step 2: Check if AWS CLI is installed

You can do this by running this:

aws --version

Step 3: Configure AWS CLI with your credentials

Here’s how you can do this:

aws configure
```

It will prompt you for 4 things:
```
AWS Access Key ID [None]: <paste your Access Key ID here>
AWS Secret Access Key [None]: <paste your Secret Access Key here>
Default region name [None]: eu-north-1 or any region of your choice
Default output format [None]: json

Just paste your keys when prompted, type eu-north-1 or any region of your choice for region, and json for format (or just press Enter for format).

Step 4: Test your AWS configuration

Now you’ll want to test your config to make sure everything is set up properly:

aws sts get-caller-identity

This should show your AWS account details if everything is configured correctly.

Step 5: Login to ECR (Docker Registry)

Now, login to ECR:

aws ecr get-login-password --region eu-north-1 | docker login --username AWS --password-stdin 244836489456.dkr.ecr.eu-north-1.amazonaws.com

You should see: "Login Succeeded".

Understanding Image Naming in Docker Repositories

Every Docker image has a name that tells Docker where to find or store it. For example, when you run:

docker pull mongo:4.2

Docker is actually pulling from:

docker.io/library/mongo:4.2

Here’s what’s happening:

  • docker.io is the registry (in this case, Docker Hub)

  • library is the default namespace for official images

  • mongo is the repository name

  • 4.2 is the image tag

If you build a local image like todo-app:1.0, that image exists only on your machine. Docker won’t know where to push it unless you include the full registry path.

For AWS ECR, the image name must include your ECR registry URL. For example:

docker tag todo-app:1.0 244836489456.dkr.ecr.eu-north-1.amazonaws.com/todo-app:1.0

Then you can push it with:

docker push 244836489456.dkr.ecr.eu-north-1.amazonaws.com/todo-app:1.0

Without that full path, Docker won’t know which remote repository you’re referring to. That’s why just todo-app:1.0 alone won’t work.

Step 6: Build, Tag, and Push your image

aws push commands for the ecr todo-app repo

# Tag your local image with the full ECR path
docker tag todo-app:1.0 244836489456.dkr.ecr.eu-north-1.amazonaws.com/todo-app:1.0

# Now push it
docker push 244836489456.dkr.ecr.eu-north-1.amazonaws.com/todo-app:1.0

⚠️ Note: Be careful when tagging and pushing your image, as every ECR repository URL is tied to a specific AWS account and region.

For example, in this tutorial, we’re using:

244836489456.dkr.ecr.eu-north-1.amazonaws.com

But your own ECR URL will be different depending on your AWS account and the region you selected (like us-east-1, ap-south-1, and so on).

So before you run your docker tag or docker push commands, make sure to replace the registry URL and region with your own.

If you don’t, Docker will throw errors like “tag does not exist” or “repository not found.”

In short, stay calm, double-check your region, and always confirm the exact ECR URL shown in your AWS console before pushing.

If you successfully ran Step 6, you should see output similar to this in your terminal:

stephenjohnson@Oghenekparobo docker_tut % docker tag todo-app:1.0 244836489456.dkr.ecr.eu-north-1.amazonaws.com/todo-app:1.0
stephenjohnson@Oghenekparobo docker_tut % docker push 244836489456.dkr.ecr.eu-north-1.amazonaws.com/todo-app:1.0
The push refers to repository [244836489456.dkr.ecr.eu-north-1.amazonaws.com/todo-app]
4f94b5cbe8ab: Pushed 
85ba7bf54231: Pushed 
4ea46a43fa07: Pushed 
dee30873f229: Pushed 
e78159dbd370: Pushed 
a358a725b813: Pushed 
cd8a6003174c: Pushed 
abb63e49e652: Pushed 
6cc65bdde70e: Pushed 
41a4e3939504: Pushed 
3520c50ae60e: Pushed 
75ba6634710f: Pushed 
1.0: digest: sha256:51f07267936fc94d9b677db8a760801e6c5fd4764f4bb2bd7b4dd150c756a39b size: 2842

This confirms your image was successfully pushed to your private AWS ECR repository.

You can now go to the AWS Management Console and then ECR, and you should see your todo-app image listed there, along with the tag 1.0.

At this point, your image is safely stored in AWS ECR and ready to be pulled or deployed anywhere that has access to your repository.

your image now deployed on AWS ECR

Assignment: Create and Push a New Version of Your App

Now that your first image (todo-app:1.0) has been successfully pushed to AWS ECR, it’s time to simulate a real-world workflow where developers make updates and release new versions of their applications.

Now, you’ll make a small change to your Node.js app, rebuild it, and push the updated version as todo-app:2.0.

Deploying Our Image

Now it’s time to deploy our image using Docker Compose.

Up to this point, we have been running our app using a local image:

image: todo-app:1.0

But now that your image lives inside AWS ECR, we need to replace that line with the full ECR image URI, because Docker must know exactly where to pull the image from.

Local image:

image: todo-app:1.0

Private repository image (ECR):

image: 244836489456.dkr.ecr.eu-north-1.amazonaws.com/todo-app:1.0

Docker cannot magically guess where “todo-app:1.0” is stored. If you don’t include the full registry URL, Docker will assume it’s looking at your local machine, not AWS.

Here is the clean, fixed, properly formatted docker-compose file that pulls your app from ECR:

version: "3.8"

services:
  my-app:
    image: 244836489456.dkr.ecr.eu-north-1.amazonaws.com/todo-app:1.0
    container_name: my-app
    ports:
      - "3000:3000"
    depends_on:
      - mongodb

  mongodb:
    image: mongo
    container_name: mongo
    ports:
      - "27017:27017"
    environment:
      MONGO_INITDB_ROOT_USERNAME: admin
      MONGO_INITDB_ROOT_PASSWORD: password

  mongo-express:
    image: mongo-express
    container_name: mongo-express
    ports:
      - "8081:8081"
    environment:
      ME_CONFIG_MONGODB_ADMINUSERNAME: admin
      ME_CONFIG_MONGODB_ADMINPASSWORD: password
      ME_CONFIG_MONGODB_SERVER: mongodb
    depends_on:
      - mongodb

Why “my-app” instead of “todo-app”?

In this case, we renamed it to avoid confusion between:

  • our local “todo-app:1.0”

  • our ECR “todo-app:1.0”

This keeps things clean, but you can rename it back if you want.

Why Must We Use the Full Image URL for ECR?

Other containers like mongo and mongo-express work like this:

image: mongo
image: mongo-express

Because Docker knows these are on Docker Hub.

But for a private repo like AWS ECR, Docker has no idea where “todo-app” is unless you give the full path:

AWS_ACCOUNT_ID.dkr.ecr.REGION.amazonaws.com/repository_name:tag

This tells Docker:

  • which account

  • which region

  • which repo

  • which version

Without this URL, Docker can’t pull the image.

Every time we want to pull from a private ECR repo, including using Docker Compose, we must be logged in.

Run this:

aws ecr get-login-password --region eu-north-1 | docker login --username AWS --password-stdin 244836489456.dkr.ecr.eu-north-1.amazonaws.com

If you’re not logged in, Docker Compose will throw:

pull access denied
repository does not exist
no basic auth credentials

Deploy Your App Using Docker Compose

Before deploying, it’s best practice to stop and remove any existing containers to avoid port conflicts or orphaned containers:

# Stop all running containers in this project
docker-compose down --remove-orphans

# Optional: verify nothing is running
docker ps

This ensures that port 3000 and other mapped ports are free, preventing errors when starting new containers.

Once the environment is clean, deploy your stack:

docker-compose up -d

Docker Compose will:

  1. Connect to AWS ECR – Authenticate and pull the todo-app:1.0 image from your private repository.

  2. Start MongoDB – Launch the database container with your configured credentials.

  3. Start Mongo Express – Launch the web-based MongoDB admin interface.

  4. Start your Node.js app – Launch the my-app container, linked to MongoDB.

Check the running containers:

docker ps

You should see:

  • mongo

  • mongo-express

  • my-app

If my-app fails to start, it’s usually because port 3000 is already in use. Ensure it’s free by stopping any process using it:

lsof -i :3000
kill -9 <PID>  # if a process is using it

Then rerun:

docker-compose up -d

To access your app:

This workflow ensures a clean start and avoids common port or container conflicts.

Sharing our Private Docker Image

Once your Node.js app is pushed to AWS ECR, it’s safely stored in your private repository. But what if another developer, team member, or server needs to run that same image? Since it’s private, Docker cannot pull it automatically like public images (e.g., mongo or nginx). They need authenticated access.

Here’s how they can get and use your image:

1. Grant IAM Access

Your collaborator needs an AWS IAM user or role with permissions for ECR. At minimum, the policy should allow:

  • ecr:GetAuthorizationToken

  • ecr:BatchCheckLayerAvailability

  • ecr:GetDownloadUrlForLayer

  • ecr:BatchGetImage

You can create a dedicated IAM user for this and provide them an Access Key ID and a Secret Access Key.

2. Install and Configure AWS CLI

The collaborator must have the AWS CLI installed. Then they configure it with their credentials:

aws configure

They enter:

  • Access Key ID

  • Secret Access Key

  • Default region (the same region where the ECR repo exists, for example, eu-north-1)

  • Default output format (usually json)

3. Authenticate Docker with ECR

Before pulling the image, Docker must authenticate using the AWS credentials:

aws ecr get-login-password --region eu-north-1 | docker login --username AWS --password-stdin 244836489456.dkr.ecr.eu-north-1.amazonaws.com

If successful, Docker will respond with:

Login Succeeded

4. Pull the Image

Now the collaborator can pull the image using the full ECR URI, which includes your AWS account, region, repository name, and tag:

docker pull 244836489456.dkr.ecr.eu-north-1.amazonaws.com/todo-app:1.0

5. Run the Container

After pulling, they can run the container locally:

docker run -p 3000:3000 244836489456.dkr.ecr.eu-north-1.amazonaws.com/todo-app:1.0

Or include it in a Docker Compose file, replacing the image: field with the full ECR URI.

  • Public images like mongo don’t require this because Docker Hub is open. Private ECR images require explicit authentication.

  • Every pull from a private repository requires an active login. Docker cannot guess credentials.

  • Using the full image URI ensures Docker knows exactly where to fetch the image.

This setup allows your team to share, deploy, or run your application anywhere, on local machines, staging servers, or production, while keeping your repository private and secure.

Docker Volumes

When running containers like MongoDB, all data created inside a container is ephemeral. If the container stops or is removed, all data inside it disappears. This is fine for testing, but not suitable for production.

To solve this, Docker provides volumes, which allow containers to store data outside the container, either on the host machine or in Docker-managed storage, so it survives container restarts, rebuilds, or removals.

How Docker Volumes Work

Think of Docker volumes as persistent folders for containers:

  • Data written inside a volume remains safe, even if the container is removed.

  • Containers can read/write to these volumes.

  • Volumes are essential for databases, logs, file uploads, or any persistent data your application needs.

Types of Docker Volumes

Docker has three main types of volumes:

1. Named Volumes

Named volumes are user-defined volumes with a clear name, that are fully managed by Docker. You’d typically use them in production databases and for persistent data that containers can share.

Here’s an example:

volumes:
  mongo-data:

And in a service:

volumes:
  - mongo-data:/data/db

2. Bind Mounts

Blind mounts map a folder from your host machine into the container. They’re often used for development, live syncing files, logs, and uploaded files.

Here’s an example:

volumes:
  - ./uploads:/usr/src/app/uploads

3. Anonymous Volumes

These are volumes without a name. Docker just assigns them a random name. You’d use them for temporary data for testing (and they’re not commonly used in production).

Here’s an example:

volumes:
  - /data/tmp

Example Docker Compose File Using Volumes

Here’s a full docker-compose.yml file using the most common volume types for a Node.js + MongoDB + Mongo Express stack:

version: "3.8"

services:
  my-app:
    image: 244836489456.dkr.ecr.eu-north-1.amazonaws.com/todo-app:1.0
    container_name: my-app
    ports:
      - "3000:3000"
    depends_on:
      - mongodb
    volumes:
      - ./uploads:/usr/src/app/uploads  # bind mount for file uploads

  mongodb:
    image: mongo
    container_name: mongo
    ports:
      - "27017:27017"
    environment:
      MONGO_INITDB_ROOT_USERNAME: admin
      MONGO_INITDB_ROOT_PASSWORD: password
    volumes:
      - mongo-data:/data/db  # named volume for persistent database storage

  mongo-express:
    image: mongo-express
    container_name: mongo-express
    ports:
      - "8081:8081"
    environment:
      ME_CONFIG_MONGODB_ADMINUSERNAME: admin
      ME_CONFIG_MONGODB_ADMINPASSWORD: password
      ME_CONFIG_MONGODB_SERVER: mongodb
    depends_on:
      - mongodb

volumes:
  mongo-data:  # named volume definition

How this code is working:

  1. MongoDB Volume (mongo-data): This is a named volume. It stores all database files under /data/db inside the container. It survives container restarts, removals, or rebuilds.

  2. Node.js Uploads (./uploads): This is a bind mount. It maps the uploads folder on your host to /usr/src/app/uploads inside the container. Any uploaded files are immediately visible on your host.

  3. Anonymous Volume: These are not shown in this file because it’s rarely used in production. Temporary data storage is created automatically by Docker if a volume is defined without a name.

Visual Concept (Simplified):

Host Machine
├─ /project/uploads   bind mount, synced with container
├─ Docker Volumes
  └─ mongo-data     named volume, persistent MongoDB data

Containers
├─ my-app
  └─ /usr/src/app/uploads   sees host uploads folder
├─ mongodb
  └─ /data/db              uses named volume mongo-data
├─ mongo-express

Takeaways

  • Always use volumes for data you care about.

  • Named volumes are best for databases in production.

  • Bind mounts are best for development and live syncing.

  • Anonymous volumes are rarely needed outside testing.

  • Volumes separate container lifecycle from data lifecycle, which is a cornerstone of Docker best practices.

Start Your Application

Once your Docker Compose is configured with volumes, the next step is to start your application and make sure the volumes are working correctly. Here’s a simple step-by-step guide.

1. Start the Containers

Run:

docker-compose up -d

The -d flag runs the containers in detached mode (in the background).

Docker will:

  • Pull your app image from AWS ECR (if you’re logged in)

  • Start MongoDB with the named volume

  • Start Mongo Express

  • Start your Node.js app

2. Check Running Containers

To see if everything started correctly:

docker ps

You should see something like:

CONTAINER ID   IMAGE                                               STATUS          PORTS
2a2e120cc912   244836489456.dkr.ecr.eu-north-1.amazonaws.com/todo-app:1.0   Up 5s    0.0.0.0:3000->3000/tcp
f4d5a1ab1234   mongo                                               Up 5s          0.0.0.0:27017->27017/tcp
c3d5b2bc2345   mongo-express                                      Up 5s          0.0.0.0:8081->8081/tcp

3. Verify Volumes

List Docker volumes:

docker volume ls

You should see your named volume, for example mongo-data.

Inspect the volume:

docker volume inspect docker_tut_mongo-data

This will show where Docker stores your MongoDB data on the host, for example:

[
    {
        "Name": "mongo-data",
        "Driver": "local",
        "Mountpoint": "/var/lib/docker/volumes/mongo-data/_data",
        "Labels": {},
        "Scope": "local"
    }
]

Anything stored in /data/db inside MongoDB is actually saved here on your host.

4. Test Data Persistence

  1. Connect to MongoDB or your app and add some data.

  2. Stop and remove the container:

docker-compose down
  1. Restart the app:
docker-compose up -d
  1. Check your data again.
  • Because MongoDB uses the named volume, your data is still there.

  • This proves the volume is persistent.

5. Optional: Check Node.js Uploads (Bind Mount)

  • If you uploaded a file through your app, check your project folder ./uploads.

  • You should see the file appear on your host machine because bind mounts sync host and container directories.

Conclusion

Well done, you have made it to the end of this comprehensive Docker tutorial. From unraveling the basics of containers and images, to networking, Docker Compose, volumes, and even deploying to a private AWS ECR repository, you've built a fully containerized Node.js application stack that's production-ready and scalable. These are hands-on skills that will transform how you develop, collaborate, and deploy applications in real-world scenarios.

Thank you for sticking with it. Docker can feel overwhelming at first – those long commands, networking quirks, and persistent data challenges aren't trivial. But getting to this point? It means you've conquered a steep learning curve and reached new heights in your development journey. You're now equipped to eliminate "it works on my machine" headaches, streamline CI/CD pipelines, and level up as a backend or full-stack pro.

Keep experimenting: Tweak your todo-app, try multi-stage builds in your Dockerfile, or explore orchestration tools like Kubernetes next. The Docker ecosystem is vast, but with this foundation, you're ready to dive deeper. If you hit snags or have questions, the community on Docker Hub, Stack Overflow, or GitHub.

You can find the final code here: https://github.com/Oghenekparobo/docker_tut_js/tree/final



Read the whole story
alvinashcraft
42 seconds ago
reply
Pennsylvania, USA
Share this story
Delete

🚀 AG-UI + Agent Framework + .NET + Aspire: Web-Enabling Your Intelligent Agents (Blog + Demo + Code!)

1 Share

⚠ This blog post was created with the help of AI tools. Yes, I used a bit of magic from language models to organize my thoughts and automate the boring parts, but the geeky fun and the 🤖 in C# are 100% mine.

📺 VIDEO COMING SOON — stay tuned!
(I’ll embed the YouTube player here as soon as the video goes live.)

Hola friends! Bruno here 🙋‍♂️ — Cloud Advocate at Microsoft, lover of .NET, AI, Blazor, and the occasional dog-walk debugging session with ACE 🐶.

Today we’re diving into something very cool for .NET AI developers:
👉 How to expose your Agent Framework agents to the web using AG-UI
👉 How Aspire orchestrates everything for you
👉 And how this makes building multi-client, AI-powered experiences ridiculously easier

This post accompanies my 5-minute video… which is coming in hot 🔥
Until then, let’s go step-by-step and explore what AG-UI brings to your .NET AI apps.


🎯 What We’re Building

A Web Server hosting an Agent Framework agent, orchestrated by Aspire, and consumed by two different websites via AG-UI.

Here’s the diagram that I also use in the video:

           ┌──────────────────────┐
           │     Aspire App       │
           │    (Orchestrator)    │
           └───────────▲──────────┘
                       │
           ┌───────────┴──────────┐
           │    Web Server         │
           │ (Agent Hosted Here)   │
           └───────▲───────▲──────┘
                   │       │
          AG-UI     │       │     AG-UI
                   │       │
      ┌────────────┘       └──────────────┐
      │                                     │
┌───────────────┐                 ┌────────────────┐
│  Website A     │                 │   Website B    │
│  AG-UI Client  │                 │  AG-UI Client  │
└───────────────┘                 └────────────────┘


🧠 What is AG-UI?

AG-UI is the Agent-User Interaction protocol used by Agent Framework apps to:

  • Stream messages (via SSE)
  • Sync agent → UI state
  • Trigger human-in-the-loop workflows
  • Connect multiple frontends to the same agent

If you’re building AI-powered web apps with .NET — AG-UI is your new best friend.

🧪 Demo – Web Server Agent + Web Client + Aspire Orchestration

We’ll use the sample repo structure:

samples/
  AgentFx-AIWebChatApp-AG-UI/
    Agents/
    Web/
    AppHost/

1. Create & Publish Agent (Agents/Program.cs)

In the Agents project you’ll find Program.cs like this (trimmed):

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddAGUI();
builder.Services.AddHttpClient();

var app = builder.Build();

// create the agent
var azureOpenAiEndpoint = builder.Configuration["AZURE_OPENAI_ENDPOINT"];
var openAiKey = builder.Configuration["AZURE_OPENAI_KEY"];
var client = new AzureOpenAIClient(new Uri(azureOpenAiEndpoint), new AzureKeyCredential(openAiKey)).GetChatClient();

var agent = client.AsIChatClient().CreateAIAgent(
    name: "aiwebagent",
    instructions: "You are an AI assistant..." );

app.MapAGUI("/", agent);

await app.RunAsync();

Key points

  • The AddAGUI() call wires up the AG-UI protocol endpoint.
  • MapAGUI("/") exposes the agent at the root.
  • The agent is built from Azure OpenAI client (could be swapped).

2. Web Front-end (Web/Program.cs)

In the Web project’s Program.cs we have:

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddRazorPages();
builder.Services.AddServerSideBlazor();

var aguiServerUrl = builder.Configuration["AGUI_SERVER_URL"];

builder.Services.AddHttpClient<AGUIChatClient>(client =>
    client.BaseAddress = new Uri(aguiServerUrl));

builder.Services.AddScoped(sp => {
    var chatClient = sp.GetRequiredService<AGUIChatClient>();
    return chatClient.CreateAIAgent(
        name: "web-client-agent",
        description: "Talks to remote AG-UI server");
});

var app = builder.Build();
app.UseStaticFiles();
app.UseRouting();
app.MapBlazorHub();
app.MapFallbackToPage("/_Host");
await app.RunAsync();

Key points

  • AGUI_SERVER_URL from config points to the agent host.
  • AGUIChatClient wraps HTTP + SSE to communicate with the agent.
  • The front-end is a Blazor app (could be any web tech) that uses the agent via DI.

3. Aspire Orchestration (AppHost/AppHost.cs)

In the AppHost project:

var builder = DistributedApplication.CreateBuilder(args);

var agents = builder.AddProject<Projects.Agents>("agents")
    .WithEndpoint("http", e => e.Port = 8888);

var web = builder.AddProject<Projects.Web>("web")
    .WithReference(agents)
    .WithEnvironment("AGUI_SERVER_URL", agents.GetEndpoint("http")!.Uri);

web.WithExternalHttpEndpoints();

await builder.Build().RunAsync();

Key points

  • AddProject defines the components (Agents, Web).
  • WithReference(agents) ensures Web knows about the agent project.
  • WithEnvironment("AGUI_SERVER_URL", …) injects the correct URL dynamically.
  • WithExternalHttpEndpoints() exposes the web front-end externally (for QA/dev).

🏁 Final Thoughts

AG-UI + Agent Framework + Aspire is what I call the modern stack for .NET AI applications:

  • Agents with memory, tools, streaming, context
  • Web UI with state sync and rich interactivity
  • Full orchestration with Aspire
  • Easy deployment to Azure Container Apps or Azure App Service

It’s everything we wished we had when trying to build AI-powered web apps without reinventing the streaming / workflow / UI plumbing.


📺 VIDEO COMING SOON

Stay tuned — I’ll embed the YouTube video right here as soon as it’s uploaded.


🔗 Useful References

Happy coding!

Greetings

El Bruno

More posts in my blog ElBruno.com.

More info in https://beacons.ai/elbruno






Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

Davide's Code and Architecture Notes - Metrics, Logs, and Traces: the three pillars of Observability

1 Share
Learn the differences between metrics, logs, and traces - the three pillars of observability in distribut ed systems - and how to use them effectively
Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

How to Monitor and Optimize Batched Deletion Processes in SQL Server

1 Share

Batched deletions are a common strategy in SQL Server to manage large datasets without overloading the system, but poorly tuned deletes can cause blocking, long-running transactions, and heavy log usage. Learn how to monitor and optimize these processes for smooth, efficient database performance.

In previous articles I showed patterns for working with large amounts of data on big tables while keeping locking at a minimum. These processes can allow migrations and maintenance without requiring downtime but, in environments with unpredictable database workloads, there is a risk of heavy traffic starting at any time and disrupting a once smooth operation. In this article, I’ll demonstrate how to augment these processes to allow dynamic adjustment of the configuration.

For most systems, the main limitation these techniques run into is the speed and throughput of I/O (input/output). During periods of low traffic, a large batch size may perform great with no impact to production, but as traffic increases, the storage subsystem may not be able to keep up.

I’ll show two workarounds to deal with this issue: lowering the batch size, and introducing a small delay in between batches. Both will allow the process to continue running with less I/O demand but, if this still isn’t enough, we can easily stop the process and restart at a different time.

Logging Tables in SQL Server

Before we get into the configuration options, let’s consider how we can get feedback on our process while it’s running, so we’re better informed about the adjustments we want to make. Let’s review the code for purging in batches explored in this article:

SET NOCOUNT ON;
--control the number of rows deleted per iteration
DECLARE @BatchSize INT = 5000;
 
--variable used to tell the process to stop
DECLARE @Stop INT = 0;
 
IF (OBJECT_ID ('tempdb..#ToProcess') IS NULL)
  CREATE TABLE #ToProcess (Id INT NOT NULL PRIMARY KEY CLUSTERED);
IF (OBJECT_ID ('tempdb..#Batch') IS NULL)
  CREATE TABLE #Batch (Id INT NOT NULL PRIMARY KEY CLUSTERED);
 
-----------------Gather Ids------------------------------------
INSERT INTO #ToProcess (Id)
SELECT Id
FROM dbo.Posts
WHERE CreationDate < '2011';
 
-----------------Main Loop------------------------------------
WHILE (@Stop = 0)
BEGIN
  --Load up our batch table while deleting from the main set
  DELETE TOP (@BatchSize) #ToProcess
  OUTPUT DELETED.Id INTO #Batch (Id);
 
  --Once the rowcount is less than the batchsize,
  -- we can stop (after this loop iteration)
  IF @@ROWCOUNT < @BatchSize
    SELECT @Stop = 1;
 
--Perform the DELETE
  DELETE FROM p
  FROM dbo.Posts p
    JOIN #Batch b ON p.Id = b.Id;
 
  --Clear out Batch table
  TRUNCATE TABLE #Batch
 
END;

Now, we’ll add a new table to hold our logging:

IF (OBJECT_ID('dbo.PurgeLogging') IS NULL)
	CREATE TABLE dbo.PurgeLogging (
	 LogId INT IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED
	,StartTime DATETIME2(7) NOT NULL DEFAULT GETDATE()
	,EndTime DATETIME2(7) NULL
	,Duration_ms AS 
DATEDIFF(MILLISECOND
, StartTime
, ISNULL(EndTime, GETDATE()))	
	,RowsAffected INT NULL
	,[BatchSize] INT NOT NULL
	,DelayTime varchar(8) NOT NULL
	,ErrorCode INT NULL
	,ErrorMessage NVARCHAR(255) NULL
	)

This table will contain a row for every iteration of the loop, tracking the time it takes for the DELETE statement to run (because that’s what will alert us to any blocking). Depending on the latency demands of your system, even a 1-2 second iteration may be too slow, but other systems may function without issue as long as this is below the default timeout of 30 seconds.

Besides the time tracking, we’ll also record the values of our configuration parameters and any errors that come up during the operation. We do this by inserting a record at the beginning of our loop:

INSERT INTO dbo.PurgeLogging (StartTime,[BatchSize], DelayTime)
SELECT GETDATE(), @BatchSize, @DelayTime
SELECT @LogId = SCOPE_IDENTITY()

And by updating that record at the end of our loop:

UPDATE dbo.PurgeLogging 
	SET EndTime = GETDATE()
	, RowsAffected = @RowsAffected
	, ErrorMsg = @ErrorMsg
	, ErrorNum = @ErrorNum
WHERE LogId = @LogId

Configuration Tables in SQL Server

In our previous script we used a static variable to hold the BatchSize, but we’ll now store this in a table along with other parameters to adjust the process.

This allows us to make changes on the fly without stopping the script and rolling back a midflight transaction. It also opens up the possibility of using an automated process to tweak these parameters based on system load.

Let’s look at our configuration table:

IF (OBJECT_ID ('dbo.PurgeConfig') IS NULL)
	CREATE TABLE dbo.PurgeConfig (
		ConfigId INT NOT NULL DEFAULT (1) PRIMARY KEY CLUSTERED 
		, DelayTime varchar(8) NOT NULL
		, [BatchSize] INT NOT NULL
		, [Stop] bit NOT NULL DEFAULT (0)
		, CONSTRAINT CH_PurgeConfig_ConfigId CHECK (ConfigId=1)
		);

This table contains our parameters as well as a check constraint to ensure we only have one row stored at any time. We’ll update the running variables at the end of each loop (as well as at the beginning of the script):

SELECT @BatchSize = [BatchSize]
		, @DelayTime = [DelayTime]
		--ONLY UPDATE @Stop when it is 0
		, @Stop = CASE WHEN @Stop = 1 THEN 1 ELSE [Stop] END
FROM dbo.PurgeConfig

Error Handling in SQL Server

While we are making improvements to the script, let’s add some polish by introducing error handling inside the main loop. This can be as simple as adding a TRY/CATCH block and some variables to store the error message and number.

You might also opt to set @Stop = 1 in this section if you want the script to stop any time it hits an error. In this example, I’m letting it continue because I can address the error and rerun the script later; any records that failed to delete will be scooped up by the insert to the #ToProcess table on the next run.

Putting It All Together

Now let’s look at what our script looks like with these new tables:

SET NOCOUNT ON;
--control the number of rows deleted per iteration
DECLARE @BatchSize INT;
 
--variable used to tell the process to stop
DECLARE @Stop INT = 0;

--add a delay between iterations
DECLARE @DelayTime VARCHAR(8);

--logging variable for rowcount
DECLARE @RowsAffected INT;

--logging variable for the log record
DECLARE @LogId INT;

--logging variables for errors
DECLARE @ErrorMsg nvarchar(4000)
	, @ErrorNum INT;

IF (OBJECT_ID ('tempdb..#ToProcess') IS NULL)
  CREATE TABLE #ToProcess (Id INT NOT NULL PRIMARY KEY CLUSTERED);
IF (OBJECT_ID ('tempdb..#Batch') IS NULL)
  CREATE TABLE #Batch (Id INT NOT NULL PRIMARY KEY CLUSTERED);

--------------Logging Table----------------------------------- 
IF (OBJECT_ID('dbo.PurgeLogging') IS NULL)
	CREATE TABLE dbo.PurgeLogging (
	 LogId INT IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED
	,StartTime DATETIME2(7) NOT NULL DEFAULT GETDATE()
	,EndTime DATETIME2(7) NULL
	,Duration_ms AS 
    DATEDIFF(MILLISECOND
    , StartTime
    , ISNULL(EndTime, GETDATE()))	
	,RowsAffected INT NULL
	,[BatchSize] INT NOT NULL
	,DelayTime varchar(8) NOT NULL
	,ErrorCode INT NULL
	,ErrorMessage NVARCHAR(255) NULL
	);

-----------------Configuration Table---------------------------
IF (OBJECT_ID ('dbo.PurgeConfig') IS NULL)
BEGIN
	CREATE TABLE dbo.PurgeConfig (
		ConfigId INT NOT NULL DEFAULT (1) PRIMARY KEY CLUSTERED 
		, DelayTime varchar(8) NOT NULL
		, [BatchSize] INT NOT NULL
		, [Stop] bit NOT NULL DEFAULT (0)
		, CONSTRAINT CH_PurgeConfig_ConfigId CHECK (ConfigId=1)
		);

-----------------Add Initial Configuration---------------------
	INSERT INTO dbo.PurgeConfig (DelayTime, [BatchSize], [Stop])
	VALUES ('00:00:00', 5000, 0);
	SELECT @BatchSize = [BatchSize]
			, @DelayTime = [DelayTime]
			--ONLY UPDATE @Stop when it is 0
			, @Stop = CASE WHEN @Stop = 1 THEN 1 ELSE [Stop] END
	FROM dbo.PurgeConfig;
END

-----------------Gather Ids------------------------------------
INSERT INTO #ToProcess (Id)
SELECT Id
FROM dbo.Posts
WHERE CreationDate < '2011';

-----------------Main Loop------------------------------------
WHILE (@Stop = 0)
BEGIN
  --Create a logging record
  INSERT INTO dbo.PurgeLogging (StartTime,[BatchSize], DelayTime)
  SELECT GETDATE(), @BatchSize, @DelayTime;
  SELECT @LogId = SCOPE_IDENTITY();

  --Load up our batch table while deleting from the main set
  DELETE TOP (@BatchSize) #ToProcess
  OUTPUT DELETED.Id INTO #Batch (Id);
  
  --Once the rowcount is less than the batchsize,
  -- we can stop (after this loop iteration)
  IF @@ROWCOUNT < @BatchSize
    SELECT @Stop = 1;
  
  BEGIN TRY
    --Perform the DELETE
    DELETE FROM p
    FROM dbo.Posts p
      JOIN #Batch b ON p.Id = b.Id;
    --Store the rowcount
    SELECT @RowsAffected = @@ROWCOUNT;
  END TRY
  BEGIN CATCH
    --if we have an error, store the message and log it
    -- optionally you can stop the process here by setting @stop =1
		SELECT @ErrorMsg = ERROR_MESSAGE()
		, @ErrorNum = ERROR_NUMBER()
	END CATCH

  --Update the logging table
  UPDATE dbo.PurgeLogging 
	SET EndTime = GETDATE()
    , RowsAffected = @RowsAffected
    , ErrorMessage = @ErrorMsg
    , ErrorCode = @ErrorNum
  WHERE LogId = @LogId;

  --Refresh running configuration from config table
  SELECT @BatchSize = [BatchSize]
		, @DelayTime = [DelayTime]
		--ONLY UPDATE @Stop when it is 0
		, @Stop = CASE WHEN @Stop = 1 THEN 1 ELSE [Stop] END
  FROM dbo.PurgeConfig;

  --Clear out Batch table
  TRUNCATE TABLE #Batch;

  –-Wait before running the next batch
  WAITFOR DELAY @DelayTime
 
END;

After we kick off this script we can monitor the progress by querying the PurgeLogging table:

SELECT TOP 100 * FROM dbo.PurgeLogging
ORDER BY LogId DESC

This results in each batch taking less than 200 milliseconds to complete:

An image showing each batch taking less than 200 milliseconds to complete.

Then, if we want to change the batch size or add a delay in between batches, we can update the PurgeConfig table like so:

UPDATE PurgeConfig
	SET DelayTime = '00:00:02'
	, [BatchSize] = 100000

We can see in our logging table that the change takes effect seamlessly:

An image showing that the change takes effect seamlessly.

Notice that there’s now a second gap between the StartTime and EndTime of the previous row, which could help slower disks keep up.

We can also see that each batch is now taking more than a second – if we feel this is too long, we can update our PurgeConfig table once again to lower the batch size:

Image showing updating the PurgeConfig table to lower the batch size.

Conclusion

Adding logging and dynamic configuration to this technique allows us to tune the process to the unique capabilities and requirements of any environment. By looking at the time each loop takes to execute, we can adjust the batch size to keep our impact to an acceptable amount. If our I/O is saturated, we can add a delay in between batches to allow other processes to complete.

This technique can be used to purge old data as I have shown here, but it can also be used for more advanced processes like changing the datatype on a column while a table is being used, or deleting from multiple tables with foreign key relationships.

The post How to Monitor and Optimize Batched Deletion Processes in SQL Server appeared first on Simple Talk.

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete

Hands-on with MCP Resources in Visual Studio

1 Share

In the first post of this series, we explored what MCP resources are and why they're the overlooked piece of the MCP puzzle. In a second post we showed how to use MCP resources in Visual Studio Code. Before I continue with a next post on building your own MCP server, I first want to show you how Visual Studio handles MCP resources.

Setting up your MCP server with resources in Visual Studio.

Let's start by installing an MCP server that provides resources. We'll use the GitHub MCP Server as our example because it's widely used and demonstrates several resource patterns.

We’ll use an mcp.json file to configure our mcp server:

  • Create .vscode/mcp.json in your workspace root
  • Add your server configuration:
  • Save the file—Visual Studio will detect it and try to load the MCP server.
  • If you now try to use this MCP server, you’ll notice that it doesn’t work yet.
  • This is because we first need to authenticate and fetch an OAUTH token.
  • Click on the next to the MCP server and click on the Authentication option
  • On the Authentication popup, click on Authenticate.

Using MCP resources

Once you have an MCP server installed and running, let's explore its resources.

  • Click on the + icon in the chat menu and choose MCP resources:

 

  • Choose the MCP resource template you want to use and provide the necessary parameters
    • Remark: Notice that the path used in the Github MCP server is case sensitive!

 

  • Now our agent can use this resource in the same way as a local resource.

That’s it!

Read the whole story
alvinashcraft
1 minute ago
reply
Pennsylvania, USA
Share this story
Delete
Next Page of Stories