Restoring Database in Microsoft Sql Server Docker Container

UPDATE: At the time of publishing thie blog post, Microsoft has removed the mssql-tools from the container image.


Context

Searching on the internet about using databases in containers, they will find hoards of blogs, posts, and suggestions to not use databases in containers. But for testing deployment, configurations, and integration testing, it’s really convenient to spin up a temporary database in a container and run isolated tests. Recently, I was working on a project where multiple applications were manually deployed in the test environment. I was tasked to automate the deployment process, and this is the culmination of what I learned from that implementation.

Initial Attempt

Microsoft provides a very rudimentary sample (mssql-customize) for customizing Microsoft SQL Server container image from their Docker Hub image.

This is the Dockerfile for the custom image:

FROM mcr.microsoft.com/mssql/server:2017-latest

# Create a config directory
RUN mkdir -p /usr/config
WORKDIR /usr/config

# Bundle config source
COPY . /usr/config

# Grant permissions for our scripts to be executable
RUN chmod +x /usr/config/entrypoint.sh
RUN chmod +x /usr/config/db-restore.sh

ENTRYPOINT ["./entrypoint.sh"]

In the Dockerfile, we run the entrypoint.sh which executes db-restore.sh (and detaches it from then current process) and starts the database server.

#!/bin/bash

# Start the script to restore the database
/usr/config/db-restore.sh &

# Start SQL Server
/opt/mssql/bin/sqlservr

In the db-restore.sh, it will try to connect with the database (with a fixed retry count) and run the query to restore from the backup file.

#!/bin/bash

# if the backup doesn't exist, we can skip this script
if [ ! -f "/home/Database.bak" ]
then
	echo "SETUP :: database backup file not found, exiting"
	exit 0
fi

SUCCESS=false
MAX_ATTEMPTS=60
ATTEMPT_COUNT=0
DBSTATUS=1

while [ $SUCCESS = false ] && [ $ATTEMPT_COUNT -le $MAX_ATTEMPTS ]; do
	echo "SETUP :: looking for existing database"
	DBSTATUS=$(/opt/mssql-tools/bin/sqlcmd -h -1 -t 1 -U sa -P $MSSQL_SA_PASSWORD -Q "SET NOCOUNT ON; SELECT COUNT(1) FROM sys.databases WHERE name = N'Database'")
	if [ $? -eq 0 ]; then
		SUCCESS=true
		DBSTATUS=$(echo $DBSTATUS | tr -cd '[:digit:]')
		break
	fi
	echo "SETUP :: Attempt $ATTEMPT_COUNT failed. Trying again..."
	ATTEMPT_COUNT=$(( ATTEMPT_COUNT + 1 ))
    echo SETUP :: sleeping
	sleep 1
done

if [ $ATTEMPT_COUNT -ge $MAX_ATTEMPTS ]; then
	echo SETUP :: max attempt count reached, exiting
	exit 1
fi

if ! [ "$DBSTATUS" == "0" ]; then 
	echo "SETUP :: database already exists, exiting"
	exit 0
fi

# Run the setup script to create the DB and the schema in the DB
echo "SETUP :: starting database restoration"
/opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P $MSSQL_SA_PASSWORD -Q "RESTORE DATABASE [Database] FROM DISK = '/home/Database.bak' WITH RECOVERY"
echo "SETUP :: database restoration complete"

This might seem like a crude implementation, but it is simple and does what it was designed for. Now, we have a custom SQL Server image that can-

  • automatically restore the database to a certain point to run specific tests
  • setup mock data to run specific integration tests

The Conundrum

After integrating the Docker image with other apps in a Docker Compose, a new problem emerges. The database restoration was aborted after some time.

This was odd because it was working perfectly while running separately from the other apps.

Turns out, one of the apps was configured to run database migration at startup. When the Docker services started, that app was racing with the restoration script to run its migration. For some reason, the app was starting a bit later than the restoration process. But before the restoration process completes, that app would connect with the server and create an empty database immediately.

I have two guesses for the database to abort restoration-

  • The database backup was taken before some of the current features were implemented. That means the newly created database would have a different schema than the restoring backup. Thus the backup’s schema didn’t match in the midway of the restoration.
  • There was no database when the restoration started and it started to create files (data, logs, etc.). However, the app created an empty database, thus creating files that were meant to be created by the backup restoration. When it tried to copy files from the backup to an existing file, it aborted.

But, I can’t say any of those were the exact cause because I didn’t try to dig down any deeper into this one, and I had to finish the implementation as soon as possible. That’s why I went on a different route.

Manual Restoration

Instead of automatically restoring the database from inside the Docker container, we can trigger the restoration script from outside using docker exec. The procedure would be somewhat like this-

  • start services with docker compose
  • stop backend container
  • restore database
  • start the backend container again

This can be automated with a simple script, which can be executed on the setup phase of the container services.

Now, the ENTRYPOINT of the Dockerfile file will be the SQL Server binary-

FROM mcr.microsoft.com/mssql/server:2022-latest

WORKDIR /home

# Bundle config source
COPY . /home

ENTRYPOINT ["/opt/mssql/bin/sqlservr"]

This is the updated db-restore.sh-

#!/bin/bash

MSG_PREFIX=""

function msg {
	echo "[ setup ][ $MSG_PREFIX ] ${1}"
}

function check_backup {
	if [ ! -f "${1}" ]
	then
		return 1
	fi
	return 0
}

SUCCESS=false
MAX_ATTEMPTS=60
ATTEMPT_COUNT=0
DBSTATUS=1

MSG_PREFIX="init"
while [ $SUCCESS = false ] && [ $ATTEMPT_COUNT -le $MAX_ATTEMPTS ]; do
	msg "looking for existing database"
	/opt/mssql-tools/bin/sqlcmd -h -1 -t 1 -U sa -P "$MSSQL_SA_PASSWORD" -Q "SET NOCOUNT ON; SELECT COUNT(1) FROM sys.databases"
	if [ $? -eq 0 ]; then
		SUCCESS=true
		break
	fi
	msg "attempt $ATTEMPT_COUNT failed. Trying again..."
	ATTEMPT_COUNT=$(( ATTEMPT_COUNT + 1 ))
    msg "sleeping"
	sleep 1
done

if [ $ATTEMPT_COUNT -ge $MAX_ATTEMPTS ]; then
	msg "max attempt count reached, exiting"
	exit 1
fi

function restore_db {
	echo ""
	MSG_PREFIX="${1}"
	if ! check_backup "${2}"
	then
		msg "backup file not found, skipping"
	else
		msg "starting database restoration"
		cat &> "/tmp/${1}.sql" <<EOF
USE master
GO

IF EXISTS (SELECT name FROM sys.databases WHERE name = N'${1}')
	ALTER DATABASE [${1}]
	SET SINGLE_USER
	WITH ROLLBACK IMMEDIATE

	DROP DATABASE [${1}]
GO

RESTORE DATABASE [${1}] FROM DISK = '${2}' WITH REPLACE
GO
EOF
		/opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P $MSSQL_SA_PASSWORD -i "/tmp/${1}.sql"
		msg "database restoration complete"
	fi
}

restore_db "Database" "/home/Database.bak"

MSG_PREFIX="init"
msg "done"

This time, the script will-

  • make the database to SINGLE_USER mode, thus outside connections will be terminated
  • drop if the database was created
  • restore the database from the backup file

Now, we can run run-db-restore.sh to start the restoration process-

#!/bin/bash

if ! docker ps | grep "org/backend" &> /dev/null
then
	echo "backend not running, exiting"
	exit 1
fi

if ! docker ps | grep "org/mssql" &> /dev/null
then
	echo "mssql server not running, exiting"
	exit 1
fi

BACKEND_CONTAINER_ID=$(docker ps | grep "org/backend" | awk '{print $1}')
MSSQL_CONTAINER_ID=$(docker ps | grep "org/mssql" | awk '{print $1}')
echo "stopping backend"
docker stop "$BACKEND_CONTAINER_ID"

echo "running db-restore.sh script"
docker exec -it "$MSSQL_CONTAINER_ID" bash -c "/home/db-restore.sh"

echo "starting backend"
docker start "$BACKEND_CONTAINER_ID"

echo "done"

The Problem With Self-Signed Certificate Error

The sqlcmd client in SQL Server 2022 container image, will throw an error because of the self-signed certificate. This is because the server creates a self-signed certificate if not configured.

As our goal is to spin up a container for testing, and not for production. We can bypass setting up the signed certificate for now.

For this, we can use the -C parameter for sqlcmd to disable certificate signing check-

-/opt/mssql-tools/bin/sqlcmd -S localhost -U sa -P $MSSQL_SA_PASSWORD -i "/tmp/${1}.sql"
+/opt/mssql-tools/bin/sqlcmd -C -S localhost -U sa -P $MSSQL_SA_PASSWORD -i "/tmp/${1}.sql"

Thanks to this Reddit post for this solution.

Backup Restoration Time

Another big problem was the restoration time. The console prints that restoration was complete within 20-30 seconds. But, it takes 15-20 minutes for the sqlcmd process to actually exit.

We can’t wait this much time only to set up an ephemeral test environment. For example, setting up

3 separate tests will take almost 45 minutes to 1 hour just for the database restoration.

Looking out for some time, I stumbled upon this StackOverflow question where this behavior was described in detail.

The solution is to install sqlcmd from Microsoft’s repository instead of the one already installed in the SQL Server container image.

apt-get install curl -y
curl https://packages.microsoft.com/keys/microsoft.asc | tee /etc/apt/trusted.gpg.d/microsoft.asc
apt-get update
apt-get install mssql-tools18 unixodbc-dev -y
curl https://packages.microsoft.com/config/ubuntu/22.04/prod.list | tee /etc/apt/sources.list.d/mssql-release.list

After installation, we have to use /opt/mssql-tools18/bin/sqlcmd for the database restoration instead of using /opt/mssql-tools/bin/sqlcmd .

Rootless Containers

The SQL Server 2022 Docker image is a rootless container, meaning the default user is not the root user and sudoor su binary is not installed on the image.

To install the sqlcmd tool, we need to tell docker exec to execute as root user-

-docker exec -it "$MSSQL_CONTAINER_ID" bash -c "/home/db-restore.sh"
+docker exec -it -u root "$MSSQL_CONTAINER_ID" bash -c "/home/db-restore.sh"

Have a look at this Github issue for more details on this.

Updated Codes

Final db-restore.sh would be the following-

#!/bin/bash

MSG_PREFIX=""

function msg {
	echo "[ setup ][ $MSG_PREFIX ] ${1}"
}

function check_backup {
	if [ ! -f "${1}" ]
	then
		return 1
	fi
	return 0
}

SUCCESS=false
MAX_ATTEMPTS=60
ATTEMPT_COUNT=0
DBSTATUS=1

MSG_PREFIX="init"
while [ $SUCCESS = false ] && [ $ATTEMPT_COUNT -le $MAX_ATTEMPTS ]; do
	msg "looking for existing database"
	/opt/mssql-tools/bin/sqlcmd -h -1 -t 1 -U sa -P "$MSSQL_SA_PASSWORD" -Q "SET NOCOUNT ON; SELECT COUNT(1) FROM sys.databases"
	if [ $? -eq 0 ]; then
		SUCCESS=true
		break
	fi
	msg "attempt $ATTEMPT_COUNT failed. Trying again..."
	ATTEMPT_COUNT=$(( ATTEMPT_COUNT + 1 ))
    msg "sleeping"
	sleep 1
done

if [ $ATTEMPT_COUNT -ge $MAX_ATTEMPTS ]; then
	msg "max attempt count reached, exiting"
	exit 1
fi

# install mssql-tools18, which has the updated sqlcmd binary
apt-get install curl -y
curl https://packages.microsoft.com/keys/microsoft.asc | tee /etc/apt/trusted.gpg.d/microsoft.asc
apt-get update
apt-get install mssql-tools18 unixodbc-dev -y
curl https://packages.microsoft.com/config/ubuntu/22.04/prod.list | tee /etc/apt/sources.list.d/mssql-release.list

function restore_db {
	echo ""
	MSG_PREFIX="${1}"
	if ! check_backup "${2}"
	then
		msg "backup file not found, skipping"
	else
		msg "starting database restoration"
		cat &> "/tmp/${1}.sql" <<EOF
USE master
GO

IF EXISTS (SELECT name FROM sys.databases WHERE name = N'${1}')
	ALTER DATABASE [${1}]
	SET SINGLE_USER
	WITH ROLLBACK IMMEDIATE

	DROP DATABASE [${1}]
GO

RESTORE DATABASE [${1}] FROM DISK = '${2}' WITH REPLACE
GO
EOF
		/opt/mssql-tools18/bin/sqlcmd -C -S localhost -U sa -P $MSSQL_SA_PASSWORD -i "/tmp/${1}.sql"
		msg "database restoration complete"
	fi
}

restore_db "Database" "/home/Database.bak"

MSG_PREFIX="init"
msg "done"

The final run-db-restore.sh would be the following-

#!/bin/bash

if ! docker ps | grep "org/backend" &> /dev/null
then
	echo "backend not running, exiting"
	exit 1
fi

if ! docker ps | grep "org/mssql" &> /dev/null
then
	echo "mssql server not running, exiting"
	exit 1
fi

BACKEND_CONTAINER_ID=$(docker ps | grep "org/backend" | awk '{print $1}')
MSSQL_CONTAINER_ID=$(docker ps | grep "org/mssql" | awk '{print $1}')
echo "stopping backend"
docker stop "$BACKEND_CONTAINER_ID"

echo "running db-restore.sh script"
docker exec -it -u root "$MSSQL_CONTAINER_ID" bash -c "/home/db-restore.sh"

echo "starting backend"
docker start "$BACKEND_CONTAINER_ID"

echo "done"

Resources