We’ve seen the usefulness of docker, and how we can use it to run applications normally requiring huge amounts of configuration such as web servers and databases. This seems like the perfect solution for complex packages that require expert maintenance and setup. And it is! There’s a reason why docker is widely used and loved. However, one potential catch is that docker containers don’t allow persistent memory by default! This means that whatever changes you make to the docker configuration won’t persist after you shut down the container and restart it again. And that’s why we create docker volumes.
Stateless Containers
When docker first launched in 2013, they were meant to run as “stateless” containers. This means that the process inside the container had no persistent configuration or storage. There was a method to connect it to the host OS via something called a “bind mount”, but this was an imperfect solution that relied on the specifics of the host OS, and the process of creating the binds wasn’t portable, thereby spoiling the purpose of docker in the first place.
There’s nothing wrong with stateless containers – there are many applications that can run perfectly well in the background without any configuration or memory changes. For example, NGINX running as a reverse proxy merely needs to forward requests to a backend server, while it servers static files. A load balancer, or something like Redis in ephemeral mode also doesn’t require any state. These processes do their job regardless of the incoming data and don’t require external information on how to proceed.
But stateless containers are very limiting. Most processes cannot run in a stateless manner and need persistent storage.
The Need for Persistent Storage
Let’s say you want to use docker to run a database, against which you want to run queries. Even if you don’t need to write to this database, you might surely want to make other changes. For example, you might want to create a useful view object by combining data from other tables. With a stateless container running a database, anytime you restart your server or restart the container, your changes are lost. This means that your view object is lost, and any scripts that run against it won’t work.
Similarly, if your docker container runs a web application like WordPress, it’s pretty useless if you can’t make any changes to it like creating new posts, or changing the database password! Other applications like logging services requiring storage and development environments depend heavily on data changes. Running such applications in a stateless environment makes no sense.
Using Docker Volumes to Create Stateful Environments
Using docker, we can create entities known as “volumes” that act as persistent storage buckets for applications to use. For example, see the following code:
docker volume create my-data-volume
In the above code, we create a new volume called “my-data-volume”. I can assign this volume to any container when I create it, and map it to a directory inside that container. The same volume can be used by multiple containers.
These volumes are managed by docker – not by the OS, so it’s not particularly important where they are stored in the host OS. Here’s an example of how to attach the volume to the container:
docker run -dit --name funny_mirzakhani -v my-data-volume:/data alpine sh
The above command creates an instance of the “Alpine” container, using the “my-data-volume” volume, and maps it to a directory called “/data” inside the container.
As proof, we can attach to the container like this:
docker attach funny_mirzakhani
Once inside, we can check the directory for the “data” directory.
You can see above, that the “data” directory indeed exists. Let’s write something to it and see if the changes are reflected in the volume.
In the above screenshot, you can see that I entered the “data” folder and created a new file. If this was a normal docker folder, then this data is ephemeral. It’ll vanish once I remove the container and create a new one. But because this folder is attached to a volume during the “docker run” command, it’ll remain persistent. Let’s test it and see if it works:
Here, I first exit the container and then stop it. Then I use “docker rm” to remove the container entirely. I then run the container again using the same command, navigate to the “data” folder, and there it is – the file I created in the data folder still exists!
This shows that by using volumes, you can create persistent storage inside a docker container, making it perfect for “stateful” applications. Of course, the structure of the application has to be such that it supports the changed data being restricted only to certain volumes, which means that not all applications are suitable for docker. But modern programming practices encourage the separation of static and dynamic data, so with a bit of elbow grease, you should be able to get your modern application to work with docker.
Examining Docker Volumes
Once you’ve created a volume and been using it for a while, you might want to periodically check up on which volumes you’ve created and what their status is. The reason is simple – it’s easy to create volumes and forget about them. They’re not visible on your usual folder structure, so unless you explicitly know they’re there, they can use up space silently, even when the container for which they were created no longer exists.
To see which volumes are available, use the following command:
docker volume ls
Here’s a sample output:
In the above screenshot, there are three volumes in use. Out of these, the last one is the one we created specifically for use with my test Alpine container. The other two are anonymous volumes that existing containers have created for temporary data storage. While creating a docker image, the user can specify the creation of these anonymous volumes, which aren’t meant to be permanent storage.
Examining Single Docker Volumes
If you want to see the details of any single docker volume, you can use the following command:
docker volume inspect my-data-volume
This will give you the details of any single volume, including the date of its creation, and where the data it contains is stored on the system. In the above example, the data is stored at:
/var/lib/docker/volumes/my-data-volume/_data
If you want, you can manually inspect the data at this location using the “sudo ls” command like this:
sudo ls /var/lib/docker/volumes/my-data-volume/_data
If you’ve written to the volume from inside a container as we did earlier, then the above command will show you what you’ve written, indicating the presence of persistent storage as expected.
Pruning Volumes
Now and then you should check to see if there exist volumes that aren’t used by any container using the following command:
docker volume prune
This has the following output:
As you can see, the two anonymous volumes weren’t in use, and we saved 222.5 MB of space!
Conclusion
By default, docker containers are stateless applications that don’t preserve any data changes you make within them. While this can work fine for certain stateless applications like an NGINX reverse proxy, or certain other web servers, most processes require the preservation and updation of data. To address this problem, docker uses named volumes that map to specific directories in the underlying OS. This is how we convert stateless docker applications to stateless ones.

I’m a NameHero team member, and an expert on WordPress and web hosting. I’ve been in this industry since 2008. I’ve also developed apps on Android and have written extensive tutorials on managing Linux servers. You can contact me on my website WP-Tweaks.com!
Leave a Reply