Backup & Restore Docker Named Volumes

I finally started implementing backup & restore feature for Puffin. The first issue I encountered was to make a backup of named volumes.

The official Docker documentation mentions only data volume containers and –volumes-from option. There’s also docker cp command, but it requires knowing the path where the volumes are mounted in the container that uses them.

It turns out it’s pretty easy to do using volume mounts and tar.

To backup some_volume to /tmp/some_archive.tar.bz2 simply run:

And to restore run:

I have chosen alpine image since it’s lightweight and contains everything what’s needed. One potential issue might be preserving file ownership since different users and groups exist on different containers. Classical solution to this problem is to run the tar command using the same image as the one normally using the volume instead of alpine, but what if there’s no tar there? Using numeric owner generally preserves permissions correctly, unless you also use user namespaces. Also you need to remember to stop all the containers using the volume being backed-up or restored, otherwise an inconsistent / intermediate state the data might be archived.

Ultimately I wrote my own little volume-backup utility for backup and restore of volumes that simplifies the process even further and offers some improvements. Example usage (see README for more details):

Feel free to check it out and let me know what do you think.

Edit: Changed the cleanup code to delete hidden files – thanks for a comment Olivier.

Edit: It’s also possible to backup to standard output and restore from standard input. I added this capability to volume-backup – thanks for a comment, suggestion and example Holger

22 thoughts on “Backup & Restore Docker Named Volumes

  1. Thank you! Been looking all over for this. It’s incomprehensible to me that there is not a docker volume save and docker volume load command as you can do with images.

  2. Is there any special reason to do it this way and no just grab it from /var/lib/docker/volumes/volume_name?

  3. Can you elaborate on stopping containers during backup? Is that to prevent situation where some kind of “transaction” (using db language) happen during file volume backup and part of files might be in new state and other part in old state? Is there anything more we have to worry about than in usual tar based backup?

  4. Can you elaborate on stopping container while volume is backed up? Is there anything more we have to worry about using docker volumes than during usual tar base backup (files inconsistency)?

  5. Very helpful, thank you. I felt inspired to avoid the dependency for mounting a host backup directory, by writing the archive content directly to stdout.

    docker run –rm -v named-vol:/volume alpine tar -cjf – /volume > /tmp/backup.tar.bz2

    Because my backup is not stored on the docker host directly:

    docker run –rm -v named-vol:/volume alpine tar -cjf – /volume | ssh user@backuphost ‘cat > /tmp/backup.tar.bz2’

  6. Hey,

    Thanks for sharing.

    By the way, you should use “rm -rf /volume/* /volume/.[^.] /volume/.??* ; tar …” if you really want to empty the /volume mount point also removing all hidden files and folders.

    Cheers,
    Olivier

  7. Unbelievable that this requires a script and isn’t part of docker volume’s built-in capabilities.

  8. Hi Olivier,

    You are absolutely right. In fact in the source code of volume-backup project the cleanup code evolved to: rm -rf /volume/* /volume/..?* /volume/.[!.]* which is more or less equivalent to your version AFAIK (although yours seems to be more popular on ServerFault). I didn’t update the article to keep it simple, but you are right, it should be clear, so I did it now.

    Thanks,
    Jarek

  9. Hi Holger,

    This is really cool – I have added an issue for this on volume-backup project. I would just like to keep the old syntax with two volumes as well, for backwards compatibility.

    If your inspiration still lasts, then please feel free to send a pull request:) If not, no problem, I will do it in the coming weeks.

    Thanks,
    Jarek

  10. Regarding simply grabbing contents of /var/lib/docker/volumes/<volume>/_data directory – this could work, but I needed to talk to a remote Docker daemon, so this solution was more convenient. I was also worried to build on implementation detail and about file permissions, but maybe for no reason.

  11. Hi badsector,

    Thanks for the comment – I updated the article to clarify what I meant.

    Regarding stopping the container while performing backup – yes, I was worried that inconsistent / intermediate state will be archived. Other projects use specialized backup scripts to copy the database (libre.sh for example), in Puffin I decided to simply shutdown the container, perform the backup and start it again.

    Thanks,
    Jarek

  12. Hello everyone. Sorry for publishing and responding to your comments with such delay. I just noticed that I had a configuration issue with my WordPress.

    It’s now fixed, comments should appear immediately.

  13. I have the same question :

    “Me 19 October 2017 at 12:15
    Is there any special reason to do it this way and no just grab it from /var/lib/docker/volumes/volume_name?”

Leave a Reply

Your email address will not be published.