Integration with BTRFS Snapshot #251

Open
opened 2019-11-22 22:33:31 +00:00 by drewkett · 10 comments

What I'm trying to do and why

I've got systems that use btrfs for the file system. What I set up borgmatic to do is to run a pre backup script which takes a read only snapshot of the filesystem and any subvolumes I care about before the backup. And postbackup, a separate script deletes the read only snapshots.

The following script takes the snapshots

#!/usr/bin/python3

from pathlib import Path
from subprocess import run
backup_dir = Path("/.backups")
error_occured = False

subvols = [("/","root"),("/data/subvol1","subvol1"),("/data/subvol2","subvol2")]
for path, name in subvols:
    backup_path = backup_dir / name
    # This cleans up any snapshots that didn't get cleaned up on a previous run
    if backup_path.exists():
        res = run(["btrfs","subvolume","delete",str(backup_path)])
        error_occured |= res.returncode != 0
    res = run(["btrfs","subvolume","snapshot","-r",path,str(backup_path)])
    error_occured |= res.returncode != 0

if error_occured:
    exit(1)

And then the postbackup script is this

#!/usr/bin/python3
from pathlib import Path
from subprocess import run
backup_dir = Path("/.backups")
error_occured = False
subvols = [("/","root"),("/data/subvol1","subvol1"),("/data/subvol2","subvol2")]
for path, name in subvols:
    backup_path = backup_dir / name
    if backup_path.exists():
        res = run(["btrfs","subvolume","delete",str(backup_path)])
        error_occured |= res.returncode != 0
if error_occured:
    exit(1)

I don't currently attempt to handle restoration though it seems doable. Btrfs subvolumes can be mounted in multiple locations at once, so if you mount the destination one in /.backups (or create it if it doesn't exist) then restore into that, you would get the files restored. Obviously there would be details to work out on that

#### What I'm trying to do and why I've got systems that use btrfs for the file system. What I set up borgmatic to do is to run a pre backup script which takes a read only snapshot of the filesystem and any subvolumes I care about before the backup. And postbackup, a separate script deletes the read only snapshots. The following script takes the snapshots ```python #!/usr/bin/python3 from pathlib import Path from subprocess import run backup_dir = Path("/.backups") error_occured = False subvols = [("/","root"),("/data/subvol1","subvol1"),("/data/subvol2","subvol2")] for path, name in subvols: backup_path = backup_dir / name # This cleans up any snapshots that didn't get cleaned up on a previous run if backup_path.exists(): res = run(["btrfs","subvolume","delete",str(backup_path)]) error_occured |= res.returncode != 0 res = run(["btrfs","subvolume","snapshot","-r",path,str(backup_path)]) error_occured |= res.returncode != 0 if error_occured: exit(1) ``` And then the postbackup script is this ```python #!/usr/bin/python3 from pathlib import Path from subprocess import run backup_dir = Path("/.backups") error_occured = False subvols = [("/","root"),("/data/subvol1","subvol1"),("/data/subvol2","subvol2")] for path, name in subvols: backup_path = backup_dir / name if backup_path.exists(): res = run(["btrfs","subvolume","delete",str(backup_path)]) error_occured |= res.returncode != 0 if error_occured: exit(1) ``` I don't currently attempt to handle restoration though it seems doable. Btrfs subvolumes can be mounted in multiple locations at once, so if you mount the destination one in /.backups (or create it if it doesn't exist) then restore into that, you would get the files restored. Obviously there would be details to work out on that
Owner

Awesome.. Seeing how btrfs snapshotting is in use in the wild is super helpful. One of the current theories with the LVM snapshotting ticket is that borgmatic could walk through the list of configured borgmatic repositories and probe for whether a given repository path is on LVM, btrfs, etc. And then use the result of that to make the appropriate snapshots.

Given what you know about btrfs and how you use it, would that sort of probing approach work for btrfs snapshotting?

Awesome.. Seeing how btrfs snapshotting is in use in the wild is super helpful. One of the current theories with the [LVM snapshotting ticket](https://projects.torsion.org/witten/borgmatic/issues/80) is that borgmatic could walk through the list of configured borgmatic repositories and [probe](https://projects.torsion.org/witten/borgmatic/issues/80#issuecomment-2025) for whether a given repository path is on LVM, btrfs, etc. And then use the result of that to make the appropriate snapshots. Given what you know about btrfs and how you use it, would that sort of probing approach work for btrfs snapshotting?
Author

I've never tried that, but I would be surprised if it wasn't doable for btrfs to determine if a path uses btrfs and whic btrfs subvolume that path is in.

One hiccup that I see is that when you take btrfs snapshots of a subvolume they are not recursive (as in you don't get snapshots of subvolumes within that path), so there would be an implicit one_file_system enabled since I don't think Borg crosses subvolume boundaries when --one-file-system is used. I'd imagine someone has created a script or something to do recursive snapshots, but it seems like something that would be prone to edge cases.

Another related issue is if a non btrfs filesystem is mounted at a subfolder of a btrfs subvolume. This obviously would not get snapshot at the same time.

Another thing I'm not sure of from a desired behavior perspective is that if the path selected is not the root of the subvolume and its actually one level up. You could still take a snapshot and only backup the requested path, but there may be implications of taking a snapshot that includes an unintended directory, such as if you have a vm image in the snapshot which may hurt performance. That would certainly not be best practices I feel because its easy to set up sub volumes to avoid that issue.

The more I think about it, an automatic approach is complicated. Particularly the first issue is not a desirable behavior if it doesn't capture everything under the path by default. I can look around and see what other people have come with regarding recursive btrfs snapshots. I think something might be doable (maybe with caveats), it would just require some thought and testing.

I've never tried that, but I would be surprised if it wasn't doable for btrfs to determine if a path uses btrfs and whic btrfs subvolume that path is in. One hiccup that I see is that when you take btrfs snapshots of a subvolume they are not recursive (as in you don't get snapshots of subvolumes within that path), so there would be an implicit `one_file_system` enabled since I don't think Borg crosses subvolume boundaries when --one-file-system is used. I'd imagine someone has created a script or something to do recursive snapshots, but it seems like something that would be prone to edge cases. Another related issue is if a non btrfs filesystem is mounted at a subfolder of a btrfs subvolume. This obviously would not get snapshot at the same time. Another thing I'm not sure of from a desired behavior perspective is that if the path selected is not the root of the subvolume and its actually one level up. You could still take a snapshot and only backup the requested path, but there may be implications of taking a snapshot that includes an unintended directory, such as if you have a vm image in the snapshot which may hurt performance. That would certainly not be best practices I feel because its easy to set up sub volumes to avoid that issue. The more I think about it, an automatic approach is complicated. Particularly the first issue is not a desirable behavior if it doesn't capture everything under the path by default. I can look around and see what other people have come with regarding recursive btrfs snapshots. I think something might be doable (maybe with caveats), it would just require some thought and testing.
Owner

One hiccup that I see is that when you take btrfs snapshots of a subvolume they are not recursive

Good point. Do you have a sense for how common it'd be to backup a btrfs hierarchy with subvolumes? Is that the sort of feature we could simply skip in the first version of this? Or is it pretty essential?

Another related issue is if a non btrfs filesystem is mounted at a subfolder of a btrfs subvolume. This obviously would not get snapshot at the same time.

Correct.. And if that's encountered, I could see two options: 1. Blow up, saying that it's unsupported, or 2. Snapshot the btrfs portion while not snapshotting the non-btrfs subfolder. I'm not sure what other expectation there could be!

The more I think about it, an automatic approach is complicated.

Is the alternative an explicit list of volumes/paths to snapshot, separate from the list of source directories to backup?

> One hiccup that I see is that when you take btrfs snapshots of a subvolume they are not recursive Good point. Do you have a sense for how common it'd be to backup a btrfs hierarchy with subvolumes? Is that the sort of feature we could simply skip in the first version of this? Or is it pretty essential? > Another related issue is if a non btrfs filesystem is mounted at a subfolder of a btrfs subvolume. This obviously would not get snapshot at the same time. Correct.. And if that's encountered, I could see two options: 1. Blow up, saying that it's unsupported, or 2. Snapshot the btrfs portion while not snapshotting the non-btrfs subfolder. I'm not sure what other expectation there could be! > The more I think about it, an automatic approach is complicated. Is the alternative an explicit list of volumes/paths to snapshot, separate from the list of source directories to backup?
Author

I honestly don't know how other people use btrfs. With some quick googling, it doesn't seem like there is a widely used technique for snapshotting with all subvolumes, so I'd imagine it's not that common. That said, I think the simplest thing to do would be to require one_file_system: true and make it an explicit list of paths to subvolumes to snapshot would be a good place to start. A little bit of thought would need to be given I think to the interaction between this feature and path exclusions, though that seems manageable at first thought.

I honestly don't know how other people use btrfs. With some quick googling, it doesn't seem like there is a widely used technique for snapshotting with all subvolumes, so I'd imagine it's not that common. That said, I think the simplest thing to do would be to require `one_file_system: true` and make it an explicit list of paths to subvolumes to snapshot would be a good place to start. A little bit of thought would need to be given I think to the interaction between this feature and path exclusions, though that seems manageable at first thought.
Owner

Okay, thanks. That's helpful!

Okay, thanks. That's helpful!
witten added this to the filesystem snapshots milestone 2019-12-05 21:14:20 +00:00

I fail to understand the why part of this issue. Why is it useful to integrate a btrfs functionality inside a generic backup application?

As for btrfs not taking snapshots of subvolumes within another subvolume, that's an important feature and not an "issue" within btrfs. That's how you separate certain parts of the filesystem from others, for example: you don't want /home to be included in a snapshot of /, because reverting that snapshot data would result in data loss in the /home directory. That's why you should always create a /home subvolume as well with btrfs.

As for automatically snapshotting all subvolumes, you could automatically get a list of all subvolumes to be taken snapshots of. But in general it's better to use SUSE's snapper tool for snapshots rather than the built-in btrfs snapshot feature. It has more features and is easier to use and to integrate with other tools.

I fail to understand the **why** part of this issue. Why is it useful to integrate a btrfs functionality inside a generic backup application? As for btrfs not taking snapshots of subvolumes within another subvolume, that's an important feature and not an "issue" within btrfs. That's how you separate certain parts of the filesystem from others, for example: you don't want /home to be included in a snapshot of /, because reverting that snapshot data would result in data loss in the /home directory. That's why you should always create a /home subvolume as well with btrfs. As for automatically snapshotting all subvolumes, you could automatically get a list of all subvolumes to be taken snapshots of. But in general it's better to use SUSE's snapper tool for snapshots rather than the built-in btrfs snapshot feature. It has more features and is easier to use and to integrate with other tools.
Owner

I fail to understand the why part of this issue. Why is it useful to integrate a btrfs functionality inside a generic backup application?

I think the short answer here is convenience. You could script everything yourself with pre- and post-backup hooks or shell scripts, but that raises the barrier to getting backups working. Which isn't to say that absolutely every conceivable feature should get integrated directly into borgmatic..

As for automatically snapshotting all subvolumes, you could automatically get a list of all subvolumes to be taken snapshots of. But in general it’s better to use SUSE’s snapper tool for snapshots rather than the built-in btrfs snapshot feature. It has more features and is easier to use and to integrate with other tools.

Thanks for the tip. Snapper has also come up in the tickets for other filesystems, so maybe it could help with support for multiple types of snapshots. Link for the lazy: https://github.com/openSUSE/snapper

> I fail to understand the *why* part of this issue. Why is it useful to integrate a btrfs functionality inside a generic backup application? I think the short answer here is convenience. You *could* script everything yourself with pre- and post-backup hooks or shell scripts, but that raises the barrier to getting backups working. Which isn't to say that absolutely every conceivable feature should get integrated directly into borgmatic.. > As for automatically snapshotting all subvolumes, you could automatically get a list of all subvolumes to be taken snapshots of. But in general it’s better to use SUSE’s snapper tool for snapshots rather than the built-in btrfs snapshot feature. It has more features and is easier to use and to integrate with other tools. Thanks for the tip. Snapper has also come up in the tickets for other filesystems, so maybe it could help with support for multiple types of snapshots. Link for the lazy: https://github.com/openSUSE/snapper

This is a rather old issue, however since BTRFS is the default on Fedora now (and used on OpenSUSE/ArchLinux), it's now a desirable usecase.

I'll do a complete write up later, but there are a number of resources explaining how to use snapper with borgmatic. Most of the work is actually with setting up snapper and not borgmatic.

https://gist.github.com/craftyc0der/3cbdb1f9ed60aa94f8cfc0f54b719b95
https://teddit.pussthecat.org/r/Fedora/comments/pfdycf/fedora_auto_snapshots_snapper/

What I do is then set the backup path to i.e /.snapshots. I change my patterns.lst to be prefixed with /.snapshots/*/snapshot/whatever. If backing up root, you may want to exclude directories like proc, run, etc. Further, you may want /var/log on a separate subvolume so your logs don't get rolled back as well.

This setup assumes you're using snapper-cleanup with snapper-timeline. Of course, creating the snapshot before the backup works too, but an alternative is just to run the backup after the hour without the need for hooks.

As for recursively traversing subvolumes, I don't think this is desirable behavior. I believe there is an issue on the Borg tracker about data loss because of this issue. I think it'd be better to explicitly backup subvolumes rather than rely on recursion.

EDIT: Snapper does not descend into subvolumes, hence why making things on a separate subvolume solves this issue. Just a clarification. It's a different story with manual snapshots that do include subvolumes.

This is a rather old issue, however since BTRFS is the default on Fedora now (and used on OpenSUSE/ArchLinux), it's now a desirable usecase. I'll do a complete write up later, but there are a number of resources explaining how to use snapper with borgmatic. Most of the work is actually with setting up snapper and not borgmatic. https://gist.github.com/craftyc0der/3cbdb1f9ed60aa94f8cfc0f54b719b95 https://teddit.pussthecat.org/r/Fedora/comments/pfdycf/fedora_auto_snapshots_snapper/ What I do is then set the backup path to i.e /.snapshots. I change my patterns.lst to be prefixed with `/.snapshots/*/snapshot/whatever`. If backing up root, you may want to exclude directories like proc, run, etc. Further, you may want `/var/log` on a separate subvolume so your logs don't get rolled back as well. This setup assumes you're using `snapper-cleanup` with `snapper-timeline`. Of course, creating the snapshot before the backup works too, but an alternative is just to run the backup after the hour without the need for hooks. As for recursively traversing subvolumes, I don't think this is desirable behavior. I believe there is an issue on the Borg tracker about data loss because of this issue. I think it'd be better to explicitly backup subvolumes rather than rely on recursion. EDIT: Snapper does not descend into subvolumes, hence why making things on a separate subvolume solves this issue. Just a clarification. It's a different story with manual snapshots that do include subvolumes.
Owner

FYI some work has begun on Snapper integration here: https://github.com/borgmatic-collective/borgmatic/pull/51

FYI some work has begun on Snapper integration here: https://github.com/borgmatic-collective/borgmatic/pull/51
Contributor

FWIW, I configured borgmatic and a borg repository as I wished (with one_file_system: true, though that's not strictly necessary), then use the below command to backup the entirety of a btrfs subvolume snapshot more-or-less 1:1 with a given archive:

SUBVOL='backup/home-20230101'; borgmatic create --override location.source_directories='[.]' location.working_directory="/mnt/btrfs/${SUBVOL}" storage.archive_name_format="${SUBVOL##*/}"

where /mnt/btrfs is a mount of the root subvolume (ID level 5). Works great.

FWIW, I configured borgmatic and a borg repository as I wished (with `one_file_system: true`, though that's not strictly necessary), then use the below command to backup the entirety of a btrfs subvolume snapshot more-or-less 1:1 with a given archive: `SUBVOL='backup/home-20230101'; borgmatic create --override location.source_directories='[.]' location.working_directory="/mnt/btrfs/${SUBVOL}" storage.archive_name_format="${SUBVOL##*/}"` where `/mnt/btrfs` is a mount of the root subvolume (ID level 5). Works great.
witten added the
new feature area
label 2023-06-28 18:53:39 +00:00
Sign in to join this conversation.
No Assignees
5 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#251
No description provided.