#261 ZFS filesystem snapshotting

Open
opened 2 months ago by witten · 15 comments
witten commented 2 months ago

What I'm trying to do and why

Much like with #80, this is a ticket for borgmatic to provide filesystem snapshotting: Create a read-only snapshot before backups, pass that snapshot to Borg, and then remove / cleanup the snapshot afterwards. This ticket is for ZFS support, in particular.

#### What I'm trying to do and why Much like with https://projects.torsion.org/witten/borgmatic/issues/80, this is a ticket for borgmatic to provide filesystem snapshotting: Create a read-only snapshot before backups, pass that snapshot to Borg, and then remove / cleanup the snapshot afterwards. This ticket is for [ZFS](https://en.wikipedia.org/wiki/ZFS) support, in particular.
witten added this to the filesystem snapshots milestone 2 months ago
jucor_ commented 1 month ago

This would be absolutely amazing :) Thanks for considering it! Happy to beta-test it, too.

This would be absolutely amazing :) Thanks for considering it! Happy to beta-test it, too.
jucor_ commented 1 month ago

In the meantime, could that be achieved via hooks?

In the meantime, could that be achieved via [hooks](https://torsion.org/borgmatic/docs/how-to/add-preparation-and-cleanup-steps-to-backups/)?
witten commented 1 month ago
Owner

Yup, hooks would be a totally valid work-around, assuming you know the right snapshotting incantations.

Yup, hooks would be a totally valid work-around, assuming you know the right snapshotting incantations.
jucor_ commented 4 weeks ago

I'll have a stab as soon as I have time, and report here. Should be simple-ish, as long as we do not try to have nice error checking ;)

The only downside of hooks is the redundancy: paths have to be specified both as locations to back up and their ZFS volume has to be specified in the snapshot hook. An auto detection of the volumes from the paths would be nicer. There will be room for automation :)

I'll have a stab as soon as I have time, and report here. Should be simple-ish, as long as we do not try to have nice error checking ;) The only downside of hooks is the redundancy: paths have to be specified both as locations to back up *and* their ZFS volume has to be specified in the snapshot hook. An auto detection of the volumes from the paths would be nicer. There will be room for automation :)
jucor_ commented 3 weeks ago

Done with hooks, it works, but there are some downsides.

Explanations Here is the setup, using the ZFS terminology:

  • I have a “pool” pool0 containing a dataset data.
  • That dataset is mounted in /mnt/mirrored/.
  • I can therefore create snapshots of /mnt/mirrored/ with zfs snapshot pool0/data@snapshot_name.
  • This new snapshot is then visible in /mnt/mirrored/.zfs/snapshot/snapshot_name. Note that /mnt/mirrored/.zfs is a special ZFS directory that does not appear in listings (e.g. not showing in ls -a /mnt/mirrored/), and therefore that borg does not see when browsing the tree, so no need to explicitely exclude.
  • We can create that snapshot in the hook
  • We can then backup that snapshot.
  • And delete that snapshot when we are done.

Resulting config.yaml

location:
  source_directories:
    # Path where the snapshot appears
    - /mnt/mirrored/.zfs/snapshot/borg-ongoing

hooks:
  before_backup:
    # Take a snapshot of /mnt/mirrored, referring it by its ZFS full path, not by its mountpoint. 
    # IMPORTANT: Try to delete any existing snapshot of that name first, just in case of leftovers from aborted backups
    - zfs destroy pool0/data@borg-ongoing 2> /dev/null; zfs snapshot pool0/data@borg-ongoing  

  after_backup:
    - zfs destroy pool0/data@borg-ongoing                                                                                                                                                            

Upsides

  • It seems to work \o/

Downsides

  • No idea how to avoid having the .zfs/snapshot/borg-ongoing/ part of the path in the archive. It is ugly, but I could not find a way to process paths in Borg pre-backup. Any idea?
  • The config file needs to use both the ZFS (pool0/data) path in the hook, then its mount path in source_directories. Potentially confusing.
  • The paths are not automatically inferred. Not ideal.
Done with hooks, it works, but there are some downsides. *Explanations* Here is the setup, using the ZFS terminology: - I have a "pool" `pool0` containing a dataset `data`. - That dataset is mounted in `/mnt/mirrored/`. - I can therefore create snapshots of `/mnt/mirrored/` with `zfs snapshot pool0/data@snapshot_name`. - This new snapshot is then visible in `/mnt/mirrored/.zfs/snapshot/snapshot_name`. Note that `/mnt/mirrored/.zfs` is a special ZFS directory that does not appear in listings (e.g. not showing in `ls -a /mnt/mirrored/`), and therefore that borg does not see when browsing the tree, so no need to explicitely exclude. - We can create that snapshot in the hook - We can then backup that snapshot. - And delete that snapshot when we are done. *Resulting config.yaml* ```yaml location: source_directories: # Path where the snapshot appears - /mnt/mirrored/.zfs/snapshot/borg-ongoing hooks: before_backup: # Take a snapshot of /mnt/mirrored, referring it by its ZFS full path, not by its mountpoint. # IMPORTANT: Try to delete any existing snapshot of that name first, just in case of leftovers from aborted backups - zfs destroy pool0/data@borg-ongoing 2> /dev/null; zfs snapshot pool0/data@borg-ongoing after_backup: - zfs destroy pool0/data@borg-ongoing ``` *Upsides* - It seems to work \o/ *Downsides* - No idea how to avoid having the `.zfs/snapshot/borg-ongoing/` part of the path in the archive. It is ugly, but I could not find a way to process paths in Borg pre-backup. Any idea? - The config file needs to use both the ZFS (`pool0/data`) path in the hook, then its mount path in `source_directories`. Potentially confusing. - The paths are not automatically inferred. Not ideal.
witten commented 3 weeks ago
Owner

I appreciate you doing some useful R&D on this!

No idea how to avoid having the .zfs/snapshot/borg-ongoing/ part of the path in the archive. It is ugly, but I could not find a way to process paths in Borg pre-backup. Any idea?

Would it be possible to do a bind mount of /mnt/mirrored/.zfs/snapshot/borg-ongoing to appear to be mounted at /mnt/mirrored/, but only within the borgmatic process (so that it doesn't affect other processes on the system)?

Here's a contrived example using unshare such that a bind mount is only visible to a single process: #80

The paths are not automatically inferred. Not ideal.

Meaning that the ZFS path can't be inferred from the mount path? Is there any ZFS command-line utility that would allow us to parse that out at runtime? (I'm thinking ahead to built-in borgmatic ZFS snapshotting, not just doing it with hooks.)

I appreciate you doing some useful R&D on this! > No idea how to avoid having the .zfs/snapshot/borg-ongoing/ part of the path in the archive. It is ugly, but I could not find a way to process paths in Borg pre-backup. Any idea? Would it be possible to do a [bind mount](https://unix.stackexchange.com/questions/198590/what-is-a-bind-mount) of `/mnt/mirrored/.zfs/snapshot/borg-ongoing` to appear to be mounted at `/mnt/mirrored/`, but only within the borgmatic process (so that it doesn't affect other processes on the system)? Here's a contrived example using `unshare` such that a bind mount is only visible to a single process: https://projects.torsion.org/witten/borgmatic/issues/80#issuecomment-2072 > The paths are not automatically inferred. Not ideal. Meaning that the ZFS path can't be inferred from the mount path? Is there any ZFS command-line utility that would allow us to parse that out at runtime? (I'm thinking ahead to built-in borgmatic ZFS snapshotting, not just doing it with hooks.)
jucor_ commented 3 weeks ago

Very happy to help with the R&D: borgmatic is so useful, least I can do :)

Bind mount

Oh, I like this idea. Where would we declare the bind mount in the config file so that it carries to the call to borg? Would it carry from the hooks to the borg call?

ZFS mount path inference

Yup, I'm right there with you, dreaming of that feature :) Easily feasible with zfs list:

julien@supercompute:cron.d$ zfs list 
NAME         USED  AVAIL  REFER  MOUNTPOINT   
pool0       14.3G  2.62T    96K  /mnt/pool0
pool0/data  14.2G  2.62T  9.16G  /mnt/mirrored   
pool1       20.1G  2.61T    96K  /mnt/pool1
pool1/data  20.0G  2.61T  7.22G  /mnt/unmirrored 
julien@supercompute:cron.d$   

To make it easier to parse, from the official doc:

You can use the -H option to omit the zfs list header from the generated output. With the -H option, all white space is replaced by the Tab character. This option can be useful when you need parseable output, for example, when scripting. The following example shows the output generated from using the zfs list command with the -H option

Hence the easy-to-parse, tab-separated result below:

julien@supercompute:cron.d$ zfs list -H -o name,mountpoint   
pool0   /mnt/pool0 
pool0/data      /mnt/mirrored 
pool1   /mnt/pool1
pool1/data      /mnt/unmirrored 
julien@supercompute:cron.d$   

Update on the hooks

I need to amend the hooks above. The zfs destroy in the pre- hook was a bad idea: if there is already a borgmatic/borg instatiation not yet finished, zfs destroy will ruin it, because the hook (and hence snapshot destruction) will be called before borg fails to obtain a lock on the repository.

So no preemptive clean-up should be done. If someone kills borgmatic mid-way without the post- hook being called, that person is responsible for destroying the snapshot too. Otherwise, the pre- hook will just fail and throw an error.

Ideally, a snapshot would have a UUID determined at pre- hook, and the source_directions and the post- hook would use that UUID too, but that would require using variables. Definitely something for the proper borgmatic feature, though!

TLDR:

location:
  source_directories:
    # Path where the snapshot appears
    - /mnt/mirrored/.zfs/snapshot/borg-ongoing

hooks:
  before_backup:
    # Take a snapshot of /mnt/mirrored, referring it by its ZFS full path, not by its mountpoint. 
    # IMPORTANT: If a snapshot of the same name is already in place, due to a still-running backup *or* to an interrupted prior backup, this hook will throw an error, and ask the user to clean up the mess.
    - zfs snapshot pool0/data@borg-ongoing  

  after_backup:
    - zfs destroy pool0/data@borg-ongoing    
Very happy to help with the R&D: borgmatic is *so* useful, least I can do :) ## Bind mount Oh, I like this idea. Where would we declare the bind mount in the config file so that it carries to the call to borg? Would it carry from the hooks to the borg call? ## ZFS mount path inference Yup, I'm right there with you, dreaming of that feature :) Easily feasible with `zfs list`: ``` julien@supercompute:cron.d$ zfs list NAME USED AVAIL REFER MOUNTPOINT pool0 14.3G 2.62T 96K /mnt/pool0 pool0/data 14.2G 2.62T 9.16G /mnt/mirrored pool1 20.1G 2.61T 96K /mnt/pool1 pool1/data 20.0G 2.61T 7.22G /mnt/unmirrored julien@supercompute:cron.d$ ``` To make it easier to parse, from [the official doc](https://docs.oracle.com/cd/E18752_01/html/819-5461/gazsu.html): >You can use the -H option to omit the zfs list header from the generated output. With the -H option, all white space is replaced by the Tab character. This option can be useful when you need parseable output, for example, when scripting. The following example shows the output generated from using the zfs list command with the -H option Hence the easy-to-parse, tab-separated result below: ``` julien@supercompute:cron.d$ zfs list -H -o name,mountpoint pool0 /mnt/pool0 pool0/data /mnt/mirrored pool1 /mnt/pool1 pool1/data /mnt/unmirrored julien@supercompute:cron.d$ ``` ## Update on the hooks I need to amend the hooks above. The `zfs destroy` in the pre- hook was a bad idea: if there is already a borgmatic/borg instatiation not yet finished, `zfs destroy` will ruin it, because the hook (and hence snapshot destruction) will be called before `borg` fails to obtain a lock on the repository. So no preemptive clean-up should be done. If someone kills borgmatic mid-way without the post- hook being called, that person is responsible for destroying the snapshot too. Otherwise, the pre- hook will just fail and throw an error. Ideally, a snapshot would have a UUID determined at pre- hook, and the source_directions and the post- hook would use that UUID too, but that would require using variables. Definitely something for the proper borgmatic feature, though! TLDR: ```yaml location: source_directories: # Path where the snapshot appears - /mnt/mirrored/.zfs/snapshot/borg-ongoing hooks: before_backup: # Take a snapshot of /mnt/mirrored, referring it by its ZFS full path, not by its mountpoint. # IMPORTANT: If a snapshot of the same name is already in place, due to a still-running backup *or* to an interrupted prior backup, this hook will throw an error, and ask the user to clean up the mess. - zfs snapshot pool0/data@borg-ongoing after_backup: - zfs destroy pool0/data@borg-ongoing ```
witten commented 3 weeks ago
Owner

Where would we declare the bind mount in the config file so that it carries to the call to borg? Would it carry from the hooks to the borg call?

Today, with hooks, you could make the bind mount in the before_backup hook and unmount it in the after_backup hook. In the future world where ZFS snapshotting support is built-in to borgmatic, it would hopefully not even require configuration.. borgmatic would just do it transparently for any source directory enabled for snapshotting.

I'm glad to hear that there's a way to get the ZFS path from the mount path! Thanks for describing that.

The zfs destroy in the pre- hook was a bad idea: if there is already a borgmatic/borg instatiation not yet finished, zfs destroy will ruin it, because the hook (and hence snapshot destruction) will be called before borg fails to obtain a lock on the repository.

This ticket may be relevant to your interests: #250

So no preemptive clean-up should be done. If someone kills borgmatic mid-way without the post- hook being called, that person is responsible for destroying the snapshot too. Otherwise, the pre- hook will just fail and throw an error.

Note that there is a borgmatic on_error hook too. You could put cleanup there.

> Where would we declare the bind mount in the config file so that it carries to the call to borg? Would it carry from the hooks to the borg call? Today, with hooks, you could make the bind mount in the `before_backup` hook and unmount it in the `after_backup` hook. In the future world where ZFS snapshotting support is built-in to borgmatic, it would hopefully not even require configuration.. borgmatic would just do it transparently for any source directory enabled for snapshotting. I'm glad to hear that there's a way to get the ZFS path from the mount path! Thanks for describing that. > The zfs destroy in the pre- hook was a bad idea: if there is already a borgmatic/borg instatiation not yet finished, zfs destroy will ruin it, because the hook (and hence snapshot destruction) will be called before borg fails to obtain a lock on the repository. This ticket may be relevant to your interests: https://projects.torsion.org/witten/borgmatic/issues/250 > So no preemptive clean-up should be done. If someone kills borgmatic mid-way without the post- hook being called, that person is responsible for destroying the snapshot too. Otherwise, the pre- hook will just fail and throw an error. Note that there is a borgmatic `on_error` hook too. You *could* put cleanup there.
jucor_ commented 3 weeks ago

Mount: Mmm, I'm not sure I understand, sorry. How do you arrange that unshare in the before_backuphook makes the bind only visible to borgmatic and its called programs. Then how do you deactivate it in the after_backup before calling zfs destroy?

Locking: #250 would solve it indeed, thanks! That'll be very useful when it's merged :)

*Cleanup in on_error. I think that has the same problem as after_backup. Let's say borgmatic instance B1 creates the snapshot and starts a long run. Meanwhile, instance B2 starts, fails to create a snapshot, or even ignores that error and then fails to access the lock on the repo. I then goes into its on_error hook, and there destroys the snapshot. Then B1 is suddenly left without a source. #250-like locking is really the only mechanism I can think of.

*Mount*: Mmm, I'm not sure I understand, sorry. How do you arrange that `unshare` in the `before_backup`hook makes the bind only visible to borgmatic and its called programs. Then how do you deactivate it in the `after_backup` *before* calling `zfs destroy`? *Locking*: #250 would solve it indeed, thanks! That'll be very useful when it's merged :) *Cleanup in `on_error`. I think that has the same problem as `after_backup`. Let's say borgmatic instance B1 creates the snapshot and starts a long run. Meanwhile, instance B2 starts, fails to create a snapshot, or even ignores that error and then fails to access the lock on the repo. I then goes into its `on_error` hook, and there *destroys the snapshot*. Then B1 is suddenly left without a source. #250-like locking is really the only mechanism I can think of.
witten commented 3 weeks ago
Owner

Mount: Mmm, I’m not sure I understand, sorry. How do you arrange that unshare in the before_backuphook makes the bind only visible to borgmatic and its called programs. Then how do you deactivate it in the after_backup before calling zfs destroy?

Ah, yes.. This approach won't work with hooks as written without making the bind mount also visible to other processes (which would defeat the purpose). It would have to be baked into an eventual borgmatic filesystem snapshotting feature.. such that when borgmatic invokes Borg, it wraps it with an unshare call to create any bind mounts.

In regards to cleanup: I'm not that familiar with ZFS, but would it be possible to create uniquely named snapshots, so that no borgmatic invocation impacts another?

> Mount: Mmm, I’m not sure I understand, sorry. How do you arrange that unshare in the before_backuphook makes the bind only visible to borgmatic and its called programs. Then how do you deactivate it in the after_backup before calling zfs destroy? Ah, yes.. This approach won't work with hooks as written without making the bind mount also visible to other processes (which would defeat the purpose). It would have to be baked into an eventual borgmatic filesystem snapshotting feature.. such that when borgmatic invokes Borg, it wraps it with an `unshare` call to create any bind mounts. In regards to cleanup: I'm not that familiar with ZFS, but would it be possible to create uniquely named snapshots, so that no borgmatic invocation impacts another?
jucor_ commented 3 weeks ago

Thanks for confirming about the hooks, that's what I feared.

For cleanup: yes, totally feasible and what I was suggesting when I mentioned UUIDS above: just generate a unique name (from timestamp or anything, really) and change the name of the snapshot after the @ in the zfs calls.

However, I cannot not find a way in YAML to generate and store the unique name at runtime in the YAML file so that both before- and after- hooks can use that same unique name. Do you know how to? Otherwise, this also might need to be done in the eventual borgmatic filesystem snapshotting feature :/

Thanks for confirming about the hooks, that's what I feared. For cleanup: yes, totally feasible and what I was suggesting when I mentioned UUIDS above: just generate a unique name (from timestamp or anything, really) and change the name of the snapshot after the `@` in the `zfs` calls. However, I cannot not find a way in YAML to generate and store the unique name at runtime in the YAML file so that both before- and after- hooks can use that same unique name. Do you know how to? Otherwise, this also might need to be done in the eventual borgmatic filesystem snapshotting feature :/
witten commented 3 weeks ago
Owner

Yeah, I think that's getting beyond what you can do easily with YAML. I can imagine that you could call out to a shell script in the before_backup hook that writes its unique name to a temporary file, and then another script in the after_backup hook that reads from that file. But at that point, might as well implement the feature for real...

Yeah, I think that's getting beyond what you can do easily with YAML. I can imagine that you could call out to a shell script in the `before_backup` hook that writes its unique name to a temporary file, and then another script in the `after_backup` hook that reads from that file. But at that point, might as well implement the feature for real...
jucor_ commented 3 weeks ago

Makes a lot of sense! How can I help in that regard?

Makes a lot of sense! How can I help in that regard?
witten commented 3 weeks ago
Owner

Thanks for asking. :)

Before implementation: Design discussion, as we've been doing here. On that front: Do you have an opinion on how the feature should determine which ZFS volumes should be snapshotted? Options include automatically introspecting source_directories to see if they reside on ZFS volumes (and using that fact to determine whether to snapshot), or just requiring the user to explicitly provide a list of volumes to snapshot. Discussed in other contexts here: #80 and #251.

During implementation: I'm more than happy to field PRs. :)

After implementation: Testing and feedback. I don't use ZFS myself, so getting feedback from actual users would be invaluable.

Thanks for asking. :) Before implementation: Design discussion, as we've been doing here. On that front: Do you have an opinion on how the feature should determine which ZFS volumes should be snapshotted? Options include automatically introspecting `source_directories` to see if they reside on ZFS volumes (and using that fact to determine whether to snapshot), or just requiring the user to explicitly provide a list of volumes to snapshot. Discussed in other contexts here: https://projects.torsion.org/witten/borgmatic/issues/80 and https://projects.torsion.org/witten/borgmatic/issues/251. During implementation: I'm more than happy to field PRs. :) After implementation: Testing and feedback. I don't use ZFS myself, so getting feedback from actual users would be invaluable.
jucor_ commented 3 weeks ago

Introspecting source_directories would be great :) But if it is simpler to have the user specify it manually, then I'd vote for having the manual version earlier rather than a dream solution in an unspecified future. Automatic introspection can always be added later. A bird in the hand etc etc.

Happy to test too!

Introspecting source_directories would be great :) But if it is simpler to have the user specify it manually, then I'd vote for having the manual version earlier rather than a dream solution in an unspecified future. Automatic introspection can always be added later. A bird in the hand etc etc. Happy to test too!
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Due Date

No due date set.

Dependencies

This issue currently doesn't have any dependencies.

Loading…
Cancel
Save
There is no content yet.