ZFS filesystem snapshotting #261

Open
opened 2019-11-28 17:01:17 +00:00 by witten · 45 comments
Owner

What I'm trying to do and why

Much like with witten/borgmatic#80, this is a ticket for borgmatic to provide filesystem snapshotting: Create a read-only snapshot before backups, pass that snapshot to Borg, and then remove / cleanup the snapshot afterwards. This ticket is for ZFS support, in particular.

#### What I'm trying to do and why Much like with https://projects.torsion.org/witten/borgmatic/issues/80, this is a ticket for borgmatic to provide filesystem snapshotting: Create a read-only snapshot before backups, pass that snapshot to Borg, and then remove / cleanup the snapshot afterwards. This ticket is for [ZFS](https://en.wikipedia.org/wiki/ZFS) support, in particular.
witten added this to the filesystem snapshots milestone 2019-12-05 21:13:35 +00:00

This would be absolutely amazing :) Thanks for considering it! Happy to beta-test it, too.

This would be absolutely amazing :) Thanks for considering it! Happy to beta-test it, too.

In the meantime, could that be achieved via hooks?

In the meantime, could that be achieved via [hooks](https://torsion.org/borgmatic/docs/how-to/add-preparation-and-cleanup-steps-to-backups/)?
Author
Owner

Yup, hooks would be a totally valid work-around, assuming you know the right snapshotting incantations.

Yup, hooks would be a totally valid work-around, assuming you know the right snapshotting incantations.

I'll have a stab as soon as I have time, and report here. Should be simple-ish, as long as we do not try to have nice error checking ;)

The only downside of hooks is the redundancy: paths have to be specified both as locations to back up and their ZFS volume has to be specified in the snapshot hook. An auto detection of the volumes from the paths would be nicer. There will be room for automation :)

I'll have a stab as soon as I have time, and report here. Should be simple-ish, as long as we do not try to have nice error checking ;) The only downside of hooks is the redundancy: paths have to be specified both as locations to back up *and* their ZFS volume has to be specified in the snapshot hook. An auto detection of the volumes from the paths would be nicer. There will be room for automation :)

Done with hooks, it works, but there are some downsides.

Explanations
Here is the setup, using the ZFS terminology:

  • I have a "pool" pool0 containing a dataset data.
  • That dataset is mounted in /mnt/mirrored/.
  • I can therefore create snapshots of /mnt/mirrored/ with zfs snapshot pool0/data@snapshot_name.
  • This new snapshot is then visible in /mnt/mirrored/.zfs/snapshot/snapshot_name. Note that /mnt/mirrored/.zfs is a special ZFS directory that does not appear in listings (e.g. not showing in ls -a /mnt/mirrored/), and therefore that borg does not see when browsing the tree, so no need to explicitely exclude.
  • We can create that snapshot in the hook
  • We can then backup that snapshot.
  • And delete that snapshot when we are done.

Resulting config.yaml

location:
  source_directories:
    # Path where the snapshot appears
    - /mnt/mirrored/.zfs/snapshot/borg-ongoing

hooks:
  before_backup:
    # Take a snapshot of /mnt/mirrored, referring it by its ZFS full path, not by its mountpoint. 
    # IMPORTANT: Try to delete any existing snapshot of that name first, just in case of leftovers from aborted backups
    - zfs destroy pool0/data@borg-ongoing 2> /dev/null; zfs snapshot pool0/data@borg-ongoing  

  after_backup:
    - zfs destroy pool0/data@borg-ongoing                                                                                                                                                            

Upsides

  • It seems to work \o/

Downsides

  • No idea how to avoid having the .zfs/snapshot/borg-ongoing/ part of the path in the archive. It is ugly, but I could not find a way to process paths in Borg pre-backup. Any idea?
  • The config file needs to use both the ZFS (pool0/data) path in the hook, then its mount path in source_directories. Potentially confusing.
  • The paths are not automatically inferred. Not ideal.
Done with hooks, it works, but there are some downsides. *Explanations* Here is the setup, using the ZFS terminology: - I have a "pool" `pool0` containing a dataset `data`. - That dataset is mounted in `/mnt/mirrored/`. - I can therefore create snapshots of `/mnt/mirrored/` with `zfs snapshot pool0/data@snapshot_name`. - This new snapshot is then visible in `/mnt/mirrored/.zfs/snapshot/snapshot_name`. Note that `/mnt/mirrored/.zfs` is a special ZFS directory that does not appear in listings (e.g. not showing in `ls -a /mnt/mirrored/`), and therefore that borg does not see when browsing the tree, so no need to explicitely exclude. - We can create that snapshot in the hook - We can then backup that snapshot. - And delete that snapshot when we are done. *Resulting config.yaml* ```yaml location: source_directories: # Path where the snapshot appears - /mnt/mirrored/.zfs/snapshot/borg-ongoing hooks: before_backup: # Take a snapshot of /mnt/mirrored, referring it by its ZFS full path, not by its mountpoint. # IMPORTANT: Try to delete any existing snapshot of that name first, just in case of leftovers from aborted backups - zfs destroy pool0/data@borg-ongoing 2> /dev/null; zfs snapshot pool0/data@borg-ongoing after_backup: - zfs destroy pool0/data@borg-ongoing ``` *Upsides* - It seems to work \o/ *Downsides* - No idea how to avoid having the `.zfs/snapshot/borg-ongoing/` part of the path in the archive. It is ugly, but I could not find a way to process paths in Borg pre-backup. Any idea? - The config file needs to use both the ZFS (`pool0/data`) path in the hook, then its mount path in `source_directories`. Potentially confusing. - The paths are not automatically inferred. Not ideal.
Author
Owner

I appreciate you doing some useful R&D on this!

No idea how to avoid having the .zfs/snapshot/borg-ongoing/ part of the path in the archive. It is ugly, but I could not find a way to process paths in Borg pre-backup. Any idea?

Would it be possible to do a bind mount of /mnt/mirrored/.zfs/snapshot/borg-ongoing to appear to be mounted at /mnt/mirrored/, but only within the borgmatic process (so that it doesn't affect other processes on the system)?

Here's a contrived example using unshare such that a bind mount is only visible to a single process: witten/borgmatic#80 (comment)

The paths are not automatically inferred. Not ideal.

Meaning that the ZFS path can't be inferred from the mount path? Is there any ZFS command-line utility that would allow us to parse that out at runtime? (I'm thinking ahead to built-in borgmatic ZFS snapshotting, not just doing it with hooks.)

I appreciate you doing some useful R&D on this! > No idea how to avoid having the .zfs/snapshot/borg-ongoing/ part of the path in the archive. It is ugly, but I could not find a way to process paths in Borg pre-backup. Any idea? Would it be possible to do a [bind mount](https://unix.stackexchange.com/questions/198590/what-is-a-bind-mount) of `/mnt/mirrored/.zfs/snapshot/borg-ongoing` to appear to be mounted at `/mnt/mirrored/`, but only within the borgmatic process (so that it doesn't affect other processes on the system)? Here's a contrived example using `unshare` such that a bind mount is only visible to a single process: https://projects.torsion.org/witten/borgmatic/issues/80#issuecomment-2072 > The paths are not automatically inferred. Not ideal. Meaning that the ZFS path can't be inferred from the mount path? Is there any ZFS command-line utility that would allow us to parse that out at runtime? (I'm thinking ahead to built-in borgmatic ZFS snapshotting, not just doing it with hooks.)

Very happy to help with the R&D: borgmatic is so useful, least I can do :)

Bind mount

Oh, I like this idea. Where would we declare the bind mount in the config file so that it carries to the call to borg? Would it carry from the hooks to the borg call?

ZFS mount path inference

Yup, I'm right there with you, dreaming of that feature :) Easily feasible with zfs list:

julien@supercompute:cron.d$ zfs list 
NAME         USED  AVAIL  REFER  MOUNTPOINT   
pool0       14.3G  2.62T    96K  /mnt/pool0
pool0/data  14.2G  2.62T  9.16G  /mnt/mirrored   
pool1       20.1G  2.61T    96K  /mnt/pool1
pool1/data  20.0G  2.61T  7.22G  /mnt/unmirrored 
julien@supercompute:cron.d$   

To make it easier to parse, from the official doc:

You can use the -H option to omit the zfs list header from the generated output. With the -H option, all white space is replaced by the Tab character. This option can be useful when you need parseable output, for example, when scripting. The following example shows the output generated from using the zfs list command with the -H option

Hence the easy-to-parse, tab-separated result below:

julien@supercompute:cron.d$ zfs list -H -o name,mountpoint   
pool0   /mnt/pool0 
pool0/data      /mnt/mirrored 
pool1   /mnt/pool1
pool1/data      /mnt/unmirrored 
julien@supercompute:cron.d$   

Update on the hooks

I need to amend the hooks above. The zfs destroy in the pre- hook was a bad idea: if there is already a borgmatic/borg instatiation not yet finished, zfs destroy will ruin it, because the hook (and hence snapshot destruction) will be called before borg fails to obtain a lock on the repository.

So no preemptive clean-up should be done. If someone kills borgmatic mid-way without the post- hook being called, that person is responsible for destroying the snapshot too. Otherwise, the pre- hook will just fail and throw an error.

Ideally, a snapshot would have a UUID determined at pre- hook, and the source_directions and the post- hook would use that UUID too, but that would require using variables. Definitely something for the proper borgmatic feature, though!

TLDR:

location:
  source_directories:
    # Path where the snapshot appears
    - /mnt/mirrored/.zfs/snapshot/borg-ongoing

hooks:
  before_backup:
    # Take a snapshot of /mnt/mirrored, referring it by its ZFS full path, not by its mountpoint. 
    # IMPORTANT: If a snapshot of the same name is already in place, due to a still-running backup *or* to an interrupted prior backup, this hook will throw an error, and ask the user to clean up the mess.
    - zfs snapshot pool0/data@borg-ongoing  

  after_backup:
    - zfs destroy pool0/data@borg-ongoing    
Very happy to help with the R&D: borgmatic is *so* useful, least I can do :) ## Bind mount Oh, I like this idea. Where would we declare the bind mount in the config file so that it carries to the call to borg? Would it carry from the hooks to the borg call? ## ZFS mount path inference Yup, I'm right there with you, dreaming of that feature :) Easily feasible with `zfs list`: ``` julien@supercompute:cron.d$ zfs list NAME USED AVAIL REFER MOUNTPOINT pool0 14.3G 2.62T 96K /mnt/pool0 pool0/data 14.2G 2.62T 9.16G /mnt/mirrored pool1 20.1G 2.61T 96K /mnt/pool1 pool1/data 20.0G 2.61T 7.22G /mnt/unmirrored julien@supercompute:cron.d$ ``` To make it easier to parse, from [the official doc](https://docs.oracle.com/cd/E18752_01/html/819-5461/gazsu.html): >You can use the -H option to omit the zfs list header from the generated output. With the -H option, all white space is replaced by the Tab character. This option can be useful when you need parseable output, for example, when scripting. The following example shows the output generated from using the zfs list command with the -H option Hence the easy-to-parse, tab-separated result below: ``` julien@supercompute:cron.d$ zfs list -H -o name,mountpoint pool0 /mnt/pool0 pool0/data /mnt/mirrored pool1 /mnt/pool1 pool1/data /mnt/unmirrored julien@supercompute:cron.d$ ``` ## Update on the hooks I need to amend the hooks above. The `zfs destroy` in the pre- hook was a bad idea: if there is already a borgmatic/borg instatiation not yet finished, `zfs destroy` will ruin it, because the hook (and hence snapshot destruction) will be called before `borg` fails to obtain a lock on the repository. So no preemptive clean-up should be done. If someone kills borgmatic mid-way without the post- hook being called, that person is responsible for destroying the snapshot too. Otherwise, the pre- hook will just fail and throw an error. Ideally, a snapshot would have a UUID determined at pre- hook, and the source_directions and the post- hook would use that UUID too, but that would require using variables. Definitely something for the proper borgmatic feature, though! TLDR: ```yaml location: source_directories: # Path where the snapshot appears - /mnt/mirrored/.zfs/snapshot/borg-ongoing hooks: before_backup: # Take a snapshot of /mnt/mirrored, referring it by its ZFS full path, not by its mountpoint. # IMPORTANT: If a snapshot of the same name is already in place, due to a still-running backup *or* to an interrupted prior backup, this hook will throw an error, and ask the user to clean up the mess. - zfs snapshot pool0/data@borg-ongoing after_backup: - zfs destroy pool0/data@borg-ongoing ```
Author
Owner

Where would we declare the bind mount in the config file so that it carries to the call to borg? Would it carry from the hooks to the borg call?

Today, with hooks, you could make the bind mount in the before_backup hook and unmount it in the after_backup hook. In the future world where ZFS snapshotting support is built-in to borgmatic, it would hopefully not even require configuration.. borgmatic would just do it transparently for any source directory enabled for snapshotting.

I'm glad to hear that there's a way to get the ZFS path from the mount path! Thanks for describing that.

The zfs destroy in the pre- hook was a bad idea: if there is already a borgmatic/borg instatiation not yet finished, zfs destroy will ruin it, because the hook (and hence snapshot destruction) will be called before borg fails to obtain a lock on the repository.

This ticket may be relevant to your interests: witten/borgmatic#250

So no preemptive clean-up should be done. If someone kills borgmatic mid-way without the post- hook being called, that person is responsible for destroying the snapshot too. Otherwise, the pre- hook will just fail and throw an error.

Note that there is a borgmatic on_error hook too. You could put cleanup there.

> Where would we declare the bind mount in the config file so that it carries to the call to borg? Would it carry from the hooks to the borg call? Today, with hooks, you could make the bind mount in the `before_backup` hook and unmount it in the `after_backup` hook. In the future world where ZFS snapshotting support is built-in to borgmatic, it would hopefully not even require configuration.. borgmatic would just do it transparently for any source directory enabled for snapshotting. I'm glad to hear that there's a way to get the ZFS path from the mount path! Thanks for describing that. > The zfs destroy in the pre- hook was a bad idea: if there is already a borgmatic/borg instatiation not yet finished, zfs destroy will ruin it, because the hook (and hence snapshot destruction) will be called before borg fails to obtain a lock on the repository. This ticket may be relevant to your interests: https://projects.torsion.org/witten/borgmatic/issues/250 > So no preemptive clean-up should be done. If someone kills borgmatic mid-way without the post- hook being called, that person is responsible for destroying the snapshot too. Otherwise, the pre- hook will just fail and throw an error. Note that there is a borgmatic `on_error` hook too. You *could* put cleanup there.

Mount: Mmm, I'm not sure I understand, sorry. How do you arrange that unshare in the before_backuphook makes the bind only visible to borgmatic and its called programs. Then how do you deactivate it in the after_backup before calling zfs destroy?

Locking: #250 would solve it indeed, thanks! That'll be very useful when it's merged :)

*Cleanup in on_error. I think that has the same problem as after_backup. Let's say borgmatic instance B1 creates the snapshot and starts a long run. Meanwhile, instance B2 starts, fails to create a snapshot, or even ignores that error and then fails to access the lock on the repo. I then goes into its on_error hook, and there destroys the snapshot. Then B1 is suddenly left without a source. #250-like locking is really the only mechanism I can think of.

*Mount*: Mmm, I'm not sure I understand, sorry. How do you arrange that `unshare` in the `before_backup`hook makes the bind only visible to borgmatic and its called programs. Then how do you deactivate it in the `after_backup` *before* calling `zfs destroy`? *Locking*: #250 would solve it indeed, thanks! That'll be very useful when it's merged :) *Cleanup in `on_error`. I think that has the same problem as `after_backup`. Let's say borgmatic instance B1 creates the snapshot and starts a long run. Meanwhile, instance B2 starts, fails to create a snapshot, or even ignores that error and then fails to access the lock on the repo. I then goes into its `on_error` hook, and there *destroys the snapshot*. Then B1 is suddenly left without a source. #250-like locking is really the only mechanism I can think of.
Author
Owner

Mount: Mmm, I’m not sure I understand, sorry. How do you arrange that unshare in the before_backuphook makes the bind only visible to borgmatic and its called programs. Then how do you deactivate it in the after_backup before calling zfs destroy?

Ah, yes.. This approach won't work with hooks as written without making the bind mount also visible to other processes (which would defeat the purpose). It would have to be baked into an eventual borgmatic filesystem snapshotting feature.. such that when borgmatic invokes Borg, it wraps it with an unshare call to create any bind mounts.

In regards to cleanup: I'm not that familiar with ZFS, but would it be possible to create uniquely named snapshots, so that no borgmatic invocation impacts another?

> Mount: Mmm, I’m not sure I understand, sorry. How do you arrange that unshare in the before_backuphook makes the bind only visible to borgmatic and its called programs. Then how do you deactivate it in the after_backup before calling zfs destroy? Ah, yes.. This approach won't work with hooks as written without making the bind mount also visible to other processes (which would defeat the purpose). It would have to be baked into an eventual borgmatic filesystem snapshotting feature.. such that when borgmatic invokes Borg, it wraps it with an `unshare` call to create any bind mounts. In regards to cleanup: I'm not that familiar with ZFS, but would it be possible to create uniquely named snapshots, so that no borgmatic invocation impacts another?

Thanks for confirming about the hooks, that's what I feared.

For cleanup: yes, totally feasible and what I was suggesting when I mentioned UUIDS above: just generate a unique name (from timestamp or anything, really) and change the name of the snapshot after the @ in the zfs calls.

However, I cannot not find a way in YAML to generate and store the unique name at runtime in the YAML file so that both before- and after- hooks can use that same unique name. Do you know how to? Otherwise, this also might need to be done in the eventual borgmatic filesystem snapshotting feature :/

Thanks for confirming about the hooks, that's what I feared. For cleanup: yes, totally feasible and what I was suggesting when I mentioned UUIDS above: just generate a unique name (from timestamp or anything, really) and change the name of the snapshot after the `@` in the `zfs` calls. However, I cannot not find a way in YAML to generate and store the unique name at runtime in the YAML file so that both before- and after- hooks can use that same unique name. Do you know how to? Otherwise, this also might need to be done in the eventual borgmatic filesystem snapshotting feature :/
Author
Owner

Yeah, I think that's getting beyond what you can do easily with YAML. I can imagine that you could call out to a shell script in the before_backup hook that writes its unique name to a temporary file, and then another script in the after_backup hook that reads from that file. But at that point, might as well implement the feature for real...

Yeah, I think that's getting beyond what you can do easily with YAML. I can imagine that you could call out to a shell script in the `before_backup` hook that writes its unique name to a temporary file, and then another script in the `after_backup` hook that reads from that file. But at that point, might as well implement the feature for real...

Makes a lot of sense! How can I help in that regard?

Makes a lot of sense! How can I help in that regard?
Author
Owner

Thanks for asking. :)

Before implementation: Design discussion, as we've been doing here. On that front: Do you have an opinion on how the feature should determine which ZFS volumes should be snapshotted? Options include automatically introspecting source_directories to see if they reside on ZFS volumes (and using that fact to determine whether to snapshot), or just requiring the user to explicitly provide a list of volumes to snapshot. Discussed in other contexts here: witten/borgmatic#80 and witten/borgmatic#251.

During implementation: I'm more than happy to field PRs. :)

After implementation: Testing and feedback. I don't use ZFS myself, so getting feedback from actual users would be invaluable.

Thanks for asking. :) Before implementation: Design discussion, as we've been doing here. On that front: Do you have an opinion on how the feature should determine which ZFS volumes should be snapshotted? Options include automatically introspecting `source_directories` to see if they reside on ZFS volumes (and using that fact to determine whether to snapshot), or just requiring the user to explicitly provide a list of volumes to snapshot. Discussed in other contexts here: https://projects.torsion.org/witten/borgmatic/issues/80 and https://projects.torsion.org/witten/borgmatic/issues/251. During implementation: I'm more than happy to field PRs. :) After implementation: Testing and feedback. I don't use ZFS myself, so getting feedback from actual users would be invaluable.

Introspecting source_directories would be great :) But if it is simpler to have the user specify it manually, then I'd vote for having the manual version earlier rather than a dream solution in an unspecified future. Automatic introspection can always be added later. A bird in the hand etc etc.

Happy to test too!

Introspecting source_directories would be great :) But if it is simpler to have the user specify it manually, then I'd vote for having the manual version earlier rather than a dream solution in an unspecified future. Automatic introspection can always be added later. A bird in the hand etc etc. Happy to test too!

Sorry for the messy/bad code. But here is a small script that till now works fine when i tried it out.
I want to add a few comments about what is going on, i hope that gives you an idea how the a zfs feature could be implemented.

#!/bin/bash


snapshotname="borgmatic-snapshot"
mountpoint="/mnt"
zfs_path="/usr/sbin/zfs"
readarray zfs_mountpoints <<< $(${zfs_path} list -o mountpoint | tail -n +2 | sed 's/\/$//; s/^\///')


if [ "${1}" == "pre" ]
then
        ${zfs_path} snapshot -r "${zfs_mountpoints[0]/$'\n'/}@${snapshotname}"
        for i in ${zfs_mountpoints[@]}
        do
                mkdir --parent "${mountpoint}/${i}"
                mount --bind "/${i}/.zfs/snapshot/${snapshotname}" "${mountpoint}/${i}"
        done
elif [ "${1}" == "post" ]
then
        umount -r "${mountpoint}/${zfs_mountpoints[0]/$'\n'/}"
        for i in ${zfs_mountpoints[@]}
        do
                rm -r "${mountpoint}/${i}" 
        done
        ${zfs_path} destroy -r "${zfs_mountpoints[0]/$'\n'/}@${snapshotname}"
fi

Note: For some reason the delete wont work when the pre job is run twice without an delete.

Sorry for the messy/bad code. But here is a small script that till now works fine when i tried it out. I want to add a few comments about what is going on, i hope that gives you an idea how the a zfs feature could be implemented. ``` #!/bin/bash snapshotname="borgmatic-snapshot" mountpoint="/mnt" zfs_path="/usr/sbin/zfs" readarray zfs_mountpoints <<< $(${zfs_path} list -o mountpoint | tail -n +2 | sed 's/\/$//; s/^\///') if [ "${1}" == "pre" ] then ${zfs_path} snapshot -r "${zfs_mountpoints[0]/$'\n'/}@${snapshotname}" for i in ${zfs_mountpoints[@]} do mkdir --parent "${mountpoint}/${i}" mount --bind "/${i}/.zfs/snapshot/${snapshotname}" "${mountpoint}/${i}" done elif [ "${1}" == "post" ] then umount -r "${mountpoint}/${zfs_mountpoints[0]/$'\n'/}" for i in ${zfs_mountpoints[@]} do rm -r "${mountpoint}/${i}" done ${zfs_path} destroy -r "${zfs_mountpoints[0]/$'\n'/}@${snapshotname}" fi ``` Note: For some reason the delete wont work when the pre job is run twice without an delete.

Hey, anything I can do to progress this? :)

Hey, anything I can do to progress this? :)
Author
Owner

Pull requests are always appreciated. 😄 But short of that, vetting the script that @floriplum posted for suitable "before" and "after" commands would aid development. Does it look like it would work for your use case, or is it missing some edge cases? I don't use ZFS myself, so review by actual ZFS users is helpful.

Pull requests are always appreciated. :smile: But short of that, vetting the script that @floriplum posted for suitable "before" and "after" commands would aid development. Does it look like it would work for your use case, or is it missing some edge cases? I don't use ZFS myself, so review by actual ZFS users is helpful.

Yes, this would work for my use case command wise.

Something as simple as:

Pre hook - create zfs snapshot. Store the name in a variable or in addition create a static mount to separate directory

Backup - pass snapshot name to borgmatic using env to help in creating a path in the special . Zfs directory or just specify the new static snapshot mount

Post hook - delete zfs snapshot using variable or unmount and then delete.

Hope this makes sense, I like the variable option better as it means borgmatic can just dynamically use the new snapshot no matter the name but not sure how I would pass this to borgmatic

Yes, this would work for my use case command wise. Something as simple as: Pre hook - create zfs snapshot. Store the name in a variable or in addition create a static mount to separate directory Backup - pass snapshot name to borgmatic using env to help in creating a path in the special . Zfs directory or just specify the new static snapshot mount Post hook - delete zfs snapshot using variable or unmount and then delete. Hope this makes sense, I like the variable option better as it means borgmatic can just dynamically use the new snapshot no matter the name but not sure how I would pass this to borgmatic
witten added the
new feature area
label 2023-06-28 18:53:22 +00:00

Hi,

I'm just testing the following setup, which seems to work quite well (Debian 12.2 bookworm, borgmatic 1.7.7, package version 1.7.7-1):

On my (source) machine I have three ZFS pools: rpool (root FS /), bpool (for /boot) and dpool (production data, say /srv). Each ZFS pool defines a number of datasets. The backup plan is: backup everything from these pools as a single hierarchy of files, as it's seen on the source machine.

I have created a bash script, name it borgmatic-hook, which gets invoked for every hook (hook name is passed as a positional argument to the script). ZFS snapshoting and mounting is handled by before_everything hook. Snapshot unmounting and deletion is handled by after_everyghing hook. Note, that these hooks are invoked only if the borg create task is included in the list of task to be executed by the given borgmatic run (job). Also, the after_everything seems to be invoked even in case of backup error, so there is no special need for cleaning up snapshots in the on_error hook.

Every borgmatic run (job run) gets assigned individual job-id. It also has its job-name. I use relative source paths + working_directory. The job-id is generated for every run (based on date-time + a random number). The job-name is defined by admin for given job type (e.g. nightly-backup-of-everything). During a single job run, each hook receives the assigned job-id, so we have a context. All the snapshots for the given job run are named after its job-id, for example rpool@var/lib@borgmatic-${job-id}. The working_directory is named after job-name (I've observed, that changing the working_directory from one run to another caused the backup to take much longer than when the same name was used in all runs).

So, my cron entry looks like this:

25 20 * * * root PATH=$PATH:/usr/bin:/usr/local/bin:/usr/sbin:/usr/local/sbin BORGMATIC_JOB_NAME='nightly-all' BORGMATIC_JOB_ID="`/usr/local/sbin/borgmatic-new-job-id`" /usr/bin/borgmatic --config /etc/borgmatic/backup-all.yaml --verbosity -1 --syslog-verbosity 1

The borgmatic-new-job-id looks like this:

#!/usr/bin/env sh

set -e

rand=$(nawk 'BEGIN { srand(); print int(0x1000000 * rand()); }')

printf '%s-%06x\n' "$(date -u +%Y%m%d-%H%M%S-%N)" "$rand"

Example ID generated by this script:

ptomulik@barakus:$ borgmatic-new-job-id 
20231112-215506-522047870-cb81fd

Essential parts of /etc/borgmatic/backup-all.yaml:

location:
    source_directories:
        - .

    repositories:
        - ssh://storagebox/./path/to/repo

    working_directory: "/borgmatic/jobs/${BORGMATIC_JOB_NAME:-backup}"

# ...

hooks:
    before_actions:
        - borgmatic-hook -z "rpool,bpool,dpool" -w "/borgmatic/jobs/${BORGMATIC_JOB_NAME:-backup}" -n "${BORGMATIC_JOB_NAME:-backup}" -i "${BORGMATIC_JOB_ID:-default}" -c "{configuration_filename}" -r "{repository}" before_actions

    before_backup:
        - borgmatic-hook -z "rpool,bpool,dpool" -w "/borgmatic/jobs/${BORGMATIC_JOB_NAME:-backup}" -n "${BORGMATIC_JOB_NAME:-backup}" -i "${BORGMATIC_JOB_ID:-default}" -c "{configuration_filename}" -r "{repository}" before_backup

# and so on, for all the other hooks, including these two, that are of our main interest:

    before_everything:
        - borgmatic-hook -z "rpool,bpool,dpool" -w "/borgmatic/jobs/${BORGMATIC_JOB_NAME:-backup}" -n "${BORGMATIC_JOB_NAME:-backup}" -i "${BORGMATIC_JOB_ID:-default}" -c "{configuration_filename}" before_everything

    after_everything:
        - borgmatic-hook -z "rpool,bpool,dpool" -w "/borgmatic/jobs/${BORGMATIC_JOB_NAME:-backup}" -n "${BORGMATIC_JOB_NAME:-backup}" -i "${BORGMATIC_JOB_ID:-default}" -c "{configuration_filename}" after_everything

The borgmatic-hook script, in it's current form, is the following:

#!/usr/bin/env bash

set -e

help() {
  cat <<! >&2
Usage: borgmatic-hook [OPTIONS] HOOKNAME

  Borgmatic hook script.

Options:

  -i|--job-id ID

      The identifier of the running borgmatic job. The provided ID shall be
      different for different borgmatic runs (shall be unique across all runs),
      but shall remain same for all hooks invoked during given run.

      You may use bash script named "borgmatic-new-jobid" to generate new ID.

      The allowed syntax for ID is quite restrictive, it must consist of
      alphanumeric strings separated by one of '-', '_', or '.'.

  -n|--job-name JOBNAME

      The name of the running borgmatic job. The job name shall remain same for
      all the runs for given config.

      The allowed syntax for JOBNAME is quite restrictive, it must consist of
      alphanumeric strings separated by one of '-', '_', or '.'.

  -c|--config PATH

      The borgmatic config file used for the current run.

  -w|--working-directory PATH

      If given, the script will maintain working directory. The working
      directory specified in PATH will be created in "before_everything" hook
      and deleted (if empty) in "after_everything" hook.

  -m|--mount-point PATH

      Root directory for recursive mounts. This is used, for example, as a root
      directory for a hierarchy of filesystems to be backed up as a whole.
      For example, if '--zfs-pools' is provided, then all mountable (canmount
      != 'off') datasets from these pools are snapshoted and mounted
      recursively under PATH to resemble the original hierarchy from snapshots.

      If --mount-point is not given, it defaults to --working-directory PATH,
      if given, or to "/borgmatic/jobs/\${JOBNAME}", with \${JOBNAME} being
      a value of the --job-name option.

  -r|--repo REPOSITORY

      Borg backup repository name.

  -z|--zfs-pools POOLS

      Comma or space separated list of ZFS pools to be recursivelly mounted to
      the path spefified by --mount-point (or to working dir, by default).

  -V|--verbose

      Print info messages to stdout.

  -C|--color

      Use colors for info, warning and error messages.

  -h|--help

      Print this help

Supported HOOKNAME values:

    * before_actions, after_actions
    * before_backup, after_backup
    * before_prune, after_prune
    * before_compact, after_compact
    * before_check, after_check
    * before_extract, after_extract
    * before_everything, after_everything
    * on_error

!
}

info() {
  if $opt_verbose; then
    if $opt_color; then
      echo -e "\033[0;92mborgmatic-hook $hook: info: $@\033[0m"
    else
      echo "borgmatic-hook $hook: info: $@"
    fi
  fi
}

warning() {
  if $opt_color; then
    echo -e "\033[0;33mborgmatic-hook $hook: warning: $@\033[0m" >&2
  else
    echo "borgmatic-hook $hook: warning: $@" >&2
  fi
}

error() {
  if $opt_color; then
    echo -e "\033[0;31mborgmatic-hook $hook: error: $@\033[0m" >&2
  else
    echo "borgmatic-hook $hook: error: $@" >&2
  fi
}

exec_lines() {
  while IFS= read line; do info "$line" && $line; done
}

zfs_pools_snap_create_and_mount() {
  [ $# -gt 0 ] || return

  for pool in $@; do
    info "ensure that '${pool}' has no datasets colliding with '@${snap_id}'"
    if datasets=$(zfs list -r -H -o name -t all $pool | grep "@${snap_id}\$"); then
      for ds in $datasets; do
        error "ZFS dataset '$ds' already exists"
      done
      return 1;
    fi
  done

  if [ -e "${mount_point}" ]; then
    if [ ! -d "${mount_point}" ]; then
      error "'${mount_point}': not a directory"
      return 1
    fi
    if [ ! -z "$(ls -A ${mount_point})" ]; then
      error "'${mount_point}': not an empty directory"
      return 1
    fi
    if mounted=$(findmnt -n -osource "${mount_point}"); then
      error "'${mounted}' already mounted to '${mount_point}'"
      error "previous backup for still in progress?"
      return 1
    fi
  fi

  for pool in $@; do
    info "zfs snapshot -r '${pool}@${snap_id}'"
    zfs snapshot -r "${pool}@${snap_id}"
  done

  # The mount_point dir gets created after ZFS snapshot.
  # This way we prevent from including our (possibly temporary) directory into
  # snapshots and backups.
  if [ ! -e "${mount_point}" ]; then
    info "mkdir -p '${mount_point}' && chmod 700 '${mount_point}'"
    mkdir -p "${mount_point}" && chmod 700 "${mount_point}"
  fi

  zfs list -rH -o name,mountpoint,canmount -s mountpoint "$@" | \
    awk -v snap_id="${snap_id}" \
        -v mount_point="${mount_point}" \
        '$3 != "off" {print "mount -t zfs -o ro "$1"@"snap_id" "mount_point""$2}' | \
    exec_lines
}

zfs_pools_snap_umount_and_delete() {
  if mount_source=$(findmnt -n -osource "${mount_point}"); then
    if dataset_type=$(zfs get -H -ovalue type "${mount_source}"); then
      if [ "${dataset_type}" == "snapshot" ]; then
        info "umount -R ${mount_point}"
        umount -R "${mount_point}"
      else
        warning "${mount_source} is not a zfs snapshot, skipping 'umount -R ${mount_point}'"
      fi
    fi
  fi

  [ $# -gt 0 ] || return

  for pool in $@; do
    info "zfs destroy -r '${pool}@${snap_id}'"
    zfs destroy -r "${pool}@${snap_id}" || true
  done
}

shortopts=i:n:c:r:w:m:z:hCV
longopts=job-id:,job-name,config:,repo:,working-directory:,mount-point:,zfs-pools:,help,color,verbose
if ! options=$(getopt -o "${shortopts}"  -l "${longopts}" -- "$@"); then
  exit 1
fi
unset shortopts
unset longopts

eval set -- "$options"

opt_job_id=''
opt_job_id_re='^\([a-zA-Z0-9]\+\)\([_\.-][a-zA-Z0-9]\+\)*$'
opt_job_name=''
opt_job_name_re='^\([a-zA-Z0-9]\+\)\([_\.-][a-zA-Z0-9]\+\)*$'
opt_config=''
opt_repo=''
opt_working_directory=''
opt_mount_point=''
opt_zfs_pools=''
opt_zfs_pools_re='^\([a-zA-Z][a-zA-Z0-9_:\.-]*\)\(\(\(,\? *\)\| \+\)[a-zA-Z][a-zA-Z0-9_:\.-]\+\)*$'
opt_verbose=false
opt_color=false

while [ $# -gt 0 ]; do
  case "$1" in
    -h|--help)
      help; exit 0;;
    -i|--job-id)
      if ! echo "$2" | grep -q "${opt_job_id_re}"; then
        error "malformed value for $1: '$2'"
        exit 1
      fi
      opt_job_id="$2"; shift 2;;
    -n|--job-name)
      if ! echo "$2" | grep -q "${opt_job_name_re}"; then
        error "malformed value for $1: '$2'"
        exit 1
      fi
      opt_job_name="$2"; shift 2;;
    -c|--config)
      opt_config="$2"; shift 2;;
    -r|--repo)
      opt_repo="$2"; shift 2;;
    -w|--working-directory)
      opt_working_directory="$2"; shift 2;;
    -m|--mount-point)
      opt_mount_point="$2"; shift 2 ;;
    -z|--zfs-pools)
      if ! echo "$2" | grep -q "${opt_zfs_pools_re}"; then
        error "malformed value for $1: '$2'"
        exit 1
      fi
      opt_zfs_pools="$2"; shift 2;
      ;;
    -V|--verbose)
      opt_verbose=true; shift;;
    -C|--color)
      opt_color=true; shift;;
    --)
      shift; break;;
    -*)
      help; echo ""; error "unrecognized option $1"; exit 1;;
    *)
      break;;
  esac
done

if ! [ $# -eq 1 ]; then
  help; exit 1
fi

job_id=${opt_job_id:-default}
job_name=${opt_job_name:-backup}

default_mount_point="/borgmatic/jobs/${job_name}"

config=${opt_config}
repo=${opt_repo}
working_directory=${opt_working_directory}
mount_point=${opt_mount_point:-${working_directory:-${default_mount_point}}}
zfs_pools=$(echo "${opt_zfs_pools}" | sed -e 's/[^a-zA-Z0-9_:\.-]\+/ /g')

hook=$1
snap_id="borgmatic-${job_id}"

info "script started"

case "$hook" in
    before_actions)
        info "nothing to be done"
    ;;
    before_backup)
        info "nothing to be done"
    ;;
    before_prune)
        info "nothing to be done"
    ;;
    before_compact)
        info "nothing to be done"
    ;;
    before_check)
        info "nothing to be done"
    ;;
    before_extract)
        info "nothing to be done"
    ;;
    after_backup)
        info "nothing to be done"
    ;;
    after_compact)
        info "nothing to be done"
    ;;
    after_prune)
        info "nothing to be done"
    ;;
    after_check)
        info "nothing to be done"
    ;;
    after_extract)
        info "nothing to be done"
    ;;
    after_actions)
        info "nothing to be done"
    ;;
    on_error)
        info "nothing to be done"
    ;;

    before_everything)
      if [ ! -z "$zfs_pools" ]; then
        zfs_pools_snap_create_and_mount $zfs_pools
      fi

      # If working directory does not exist, create it. We first take our
      # ZFS snapshots, then make the working directory to prevent empty working
      # directory to be included in snapshot (and thus, in backup).
      if [ ! -z "${working_directory}" ] && [ ! -e "${working_directory}" ]; then
        info "mkdir -p '${working_directory}'";
        mkdir -p "${working_directory}";
      fi
      ;;

    after_everything)

      if [ ! -z "$zfs_pools" ]; then
        zfs_pools_snap_umount_and_delete $zfs_pools
      fi

      # If the script is provided with the --working-directory value, then
      # we're instructed to maintain working directory (create it in
      # "before_everything", and remove "after_everygthing").
      #
      # Note, that these hooks only get called when the working directory is
      # needed by borgmatic (i.e. when there is "borg create" command among
      # others in the commands being executed by borgmatic during the run)
      if [ ! -z "${working_directory}" ] && [ -d "${working_directory}" ] && [ -z "$(ls -A ${working_directory})" ]; then
        info "rmdir '${working_directory}'";
        rmdir "${working_directory}";
      fi
      ;;

  *)
    warning "hook $hook not supported, exiting..."
esac

info "script finished"

# vim: set ft=bash:

The script uses mount -t zfs to mount snapshots. After everything, they are unmounted recursively with umount -R. The list of volumes to be snapshoted is determined automatically: all mountable datasets (canmount != "off") from the ZFS pool list provided via -z option get mounted.

Snapshots get unique names (as long as job-id's are unique), so there is no risk for collision, in case there are several jobs running in same time. The problematic part may be, however with two jobs running with same job-name, in which case the second instance shall fail, because the working_directory is already used by the first job. The schedule shall assume enough time for single job run to let it finish with some time-margin.

Hope, this may help someone.

Hi, I'm just testing the following setup, which seems to work quite well (Debian 12.2 bookworm, borgmatic 1.7.7, package version 1.7.7-1): On my (source) machine I have three ZFS pools: ``rpool`` (root FS ``/``), ``bpool`` (for ``/boot``) and ``dpool`` (production data, say ``/srv``). Each ZFS pool defines a number of datasets. The backup plan is: backup everything from these pools as a single hierarchy of files, as it's seen on the source machine. I have created a bash script, name it ``borgmatic-hook``, which gets invoked for every hook (hook name is passed as a positional argument to the script). ZFS snapshoting and mounting is handled by ``before_everything`` hook. Snapshot unmounting and deletion is handled by ``after_everyghing`` hook. Note, that these hooks are invoked only if the ``borg create`` task is included in the list of task to be executed by the given borgmatic run (job). Also, the ``after_everything`` seems to be invoked even in case of backup error, so there is no special need for cleaning up snapshots in the ``on_error`` hook. Every ``borgmatic`` run (job run) gets assigned individual ``job-id``. It also has its ``job-name``. I use relative source paths + ``working_directory``. The ``job-id`` is generated for every run (based on date-time + a random number). The ``job-name`` is defined by admin for given job type (e.g. ``nightly-backup-of-everything``). During a single job run, each hook receives the assigned ``job-id``, so we have a context. All the snapshots for the given job run are named after its ``job-id``, for example ``rpool@var/lib@borgmatic-${job-id}``. The ``working_directory`` is named after ``job-name`` (I've observed, that changing the ``working_directory`` from one run to another caused the backup to take much longer than when the same name was used in all runs). So, my cron entry looks like this: ```crontab 25 20 * * * root PATH=$PATH:/usr/bin:/usr/local/bin:/usr/sbin:/usr/local/sbin BORGMATIC_JOB_NAME='nightly-all' BORGMATIC_JOB_ID="`/usr/local/sbin/borgmatic-new-job-id`" /usr/bin/borgmatic --config /etc/borgmatic/backup-all.yaml --verbosity -1 --syslog-verbosity 1 ``` The ``borgmatic-new-job-id`` looks like this: ```bash #!/usr/bin/env sh set -e rand=$(nawk 'BEGIN { srand(); print int(0x1000000 * rand()); }') printf '%s-%06x\n' "$(date -u +%Y%m%d-%H%M%S-%N)" "$rand" ``` Example ID generated by this script: ```console ptomulik@barakus:$ borgmatic-new-job-id 20231112-215506-522047870-cb81fd ``` Essential parts of ``/etc/borgmatic/backup-all.yaml``: ```yaml location: source_directories: - . repositories: - ssh://storagebox/./path/to/repo working_directory: "/borgmatic/jobs/${BORGMATIC_JOB_NAME:-backup}" # ... hooks: before_actions: - borgmatic-hook -z "rpool,bpool,dpool" -w "/borgmatic/jobs/${BORGMATIC_JOB_NAME:-backup}" -n "${BORGMATIC_JOB_NAME:-backup}" -i "${BORGMATIC_JOB_ID:-default}" -c "{configuration_filename}" -r "{repository}" before_actions before_backup: - borgmatic-hook -z "rpool,bpool,dpool" -w "/borgmatic/jobs/${BORGMATIC_JOB_NAME:-backup}" -n "${BORGMATIC_JOB_NAME:-backup}" -i "${BORGMATIC_JOB_ID:-default}" -c "{configuration_filename}" -r "{repository}" before_backup # and so on, for all the other hooks, including these two, that are of our main interest: before_everything: - borgmatic-hook -z "rpool,bpool,dpool" -w "/borgmatic/jobs/${BORGMATIC_JOB_NAME:-backup}" -n "${BORGMATIC_JOB_NAME:-backup}" -i "${BORGMATIC_JOB_ID:-default}" -c "{configuration_filename}" before_everything after_everything: - borgmatic-hook -z "rpool,bpool,dpool" -w "/borgmatic/jobs/${BORGMATIC_JOB_NAME:-backup}" -n "${BORGMATIC_JOB_NAME:-backup}" -i "${BORGMATIC_JOB_ID:-default}" -c "{configuration_filename}" after_everything ``` The ``borgmatic-hook`` script, in it's current form, is the following: ```bash #!/usr/bin/env bash set -e help() { cat <<! >&2 Usage: borgmatic-hook [OPTIONS] HOOKNAME Borgmatic hook script. Options: -i|--job-id ID The identifier of the running borgmatic job. The provided ID shall be different for different borgmatic runs (shall be unique across all runs), but shall remain same for all hooks invoked during given run. You may use bash script named "borgmatic-new-jobid" to generate new ID. The allowed syntax for ID is quite restrictive, it must consist of alphanumeric strings separated by one of '-', '_', or '.'. -n|--job-name JOBNAME The name of the running borgmatic job. The job name shall remain same for all the runs for given config. The allowed syntax for JOBNAME is quite restrictive, it must consist of alphanumeric strings separated by one of '-', '_', or '.'. -c|--config PATH The borgmatic config file used for the current run. -w|--working-directory PATH If given, the script will maintain working directory. The working directory specified in PATH will be created in "before_everything" hook and deleted (if empty) in "after_everything" hook. -m|--mount-point PATH Root directory for recursive mounts. This is used, for example, as a root directory for a hierarchy of filesystems to be backed up as a whole. For example, if '--zfs-pools' is provided, then all mountable (canmount != 'off') datasets from these pools are snapshoted and mounted recursively under PATH to resemble the original hierarchy from snapshots. If --mount-point is not given, it defaults to --working-directory PATH, if given, or to "/borgmatic/jobs/\${JOBNAME}", with \${JOBNAME} being a value of the --job-name option. -r|--repo REPOSITORY Borg backup repository name. -z|--zfs-pools POOLS Comma or space separated list of ZFS pools to be recursivelly mounted to the path spefified by --mount-point (or to working dir, by default). -V|--verbose Print info messages to stdout. -C|--color Use colors for info, warning and error messages. -h|--help Print this help Supported HOOKNAME values: * before_actions, after_actions * before_backup, after_backup * before_prune, after_prune * before_compact, after_compact * before_check, after_check * before_extract, after_extract * before_everything, after_everything * on_error ! } info() { if $opt_verbose; then if $opt_color; then echo -e "\033[0;92mborgmatic-hook $hook: info: $@\033[0m" else echo "borgmatic-hook $hook: info: $@" fi fi } warning() { if $opt_color; then echo -e "\033[0;33mborgmatic-hook $hook: warning: $@\033[0m" >&2 else echo "borgmatic-hook $hook: warning: $@" >&2 fi } error() { if $opt_color; then echo -e "\033[0;31mborgmatic-hook $hook: error: $@\033[0m" >&2 else echo "borgmatic-hook $hook: error: $@" >&2 fi } exec_lines() { while IFS= read line; do info "$line" && $line; done } zfs_pools_snap_create_and_mount() { [ $# -gt 0 ] || return for pool in $@; do info "ensure that '${pool}' has no datasets colliding with '@${snap_id}'" if datasets=$(zfs list -r -H -o name -t all $pool | grep "@${snap_id}\$"); then for ds in $datasets; do error "ZFS dataset '$ds' already exists" done return 1; fi done if [ -e "${mount_point}" ]; then if [ ! -d "${mount_point}" ]; then error "'${mount_point}': not a directory" return 1 fi if [ ! -z "$(ls -A ${mount_point})" ]; then error "'${mount_point}': not an empty directory" return 1 fi if mounted=$(findmnt -n -osource "${mount_point}"); then error "'${mounted}' already mounted to '${mount_point}'" error "previous backup for still in progress?" return 1 fi fi for pool in $@; do info "zfs snapshot -r '${pool}@${snap_id}'" zfs snapshot -r "${pool}@${snap_id}" done # The mount_point dir gets created after ZFS snapshot. # This way we prevent from including our (possibly temporary) directory into # snapshots and backups. if [ ! -e "${mount_point}" ]; then info "mkdir -p '${mount_point}' && chmod 700 '${mount_point}'" mkdir -p "${mount_point}" && chmod 700 "${mount_point}" fi zfs list -rH -o name,mountpoint,canmount -s mountpoint "$@" | \ awk -v snap_id="${snap_id}" \ -v mount_point="${mount_point}" \ '$3 != "off" {print "mount -t zfs -o ro "$1"@"snap_id" "mount_point""$2}' | \ exec_lines } zfs_pools_snap_umount_and_delete() { if mount_source=$(findmnt -n -osource "${mount_point}"); then if dataset_type=$(zfs get -H -ovalue type "${mount_source}"); then if [ "${dataset_type}" == "snapshot" ]; then info "umount -R ${mount_point}" umount -R "${mount_point}" else warning "${mount_source} is not a zfs snapshot, skipping 'umount -R ${mount_point}'" fi fi fi [ $# -gt 0 ] || return for pool in $@; do info "zfs destroy -r '${pool}@${snap_id}'" zfs destroy -r "${pool}@${snap_id}" || true done } shortopts=i:n:c:r:w:m:z:hCV longopts=job-id:,job-name,config:,repo:,working-directory:,mount-point:,zfs-pools:,help,color,verbose if ! options=$(getopt -o "${shortopts}" -l "${longopts}" -- "$@"); then exit 1 fi unset shortopts unset longopts eval set -- "$options" opt_job_id='' opt_job_id_re='^\([a-zA-Z0-9]\+\)\([_\.-][a-zA-Z0-9]\+\)*$' opt_job_name='' opt_job_name_re='^\([a-zA-Z0-9]\+\)\([_\.-][a-zA-Z0-9]\+\)*$' opt_config='' opt_repo='' opt_working_directory='' opt_mount_point='' opt_zfs_pools='' opt_zfs_pools_re='^\([a-zA-Z][a-zA-Z0-9_:\.-]*\)\(\(\(,\? *\)\| \+\)[a-zA-Z][a-zA-Z0-9_:\.-]\+\)*$' opt_verbose=false opt_color=false while [ $# -gt 0 ]; do case "$1" in -h|--help) help; exit 0;; -i|--job-id) if ! echo "$2" | grep -q "${opt_job_id_re}"; then error "malformed value for $1: '$2'" exit 1 fi opt_job_id="$2"; shift 2;; -n|--job-name) if ! echo "$2" | grep -q "${opt_job_name_re}"; then error "malformed value for $1: '$2'" exit 1 fi opt_job_name="$2"; shift 2;; -c|--config) opt_config="$2"; shift 2;; -r|--repo) opt_repo="$2"; shift 2;; -w|--working-directory) opt_working_directory="$2"; shift 2;; -m|--mount-point) opt_mount_point="$2"; shift 2 ;; -z|--zfs-pools) if ! echo "$2" | grep -q "${opt_zfs_pools_re}"; then error "malformed value for $1: '$2'" exit 1 fi opt_zfs_pools="$2"; shift 2; ;; -V|--verbose) opt_verbose=true; shift;; -C|--color) opt_color=true; shift;; --) shift; break;; -*) help; echo ""; error "unrecognized option $1"; exit 1;; *) break;; esac done if ! [ $# -eq 1 ]; then help; exit 1 fi job_id=${opt_job_id:-default} job_name=${opt_job_name:-backup} default_mount_point="/borgmatic/jobs/${job_name}" config=${opt_config} repo=${opt_repo} working_directory=${opt_working_directory} mount_point=${opt_mount_point:-${working_directory:-${default_mount_point}}} zfs_pools=$(echo "${opt_zfs_pools}" | sed -e 's/[^a-zA-Z0-9_:\.-]\+/ /g') hook=$1 snap_id="borgmatic-${job_id}" info "script started" case "$hook" in before_actions) info "nothing to be done" ;; before_backup) info "nothing to be done" ;; before_prune) info "nothing to be done" ;; before_compact) info "nothing to be done" ;; before_check) info "nothing to be done" ;; before_extract) info "nothing to be done" ;; after_backup) info "nothing to be done" ;; after_compact) info "nothing to be done" ;; after_prune) info "nothing to be done" ;; after_check) info "nothing to be done" ;; after_extract) info "nothing to be done" ;; after_actions) info "nothing to be done" ;; on_error) info "nothing to be done" ;; before_everything) if [ ! -z "$zfs_pools" ]; then zfs_pools_snap_create_and_mount $zfs_pools fi # If working directory does not exist, create it. We first take our # ZFS snapshots, then make the working directory to prevent empty working # directory to be included in snapshot (and thus, in backup). if [ ! -z "${working_directory}" ] && [ ! -e "${working_directory}" ]; then info "mkdir -p '${working_directory}'"; mkdir -p "${working_directory}"; fi ;; after_everything) if [ ! -z "$zfs_pools" ]; then zfs_pools_snap_umount_and_delete $zfs_pools fi # If the script is provided with the --working-directory value, then # we're instructed to maintain working directory (create it in # "before_everything", and remove "after_everygthing"). # # Note, that these hooks only get called when the working directory is # needed by borgmatic (i.e. when there is "borg create" command among # others in the commands being executed by borgmatic during the run) if [ ! -z "${working_directory}" ] && [ -d "${working_directory}" ] && [ -z "$(ls -A ${working_directory})" ]; then info "rmdir '${working_directory}'"; rmdir "${working_directory}"; fi ;; *) warning "hook $hook not supported, exiting..." esac info "script finished" # vim: set ft=bash: ``` The script uses ``mount -t zfs`` to mount snapshots. After everything, they are unmounted recursively with ``umount -R``. The list of volumes to be snapshoted is determined automatically: all mountable datasets (``canmount != "off"``) from the ZFS pool list provided via ``-z`` option get mounted. Snapshots get unique names (as long as ``job-id``'s are unique), so there is no risk for collision, in case there are several jobs running in same time. The problematic part may be, however with two jobs running with same ``job-name``, in which case the second instance shall fail, because the ``working_directory`` is already used by the first job. The schedule shall assume enough time for single job run to let it finish with some time-margin. Hope, this may help someone.
Contributor

It could also work something like this (assuming this is turned into a proper hook like the db backups are):

  • Create zfs snapshots
  • Unshare with mount namespace
  • Create a new directory
  • Create all the paths to the volume mounts
  • Bind mount all the snapshots to their volume mountpoints
  • Call borg to backup the directory

We would essentially change the snapshot contents to be the actual directory contents for borg. Benefits I see to this approach: The original mount namespace is never touched, everything stays the same. The files are in the right place for a restore from the repo.

It could also work something like this (assuming this is turned into a proper hook like the db backups are): - Create zfs snapshots - Unshare with mount namespace - Create a new directory - Create all the paths to the volume mounts - Bind mount all the snapshots to their volume mountpoints - Call borg to backup the directory We would essentially change the snapshot contents to be the actual directory contents for borg. Benefits I see to this approach: The original mount namespace is never touched, everything stays the same. The files are in the right place for a restore from the repo.

It could also work something like this (assuming this is turned into a proper hook like the db backups are):

  • Create zfs snapshots
  • Unshare with mount namespace
  • Create a new directory
  • Create all the paths to the volume mounts
  • Bind mount all the snapshots to their volume mountpoints
  • Call borg to backup the directory

We would essentially change the snapshot contents to be the actual directory contents for borg. Benefits I see to this approach: The original mount namespace is never touched, everything stays the same. The files are in the right place for a restore from the repo.

Nice idea, of course. The question to be answered is, whether all the hooks are run as same process, or rather a sequence of separate processes. If we have the later case, then unshare will probably be scoped to the single call of a single hook (say "before_everything") and the mounts will disappear before the actual backup job starts.

Well, ... in the meantime I've found that, for example, Debian offers to run Borgmatic jobs using systemd facilities instead of the bare crontable. The benefit (say) is that it's configured by default to run in a quite restricted environment. I managed to find settings, where things work similarly to the above scenario, at least mounts seem to be cleaned-up by the runner (systemd? I don't know). Actually, to be able to create snapshots and make mounts, one has to even loose the default restrictions defined by the package and then some kind of magic happens :).

Here is an example override for systemd configuration (one can put it under /etc/systemd/system/borgmatic.service.d/override.conf):

[Unit]                                                                          
Requires=zfs.target                                                             
                                                                                
[Service]                                                                       
Environment=BORGMATIC_JOB_NAME=nightly-all                                      
                                                                                
PrivateDevices=no                                                               
SystemCallFilter=@system-service @mount                                         
                                                                                
DevicePolicy=closed                                                             
DeviceAllow=/dev/zfs rw                                                         
                                                                                
CapabilityBoundingSet=CAP_SYS_ADMIN                                             
                                                                                
ExecStart=                                                                      
ExecStart=systemd-inhibit --who="borgmatic" --what="sleep:shutdown" --why="Prevent interrupting scheduled backup" /usr/local/sbin/borgmatic-run-job --verbosity -1 --syslog-verbosity 1 --monitoring-verbosity 1 --stats

and the contents of the borgmatic-run-job script:

#!/usr/bin/env bash

set -e

BORGMATIC_JOB_ID="`/usr/local/sbin/borgmatic-new-job-id`" /usr/bin/borgmatic "$@"
> It could also work something like this (assuming this is turned into a proper hook like the db backups are): > > - Create zfs snapshots > - Unshare with mount namespace > - Create a new directory > - Create all the paths to the volume mounts > - Bind mount all the snapshots to their volume mountpoints > - Call borg to backup the directory > > We would essentially change the snapshot contents to be the actual directory contents for borg. Benefits I see to this approach: The original mount namespace is never touched, everything stays the same. The files are in the right place for a restore from the repo. Nice idea, of course. The question to be answered is, whether all the hooks are run as same process, or rather a sequence of separate processes. If we have the later case, then unshare will probably be scoped to the single call of a single hook (say "before_everything") and the mounts will disappear before the actual backup job starts. Well, ... in the meantime I've found that, for example, Debian offers to run Borgmatic jobs using systemd facilities instead of the bare crontable. The benefit (say) is that it's configured by default to run in a quite restricted environment. I managed to find settings, where things work similarly to the above scenario, at least mounts seem to be cleaned-up by the runner (systemd? I don't know). Actually, to be able to create snapshots and make mounts, one has to even loose the default restrictions defined by the package and then some kind of magic happens :). Here is an example override for systemd configuration (one can put it under ``/etc/systemd/system/borgmatic.service.d/override.conf``): ``` [Unit] Requires=zfs.target [Service] Environment=BORGMATIC_JOB_NAME=nightly-all PrivateDevices=no SystemCallFilter=@system-service @mount DevicePolicy=closed DeviceAllow=/dev/zfs rw CapabilityBoundingSet=CAP_SYS_ADMIN ExecStart= ExecStart=systemd-inhibit --who="borgmatic" --what="sleep:shutdown" --why="Prevent interrupting scheduled backup" /usr/local/sbin/borgmatic-run-job --verbosity -1 --syslog-verbosity 1 --monitoring-verbosity 1 --stats ``` and the contents of the ``borgmatic-run-job`` script: ``` #!/usr/bin/env bash set -e BORGMATIC_JOB_ID="`/usr/local/sbin/borgmatic-new-job-id`" /usr/bin/borgmatic "$@" ```
Contributor

Well we kinda have to run the zfs backups as a separate step or document the behavior very well but as of right now (and if @witten ) allows it I would be very interested in a solution that spawns borg in a separate mount namespace as that would be 1. very robust 2. allow for mounting the backup and 3. be expandable to other zfs like snapshot filesystems
It does however have some downsides (These are all I can think of):

  1. Might be conflicting with certain container tools (unsure docker should be fine due to socket based design, podman might not be)
  2. We have to ensure all of the directories stay writable that need to be writable for borg to work (This would have to be an upfront check which might be very hard)
Well we kinda have to run the zfs backups as a separate step or document the behavior very well but as of right now (and if @witten ) allows it I would be very interested in a solution that spawns borg in a separate mount namespace as that would be 1. very robust 2. allow for mounting the backup and 3. be expandable to other zfs like snapshot filesystems It does however have some downsides (These are all I can think of): 1. Might be conflicting with certain container tools (unsure docker should be fine due to socket based design, podman might not be) 2. We have to ensure all of the directories stay writable that need to be writable for borg to work (This would have to be an upfront check which might be very hard)
Author
Owner

We would essentially change the snapshot contents to be the actual directory contents for borg. Benefits I see to this approach: The original mount namespace is never touched, everything stays the same. The files are in the right place for a restore from the repo.

I'm a fan of bind mounts for exactly this reason. (See some of my old comments above.) Your general approach makes sense to me.

Nice idea, of course. The question to be answered is, whether all the hooks are run as same process, or rather a sequence of separate processes.

Sequence of separate processes. Specifically, whenever a borgmatic hook runs an external binary (borg, pg_dump, etc.), that binary runs in its own process. But any of the surrounding code runs in the borgmatic process.

If we have the later case, then unshare will probably be scoped to the single call of a single hook (say "before_everything") and the mounts will disappear before the actual backup job starts.

I might be missing something, but couldn't it just be scoped to the actual create action (and the resulting call to Borg)? Then Borg would "see" those snapshotted files as if they were in their normal locations.

Might be conflicting with certain container tools (unsure docker should be fine due to socket based design, podman might not be)

If you're running borgmatic in a container, then presumably you're not using zfs in the container as well. Put another way, if you want borgmatic to work with zfs, it probably needs to be run on the host. Or am I missing something here?

We have to ensure all of the directories stay writable that need to be writable for borg to work (This would have to be an upfront check which might be very hard)

Would it be as simple as not including them in the snapshot?

> We would essentially change the snapshot contents to be the actual directory contents for borg. Benefits I see to this approach: The original mount namespace is never touched, everything stays the same. The files are in the right place for a restore from the repo. I'm a fan of bind mounts for exactly this reason. (See some of my old comments above.) Your general approach makes sense to me. > Nice idea, of course. The question to be answered is, whether all the hooks are run as same process, or rather a sequence of separate processes. Sequence of separate processes. Specifically, whenever a borgmatic hook runs an external binary (`borg`, `pg_dump`, etc.), that binary runs in its own process. But any of the surrounding code runs in the borgmatic process. > If we have the later case, then unshare will probably be scoped to the single call of a single hook (say "before_everything") and the mounts will disappear before the actual backup job starts. I might be missing something, but couldn't it just be scoped to the actual `create` action (and the resulting call to Borg)? Then Borg would "see" those snapshotted files as if they were in their normal locations. > Might be conflicting with certain container tools (unsure docker should be fine due to socket based design, podman might not be) If you're running borgmatic in a container, then presumably you're not using zfs in the container as well. Put another way, if you want borgmatic to work with zfs, it probably needs to be run on the host. Or am I missing something here? > We have to ensure all of the directories stay writable that need to be writable for borg to work (This would have to be an upfront check which might be very hard) Would it be as simple as not including them in the snapshot?
Contributor

Well what I had in mind would use unshare() to create a private mount namespace, unmount the old mount and mount the snapshot (I know this works in zfs at least). But what if someone would like to snapshot /bin ? This would mean we would take a snapshot of the zfs share thats mounted on /bin, unshare with a new mount namespace, unmount the directory and now replace it with the snapshot. The mounting the new snapshot part will fail now, because /bin/zfs suddenly does not exists anymore. We should be able to circumvent this with a chroot inside the private namespace where we mount all the snapshots under a single directory tree and spawn borg inside.
The interesting question is:
Can we do it all in one call to borg? I have not read all of the code in the repo but I assume we call borg once for every db dump hook. For zfs it would be really nice to call borg and pass the snapshots AND the directories that should be backed up at the same time so they all go into one filesystem and can be mounted/looked at from borg.

Well what I had in mind would use `unshare()` to create a private mount namespace, unmount the old mount and mount the snapshot (I know this works in zfs at least). But what if someone would like to snapshot `/bin` ? This would mean we would take a snapshot of the zfs share thats mounted on `/bin`, unshare with a new mount namespace, unmount the directory and now replace it with the snapshot. The mounting the new snapshot part will fail now, because `/bin/zfs` suddenly does not exists anymore. We should be able to circumvent this with a chroot inside the private namespace where we mount all the snapshots under a single directory tree and spawn borg inside. The interesting question is: Can we do it all in one call to borg? I have not read all of the code in the repo but I assume we call borg once for every db dump hook. For zfs it would be really nice to call borg and pass the snapshots AND the directories that should be backed up at the same time so they all go into one filesystem and can be mounted/looked at from borg.
Author
Owner

The mounting the new snapshot part will fail now, because /bin/zfs suddenly does not exists anymore. We should be able to circumvent this with a chroot inside the private namespace where we mount all the snapshots under a single directory tree and spawn borg inside.

Ah, gotcha.

I have not read all of the code in the repo but I assume we call borg once for every db dump hook.

Actually, no! Each database hook dumps its configured database(s) to individual named pipes on the filesystem. Then, a single Borg call backs up all configured source directories and implicitly reads from each of those named pipes.

For zfs it would be really nice to call borg and pass the snapshots AND the directories that should be backed up at the same time so they all go into one filesystem and can be mounted/looked at from borg.

I believe that should be possible today.

> The mounting the new snapshot part will fail now, because /bin/zfs suddenly does not exists anymore. We should be able to circumvent this with a chroot inside the private namespace where we mount all the snapshots under a single directory tree and spawn borg inside. Ah, gotcha. > I have not read all of the code in the repo but I assume we call borg once for every db dump hook. Actually, no! Each database hook dumps its configured database(s) to individual named pipes on the filesystem. Then, a single Borg call backs up all configured source directories _and_ implicitly reads from each of those named pipes. > For zfs it would be really nice to call borg and pass the snapshots AND the directories that should be backed up at the same time so they all go into one filesystem and can be mounted/looked at from borg. I believe that should be possible today.
Contributor

So a bit of a technical question: How will we call unshare()? I did not consider that to be a problem but after looking at os in python where I expected to find it it does not look like python has support at all. We could of cause use util-linux and make borgmatic depend on it or use a native python package but I'm honestly not sure what the correct way to do this would be here.

So a bit of a technical question: How will we call `unshare()`? I did not consider that to be a problem but after looking at `os` in python where I expected to find it it does not look like python has support at all. We could of cause use [`util-linux`](https://github.com/util-linux/util-linux) and make borgmatic depend on it or use a native python package but I'm honestly not sure what the correct way to do this would be here.

Well what I had in mind would use unshare() to create a private mount namespace, unmount the old mount and mount the snapshot (I know this works in zfs at least). But what if someone would like to snapshot /bin ? This would mean we would take a snapshot of the zfs share thats mounted on /bin, unshare with a new mount namespace, unmount the directory and now replace it with the snapshot. The mounting the new snapshot part will fail now, because /bin/zfs suddenly does not exists anymore. We should be able to circumvent this with a chroot inside the private namespace where we mount all the snapshots under a single directory tree and spawn borg inside.

What about not unmounting any part of the original FS and mount the whole hierarchy of zfs snapshots in a subdirectory and then backup just that subdirectory using "relative directory" stuff? IMHO the final result (files in backup) is exactly same. You can use original runtime form running OS, including "/bin", and backup "/my/subdir/bin" from shapshot mounted under /my/subdir.

The interesting question is:
Can we do it all in one call to borg? I have not read all of the code in the repo but I assume we call borg once for every db dump hook. For zfs it would be really nice to call borg and pass the snapshots AND the directories that should be backed up at the same time so they all go into one filesystem and can be mounted/looked at from borg.

> Well what I had in mind would use `unshare()` to create a private mount namespace, unmount the old mount and mount the snapshot (I know this works in zfs at least). But what if someone would like to snapshot `/bin` ? This would mean we would take a snapshot of the zfs share thats mounted on `/bin`, unshare with a new mount namespace, unmount the directory and now replace it with the snapshot. The mounting the new snapshot part will fail now, because `/bin/zfs` suddenly does not exists anymore. We should be able to circumvent this with a chroot inside the private namespace where we mount all the snapshots under a single directory tree and spawn borg inside. What about not unmounting any part of the original FS and mount the whole hierarchy of zfs snapshots in a subdirectory and then backup just that subdirectory using "relative directory" stuff? IMHO the final result (files in backup) is exactly same. You can use original runtime form running OS, including "/bin", and backup "/my/subdir/bin" from shapshot mounted under /my/subdir. > The interesting question is: > Can we do it all in one call to borg? I have not read all of the code in the repo but I assume we call borg once for every db dump hook. For zfs it would be really nice to call borg and pass the snapshots AND the directories that should be backed up at the same time so they all go into one filesystem and can be mounted/looked at from borg.
Author
Owner

So a bit of a technical question: How will we call unshare()? I did not consider that to be a problem but after looking at os in python where I expected to find it it does not look like python has support at all. We could of cause use util-linux and make borgmatic depend on it or use a native python package but I'm honestly not sure what the correct way to do this would be here.

I'm not sure what the correct way is either, but here is one way that calls the underlying C implementation directly (about as "directly" as you can get in Python): #80 (comment)

What about not unmounting any part of the original FS and mount the whole hierarchy of zfs snapshots in a subdirectory and then backup just that subdirectory using "relative directory" stuff? IMHO the final result (files in backup) is exactly same. You can use original runtime form running OS, including "/bin", and backup "/my/subdir/bin" from shapshot mounted under /my/subdir.

That's an interesting idea. Are you suggesting using something like borgmatic's existing working_directory option and setting it implicitly to the ZFS snapshot directory so that the paths that are stored in the archive omit the containing ZFS snapshot directory? If so, the main challenge I can see with that approach is that relative paths couldn't also be used for any of the source_directories when a ZFS hook is in use. Maybe that's an okay limitation though? borgmatic could even error if working_directory is specified explicitly when the ZFS hook is used. EDIT: Also, you'd be limited to backing up a single ZFS snapshot at a time with this approach unless you wanted Borg to run multiple times (and create multiple archives).

> So a bit of a technical question: How will we call unshare()? I did not consider that to be a problem but after looking at os in python where I expected to find it it does not look like python has support at all. We could of cause use util-linux and make borgmatic depend on it or use a native python package but I'm honestly not sure what the correct way to do this would be here. I'm not sure what the correct way is either, but here is one way that calls the underlying C implementation directly (about as "directly" as you can get in Python): https://projects.torsion.org/borgmatic-collective/borgmatic/issues/80#issuecomment-2072 > What about not unmounting any part of the original FS and mount the whole hierarchy of zfs snapshots in a subdirectory and then backup just that subdirectory using "relative directory" stuff? IMHO the final result (files in backup) is exactly same. You can use original runtime form running OS, including "/bin", and backup "/my/subdir/bin" from shapshot mounted under /my/subdir. That's an interesting idea. Are you suggesting using something like borgmatic's existing `working_directory` option and setting it implicitly to the ZFS snapshot directory so that the paths that are stored in the archive omit the containing ZFS snapshot directory? If so, the main challenge I can see with that approach is that relative paths couldn't _also_ be used for any of the `source_directories` when a ZFS hook is in use. Maybe that's an okay limitation though? borgmatic could even error if `working_directory` is specified explicitly when the ZFS hook is used. EDIT: Also, you'd be limited to backing up a single ZFS snapshot at a time with this approach unless you wanted Borg to run multiple times (and create multiple archives).

What about not unmounting any part of the original FS and mount the whole hierarchy of zfs snapshots in a subdirectory and then backup just that subdirectory using "relative directory" stuff? IMHO the final result (files in backup) is exactly same. You can use original runtime form running OS, including "/bin", and backup "/my/subdir/bin" from shapshot mounted under /my/subdir.

That's an interesting idea. Are you suggesting using something like borgmatic's existing working_directory option and setting it implicitly to the ZFS snapshot directory so that the paths that are stored in the archive omit the containing ZFS snapshot directory? If so, the main challenge I can see with that approach is that relative paths couldn't also be used for any of the source_directories when a ZFS hook is in use. Maybe that's an okay limitation though? borgmatic could even error if working_directory is specified explicitly when the ZFS hook is used. EDIT: Also, you'd be limited to backing up a single ZFS snapshot at a time with this approach unless you wanted Borg to run multiple times (and create multiple archives).

More or less. In fact I'm describing my setup that I currently have using borgmatic 1.7 and few dozens of lines of code in bash as borgmatic hooks, and it works quite well so far.

I don't think in terms of "snapshot directories", at least not meaning the hidden ".zfs/snapshot/stuff". Rather, I mount the whole snapshot hierarchy recursively under my own directory, thus recreating the original filesystem hierarchy therein, and then backup the content of that directory. Something along these lines:

First: Snapshot creation (recursive):

zfs snapshot -r 'rpool@borgmatic-20231128-230705-798676169-32477c'
zfs snapshot -r 'bpool@borgmatic-20231128-230705-798676169-32477c'
...

Then: Snapshot mounting:

mkdir -p '/borgmatic/jobs/nightly-all' && chmod 700 '/borgmatic/jobs/nightly-all'
mount -t zfs -o ro rpool/ROOT/debian@borgmatic-20231128-230705-798676169-32477c /borgmatic/jobs/nightly-all/
mount -t zfs -o ro bpool/BOOT/debian@borgmatic-20231128-230705-798676169-32477c /borgmatic/jobs/nightly-all/boot
mount -t zfs -o ro rpool/home@borgmatic-20231128-230705-798676169-32477c /borgmatic/jobs/nightly-all/home
mount -t zfs -o ro rpool/home/root@borgmatic-20231128-230705-798676169-32477c /borgmatic/jobs/nightly-all/root
...

After that, I have my whole filesystem available for backup under this /borgmatic/jobs/nightly-all directory. The backup is configured like this:

location:
    source_directories:
        - .

    repositories:
        - ssh://storagebox/./path/to/repo

    working_directory: "/borgmatic/jobs/nightly-all"
> > What about not unmounting any part of the original FS and mount the whole hierarchy of zfs snapshots in a subdirectory and then backup just that subdirectory using "relative directory" stuff? IMHO the final result (files in backup) is exactly same. You can use original runtime form running OS, including "/bin", and backup "/my/subdir/bin" from shapshot mounted under /my/subdir. > > That's an interesting idea. Are you suggesting using something like borgmatic's existing `working_directory` option and setting it implicitly to the ZFS snapshot directory so that the paths that are stored in the archive omit the containing ZFS snapshot directory? If so, the main challenge I can see with that approach is that relative paths couldn't _also_ be used for any of the `source_directories` when a ZFS hook is in use. Maybe that's an okay limitation though? borgmatic could even error if `working_directory` is specified explicitly when the ZFS hook is used. EDIT: Also, you'd be limited to backing up a single ZFS snapshot at a time with this approach unless you wanted Borg to run multiple times (and create multiple archives). More or less. In fact I'm describing [my setup](https://projects.torsion.org/borgmatic-collective/borgmatic/issues/261#issuecomment-7436) that I currently have using borgmatic 1.7 and few dozens of lines of code in bash as borgmatic hooks, and it works quite well so far. I don't think in terms of "snapshot directories", at least not meaning the hidden ".zfs/snapshot/stuff". Rather, I mount the whole snapshot hierarchy recursively under my own directory, thus recreating the original filesystem hierarchy therein, and then backup the content of that directory. Something along these lines: First: **Snapshot creation** (recursive): ``` zfs snapshot -r 'rpool@borgmatic-20231128-230705-798676169-32477c' zfs snapshot -r 'bpool@borgmatic-20231128-230705-798676169-32477c' ... ``` Then: **Snapshot mounting**: ``` mkdir -p '/borgmatic/jobs/nightly-all' && chmod 700 '/borgmatic/jobs/nightly-all' mount -t zfs -o ro rpool/ROOT/debian@borgmatic-20231128-230705-798676169-32477c /borgmatic/jobs/nightly-all/ mount -t zfs -o ro bpool/BOOT/debian@borgmatic-20231128-230705-798676169-32477c /borgmatic/jobs/nightly-all/boot mount -t zfs -o ro rpool/home@borgmatic-20231128-230705-798676169-32477c /borgmatic/jobs/nightly-all/home mount -t zfs -o ro rpool/home/root@borgmatic-20231128-230705-798676169-32477c /borgmatic/jobs/nightly-all/root ... ``` After that, I have my whole filesystem available for backup under this ``/borgmatic/jobs/nightly-all`` directory. The backup is configured like this: ``` location: source_directories: - . repositories: - ssh://storagebox/./path/to/repo working_directory: "/borgmatic/jobs/nightly-all" ```
Author
Owner

Got it, thanks. That makes sense to me. I think there would still be the limitation of a single working directory per configuration file. Also, there would be the same caveats around interactions with relative source_directories. But that aside, this sounds like it could potentially be easier than the unshare / bind mount approach. Then again, with either approach, you're doing a mount (in one case it's -t zfs and in the other it's --bind), so maybe the two approaches aren't really that different.

Got it, thanks. That makes sense to me. I think there would still be the limitation of a single working directory per configuration file. Also, there would be the same caveats around interactions with relative `source_directories`. But that aside, this sounds like it could potentially be easier than the unshare / bind mount approach. Then again, with _either_ approach, you're doing a `mount` (in one case it's `-t zfs` and in the other it's `--bind`), so maybe the two approaches aren't really that different.

Got it, thanks. That makes sense to me. I think there would still be the limitation of a single working directory per configuration file.

Well, not necessarily an issue. Finally, it's just a working directory. :)

Also, there would be the same caveats around interactions with relative source_directories.

Relative source dirs is not a must. You can use absolute dirs as well. In any case, we shall take into account the recovery scenario in the first place, and if I understand docs correctly, the borgmatic recovery scenario is "write all the recovered files to their original locations". If it is as said, then, well, how it's supposed to work with backup of "/" as absolute source_directory? Will this always attempt to override currently running OS (rescue OS/live CD)?

With nested path, such as "/borgmatic/nightly-backup" we can use absolute source_directory "/borgmatic/nightly-backup" and the recovery scenario with the files being recovered into "/borgmatic/nightly-backup" directory, while running borgmatic from some rescue OS or live CD mounted to "/". With relative source_dirs, we can recover to any directory, am I right?

But that aside, this sounds like it could potentially be easier than the unshare / bind mount approach. Then again, with either approach, you're doing a mount (in one case it's -t zfs and in the other it's --bind), so maybe the two approaches aren't really that different.

That seems reasonable. And, the bind approach may be applicable to other scenarios, not specific to zfs, so a code may be reusable.

> Got it, thanks. That makes sense to me. I think there would still be the limitation of a single working directory per configuration file. Well, not necessarily an issue. Finally, it's just a working directory. :) > Also, there would be the same caveats around interactions with relative `source_directories`. Relative source dirs is not a must. You can use absolute dirs as well. In any case, we shall take into account the recovery scenario in the first place, and if I understand docs correctly, the borgmatic recovery scenario is "write all the recovered files to their original locations". If it is as said, then, well, how it's supposed to work with backup of "/" as absolute source_directory? Will this always attempt to override currently running OS (rescue OS/live CD)? With nested path, such as "/borgmatic/nightly-backup" we can use absolute source_directory "/borgmatic/nightly-backup" and the recovery scenario with the files being recovered into "/borgmatic/nightly-backup" directory, while running borgmatic from some rescue OS or live CD mounted to "/". With relative source_dirs, we can recover to any directory, am I right? > But that aside, this sounds like it could potentially be easier than the unshare / bind mount approach. Then again, with _either_ approach, you're doing a `mount` (in one case it's `-t zfs` and in the other it's `--bind`), so maybe the two approaches aren't really that different. That seems reasonable. And, the bind approach may be applicable to other scenarios, not specific to zfs, so a code may be reusable.
Author
Owner

Also, there would be the same caveats around interactions with relative source_directories.

Relative source dirs is not a must. You can use absolute dirs as well.

I just meant that any user expecting to use a relative source directory in conjunction with their own working_directory wouldn't be able to do that, as this proposed ZFS feature would co-opt the working_directory. But yes, absolute source directories will continue to work just fine.

In any case, we shall take into account the recovery scenario in the first place, and if I understand docs correctly, the borgmatic recovery scenario is "write all the recovered files to their original locations". If it is as said, then, well, how it's supposed to work with backup of "/" as absolute source_directory? Will this always attempt to override currently running OS (rescue OS/live CD)?

By default, borgmatic extract extracts to the current directory, just like borg extract does. (However IIRC, for purposes of extract, this might be distinct from borgmatic's working_directory option.) But you can always override the extract destination, so even if an archive was backed up with in absolute source directory like /, you can extract it anywhere you like (including /). The limitation is that the path stored into the Borg archive will influence where it gets restored.

For instance, if you backup with / in source_directories and that stores /bin/bash in the archive, then when you go to extract, you can restore that file to /bin/bash or /some/prefix/bin/bash. You can even strip path components during extract with borgmatic extract --strip-components, but that's a heavy-handed tool for many use cases.

With nested path, such as "/borgmatic/nightly-backup" we can use absolute source_directory "/borgmatic/nightly-backup" and the recovery scenario with the files being recovered into "/borgmatic/nightly-backup" directory, while running borgmatic from some rescue OS or live CD mounted to "/". With relative source_dirs, we can recover to any directory, am I right?

Yes, but you don't necessarily need relative source directories to be able to do that. (See above.)

> > Also, there would be the same caveats around interactions with relative source_directories. > > Relative source dirs is not a must. You can use absolute dirs as well. I just meant that any user expecting to use a relative source directory in conjunction with their own `working_directory` wouldn't be able to do that, as this proposed ZFS feature would co-opt the `working_directory`. But yes, absolute source directories will continue to work just fine. > In any case, we shall take into account the recovery scenario in the first place, and if I understand docs correctly, the borgmatic recovery scenario is "write all the recovered files to their original locations". If it is as said, then, well, how it's supposed to work with backup of "/" as absolute source_directory? Will this always attempt to override currently running OS (rescue OS/live CD)? By default, `borgmatic extract` extracts to the current directory, just like `borg extract` does. (However IIRC, for purposes of `extract`, this might be distinct from borgmatic's `working_directory` option.) But you can always override the `extract` destination, so even if an archive was backed up with in absolute source directory like `/`, you can extract it anywhere you like (including `/`). The limitation is that the path stored into the Borg archive will influence where it gets restored. For instance, if you backup with `/` in `source_directories` and that stores `/bin/bash` in the archive, then when you go to `extract`, you can restore that file to `/bin/bash` _or_ `/some/prefix/bin/bash`. You can even strip path components during `extract` with `borgmatic extract --strip-components`, but that's a heavy-handed tool for many use cases. > With nested path, such as "/borgmatic/nightly-backup" we can use absolute source_directory "/borgmatic/nightly-backup" and the recovery scenario with the files being recovered into "/borgmatic/nightly-backup" directory, while running borgmatic from some rescue OS or live CD mounted to "/". With relative source_dirs, we can recover to any directory, am I right? Yes, but you don't necessarily need relative source directories to be able to do that. (See above.)
Author
Owner

Some relevant discussion for a potential feature in Borg 1.4 that would make backing up ZFS snapshots and extracting them much easier (no working_directory needed): https://github.com/borgbackup/borg/discussions/7975#discussioncomment-8293248

Some relevant discussion for a potential feature in Borg 1.4 that would make backing up ZFS snapshots and extracting them much easier (no `working_directory` needed): https://github.com/borgbackup/borg/discussions/7975#discussioncomment-8293248
Contributor

We could implement this for zfs at least by doing the following for each dataset:

  1. Create a snapshot
  2. Mount said snapshot under /tmp/borgmatic_zfs_fakeroot/MOUNTPOINT
  3. Pass path as /tmp/borgmatic_zfs_fakeroot/./MOUNTPOINT to borg
  4. Clean up filesystem state
    I don't think there is a better option right now so I think this is what I will roll as soon as borg 1.4 is released
We could implement this for zfs at least by doing the following for each dataset: 1. Create a snapshot 2. Mount said snapshot under /tmp/borgmatic_zfs_fakeroot/MOUNTPOINT 3. Pass path as /tmp/borgmatic_zfs_fakeroot/./MOUNTPOINT to borg 4. Clean up filesystem state I don't think there is a better option right now so I think this is what I will roll as soon as borg 1.4 is released
Author
Owner

That sounds like a good plan, thank you! You may have already meant this, but you don't have to wait for a stable Borg 1.4 release to implement this. The next beta should probably be fine as long as nobody uses it in production.

One caveat is that I'd recommend using Python's tempfile.mkdtemp() or similar to create that temporary directory.

That sounds like a good plan, thank you! You may have already meant this, but you don't have to wait for a stable Borg 1.4 release to implement this. The next beta should probably be fine as long as nobody uses it in production. One caveat is that I'd recommend using Python's [tempfile.mkdtemp()](https://docs.python.org/3/library/tempfile.html#tempfile.mkdtemp) or similar to create that temporary directory.
Contributor

One more thing: We have to check that we do not back up anything twice, so backing up a dataset DATA/test mounted at /test and /test in source_directories should either result in an error or be resolved by removing /test from source_directories.

One more thing: We have to check that we do not back up anything twice, so backing up a dataset DATA/test mounted at /test and /test in source_directories should either result in an error or be resolved by removing /test from source_directories.
Contributor

@witten Architectural question: Where in the code would you like this to be implemented? Afaik the hook system is not really powerful enough to handle this as we have to manipulate which directories get included right? So this has to either be a new hook or become completely integrated into borgmatic.

@witten Architectural question: Where in the code would you like this to be implemented? Afaik the hook system is not really powerful enough to handle this as we have to manipulate which directories get included right? So this has to either be a new hook or become completely integrated into borgmatic.
Author
Owner

One more thing: We have to check that we do not back up anything twice, so backing up a dataset DATA/test mounted at /test and /test in source_directories should either result in an error or be resolved by removing /test from source_directories.

One approach for that would be to simply disallow anything user-specified in source_directories when the ZFS hook is enabled. That way, you shouldn't get any colliding paths stored into the Borg archive.

@witten Architectural question: Where in the code would you like this to be implemented? Afaik the hook system is not really powerful enough to handle this as we have to manipulate which directories get included right? So this has to either be a new hook or become completely integrated into borgmatic.

I'd recommend putting the hook itself borgmatic/hooks/zfs.py. And then, yes, the internal hook "API" will probably have to be expanded to support manipulation of source directories. Today, borgmatic automatically includes ~/.borgmatic in the set of directories that get backed up within collect_borgmatic_source_directories(), but that won't perform the /./ hack. So it might make sense to call out to hooks from that collect function, and then any hooks that implement it can return additional source directories. (In fact that might be a cleaner way for all hooks to work, ultimately.)

> One more thing: We have to check that we do not back up anything twice, so backing up a dataset DATA/test mounted at /test and /test in source_directories should either result in an error or be resolved by removing /test from source_directories. One approach for that would be to simply disallow anything user-specified in `source_directories` when the ZFS hook is enabled. That way, you shouldn't get any colliding paths stored into the Borg archive. > @witten Architectural question: Where in the code would you like this to be implemented? Afaik the hook system is not really powerful enough to handle this as we have to manipulate which directories get included right? So this has to either be a new hook or become completely integrated into borgmatic. I'd recommend putting the hook itself `borgmatic/hooks/zfs.py`. And then, yes, the internal hook "API" will probably have to be expanded to support manipulation of source directories. Today, borgmatic automatically includes `~/.borgmatic` in the set of directories that get backed up within `collect_borgmatic_source_directories()`, but that won't perform the `/./` hack. So it might make sense to call out to hooks from that collect function, and then any hooks that implement it can return additional source directories. (In fact that might be a cleaner way for all hooks to work, ultimately.)
Contributor

Well I was thinking of having the zfs hook work the following:
Have a zfs option in the config file (Only a simple on/off)
Have a zfs user property like com.borgmatic:backup that the hook checks for on all datasets
Now check for any overlaps between source_directories and the mount paths of the zfs datasets, snapshot all matching datasets, mount them, add the original path to the excludes and the new path to the includes.

Well I was thinking of having the zfs hook work the following: Have a zfs option in the config file (Only a simple on/off) Have a [zfs user property](https://openzfs.github.io/openzfs-docs/man/master/7/zfsprops.7.html#User_Properties) like `com.borgmatic:backup` that the hook checks for on all datasets Now check for any overlaps between source_directories and the mount paths of the zfs datasets, snapshot all matching datasets, mount them, add the original path to the excludes and the new path to the includes.
Author
Owner

That approach generally makes sense to me! (I might suggest org.torsion.borgmatic:backup for the user property though.)

That approach generally makes sense to me! (I might suggest `org.torsion.borgmatic:backup` for the user property though.)
Contributor

So I think the general implementation of a snapshot hook should be pretty simple: We need a snapshot function that allows the hook to create snapshots and return a list of paths that are to be added to the borg call plus a cleanup function like remove_data_source_dumps but for snapshots. AFAIK this will go in borgmatic/actions/create.py since all the other create only hooks are also there. Would you agree to this general approach?

So I think the general implementation of a snapshot hook should be pretty simple: We need a snapshot function that allows the hook to create snapshots and return a list of paths that are to be added to the `borg` call plus a cleanup function like `remove_data_source_dumps` but for snapshots. AFAIK this will go in `borgmatic/actions/create.py` since all the other create only hooks are also there. Would you agree to this general approach?
Author
Owner

The overall approach sounds good, but here are some suggested specifics:

  • dump_data_sources() was originally intended to be the generic interface for preparing all data sources for Borg consumption: dumping databases, taking snapshots, etc. Now I realize in the ZFS snapshot hook case, you don't want to produce a subprocess.Popen instance for a named pipe to stream data to Borg like dump_data_sources() does today, because that's not how snapshots work. What you want is to produce a path (or list of paths) to inject into the borg create paths to backup. So what if dump_data_sources() could do that too? E.g., return a subprocess.Popen instance or a path/paths to backup. Then it would be up to the caller (in this case borgmatic/actions/create.py:run_create() to do the appropriate thing with the returned value: Either pass it to create_archive() as extra paths or pass it to create_archive() as stream_processes.
  • As for removing the snapshots, would it be possible to do that with remote_data_source_dumps()? I think that should already have all the data it needs to find the snapshots, given that the config is already passed in.

My main goal here is to avoid having a proliferation of different data source setup/teardown functions for every type of data source (database, filesystem snapshot, etc.) or every data source that has slightly different requirements from the others. I'm of course open to discussion here if you disagree or just want to riff on this further. Thanks!

The overall approach sounds good, but here are some suggested specifics: - `dump_data_sources()` was originally intended to be the generic interface for preparing all data sources for Borg consumption: dumping databases, taking snapshots, etc. Now I realize in the ZFS snapshot hook case, you don't want to produce a `subprocess.Popen` instance for a named pipe to _stream_ data to Borg like `dump_data_sources()` does today, because that's not how snapshots work. What you want is to produce a path (or list of paths) to inject into the `borg create` paths to backup. So what if `dump_data_sources()` could do that too? E.g., return a `subprocess.Popen` instance _or_ a path/paths to backup. Then it would be up to the caller (in this case `borgmatic/actions/create.py:run_create()` to do the appropriate thing with the returned value: Either pass it to `create_archive()` as extra paths or pass it to `create_archive()` as `stream_processes`. - As for removing the snapshots, would it be possible to do that with `remote_data_source_dumps()`? I think that should already have all the data it needs to find the snapshots, given that the `config` is already passed in. My main goal here is to avoid having a proliferation of different data source setup/teardown functions for every type of data source (database, filesystem snapshot, etc.) or every data source that has slightly different requirements from the others. I'm of course open to discussion here if you disagree or just want to riff on this further. Thanks!
Contributor

Seems valid. We might just have the call return a list of either pathlib or Popen objects and check for the type of all of them. That way a hook could just return a list of whoknowswhat in the future and the type can be checked and processed.

Seems valid. We might just have the call return a list of either [pathlib](https://docs.python.org/3/library/pathlib.html#module-pathlib) or Popen objects and check for the type of all of them. That way a hook could just return a list of _whoknowswhat_ in the future and the type can be checked and processed.
Author
Owner

Yup, that sounds like a good idea!

Yup, that sounds like a good idea!
Sign in to join this conversation.
No Assignees
6 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#261
No description provided.