borgmatic fails to unmount and destory zfs snapshot #1295

Closed
opened 2026-04-14 23:42:05 +00:00 by mondjef · 16 comments

What I'm trying to do and why

I have a configuration that backups the following paths...

/warehouse/music (zpool=rpool/DATA/music)
/warehouse/images (zpool=rpool/DATA/images)
/warehouse/videos (zpool=dpool1/DATA/videos)
/warehouse/documents (zpool=dpool1/DATA/documents)

Backup process starts just fine, snapshots are created for zfs datasets, snapshots are mounted, backups are created and then all snapshots are unmounted and destroyed with the exception of /warehouse/music. The error reported is that it can't destroy the snapshot because it is busy. From what I can tell it is because it never actually got unmounted, though if I manually umount and destroy the snapshot I can do so without any issues. Not sure where to look or how to fix.

Steps to reproduce

No response

Actual behavior

No response

Expected behavior

all zfs snapshots are unmounted and all snapshots are destroyed

Other notes / implementation ideas

No response

borgmatic version

No response

borgmatic installation method

No response

Borg version

2.0.7

Python version

Python 3.13.12

Database version (if applicable)

No response

Operating system and version

NAME='Gentoo' ID='gentoo' PRETTY_NAME='Gentoo Linux' VERSION='2.18' VERSION_ID='2.18' HOME_URL='https://www.gentoo.org/' SUPPORT_URL='https://www.gentoo.org/support/' BUG_REPORT_URL='https://bugs.gentoo.org/' ANSI_COLOR='1;32'

### What I'm trying to do and why I have a configuration that backups the following paths... /warehouse/music (zpool=rpool/DATA/music) /warehouse/images (zpool=rpool/DATA/images) /warehouse/videos (zpool=dpool1/DATA/videos) /warehouse/documents (zpool=dpool1/DATA/documents) Backup process starts just fine, snapshots are created for zfs datasets, snapshots are mounted, backups are created and then all snapshots are unmounted and destroyed with the exception of /warehouse/music. The error reported is that it can't destroy the snapshot because it is busy. From what I can tell it is because it never actually got unmounted, though if I manually umount and destroy the snapshot I can do so without any issues. Not sure where to look or how to fix. ### Steps to reproduce _No response_ ### Actual behavior _No response_ ### Expected behavior all zfs snapshots are unmounted and all snapshots are destroyed ### Other notes / implementation ideas _No response_ ### borgmatic version _No response_ ### borgmatic installation method _No response_ ### Borg version 2.0.7 ### Python version Python 3.13.12 ### Database version (if applicable) _No response_ ### Operating system and version NAME='Gentoo' ID='gentoo' PRETTY_NAME='Gentoo Linux' VERSION='2.18' VERSION_ID='2.18' HOME_URL='https://www.gentoo.org/' SUPPORT_URL='https://www.gentoo.org/support/' BUG_REPORT_URL='https://bugs.gentoo.org/' ANSI_COLOR='1;32'
mondjef changed title from borgmatic fails to unmount zfs snapshot and destory snapshot to borgmatic fails to unmount and destory zfs snapshot 2026-04-14 23:42:32 +00:00
Owner

A few thoughts on this one:

  • Are you using borgmatic version 2.0.7? If so, there have been a few ZFS-related fixes since then that may be related to what you're seeing. So if you can upgrade, I'd recommend it. I realize there may not be a newer version in Gentoo, but there is now a stand-alone Linux binary you could try.
  • Can I see the full borgmatic logs from when you encounter this problem, including the borgmatic command you're running? Feel free to redact as necessary.
  • Can I see your borgmatic configuration as well? Also redacted as necessary.
A few thoughts on this one: - Are you using borgmatic version 2.0.7? If so, there have been a few ZFS-related fixes since then that _may_ be related to what you're seeing. So if you can upgrade, I'd recommend it. I realize there may not be a newer version in Gentoo, but there is now a [stand-alone Linux binary](https://torsion.org/borgmatic/how-to/install-borgmatic/#other-ways-to-install) you could try. - Can I see the full borgmatic logs from when you encounter this problem, including the borgmatic command you're running? Feel free to redact as necessary. - Can I see your borgmatic configuration as well? Also redacted as necessary.
Author

@witten wrote in #1295 (comment):

A few thoughts on this one:

* Are you using borgmatic version 2.0.7? If so, there have been a few ZFS-related fixes since then that _may_ be related to what you're seeing. So if you can upgrade, I'd recommend it. I realize there may not be a newer version in Gentoo, but there is now a [stand-alone Linux binary](https://torsion.org/borgmatic/how-to/install-borgmatic/#other-ways-to-install) you could try.

* Can I see the full borgmatic logs from when you encounter this problem, including the borgmatic command you're running? Feel free to redact as necessary.

* Can I see your borgmatic configuration as well? Also redacted as necessary.

I am using version 2.0.7, gentoo does have version 2.1.4 available (masked still though) so I could try and unmask upgrade and see that version fixes my issues. Or attempt to try the stand alone binary.

Command I am using: borgmatic -c /etc/borgmatic.d/dpool1.yaml

log from the backup job uploaded as well as the configuration

Thanks for your assistance with this.

@witten wrote in https://projects.torsion.org/borgmatic-collective/borgmatic/issues/1295#issuecomment-13952: > A few thoughts on this one: > > * Are you using borgmatic version 2.0.7? If so, there have been a few ZFS-related fixes since then that _may_ be related to what you're seeing. So if you can upgrade, I'd recommend it. I realize there may not be a newer version in Gentoo, but there is now a [stand-alone Linux binary](https://torsion.org/borgmatic/how-to/install-borgmatic/#other-ways-to-install) you could try. > > * Can I see the full borgmatic logs from when you encounter this problem, including the borgmatic command you're running? Feel free to redact as necessary. > > * Can I see your borgmatic configuration as well? Also redacted as necessary. I am using version 2.0.7, gentoo does have version 2.1.4 available (masked still though) so I could try and unmask upgrade and see that version fixes my issues. Or attempt to try the stand alone binary. Command I am using: borgmatic -c /etc/borgmatic.d/dpool1.yaml log from the backup job uploaded as well as the configuration Thanks for your assistance with this.
Owner

Thanks for providing those details. Yeah, it looks like for /warehouse/music, borgmatic is attempting an unmount of the snapshot, but it's doing the unmount with the wrong path and therefore the unmount fails. Which then leads to the error around destroying the snapshot, as you discovered.

So if you can, please try an upgrade and then running borgmatic again. If the problem doesn't repro, great. But if it does repro, then please post an updated log with that version (and let me know what version it is). That should at least eliminate a number of previously fixed ZFS issues. Thanks!

Thanks for providing those details. Yeah, it looks like for `/warehouse/music`, borgmatic is attempting an unmount of the snapshot, but it's doing the unmount with the wrong path and therefore the unmount fails. Which then leads to the error around destroying the snapshot, as you discovered. So if you can, please try an upgrade and then running borgmatic again. If the problem doesn't repro, great. But if it does repro, then please post an updated log with that version (and let me know what version it is). That should at least eliminate a number of previously fixed ZFS issues. Thanks!
Author

yup, I see where in the log you noticed that it seems to be referencing an incorrect mount path when it tries unmounting that one snapshot...weird that it only has an issues with this one mounted snapshot and not the others.

I upgraded to version 2.1.4 and tried again...no luck, exact same issue. Log attached.

yup, I see where in the log you noticed that it seems to be referencing an incorrect mount path when it tries unmounting that one snapshot...weird that it only has an issues with this one mounted snapshot and not the others. I upgraded to version 2.1.4 and tried again...no luck, exact same issue. Log attached.
Owner

Thanks for testing that out with the new version and including the log as well. At least now we've eliminated a number of previously fixed issues and can start "fresh." I'll dig into this and see if I can get a repro locally or at minimum an explanation of what might be going on with your machine.

Thanks for testing that out with the new version and including the log as well. At least now we've eliminated a number of previously fixed issues and can start "fresh." I'll dig into this and see if I can get a repro locally or at minimum an explanation of what might be going on with your machine.
Author

As a test, I created a bogus text file in the /warehouse/music directory as the only difference between this backup directory and the others is that it is an empty directory. I am currently rerunning the backup to see anything different happens by chance. I did briefly look over zfs.py and did not seeing anything that stood out that would explain the incorrect path being used in the umount command but I am also not familiar with the borgmatic code base at all...

As a test, I created a bogus text file in the /warehouse/music directory as the only difference between this backup directory and the others is that it is an empty directory. I am currently rerunning the backup to see anything different happens by chance. I did briefly look over zfs.py and did not seeing anything that stood out that would explain the incorrect path being used in the umount command but I am also not familiar with the borgmatic code base at all...
Owner

Oh! So when borgmatic was run to produce that most recent log, /warehouse/music was a completely empty directory? If so, I think that might be triggering this code in zfs.py, which short-circuits unmounting—and could explain what you're seeing:

            # If the snapshot mount path is empty, this is probably just a "shadow" of a nested
            # dataset and therefore there's nothing to unmount.
            if not os.path.isdir(snapshot_mount_path) or not os.listdir(snapshot_mount_path):
                continue
Oh! So when borgmatic was run to produce that most recent log, `/warehouse/music` was a completely empty directory? If so, I think that might be triggering this code in `zfs.py`, which short-circuits unmounting—and could explain what you're seeing: ```python # If the snapshot mount path is empty, this is probably just a "shadow" of a nested # dataset and therefore there's nothing to unmount. if not os.path.isdir(snapshot_mount_path) or not os.listdir(snapshot_mount_path): continue ```
Author

ha I did see that piece of code and stopped on it for a bit as to suspect it might be the case but quickly decided it wasn't as its not a shadow of a nested directory. Given the fact that I have a reasonable use case of backing up a directory that is at least at the moment empty do you see a way to accommodate this in the code base? As for a workaround I can simply just keep an empty file in the directory for now. I will confirm when the backup job completes if it fails or succeeds with the empty file there.

ha I did see that piece of code and stopped on it for a bit as to suspect it might be the case but quickly decided it wasn't as its not a shadow of a nested directory. Given the fact that I have a reasonable use case of backing up a directory that is at least at the moment empty do you see a way to accommodate this in the code base? As for a workaround I can simply just keep an empty file in the directory for now. I will confirm when the backup job completes if it fails or succeeds with the empty file there.
Owner

Yeah, this code is basically using a pretty bad heuristic: This snapshot mount path is an empty directory, and therefore it's probably a shadow of a nested dataset within the snapshot for a parent dataset, and therefore its unmount should be skipped. But I suspect that this code also happens to trigger for plain old datasets that happen to be empty, such as /warehouse/music in your case.

The underlying problem is that there's no good cross-platform way to probe for whether a directory is mounted, so you can see the code in this function sort of dancing around that by doing several tangentially related checks before the actual unmount. That includes this empty directory check.

Anyway, I'll have to think about whether I can modify this to accommodate your use case without breaking the nested dataset shadow use case.

And yes, as a workaround for now you should be able to put an empty file in the otherwise empty dataset directory.

Yeah, this code is basically using a pretty bad heuristic: This snapshot mount path is an empty directory, and therefore it's probably a shadow of a nested dataset within the snapshot for a parent dataset, and therefore its unmount should be skipped. But I suspect that this code also happens to trigger for plain old datasets that happen to be empty, such as `/warehouse/music` in your case. The underlying problem is that there's no good cross-platform way to probe for whether a directory is mounted, so you can see the code in this function sort of dancing around that by doing several tangentially related checks before the actual unmount. That includes this empty directory check. Anyway, I'll have to think about whether I can modify this to accommodate your use case without breaking the nested dataset shadow use case. And yes, as a workaround for now you should be able to put an empty file in the otherwise empty dataset directory.
Author

I believe this subsequent backup run with the bogus text file worked as far as the unmounting and destroying of the snapshot but I encountered another error and ultimately the backup still failed. Note sure if this new error is related and if from borgmatic or borg itself....

I believe this subsequent backup run with the bogus text file worked as far as the unmounting and destroying of the snapshot but I encountered another error and ultimately the backup still failed. Note sure if this new error is related and if from borgmatic or borg itself....
Author

I can confirm now that running the backup job after ensuring the problematic backup path is not empty does in fact successfully complete now. So for now I will keep this empty file in the directory to workaround this issue. The error I encountered in my last comment was an I/O issue raised by borg itself that seems to be from a corrupt backup repository which I ended just recreating the repository to fix. Just need to figure out a smarter way to import/export my zfs pool that I specifically use for the backup repositories so that concurrent jobs do not try to export/unmount while the other is still using...
If you do make any enhancements to the zfs handling with respect to what I experienced here I would be more than willing to do some testing.

I can confirm now that running the backup job after ensuring the problematic backup path is not empty does in fact successfully complete now. So for now I will keep this empty file in the directory to workaround this issue. The error I encountered in my last comment was an I/O issue raised by borg itself that seems to be from a corrupt backup repository which I ended just recreating the repository to fix. Just need to figure out a smarter way to import/export my zfs pool that I specifically use for the backup repositories so that concurrent jobs do not try to export/unmount while the other is still using... If you do make any enhancements to the zfs handling with respect to what I experienced here I would be more than willing to do some testing.
Owner

Glad to hear that you managed to get around the Borg I/O issue with a repo recreate.

Just need to figure out a smarter way to import/export my zfs pool that I specifically use for the backup repositories so that concurrent jobs do not try to export/unmount while the other is still using...

I don't know if it's helpful here, but there is this: https://borgbackup.readthedocs.io/en/stable/usage/lock.html

If you do make any enhancements to the zfs handling with respect to what I experienced here I would be more than willing to do some testing.

Great, I'll let you know!

Glad to hear that you managed to get around the Borg I/O issue with a repo recreate. > Just need to figure out a smarter way to import/export my zfs pool that I specifically use for the backup repositories so that concurrent jobs do not try to export/unmount while the other is still using... I don't know if it's helpful here, but there is this: https://borgbackup.readthedocs.io/en/stable/usage/lock.html > If you do make any enhancements to the zfs handling with respect to what I experienced here I would be more than willing to do some testing. Great, I'll let you know!
Owner

Okay, I believe this should be fixed in main! It'll be part of the next release. If you have an easy way to test this, please feel free and let me know how it works for you. The new zfs.py should be a drop-in replacement on top of the zfs.py in borgmatic 2.1.4.

Okay, I believe this should be fixed in main! It'll be part of the next release. If you have an easy way to test this, please feel free and let me know how it works for you. The new `zfs.py` should be a drop-in replacement on top of the `zfs.py` in borgmatic 2.1.4.
witten 2026-04-18 18:49:04 +00:00
  • closed this issue
  • added the
    bug
    label
Author

I can confirm the fixes you made do not raise the errors now when a zfs dataset contains no files/directories and that the snapshot created against such dataset gets properly destroyed as well. Thanks for your efforts on this!

As a side note for others, below is what I ended up doing regarding dynamically importing (mounting)/exporting (unmounting) the 'backups' zfs pool prior to running backups that use repositories on that pool... This snippet of yaml is in a separate yaml file that is then included in any other configuration files that use this pool and the repositories that reside on it.

Note: I only need to worry about importing/exporting the pool and not any actual mounting/unmounting as the backup dataset is configured to auto mount/unmount when the pool is imported or exported.

commands:
    - before: repository
      run:
          # Import the backup pool if not already imported
          - |
            if [[ "{repository}" = "/mnt/backups"* ]]; then
              zpool list backups > /dev/null 2>&1 || zpool import backups;
              while ! zpool list backups > /dev/null 2>&1; do
                echo -e "Waiting for backups zpool to import..."
                sleep 2
              done
              echo -e "zpool backups imported."
            fi
    - after: repository
      run:
          # Export (unmount) the backup pool only if no other process
          # is accessing (i.e. another borg backup/restore or any other)
          - |
            if [[ "{repository}" = "/mnt/backups"* ]]; then
              if zpool list backups > /dev/null 2>&1; then
                echo -e "zpool backups currently imported, checking if can be exported..."
                if ! lsof -t /mnt/backups >/dev/null; then
                  echo -e "/mnt/backups not being accessed. Exporting zpool backups..."
                  zpool export backups
                else
                  echo -e "/mnt/backups being accessing. Skipping export of zpool backups."
                fi
              else
                echo -e "zpool backups not currently imported, skipping export." 
              fi
            fi

zfs:

I can confirm the fixes you made do not raise the errors now when a zfs dataset contains no files/directories and that the snapshot created against such dataset gets properly destroyed as well. Thanks for your efforts on this! As a side note for others, below is what I ended up doing regarding dynamically importing (mounting)/exporting (unmounting) the 'backups' zfs pool prior to running backups that use repositories on that pool... This snippet of yaml is in a separate yaml file that is then included in any other configuration files that use this pool and the repositories that reside on it. Note: I only need to worry about importing/exporting the pool and not any actual mounting/unmounting as the backup dataset is configured to auto mount/unmount when the pool is imported or exported. ``` commands: - before: repository run: # Import the backup pool if not already imported - | if [[ "{repository}" = "/mnt/backups"* ]]; then zpool list backups > /dev/null 2>&1 || zpool import backups; while ! zpool list backups > /dev/null 2>&1; do echo -e "Waiting for backups zpool to import..." sleep 2 done echo -e "zpool backups imported." fi - after: repository run: # Export (unmount) the backup pool only if no other process # is accessing (i.e. another borg backup/restore or any other) - | if [[ "{repository}" = "/mnt/backups"* ]]; then if zpool list backups > /dev/null 2>&1; then echo -e "zpool backups currently imported, checking if can be exported..." if ! lsof -t /mnt/backups >/dev/null; then echo -e "/mnt/backups not being accessed. Exporting zpool backups..." zpool export backups else echo -e "/mnt/backups being accessing. Skipping export of zpool backups." fi else echo -e "zpool backups not currently imported, skipping export." fi fi zfs: ```
Owner

Thanks for reporting back.. I'm glad to hear the fixes are working for you! And also that you've got the dynamic import/export integrated as well.

Thanks for reporting back.. I'm glad to hear the fixes are working for you! And also that you've got the dynamic import/export integrated as well.
Owner

Released in borgmatic 2.1.5!

Released in borgmatic 2.1.5!
Sign in to join this conversation.
No milestone
No project
No assignees
2 participants
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
borgmatic-collective/borgmatic#1295
No description provided.