Provide more influence on the decision about if a backup is to be considered a success or failure. #447

Closed
opened 2021-09-07 07:13:02 +00:00 by ams_tschoening · 2 comments

What I'm trying to do and why

I'm backing things up using SSHFS and ran into two different, but as well related problems:

Dial-up DSL

Sometimes SSHFS is hosted by some dial-up DSL internet connection and that connection might simply reconnect during the time a backup is executed. The used SSHFS on my host of interest seems to disappear in those cases. When this happens, it seems that BORG only recognizes missing files and considers that a warning instead of an error.

[2021-09-03 01:53:32,906] INFO: /mnt/[...]/root_and_some_zfs/root/vmlinuz.old: stat: [Errno 2] No such file or directory: '/mnt/[...]/root_and_some_zfs/root/vmlinuz.old'
[2021-09-03 01:53:32,906] INFO: E /mnt/[...]/root_and_some_zfs/root/vmlinuz.old
[...]
[2021-09-03 01:53:33,575] INFO: security: saving state for [...] to /home/ams_d_bak_borg/.config/borg/security/[...]
[2021-09-03 01:53:34,918] INFO: security: current location   ssh://[...]
[2021-09-03 01:53:34,918] INFO: security: key type           0
[2021-09-03 01:53:34,918] INFO: security: manifest timestamp 2021-09-02T23:53:32.947038
[2021-09-03 01:53:34,919] INFO: RemoteRepository: 3.22 MB bytes sent, 3.21 kB bytes received, 116 messages sent
[2021-09-03 01:53:34,948] INFO: terminating with warning status, rc 1
[2021-09-03 01:53:34,984] INFO: [...]: Running consistency checks
Expected missing files

I'm backing up using SSHFS from different devices, like some NAS. That NAS stores backups from some Linux hosts and those backups contain symlinks using absolute paths as well. Those symlinks can't be properly resolved on the NAS, so it answers with missing
files when trying to read those. That is an implementation detail of the SFTP server used on the NAS. Compared to the former problem, this time missing files are OK, because I know things can't work differently and therefore BORG aborting with errors wouldn't make too much sense.

[2021-08-25 09:17:45,430] INFO: /mnt/[...]/root/cdrom: readlink: [Errno 2] No such file or directory: '/mnt/[...]/root/cdrom'
[2021-08-25 09:17:45,430] INFO: E /mnt/[...]/root/cdrom
Result code of Borg can't distinguish different use-cases

The problem with those two use-cases is that BorgMatic believes in both of them that the backup succeeded, because Borg exits with a warning result only, not an error. Hence it calls the hook after_backup instead of on_error and my current hook implementations assume SUCCESS for the first and ERROR for the second call. I don't have further logic to inspect archives, logs etc. myself in after_backup, because in theory the decision about SUCCESS vs. ERROR has been done before already by BorgMatic. If I decided to have an ERROR in after_backup by e.g. additionally looking into logs, in theory I would need to do what is done in on_error already again. Remember that BorgMatic itself is able to trigger web hooks and alike in case of errors.

Error code mapping by Borg?

I've already discussed this on the mailing list of Borg, if there's some kind of additional error code mapping or alike, but that doesn't seem to be the case. With such a mapping one could tell Borg to assume an error in case of missing files for one backup use-case and keep a warning for the other. This could easily be configured with BorgMatic, because I use specific YAML files for each backup already.

https://mail.python.org/pipermail/borgbackup/2021q3/001912.html

Other notes / implementation ideas

It seems unlikely that Borg itself gets such mapping, so I would like to propose an enhancement for BorgMatic about the decision which hook to call after backups. In the end, it's only about if to call after_backup or on_error. BorgMatic itself checks the error code of Borg for that decision, but it could take other facts into account as well, e.g. created log files. I'm logging at DEBUG level currently and therefore have the warnings/error messages printed by Borg in some file after the Borg backup has finished.

In theory, BorgMatic could provide me some additional hook/callback to calculate the overall result of the former backup attempt myself and decide based on my error code if after_backup or on_error should be called. I would decide e.g. by GREPping logs for warnings/error messages like the quoted ones above, but of course one is free to decide however one likes.

This might result in some extension of the currently available hooks by one per each available using a prefix like calc_if_*. That hook is called before the similar named hook, returns some error code and that error code decides if to actually call the hook or not.

  • calc_if_after_backup -> 0 -> after_backup
  • calc_if_after_backup -> 1 -> on_error

this is pretty much how hooks are handled right now already: Errors during running Borg or errors during running hooks are mapped to on_error in the end. It's only that there's currently no further influence on the decision if to run after_backup at all, that's entirely based on the result of Borg, while some people might want to take additional sources into account.

Thank's for considering!

Environment

borgmatic version: 1.5.15
borgmatic installation method: PIP, system wide
Borg version: 1.1.16
Python version: 3.8.10
operating system and version: Ubuntu 20.04

#### What I'm trying to do and why I'm backing things up using SSHFS and ran into two different, but as well related problems: ##### Dial-up DSL Sometimes SSHFS is hosted by some dial-up DSL internet connection and that connection might simply reconnect during the time a backup is executed. The used SSHFS on my host of interest seems to disappear in those cases. When this happens, it seems that BORG only recognizes missing files and considers that a warning instead of an error. ```bash [2021-09-03 01:53:32,906] INFO: /mnt/[...]/root_and_some_zfs/root/vmlinuz.old: stat: [Errno 2] No such file or directory: '/mnt/[...]/root_and_some_zfs/root/vmlinuz.old' [2021-09-03 01:53:32,906] INFO: E /mnt/[...]/root_and_some_zfs/root/vmlinuz.old [...] [2021-09-03 01:53:33,575] INFO: security: saving state for [...] to /home/ams_d_bak_borg/.config/borg/security/[...] [2021-09-03 01:53:34,918] INFO: security: current location ssh://[...] [2021-09-03 01:53:34,918] INFO: security: key type 0 [2021-09-03 01:53:34,918] INFO: security: manifest timestamp 2021-09-02T23:53:32.947038 [2021-09-03 01:53:34,919] INFO: RemoteRepository: 3.22 MB bytes sent, 3.21 kB bytes received, 116 messages sent [2021-09-03 01:53:34,948] INFO: terminating with warning status, rc 1 [2021-09-03 01:53:34,984] INFO: [...]: Running consistency checks ``` ##### Expected missing files I'm backing up using SSHFS from different devices, like some NAS. That NAS stores backups from some Linux hosts and those backups contain symlinks using absolute paths as well. Those symlinks can't be properly resolved on the NAS, so it answers with missing files when trying to read those. That is an implementation detail of the SFTP server used on the NAS. Compared to the former problem, this time missing files are OK, because I know things can't work differently and therefore BORG aborting with errors wouldn't make too much sense. ```bash [2021-08-25 09:17:45,430] INFO: /mnt/[...]/root/cdrom: readlink: [Errno 2] No such file or directory: '/mnt/[...]/root/cdrom' [2021-08-25 09:17:45,430] INFO: E /mnt/[...]/root/cdrom ``` ##### Result code of Borg can't distinguish different use-cases The problem with those two use-cases is that BorgMatic believes in both of them that the backup succeeded, because Borg exits with a warning result only, not an error. Hence it calls the hook `after_backup` instead of `on_error` and my current hook implementations assume SUCCESS for the first and ERROR for the second call. I don't have further logic to inspect archives, logs etc. myself in `after_backup`, because in theory the decision about SUCCESS vs. ERROR has been done before already by BorgMatic. If I decided to have an ERROR in `after_backup` by e.g. additionally looking into logs, in theory I would need to do what is done in `on_error` already again. Remember that BorgMatic itself is able to trigger web hooks and alike in case of errors. ##### Error code mapping by Borg? I've already discussed this on the mailing list of Borg, if there's some kind of additional error code mapping or alike, but that doesn't seem to be the case. With such a mapping one could tell Borg to assume an error in case of missing files for one backup use-case and keep a warning for the other. This could easily be configured with BorgMatic, because I use specific YAML files for each backup already. https://mail.python.org/pipermail/borgbackup/2021q3/001912.html #### Other notes / implementation ideas It seems unlikely that Borg itself gets such mapping, so I would like to propose an enhancement for BorgMatic about the decision which hook to call after backups. In the end, it's only about if to call `after_backup` or `on_error`. BorgMatic itself checks the error code of Borg for that decision, but it could take other facts into account as well, e.g. created log files. I'm logging at DEBUG level currently and therefore have the warnings/error messages printed by Borg in some file after the Borg backup has finished. In theory, BorgMatic could provide me some additional hook/callback to calculate the overall result of the former backup attempt myself and decide based on my error code if `after_backup` or `on_error` should be called. I would decide e.g. by GREPping logs for warnings/error messages like the quoted ones above, but of course one is free to decide however one likes. This might result in some extension of the currently available hooks by one per each available using a prefix like `calc_if_*`. That hook is called before the similar named hook, returns some error code and that error code decides if to actually call the hook or not. * `calc_if_after_backup` -> `0` -> `after_backup` * `calc_if_after_backup` -> `1` -> `on_error` this is pretty much how hooks are handled right now already: Errors during running Borg or errors during running hooks are mapped to `on_error` in the end. It's only that there's currently no further influence on the decision if to run `after_backup` at all, that's entirely based on the result of Borg, while some people might want to take additional sources into account. **Thank's for considering!** #### Environment **borgmatic version:** 1.5.15 **borgmatic installation method:** PIP, system wide **Borg version:** 1.1.16 **Python version:** 3.8.10 **operating system and version:** Ubuntu 20.04
Owner

Again, I apologize for the lengthy delay with these tickets. There are two existing borgmatic feature that might help here:

There's a soon-to-be-released "source_directories_must_exist" option (#501) that tells borgmatic to fail if any source directories are missing—before even calling out to Borg. I don't know if any of your missing directories are source directories, but if so, this option could handle that.

In theory, BorgMatic could provide me some additional hook/callback to calculate the overall result of the former backup attempt myself and decide based on my error code if after_backup or on_error should be called.

There is a feature similar to this. borgmatic supports performing tests before backups start to instruct borgmatic whether to optionally turn a presumed failure state into a soft failure and "early out". That doesn't sound like it'll quite work for your use case though because you want to do the opposite—take a warning and turn it into an error state.

Anyway, if this is still an active need, let me know your thoughts and we can proceed from there.

Again, I apologize for the lengthy delay with these tickets. There are two existing borgmatic feature that might help here: There's a soon-to-be-released "source_directories_must_exist" option (#501) that tells borgmatic to fail if any source directories are missing—before even calling out to Borg. I don't know if any of your missing directories are source directories, but if so, this option could handle that. > In theory, BorgMatic could provide me some additional hook/callback to calculate the overall result of the former backup attempt myself and decide based on my error code if after_backup or on_error should be called. There is a feature *similar* to this. borgmatic supports performing [tests](https://torsion.org/borgmatic/docs/how-to/backup-to-a-removable-drive-or-an-intermittent-server/) before backups start to instruct borgmatic whether to optionally turn a presumed failure state into a soft failure and "early out". That doesn't sound like it'll quite work for your use case though because you want to do the opposite—take a warning and turn it into an error state. Anyway, if this is still an active need, let me know your thoughts and we can proceed from there.
Owner

I'm closing this one for now due to inactivity, but I'd be happy to open it up again if you have further thoughts. Thank you!

I'm closing this one for now due to inactivity, but I'd be happy to open it up again if you have further thoughts. Thank you!
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#447
No description provided.