Feature request for on_backup_error #270

Closed
opened 2019-12-05 10:12:34 +00:00 by NielsH · 4 comments

Hello!

Currently the on_error hook also runs when something goes wrong with a restore, i.e.:

borgmatic extract --archive myserver-2019-12-03T13:31:04.485009 --path srv/backups/mysql/db.sql.gz --destination /root/restoretest/

Will fail if /root/restoretest does not yet exist with the following error:

ssh://borg@<ip>:22/./cfb95fef-65f2-4c28-b68f-93e798c8ec71: Error running actions for repository
[Errno 2] No such file or directory: '/root/restoretest/'
/etc/borgmatic.d/backup02-cfb95fef-65f2-4c28-b68f-93e798c8ec71.yaml: Error running on-error hook
Command 'test -x /usr/lib/nagios/plugins/custom/check_borgmatic.sh && /usr/lib/nagios/plugins/custom/check_borgmatic.sh report 'Error during backup creation' 2 > /dev/null' returned non-zero exit status 2
/etc/borgmatic.d/backup02-cfb95fef-65f2-4c28-b68f-93e798c8ec71.yaml: Error running configuration file

summary:
/etc/borgmatic.d/backup02-cfb95fef-65f2-4c28-b68f-93e798c8ec71.yaml: Error running configuration file
ssh://borg@<ip>:22/./cfb95fef-65f2-4c28-b68f-93e798c8ec71: Error running actions for repository
[Errno 2] No such file or directory: '/root/restoretest/'
/etc/borgmatic.d/backup02-cfb95fef-65f2-4c28-b68f-93e798c8ec71.yaml: Error running on-error hook
Command 'test -x /usr/lib/nagios/plugins/custom/check_borgmatic.sh && /usr/lib/nagios/plugins/custom/check_borgmatic.sh report 'Error during backup creation' 2 > /dev/null' returned non-zero exit status 2

Need some help? https://torsion.org/borgmatic/#issues

The fact that /root/restoretest does not yet exist is my fault of course; I should create it first. But I would rather not have the on_error hook trigger because of it; because it sends a message to nagios that something went wrong with the backup creation. I only mean to do this for when we create backups, prune backups or check them. Not for restore tasks.

Is it possible to filter this?

Thank you!
Niels

Hello! Currently the `on_error` hook also runs when something goes wrong with a restore, i.e.: ``` borgmatic extract --archive myserver-2019-12-03T13:31:04.485009 --path srv/backups/mysql/db.sql.gz --destination /root/restoretest/ ``` Will fail if `/root/restoretest` does not yet exist with the following error: ``` ssh://borg@<ip>:22/./cfb95fef-65f2-4c28-b68f-93e798c8ec71: Error running actions for repository [Errno 2] No such file or directory: '/root/restoretest/' /etc/borgmatic.d/backup02-cfb95fef-65f2-4c28-b68f-93e798c8ec71.yaml: Error running on-error hook Command 'test -x /usr/lib/nagios/plugins/custom/check_borgmatic.sh && /usr/lib/nagios/plugins/custom/check_borgmatic.sh report 'Error during backup creation' 2 > /dev/null' returned non-zero exit status 2 /etc/borgmatic.d/backup02-cfb95fef-65f2-4c28-b68f-93e798c8ec71.yaml: Error running configuration file summary: /etc/borgmatic.d/backup02-cfb95fef-65f2-4c28-b68f-93e798c8ec71.yaml: Error running configuration file ssh://borg@<ip>:22/./cfb95fef-65f2-4c28-b68f-93e798c8ec71: Error running actions for repository [Errno 2] No such file or directory: '/root/restoretest/' /etc/borgmatic.d/backup02-cfb95fef-65f2-4c28-b68f-93e798c8ec71.yaml: Error running on-error hook Command 'test -x /usr/lib/nagios/plugins/custom/check_borgmatic.sh && /usr/lib/nagios/plugins/custom/check_borgmatic.sh report 'Error during backup creation' 2 > /dev/null' returned non-zero exit status 2 Need some help? https://torsion.org/borgmatic/#issues ``` The fact that `/root/restoretest` does not yet exist is my fault of course; I should create it first. But I would rather not have the `on_error` hook trigger because of it; because it sends a message to nagios that something went wrong with the backup creation. I only mean to do this for when we create backups, prune backups or check them. Not for restore tasks. Is it possible to filter this? Thank you! Niels
witten added this to the per-action hooks milestone 2019-12-05 21:17:13 +00:00
Owner

There have been a couple of other tickets recently about making hooks more granular / per-action, so this fits right in with that. I'll have to think about the migration path (if any) for the existing on_error hook if we add on_backup_error that just fires on creates, for instance.

Another option would be to support filtering of error triggering when an interactive command is run like extract. See witten/borgmatic#249 for more discussion on this approach.

Side note: I'd support a more formal Nagios integration with borgmatic if that'd be useful.

There have been a couple of other tickets recently about making hooks more granular / per-action, so this fits right in with that. I'll have to think about the migration path (if any) for the existing `on_error` hook if we add `on_backup_error` that just fires on creates, for instance. Another option would be to support filtering of error triggering when an interactive command is run like `extract`. See https://projects.torsion.org/witten/borgmatic/issues/249 for more discussion on this approach. Side note: I'd support a more formal Nagios integration with borgmatic if that'd be useful.
Author

Cheers,

If there are already different discussions feel free to close this one :)
Another option would be having (environment) variables that we can expose to the on_error script (i.e. as a parameter to the script) that I could use within the on_error script itself to determine what action it has to take. That parameter could include something like the type of tasks (backup, prune, etc), optional err. msg, etc...

Formal Nagios integration would be great but perhaps would be tricky to implement since people have different use-cases?
For us, after the backup is done we parse a list with the "borgmatic list" output, parse for successful backups and report the most recent backup name/date to the monitoring system through its API.
For us specifically we use Icinga and not Nagios, and the reporting is done as a passive Icinga check.

Cheers, If there are already different discussions feel free to close this one :) Another option would be having (environment) variables that we can expose to the on_error script (i.e. as a parameter to the script) that I could use within the on_error script itself to determine what action it has to take. That parameter could include something like the type of tasks (backup, prune, etc), optional err. msg, etc... Formal Nagios integration would be great but perhaps would be tricky to implement since people have different use-cases? For us, after the backup is done we parse a list with the "borgmatic list" output, parse for successful backups and report the most recent backup name/date to the monitoring system through its API. For us specifically we use Icinga and not Nagios, and the reporting is done as a passive Icinga check.
Owner

I'll leave this open to make sure I address the need, even if the same solution ends up satisfying multiple tickets. Point taken about the parameter to the script.. That could certainly work.

Understood about Nagios/Icinga. Feel free to file tickets if anything in borgmatic could make that monitoring integration easier.

I'll leave this open to make sure I address the need, even if the same solution ends up satisfying multiple tickets. Point taken about the parameter to the script.. That could certainly work. Understood about Nagios/Icinga. Feel free to file tickets if anything in borgmatic could make that monitoring integration easier.
Owner

This is released now in borgmatic 1.4.21. I went with the approach of simply restricting the on_error hook to trigger only for prune, create, and check actions. Please let me know how it works out for you.

This is released now in borgmatic 1.4.21. I went with the approach of simply restricting the `on_error` hook to trigger only for `prune`, `create`, and `check` actions. Please let me know how it works out for you.
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#270
No description provided.