Make borgbackup context available in on_error hook #174

Closed
opened 2019-05-14 11:43:43 +00:00 by decentral1se · 6 comments
Contributor

So, here is my current on_error definition:

on_error:
  - cat failed.txt | msmtp -a default infra@foo.com

Now, I run backups on multiple archives and I include and re-use this for all of them. The issue is, when an error happens on different archives, I get the same mail without context of which archive failed. I could not include and re-use this hook but then I would have to dupicate my hooks definition.

So, I am wondering how to pass some information like archive name, repository, time of archiving, etc. into the on_error context so that I can have it available to use for my emailing.

This could be considered feature creep. However, I already experience a lot of duplication in the borgmatic configurations (because of running multiple redundant remote repositories and different applications groups) and I think it is something borgmatic could improve on.

Thoughts?

So, here is my current `on_error` definition: ```yaml on_error: - cat failed.txt | msmtp -a default infra@foo.com ``` Now, I run backups on multiple archives and I include and re-use this for all of them. The issue is, when an error happens on different archives, I get the same mail without context of which archive failed. I could not include and re-use this hook but then I would have to dupicate my hooks definition. So, I am wondering how to pass some information like archive name, repository, time of archiving, etc. into the `on_error` context so that I can have it available to use for my emailing. This could be considered feature creep. However, I already experience a lot of duplication in the `borgmatic` configurations (because of running multiple redundant remote repositories and different applications groups) and I think it is something `borgmatic` could improve on. Thoughts?
Owner

This does sound like feature creep, but also reasonable feature creep. (Almost like a variant of Zawinski's Law except for sending email.)

In the particular location in the code that the on_error hook is executed now, its "context" doesn't have everything you're looking for. It has: configuration filename, the loaded config, and any command-line arguments. In theory, with some refactoring, it could have access to more context. It's also possible that some of that information could ride on the exception object itself.

It's probably worth thinking about exactly what you'd want to see in an ideal email, because that might drive what context gets plumbed through.

I will say that it may be worth asking what the high-level feature / use case is that you're trying to solve for. For instance, is it more broadly "I want borgmatic error notifications, so I find out when my backups are busted and can fix them"? I say that because starting from the high-level use case may lead to a different solution than the (relatively) more surgical change you mention here.

Related issue (old): witten/borgmatic#39

Also related: witten/borgmatic#126

This does sound like feature creep, but also reasonable feature creep. (Almost like a variant of [Zawinski's Law](http://catb.org/jargon/html/Z/Zawinskis-Law.html) except for sending email.) In the particular location in the code that the `on_error` hook is executed now, its "context" doesn't have everything you're looking for. It has: configuration filename, the loaded config, and any command-line arguments. In theory, with some refactoring, it could have access to more context. It's also possible that some of that information could ride on the exception object itself. It's probably worth thinking about exactly what you'd want to see in an ideal email, because that might drive what context gets plumbed through. I will say that it may be worth asking what the high-level feature / use case is that you're trying to solve for. For instance, is it more broadly "I want borgmatic error notifications, so I find out when my backups are busted and can fix them"? I say that because starting from the high-level use case may lead to a different solution than the (relatively) more surgical change you mention here. Related issue (old): https://projects.torsion.org/witten/borgmatic/issues/39 Also related: https://projects.torsion.org/witten/borgmatic/issues/126
Author
Contributor

Righto! Let's then bring this discussion to witten/borgmatic#126.

Righto! Let's then bring this discussion to https://projects.torsion.org/witten/borgmatic/issues/126.
Owner

So, I lied. I think it may be worth implementing this even with the more holistic solution described in #126. My thinking is that there are a couple of different levels of backup alerting/monitoring one might want:

  • A backup just failed. Send me an alert about it so I can look at the problem and fix it! In other words, this ticket.
  • I don't trust the backup process to monitor itself. Also check that backups are running in general via some separate mechanism. This is more #126 territory.
So, I lied. I think it may be worth implementing this *even with* the more holistic solution described in #126. My thinking is that there are a couple of different levels of backup alerting/monitoring one might want: * A backup just failed. Send me an alert about it so I can look at the problem and fix it! In other words, this ticket. * I don't trust the backup process to monitor itself. Also check that backups are running in general via some separate mechanism. This is more #126 territory.
witten reopened this issue 2019-10-01 19:07:30 +00:00
Owner

Okay, I went ahead and implemented this, released in borgmatic 1.3.23. There are now several context variables available in the on_error hook, although perhaps not all the ones you're looking for. We can always iterate. I'll post a link to the new docs when they finish deploying.

Okay, I went ahead and implemented this, released in borgmatic 1.3.23. There are now several context variables available in the `on_error` hook, although perhaps not all the ones you're looking for. We can always iterate. I'll post a link to the new docs when they finish deploying.
Owner

The docs that were promised: https://torsion.org/borgmatic/docs/how-to/inspect-your-backups/#error-alerting

Let me know how this works out for you.

The docs that were promised: https://torsion.org/borgmatic/docs/how-to/inspect-your-backups/#error-alerting Let me know how this works out for you.
Author
Contributor

Excellent, thank you, I will take a look at this and integrate it into my setup.

Excellent, thank you, I will take a look at this and integrate it into my setup.
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#174
No description provided.