for me the critical point regarding my backup strategy is, that
1) it needs to be fully automated
2) I need to be informed if anything goes wrong
3) I need to be able to check the current status easily
1) is fulfilled by borgmatic
2) explicitly does not mean, that I am not informed for every successful run. This is because one does not notice if one message is missing.
This is currently done for me by getting the crontab output. For this it is important not to get any output in case everything is fine. It works, but I would prefer this to not be dependand on crontab but built into borgmatic
3) is yet missing. I am spoiled by crashplan:
But I do not request a GUI here… It can all be commandline output
I looked at the borg documentation of available json and I suggest this structure:
Stats - appears once per repository:
-percent successful backup attempts
-average speed of last 10 backups
last successful backup:
-speed of last backup
last unsuccessful backup:
It would require a gui or a web application no?
I am just showing an example in the screenshot above.
It could be just a text-output. No User-Interaction and nothing graphical.
It could feasibly be implemented with some logging and a separate command that just parses that log.
Some related discussion on #174.
Hey, moving over from #174, I am interested in number 2 of this issue.
I’ve seen netdata’s alerts handler (it is a monitoring tool primarily) and it has a shiittload of them: https://github.com/netdata/netdata/blob/master/health/notifications/alarm-notify.sh.in. If borgmatic implements one handler, someone will come and ask for another.
What is acceptable for borgmatic (not feature creep) and works for us.
I think my feeling is to provide more context to on_error (see #174). The basic requirements are to know which archive was being backed up and at what time when the failure happened and then to have this available to pass to some handler (pushbullet, twilio, etc.).
However, there is a niggling feeling that this is simply a documentation issue and this could be declared out of scope for the tool … thoughts!?
Some thoughts on this:
One thing that would be helpful is: When do y’all intend to use / look at the cockpit output? On every backup? Only when things go wrong, you get alerted, and you need to dig in? Or some other time?
yes, clearly option 2 is safer, but it also means ‘starting from scratch’…
On the ‘when’: The notifications should clearly follow a ‘lights out philosophy’. No message means, everything is good. If one gets flooded by daily status mails, one will start ignoring them.
This requires of course, that the system runs reilably.
So, maybe a watchdog (did borgmatic actually do something?) would be good. Otherwise, the lights out philosophy can go very wrong.
Yup, that philosophy makes sense to me.
FYI, I reopened and implemented #174 with the idea that it carves off a piece of the ask in this ticket (#126): More immediate alerting when a backup fails.
Still to do: Separate monitoring + cockpit.
Note that #86 is now implemented. That feature supports one approach to the “separate monitoring” ask, which is why I’m mentioning it here.
Okay, I implemented #223 (dead man’s switch via Healthchecks integration), and I also wrote up docs on a number of options for borgmatic monitoring and alerting. Feedback is welcome, tickets on new variants of monitoring/alerting are welcome, but I’m going to consider the “separate monitoring” ask in this ticket to be done for now.
Still to do: Cockpit.
I will try. Thanks!
No due date set.
This issue currently doesn't have any dependencies.
Deleting a branch is permanent. It CANNOT be undone. Continue?