#126 Create a cockpit for borgmatic

Avoinna
11 kuukautta sitten avasi henfri · 13 kommenttia

Hello,

for me the critical point regarding my backup strategy is, that 1) it needs to be fully automated 2) I need to be informed if anything goes wrong 3) I need to be able to check the current status easily

1) is fulfilled by borgmatic

2) explicitly does not mean, that I am not informed for every successful run. This is because one does not notice if one message is missing. This is currently done for me by getting the crontab output. For this it is important not to get any output in case everything is fine. It works, but I would prefer this to not be dependand on crontab but built into borgmatic

3) is yet missing. I am spoiled by crashplan: Crashplan But I do not request a GUI here… It can all be commandline output I looked at the borg documentation of available json and I suggest this structure:

during create:

archive_progress->
 archive_progress
 compressed_size
 deduplicated_size
 path

progress_percent
 message

Stats - appears once per repository:

 -percent successful backup attempts
 -average speed of last 10 backups
 -compressed_size, original_size

 last successful backup:
   -speed of last backup
   -compressed size
   -deduplicated_size
   -original_size
 
 last unsuccessful backup:
   -reason

Related issues: #53 #86

Regards, Hendrik

Hello, for me *the* critical point regarding my backup strategy is, that 1) it needs to be fully automated 2) I need to be informed if anything goes wrong 3) I need to be able to check the current status easily 1) is fulfilled by borgmatic 2) explicitly does not mean, that I am not informed for every successful run. This is because one does not notice if one message is missing. This is currently done for me by getting the crontab output. For this it is important not to get any output in case everything is fine. It works, but I would prefer this to not be dependand on crontab but built into borgmatic 3) is yet missing. I am spoiled by crashplan: ![Crashplan](https://i1.wp.com/www.accuratereviews.com/wordpress/wp-content/uploads/2015/09/crashplan.jpg) But I do not request a GUI here... It can all be commandline output I looked at the borg documentation of available json and I suggest this structure: during create: ``` archive_progress-> archive_progress compressed_size deduplicated_size path progress_percent message ``` Stats - appears once per repository: ``` -percent successful backup attempts -average speed of last 10 backups -compressed_size, original_size last successful backup: -speed of last backup -compressed size -deduplicated_size -original_size last unsuccessful backup: -reason ``` Related issues: https://projects.torsion.org/witten/borgmatic/issues/53 https://projects.torsion.org/witten/borgmatic/issues/86 Regards, Hendrik

It would require a gui or a web application no?

It would require a gui or a web application no?
henfri commented 11 kuukautta sitten
Tekijä

Hello,

I am just showing an example in the screenshot above. It could be just a text-output. No User-Interaction and nothing graphical.

Greetings, Hendrik

Hello, I am just showing an example in the screenshot above. It could be just a text-output. No User-Interaction and nothing graphical. Greetings, Hendrik

It could feasibly be implemented with some logging and a separate command that just parses that log.

It could feasibly be implemented with some logging and a separate command that just parses that log.
witten commented 7 kuukautta sitten
Omistaja

Some related discussion on #174.

Some related discussion on #174.

Hey, moving over from #174, I am interested in number 2 of this issue.

I’ve seen netdata’s alerts handler (it is a monitoring tool primarily) and it has a shiittload of them: https://github.com/netdata/netdata/blob/master/health/notifications/alarm-notify.sh.in. If borgmatic implements one handler, someone will come and ask for another.

What is acceptable for borgmatic (not feature creep) and works for us.

I think my feeling is to provide more context to on_error (see #174). The basic requirements are to know which archive was being backed up and at what time when the failure happened and then to have this available to pass to some handler (pushbullet, twilio, etc.).

However, there is a niggling feeling that this is simply a documentation issue and this could be declared out of scope for the tool … thoughts!?

Hey, moving over from https://projects.torsion.org/witten/borgmatic/issues/174, I am interested in number 2 of this issue. I've seen [netdata's](https://my-netdata.io/) alerts handler (it is a monitoring tool primarily) and it has a shiittload of them: https://github.com/netdata/netdata/blob/master/health/notifications/alarm-notify.sh.in. If borgmatic implements one handler, someone will come and ask for another. What is acceptable for borgmatic (not feature creep) and works for us. I think my feeling is to provide more context to `on_error` (see https://projects.torsion.org/witten/borgmatic/issues/174). The basic requirements are to know which archive was being backed up and at what time when the failure happened and then to have this available to pass to some handler (pushbullet, twilio, etc.). However, there is a niggling feeling that this is simply a documentation issue and this could be declared out of scope for the tool ... thoughts!?
witten commented 7 kuukautta sitten
Omistaja

Some thoughts on this:

  • I think this is a great idea, and something that users would greatly benefit from. Today, borgmatic is good at making backups, but honestly pretty bad at making sure backups happen. And that last part is really the last mile of a holistic backup solution.
  • The “need to be informed if anything goes wrong” feature seems like a good place to start, and perhaps should be broken off into another ticket. Then, once that’s done, we can focus on making the actual cockpit you look at when things to go wrong. I agree that a console cockpit may be a good first step on that front.
  • There are basically two distinct models for backup failure notifications: 1. The backup process itself is responsible for alerting the administrator (email, SMS, whatever) when a backup fails, or 2. Something completely separate from the backup process is responsible monitoring what backups appear, and alerting the administrator (email, SMS, whatever) if it looks like backups are failing or not happening for any reason.
  • Option number 2 may be safer in theory, because if your backups start silently breaking, you’ll still find out. It also has the benefit of more cleanly separating the monitoring/alerting functionality from the backup code. However, it is more moving parts, and may be more work to build than option number 1.

Thoughts/reactions?

Some thoughts on this: * I think this is a great idea, and something that users would greatly benefit from. Today, borgmatic is good at *making* backups, but honestly pretty bad at *making sure* backups happen. And that last part is really the last mile of a holistic backup solution. * The "need to be informed if anything goes wrong" feature seems like a good place to start, and perhaps should be broken off into another ticket. Then, once that's done, we can focus on making the actual cockpit you look at when things to go wrong. I agree that a console cockpit may be a good first step on that front. * There are basically two distinct models for backup failure notifications: 1. The backup process itself is responsible for alerting the administrator (email, SMS, whatever) when a backup fails, or 2. Something completely separate from the backup process is responsible monitoring what backups appear, and alerting the administrator (email, SMS, whatever) if it looks like backups are failing *or* not happening for any reason. * Option number 2 may be safer in theory, because if your backups start silently breaking, you'll still find out. It also has the benefit of more cleanly separating the monitoring/alerting functionality from the backup code. However, it is more moving parts, and may be more work to build than option number 1. Thoughts/reactions?
witten commented 7 kuukautta sitten
Omistaja

One thing that would be helpful is: When do y’all intend to use / look at the cockpit output? On every backup? Only when things go wrong, you get alerted, and you need to dig in? Or some other time?

One thing that would be helpful is: When do y'all intend to use / look at the cockpit output? On every backup? Only when things go wrong, you get alerted, and you need to dig in? Or some other time?
henfri commented 7 kuukautta sitten
Tekijä

Hello,

yes, clearly option 2 is safer, but it also means ‘starting from scratch’…

On the ‘when’: The notifications should clearly follow a ‘lights out philosophy’. No message means, everything is good. If one gets flooded by daily status mails, one will start ignoring them. This requires of course, that the system runs reilably. So, maybe a watchdog (did borgmatic actually do something?) would be good. Otherwise, the lights out philosophy can go very wrong.

Greetings, Hendrik

Hello, yes, clearly option 2 is safer, but it also means 'starting from scratch'... On the 'when': The notifications should clearly follow a 'lights out philosophy'. No message means, everything is good. If one gets flooded by daily status mails, one will start ignoring them. This requires of course, that the system runs reilably. So, maybe a watchdog (did borgmatic actually do something?) would be good. Otherwise, the lights out philosophy can go very wrong. Greetings, Hendrik
witten commented 6 kuukautta sitten
Omistaja

Yup, that philosophy makes sense to me.

Yup, that philosophy makes sense to me.
witten commented 2 kuukautta sitten
Omistaja

FYI, I reopened and implemented #174 with the idea that it carves off a piece of the ask in this ticket (#126): More immediate alerting when a backup fails.

Still to do: Separate monitoring + cockpit.

FYI, I reopened and implemented #174 with the idea that it carves off a piece of the ask in this ticket (#126): More immediate alerting when a backup fails. Still to do: Separate monitoring + cockpit.
witten commented 2 kuukautta sitten
Omistaja

Note that #86 is now implemented. That feature supports one approach to the “separate monitoring” ask, which is why I’m mentioning it here.

Note that #86 is now implemented. That feature supports one approach to the "separate monitoring" ask, which is why I'm mentioning it here.
witten commented 1 kuukausi sitten
Omistaja

Okay, I implemented #223 (dead man’s switch via Healthchecks integration), and I also wrote up docs on a number of options for borgmatic monitoring and alerting. Feedback is welcome, tickets on new variants of monitoring/alerting are welcome, but I’m going to consider the “separate monitoring” ask in this ticket to be done for now.

Still to do: Cockpit.

Okay, I implemented #223 (dead man's switch via [Healthchecks](https://healthchecks.io/) integration), and I also wrote up docs on a number of options for borgmatic [monitoring and alerting](https://torsion.org/borgmatic/docs/how-to/monitor-your-backups/). Feedback is welcome, tickets on new variants of monitoring/alerting are welcome, but I'm going to consider the "separate monitoring" ask in this ticket to be done for now. Still to do: Cockpit.
henfri commented 1 kuukausi sitten
Tekijä

Great! I will try. Thanks!

Great! I will try. Thanks!
Sign in to join this conversation.
Ei merkkipaalua
No Assignees
5 osallistujaa
Due Date

No due date set.

Dependencies

This issue currently doesn't have any dependencies.

Loading…
Peruuta
Tallenna
Sisältöä ei vielä ole.