Feature Request: Additional information for healthchecks integration #241
Labels
No Label
bug
data loss
design finalized
good first issue
new feature area
question / support
security
waiting for response
No Milestone
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: borgmatic-collective/borgmatic#241
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
What I'm trying to do and why
Include borgmatic output in healthchecks events.
Other notes / implementation ideas
Currently with the healthchecks integration hook borgmatic sends a GET request on start, success and failure.
If instead of GET a POST is sent, the request body provided in the POST can be viewed in the healthchecks event log.
Per the healthchecks documentation:
I appreciate you filing this. Could you say a little more about why you'd like to see each of the borgmatic success and error output within Healthchecks? For instance, with errors I can imagine it's something like: When Healthchecks notifies me of a borgmatic error, I want to be able to quickly dig into the error in the Healthchecks UI and see exactly what went wrong, instead of having to go track down the corresponding error messages in the borgmatic logs. Is it something like that?
For successes, what's the general motivation? What's the situation in which you'd want to see success logs in Healthchecks UI? I ask because that may be less straight-forward to implement than only plumbing errors through.
Thanks!
I appreciate you filing this. Could you say a little more about why you'd like to see each of the borgmatic success and error output within Healthchecks? For instance, with errors I can imagine it's something like: When Healthchecks notifies me of a borgmatic error, I want to be able to quickly dig into the error in the Healthchecks UI and see exactly what went wrong, instead of having to go track down the corresponding error messages in the borgmatic logs. Is it something like that?
For successes, what's the general motivation? In other words, what's the situation in which you'd want to see success logs in Healthchecks UI, and what success information is most important to see there? I ask because that may be less straight-forward to implement than only plumbing errors through.
Thanks!
So you're absolutely correct in the case of an error the type of information that I want to quickly be able to access.
In the case of a success, ideally I would like to be able to see a summary of the run, ie the output of what I would see if I ran with the options
-v 1
, though happy for this to be an opt-in/configurable.Realistically I want to easily be able to see a summary of what was changed/added.
If we wanted to break this down even further perhaps in the POST for the
/start
endpoint we see the prune, ie:Then in the actual success endpoint we see the archive summary:
Though I don't particularly mind if the whole output is just shown in the success. Happy to hear your thoughts on this.
That makes sense, although in practice that information may not be readily available (currently). Here's why: All of the borgmatic/Borg output is spewed to logs (console + syslog) as it's produced. It's sort of fire-and-forget, right now. That means that by the time a backup is complete and we're ready to tell Healthchecks of the success, that log information has already been shipped off to the logging system. It's possible it could be buffered up for purposes of this ticket, but that may be a big change.
Which does make me wonder: Might using a centralized logging system be more suitable to viewing success logs than trying to shoehorn that functionality into Healthchecks? I honestly don't know the answer to that, but that's my first reaction and I thought it worth mentioning.
Also, here's maybe a more pertinent question: What's the scenario in which you'd want to dig into these success logs? When and why would you want to see a summary of what was changed/added? That would help inform the shape of the solution.
As for the error output, I think that the error itself is readily available, because we already use it for other hooks.
Mine might be an odd use case but I'll explain.
The various instances where borgmatic is run on my servers are executed via cron, in this case the stdout is passed in the mail that cron is generating, either on hourly, daily or whatever intervals I've set for the jobs to be executed.
I've effectively gone to replace the the cron mailer with healthchecks instead, so I can get the notification in other forms (namely pushover in my case).
It's much less common for me to look at the success output than it would be than to look for errors. However sometimes I might need to go back and see if a specific file was changed during a window.
If the FR isn't feasible or considered to be too complex/over-complicating, then that's okay too.
I can alternatively change the shell script which runs borgmatic to actually perform all the required healthchecks calls and just remove the hook inside borgmatic, I just preferred it all to be in one place (the borgmatic configuration) if possible.
I'll have a look to see how feasible this sort of thing would be. There's an outside chance borgmatic could "log" to Healthchecks via a log handler. It is possible though that the Healthchecks service is not intended to be used in this way!
Not sure why you might think that's the case, it's exact purpose per their documentation is:
There's obviously a reason why it's capped at 10 kilobytes. Having said that Healthchecks can also be self-hosted, so how you decide to use your own instance I guess shouldn't really be a concern.
The self-hosted thing is a good point.. I had forgotten it's open source!
A little teaser..
Okay, this is implemented and released in borgmatic 1.4.11. Docs here: https://torsion.org/borgmatic/docs/how-to/monitor-your-backups/#healthchecks-hook
Please let me know if you have any other ideas for the Healthchecks integration, or any other features for that matter.