Feature Request: Additional information for healthchecks integration #241

New Issue

nightah · 2019-11-11T06:28:54Z

nightah commented

2019-11-11 06:28:54 +00:00

What I'm trying to do and why

Include borgmatic output in healthchecks events.

Other notes / implementation ideas

Currently with the healthchecks integration hook borgmatic sends a GET request on start, success and failure.

If instead of GET a POST is sent, the request body provided in the POST can be viewed in the healthchecks event log.

Per the healthchecks documentation:

Request method can be GET, POST or HEAD

For HTTP POST requests, you can include additional diagnostic information for your own reference in the request body. If the request body looks like a UTF-8 string, Healthchecks.io will log the first 10 kilobytes of the request body, so you can inspect it later.

#### What I'm trying to do and why Include borgmatic output in healthchecks events. #### Other notes / implementation ideas Currently with the healthchecks integration hook borgmatic sends a GET request on start, success and failure. If instead of GET a POST is sent, the request body provided in the POST can be viewed in the [healthchecks event log](https://nerv.cx/fx1C3). Per the [healthchecks documentation](https://healthchecks.io/docs/): ``` Request method can be GET, POST or HEAD For HTTP POST requests, you can include additional diagnostic information for your own reference in the request body. If the request body looks like a UTF-8 string, Healthchecks.io will log the first 10 kilobytes of the request body, so you can inspect it later. ```

witten commented

2019-11-11 17:17:47 +00:00

I appreciate you filing this. Could you say a little more about why you'd like to see each of the borgmatic success and error output within Healthchecks? For instance, with errors I can imagine it's something like: When Healthchecks notifies me of a borgmatic error, I want to be able to quickly dig into the error in the Healthchecks UI and see exactly what went wrong, instead of having to go track down the corresponding error messages in the borgmatic logs. Is it something like that?

For successes, what's the general motivation? What's the situation in which you'd want to see success logs in Healthchecks UI? I ask because that may be less straight-forward to implement than only plumbing errors through.

Thanks!

I appreciate you filing this. Could you say a little more about why you'd like to see each of the borgmatic success and error output within Healthchecks? For instance, with errors I can imagine it's something like: When Healthchecks notifies me of a borgmatic error, I want to be able to quickly dig into the error in the Healthchecks UI and see exactly what went wrong, instead of having to go track down the corresponding error messages in the borgmatic logs. Is it something like that? For successes, what's the general motivation? What's the situation in which you'd want to see success logs in Healthchecks UI? I ask because that may be less straight-forward to implement than only plumbing errors through. Thanks!

witten commented

2019-11-11 17:20:27 +00:00

I appreciate you filing this. Could you say a little more about why you'd like to see each of the borgmatic success and error output within Healthchecks? For instance, with errors I can imagine it's something like: When Healthchecks notifies me of a borgmatic error, I want to be able to quickly dig into the error in the Healthchecks UI and see exactly what went wrong, instead of having to go track down the corresponding error messages in the borgmatic logs. Is it something like that?

For successes, what's the general motivation? In other words, what's the situation in which you'd want to see success logs in Healthchecks UI, and what success information is most important to see there? I ask because that may be less straight-forward to implement than only plumbing errors through.

Thanks!

I appreciate you filing this. Could you say a little more about why you'd like to see each of the borgmatic success and error output within Healthchecks? For instance, with errors I can imagine it's something like: When Healthchecks notifies me of a borgmatic error, I want to be able to quickly dig into the error in the Healthchecks UI and see exactly what went wrong, instead of having to go track down the corresponding error messages in the borgmatic logs. Is it something like that? For successes, what's the general motivation? In other words, what's the situation in which you'd want to see success logs in Healthchecks UI, and what success information is most important to see there? I ask because that may be less straight-forward to implement than only plumbing errors through. Thanks!

witten closed this issue

2019-11-11 17:20:27 +00:00

witten reopened this issue

2019-11-11 17:20:33 +00:00

nightah commented

2019-11-12 01:26:35 +00:00

So you're absolutely correct in the case of an error the type of information that I want to quickly be able to access.

In the case of a success, ideally I would like to be able to see a summary of the run, ie the output of what I would see if I ran with the options -v 1, though happy for this to be an opt-in/configurable.

Realistically I want to easily be able to see a summary of what was changed/added.

If we wanted to break this down even further perhaps in the POST for the /start endpoint we see the prune, ie:

Mon 11 Nov 2019 01:05:50 PM AEDT - Starting a backup job.
/mnt/nerv01/.backups/borg/arch/: Pruning archives
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
Deleted data:             -346.96 GB           -270.59 GB           -632.75 MB
All archives:               63.00 TB             48.92 TB            637.62 GB
                       Unique chunks         Total chunks
Chunk index:                 1900290            136771059
------------------------------------------------------------------------------

Then in the actual success endpoint we see the archive summary:

/mnt/nerv01/.backups/borg/arch/: Creating archive
Creating archive at "/mnt/nerv01/.backups/borg/arch/::{hostname} {now:%d-%m-%Y %H:%M:%S}"
M /etc/.etckeeper
M /etc/.git/index
M /etc/.git/refs/heads/master
A /etc/.git/objects/7a/ab50fcece95508d25c7e07ce787e97bcba8a8e
A /etc/.git/objects/84/337598fd3d4baa40fe27114898945bae61587e
A /etc/.git/objects/3d/2357bbc3c1d48181e42af7184723bdd8a4fdf3
M /etc/.git/objects/d5/52a70b879de8fe49c2d47f8a403eb4fae7cc3e
A /etc/.git/objects/4a/24172bd4a033a71356b24347be4c089d24a0ac
A /etc/.git/objects/b6/9b261626e9cd3f3514cb7ac410461b8aa4b48d
A /etc/.git/objects/16/f20ef5fa628b60c793f818048dbea60dd51da0
M /etc/.git/COMMIT_EDITMSG
M /etc/.git/logs/HEAD
M /etc/.git/logs/refs/heads/master
------------------------------------------------------------------------------
Archive name: nerv 11-11-2019 13:10:27
Archive fingerprint: 922f4264cac0478a711c392f5813287fffb67a013da92cf6a77fb7d85d6cd86a
Time (start): Mon, 2019-11-11 13:10:27
Time (end):   Mon, 2019-11-11 13:15:40
Duration: 5 minutes 12.84 seconds
Number of files: 532042
Utilization of max. archive size: 0%
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
This archive:              351.81 GB            274.42 GB            382.79 MB
All archives:               63.36 TB             49.20 TB            638.01 GB
                       Unique chunks         Total chunks
Chunk index:                 1901152            137418308
------------------------------------------------------------------------------
/mnt/nerv01/.backups/borg/arch/: Running consistency checks
/etc/borgmatic/config.yaml: Running command for post-backup hook
Mon 11 Nov 2019 01:15:46 PM AEDT - Backup created.

Though I don't particularly mind if the whole output is just shown in the success. Happy to hear your thoughts on this.

So you're absolutely correct in the case of an error the type of information that I want to quickly be able to access. In the case of a success, ideally I would like to be able to see a summary of the run, ie the output of what I would see if I ran with the options `-v 1`, though happy for this to be an opt-in/configurable. Realistically I want to easily be able to see a summary of what was changed/added. If we wanted to break this down even further perhaps in the POST for the `/start` endpoint we see the prune, ie: ``` Mon 11 Nov 2019 01:05:50 PM AEDT - Starting a backup job. /mnt/nerv01/.backups/borg/arch/: Pruning archives ------------------------------------------------------------------------------ Original size Compressed size Deduplicated size Deleted data: -346.96 GB -270.59 GB -632.75 MB All archives: 63.00 TB 48.92 TB 637.62 GB Unique chunks Total chunks Chunk index: 1900290 136771059 ------------------------------------------------------------------------------ ``` Then in the actual success endpoint we see the archive summary: ``` /mnt/nerv01/.backups/borg/arch/: Creating archive Creating archive at "/mnt/nerv01/.backups/borg/arch/::{hostname} {now:%d-%m-%Y %H:%M:%S}" M /etc/.etckeeper M /etc/.git/index M /etc/.git/refs/heads/master A /etc/.git/objects/7a/ab50fcece95508d25c7e07ce787e97bcba8a8e A /etc/.git/objects/84/337598fd3d4baa40fe27114898945bae61587e A /etc/.git/objects/3d/2357bbc3c1d48181e42af7184723bdd8a4fdf3 M /etc/.git/objects/d5/52a70b879de8fe49c2d47f8a403eb4fae7cc3e A /etc/.git/objects/4a/24172bd4a033a71356b24347be4c089d24a0ac A /etc/.git/objects/b6/9b261626e9cd3f3514cb7ac410461b8aa4b48d A /etc/.git/objects/16/f20ef5fa628b60c793f818048dbea60dd51da0 M /etc/.git/COMMIT_EDITMSG M /etc/.git/logs/HEAD M /etc/.git/logs/refs/heads/master ------------------------------------------------------------------------------ Archive name: nerv 11-11-2019 13:10:27 Archive fingerprint: 922f4264cac0478a711c392f5813287fffb67a013da92cf6a77fb7d85d6cd86a Time (start): Mon, 2019-11-11 13:10:27 Time (end): Mon, 2019-11-11 13:15:40 Duration: 5 minutes 12.84 seconds Number of files: 532042 Utilization of max. archive size: 0% ------------------------------------------------------------------------------ Original size Compressed size Deduplicated size This archive: 351.81 GB 274.42 GB 382.79 MB All archives: 63.36 TB 49.20 TB 638.01 GB Unique chunks Total chunks Chunk index: 1901152 137418308 ------------------------------------------------------------------------------ /mnt/nerv01/.backups/borg/arch/: Running consistency checks /etc/borgmatic/config.yaml: Running command for post-backup hook Mon 11 Nov 2019 01:15:46 PM AEDT - Backup created. ``` Though I don't particularly mind if the whole output is just shown in the success. Happy to hear your thoughts on this.

witten commented

2019-11-12 17:08:24 +00:00

That makes sense, although in practice that information may not be readily available (currently). Here's why: All of the borgmatic/Borg output is spewed to logs (console + syslog) as it's produced. It's sort of fire-and-forget, right now. That means that by the time a backup is complete and we're ready to tell Healthchecks of the success, that log information has already been shipped off to the logging system. It's possible it could be buffered up for purposes of this ticket, but that may be a big change.

Which does make me wonder: Might using a centralized logging system be more suitable to viewing success logs than trying to shoehorn that functionality into Healthchecks? I honestly don't know the answer to that, but that's my first reaction and I thought it worth mentioning.

Also, here's maybe a more pertinent question: What's the scenario in which you'd want to dig into these success logs? When and why would you want to see a summary of what was changed/added? That would help inform the shape of the solution.

As for the error output, I think that the error itself is readily available, because we already use it for other hooks.

That makes sense, although in practice that information may not be readily available (currently). Here's why: All of the borgmatic/Borg output is spewed to logs (console + syslog) as it's produced. It's sort of fire-and-forget, right now. That means that by the time a backup is complete and we're ready to tell Healthchecks of the success, that log information has already been shipped off to the logging system. It's possible it could be buffered up for purposes of this ticket, but that may be a big change. Which does make me wonder: Might using a centralized logging system be more suitable to viewing success logs than trying to shoehorn that functionality into Healthchecks? I honestly don't know the answer to that, but that's my first reaction and I thought it worth mentioning. Also, here's maybe a more pertinent question: What's the scenario in which you'd want to dig into these success logs? When and why would you want to see a summary of what was changed/added? That would help inform the shape of the solution. As for the error output, I think that the error itself *is* readily available, because we already use it for other hooks.

nightah commented

2019-11-12 23:30:08 +00:00

Mine might be an odd use case but I'll explain.

The various instances where borgmatic is run on my servers are executed via cron, in this case the stdout is passed in the mail that cron is generating, either on hourly, daily or whatever intervals I've set for the jobs to be executed.

I've effectively gone to replace the the cron mailer with healthchecks instead, so I can get the notification in other forms (namely pushover in my case).

It's much less common for me to look at the success output than it would be than to look for errors. However sometimes I might need to go back and see if a specific file was changed during a window.

If the FR isn't feasible or considered to be too complex/over-complicating, then that's okay too.

I can alternatively change the shell script which runs borgmatic to actually perform all the required healthchecks calls and just remove the hook inside borgmatic, I just preferred it all to be in one place (the borgmatic configuration) if possible.

Mine might be an odd use case but I'll explain. The various instances where borgmatic is run on my servers are executed via cron, in this case the stdout is passed in the mail that cron is generating, either on hourly, daily or whatever intervals I've set for the jobs to be executed. I've effectively gone to replace the the cron mailer with healthchecks instead, so I can get the notification in other forms (namely pushover in my case). It's much less common for me to look at the success output than it would be than to look for errors. However sometimes I might need to go back and see if a specific file was changed during a window. If the FR isn't feasible or considered to be too complex/over-complicating, then that's okay too. I can alternatively change the shell script which runs borgmatic to actually perform all the required healthchecks calls and just remove the hook inside borgmatic, I just preferred it all to be in one place (the borgmatic configuration) if possible.

witten commented

2019-11-13 16:49:37 +00:00

I'll have a look to see how feasible this sort of thing would be. There's an outside chance borgmatic could "log" to Healthchecks via a log handler. It is possible though that the Healthchecks service is not intended to be used in this way!

I'll have a look to see how feasible this sort of thing would be. There's an outside chance borgmatic could "log" to Healthchecks via [a log handler](https://docs.python.org/3/library/logging.handlers.html#httphandler). It is possible though that the Healthchecks service is not intended to be used in this way!

nightah commented

2019-11-13 23:48:48 +00:00

Not sure why you might think that's the case, it's exact purpose per their documentation is:

For HTTP POST requests, you can include additional diagnostic information for your own reference in the request body. If the request body looks like a UTF-8 string, Healthchecks.io will log the first 10 kilobytes of the request body, so you can inspect it later.

There's obviously a reason why it's capped at 10 kilobytes. Having said that Healthchecks can also be self-hosted, so how you decide to use your own instance I guess shouldn't really be a concern.

Not sure why you might think that's the case, it's exact purpose per their documentation is: >For HTTP POST requests, you can include additional diagnostic information for your own reference in the request body. If the request body looks like a UTF-8 string, Healthchecks.io will log the first 10 kilobytes of the request body, so you can inspect it later. There's obviously a reason why it's capped at 10 kilobytes. Having said that Healthchecks can also be self-hosted, so how you decide to use your own instance I guess shouldn't really be a concern.

witten commented

2019-11-14 04:11:11 +00:00

The self-hosted thing is a good point.. I had forgotten it's open source!

witten commented

2019-11-16 00:43:59 +00:00

A little teaser..

teaser.png

88 KiB

witten referenced this issue from a commit

2019-11-18 00:56:44 +00:00

When using the Healthchecks monitoring hook, include borgmatic logs in the payloads for completion and failure pings (#241).

witten commented

2019-11-18 01:20:13 +00:00

Okay, this is implemented and released in borgmatic 1.4.11. Docs here: https://torsion.org/borgmatic/docs/how-to/monitor-your-backups/#healthchecks-hook

Please let me know if you have any other ideas for the Healthchecks integration, or any other features for that matter.

Okay, this is implemented and released in borgmatic 1.4.11. Docs here: https://torsion.org/borgmatic/docs/how-to/monitor-your-backups/#healthchecks-hook Please let me know if you have any other ideas for the Healthchecks integration, or any other features for that matter.

witten closed this issue

2019-11-18 01:20:13 +00:00

nightah referenced this issue

2020-02-18 07:47:18 +00:00

Configurable json body size for healthchecks hooks (don't truncate) #294

Sign in to join this conversation.