Logs missing from healthchecks failures? #764

Closed
opened 2023-10-04 16:57:09 +00:00 by Daniel · 8 comments

What I'm trying to do and why

I'm monitoring my backups using a self-hosted instance of healthchecks.io. I'd like to be able to see the reason why a backup failed directly within healthchecks.

Steps to reproduce

Include a healthchecks ping_url in the Borgmatic config

Actual behavior

This is the entire output logged in healthchecks:

ssh://backup-{hostname}@backups02.example.com/data/backup/{hostname}: Excluding special files to prevent Borg from hanging: /etc/systemd/system/iscsid.service, /etc/systemd/system/open-iscsi.service
ssh://backup-{hostname}@backups02.example.com/data/backup/{hostname}: Error running actions for repository
Command 'borg create --exclude-from /tmp/tmp39756hf3 --exclude-if-present .nobackup --compression zstd --one-file-system --read-special ssh://backup-{hostname}@backups02.example.com/data/backup/{hostname}::{hostname}-{now:%Y-%m-%dT%H:%M:%S.%f} /etc /home /root /usr/local/bin /usr/local/sbin /var/backups /var/lib/docker/volumes /var/lib/dpkg/status /var/local /var/spool/cron/crontabs /var/www --exclude-from /tmp/tmp39756hf3 --exclude-if-present .nobackup --info' returned non-zero exit status 2.

However, if I view the logs on the system itself, there's more details:

ct  4 00:27:13 la03 borgmatic: INFO Remote: ssh: Could not resolve hostname backups02.example.com: Name or service not known
Oct  4 00:27:13 la03 borgmatic: INFO Connection closed by remote host. Is borg working on the server?
Oct  4 00:27:13 la03 borgmatic: CRITICAL ssh://backup-{hostname}@backups02.example.com/data/backup/{hostname}: Error running actions for repository
Oct  4 00:27:13 la03 borgmatic: CRITICAL Command 'borg create --exclude-from /tmp/tmp39756hf3 --exclude-if-present .nobackup --compression zstd --one-file-system --read-special ssh://backup-{hostname}@backups02.example.com/data/backup/{hostname}::{hostname}-{now:%Y-%m-%dT%H:%M:%S.%f} /etc /home /root /usr/local/bin /usr/local/sbin /var/backups /var/lib/docker/volumes /var/lib/dpkg/status /var/local /var/spool/cron/crontabs /var/www --exclude-from /tmp/tmp39756hf3 --exclude-if-present .nobackup --info' returned non-zero exit status 2.
Oct  4 00:27:13 la03 borgmatic: CRITICAL /etc/borgmatic/config.yaml: Error running configuration file
Oct  4 00:27:13 la03 borgmatic: CRITICAL
Oct  4 00:27:13 la03 borgmatic: CRITICAL summary:
Oct  4 00:27:13 la03 borgmatic: CRITICAL /etc/borgmatic/config.yaml: Error running configuration file
Oct  4 00:27:13 la03 borgmatic: CRITICAL ssh://backup-{hostname}@backups02.example.com/data/backup/{hostname}: Error running actions for repository
Oct  4 00:27:13 la03 borgmatic: CRITICAL Remote: ssh: Could not resolve hostname backups02.example.com: Name or service not known#012Connection closed by remote host. Is borg working on the server?
Oct  4 00:27:13 la03 borgmatic: CRITICAL Command 'borg create --exclude-from /tmp/tmp39756hf3 --exclude-if-present .nobackup --compression zstd --one-file-system --read-special ssh://backup-{hostname}@backups02.example.com/data/backup/{hostname}::{hostname}-{now:%Y-%m-%dT%H:%M:%S.%f} /etc /home /root /usr/local/bin /usr/local/sbin /var/backups /var/lib/docker/volumes /var/lib/dpkg/status /var/local /var/spool/cron/crontabs /var/www --exclude-from /tmp/tmp39756hf3 --exclude-if-present .nobackup --info' returned non-zero exit status 2.
Oct  4 00:27:13 la03 borgmatic: CRITICAL
Oct  4 00:27:13 la03 borgmatic: CRITICAL Need some help? https://torsion.org/borgmatic/#issues

Expected behavior

The actual cause of the failure should be logged to healthchecks:

Oct  4 00:27:13 la03 borgmatic: CRITICAL Remote: ssh: Could not resolve hostname backups02.example.com: Name or service not known#012Connection closed by remote host. Is borg working on the server?

Other notes / implementation ideas

No response

borgmatic version

1.7.4

borgmatic installation method

Debian package

Borg version

1.2.3

Python version

3.9.2

Database version (if applicable)

MySQL 8.0.33

Operating system and version

Debian 11 (Bullseye)

### What I'm trying to do and why I'm monitoring my backups using a self-hosted instance of healthchecks.io. I'd like to be able to see the reason why a backup failed directly within healthchecks. ### Steps to reproduce Include a healthchecks ping_url in the Borgmatic config ### Actual behavior This is the entire output logged in healthchecks: ``` ssh://backup-{hostname}@backups02.example.com/data/backup/{hostname}: Excluding special files to prevent Borg from hanging: /etc/systemd/system/iscsid.service, /etc/systemd/system/open-iscsi.service ssh://backup-{hostname}@backups02.example.com/data/backup/{hostname}: Error running actions for repository Command 'borg create --exclude-from /tmp/tmp39756hf3 --exclude-if-present .nobackup --compression zstd --one-file-system --read-special ssh://backup-{hostname}@backups02.example.com/data/backup/{hostname}::{hostname}-{now:%Y-%m-%dT%H:%M:%S.%f} /etc /home /root /usr/local/bin /usr/local/sbin /var/backups /var/lib/docker/volumes /var/lib/dpkg/status /var/local /var/spool/cron/crontabs /var/www --exclude-from /tmp/tmp39756hf3 --exclude-if-present .nobackup --info' returned non-zero exit status 2. ``` However, if I view the logs on the system itself, there's more details: ``` ct 4 00:27:13 la03 borgmatic: INFO Remote: ssh: Could not resolve hostname backups02.example.com: Name or service not known Oct 4 00:27:13 la03 borgmatic: INFO Connection closed by remote host. Is borg working on the server? Oct 4 00:27:13 la03 borgmatic: CRITICAL ssh://backup-{hostname}@backups02.example.com/data/backup/{hostname}: Error running actions for repository Oct 4 00:27:13 la03 borgmatic: CRITICAL Command 'borg create --exclude-from /tmp/tmp39756hf3 --exclude-if-present .nobackup --compression zstd --one-file-system --read-special ssh://backup-{hostname}@backups02.example.com/data/backup/{hostname}::{hostname}-{now:%Y-%m-%dT%H:%M:%S.%f} /etc /home /root /usr/local/bin /usr/local/sbin /var/backups /var/lib/docker/volumes /var/lib/dpkg/status /var/local /var/spool/cron/crontabs /var/www --exclude-from /tmp/tmp39756hf3 --exclude-if-present .nobackup --info' returned non-zero exit status 2. Oct 4 00:27:13 la03 borgmatic: CRITICAL /etc/borgmatic/config.yaml: Error running configuration file Oct 4 00:27:13 la03 borgmatic: CRITICAL Oct 4 00:27:13 la03 borgmatic: CRITICAL summary: Oct 4 00:27:13 la03 borgmatic: CRITICAL /etc/borgmatic/config.yaml: Error running configuration file Oct 4 00:27:13 la03 borgmatic: CRITICAL ssh://backup-{hostname}@backups02.example.com/data/backup/{hostname}: Error running actions for repository Oct 4 00:27:13 la03 borgmatic: CRITICAL Remote: ssh: Could not resolve hostname backups02.example.com: Name or service not known#012Connection closed by remote host. Is borg working on the server? Oct 4 00:27:13 la03 borgmatic: CRITICAL Command 'borg create --exclude-from /tmp/tmp39756hf3 --exclude-if-present .nobackup --compression zstd --one-file-system --read-special ssh://backup-{hostname}@backups02.example.com/data/backup/{hostname}::{hostname}-{now:%Y-%m-%dT%H:%M:%S.%f} /etc /home /root /usr/local/bin /usr/local/sbin /var/backups /var/lib/docker/volumes /var/lib/dpkg/status /var/local /var/spool/cron/crontabs /var/www --exclude-from /tmp/tmp39756hf3 --exclude-if-present .nobackup --info' returned non-zero exit status 2. Oct 4 00:27:13 la03 borgmatic: CRITICAL Oct 4 00:27:13 la03 borgmatic: CRITICAL Need some help? https://torsion.org/borgmatic/#issues ``` ### Expected behavior The actual cause of the failure should be logged to healthchecks: ``` Oct 4 00:27:13 la03 borgmatic: CRITICAL Remote: ssh: Could not resolve hostname backups02.example.com: Name or service not known#012Connection closed by remote host. Is borg working on the server? ``` ### Other notes / implementation ideas _No response_ ### borgmatic version 1.7.4 ### borgmatic installation method Debian package ### Borg version 1.2.3 ### Python version 3.9.2 ### Database version (if applicable) MySQL 8.0.33 ### Operating system and version Debian 11 (Bullseye)
Author

Another example from a different system:

healthchecks log:

ssh://backup-{hostname}@backups01.example.com/data/backup/{hostname}: Error running actions for repository
Command 'borg prune --keep-within 90d --keep-daily 7 --keep-weekly 4 --keep-monthly -1 --keep-yearly -1 --glob-archives {hostname}-* --info ssh://backup-{hostname}@backups01.example.com/data/backup/{hostname}' returned non-zero exit status 2.
OK
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed

  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0

Actual error logs on the system:

Oct  4 00:19:17 sjc01 borgmatic: CRITICAL /etc/borgmatic/config.yaml: Error running configuration file
Oct  4 00:19:17 sjc01 borgmatic: CRITICAL
Oct  4 00:19:17 sjc01 borgmatic: CRITICAL summary:
Oct  4 00:19:17 sjc01 borgmatic: CRITICAL /etc/borgmatic/config.yaml: Error running configuration file
Oct  4 00:19:17 sjc01 borgmatic: CRITICAL ssh://backup-{hostname}@backups01.example.com/data/backup/{hostname}: Error running actions for repository
Oct  4 00:19:17 sjc01 borgmatic: CRITICAL Remote: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@#012Remote: @    WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED!     @#012Remote: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@#012Remote: IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!#012Remote: Someone could be eavesdropping on you right now (man-in-the-middle attack)!#012Remote: It is also possible that a host key has just been changed.#012Remote: The fingerprint for the ECDSA key sent by the remote host is#012Remote: SHA256:........#012Remote: Please contact your system administrator.#012Remote: Add correct host key in /root/.ssh/known_hosts to get rid of this message.#012Remote: Offending ECDSA key in /root/.ssh/known_hosts:1#012Remote:   remove with:#012Remote:   ssh-keygen -f "/root/.ssh/known_hosts" -R "backups01.example.com"#012Remote: ECDSA host key for backups01.example.com has changed and you have requested strict checking.#012Remote: Host key verification failed.#012Connection closed by remote host. Is borg working on the server?
Oct  4 00:19:17 sjc01 borgmatic: CRITICAL Command 'borg prune --keep-within 90d --keep-daily 7 --keep-weekly 4 --keep-monthly -1 --keep-yearly -1 --glob-archives {hostname}-* --info ssh://backup-{hostname}@backups01.example.com/data/backup/{hostname}' returned non-zero exit status 2.
Oct  4 00:19:17 sjc01 borgmatic: CRITICAL
Oct  4 00:19:17 sjc01 borgmatic: CRITICAL Need some help? https://torsion.org/borgmatic/#issues

(host key change was expected since I had to rebuild the backup server)

Another example from a different system: healthchecks log: ``` ssh://backup-{hostname}@backups01.example.com/data/backup/{hostname}: Error running actions for repository Command 'borg prune --keep-within 90d --keep-daily 7 --keep-weekly 4 --keep-monthly -1 --keep-yearly -1 --glob-archives {hostname}-* --info ssh://backup-{hostname}@backups01.example.com/data/backup/{hostname}' returned non-zero exit status 2. OK % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0 ``` Actual error logs on the system: ``` Oct 4 00:19:17 sjc01 borgmatic: CRITICAL /etc/borgmatic/config.yaml: Error running configuration file Oct 4 00:19:17 sjc01 borgmatic: CRITICAL Oct 4 00:19:17 sjc01 borgmatic: CRITICAL summary: Oct 4 00:19:17 sjc01 borgmatic: CRITICAL /etc/borgmatic/config.yaml: Error running configuration file Oct 4 00:19:17 sjc01 borgmatic: CRITICAL ssh://backup-{hostname}@backups01.example.com/data/backup/{hostname}: Error running actions for repository Oct 4 00:19:17 sjc01 borgmatic: CRITICAL Remote: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@#012Remote: @ WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED! @#012Remote: @@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@@#012Remote: IT IS POSSIBLE THAT SOMEONE IS DOING SOMETHING NASTY!#012Remote: Someone could be eavesdropping on you right now (man-in-the-middle attack)!#012Remote: It is also possible that a host key has just been changed.#012Remote: The fingerprint for the ECDSA key sent by the remote host is#012Remote: SHA256:........#012Remote: Please contact your system administrator.#012Remote: Add correct host key in /root/.ssh/known_hosts to get rid of this message.#012Remote: Offending ECDSA key in /root/.ssh/known_hosts:1#012Remote: remove with:#012Remote: ssh-keygen -f "/root/.ssh/known_hosts" -R "backups01.example.com"#012Remote: ECDSA host key for backups01.example.com has changed and you have requested strict checking.#012Remote: Host key verification failed.#012Connection closed by remote host. Is borg working on the server? Oct 4 00:19:17 sjc01 borgmatic: CRITICAL Command 'borg prune --keep-within 90d --keep-daily 7 --keep-weekly 4 --keep-monthly -1 --keep-yearly -1 --glob-archives {hostname}-* --info ssh://backup-{hostname}@backups01.example.com/data/backup/{hostname}' returned non-zero exit status 2. Oct 4 00:19:17 sjc01 borgmatic: CRITICAL Oct 4 00:19:17 sjc01 borgmatic: CRITICAL Need some help? https://torsion.org/borgmatic/#issues ``` (host key change was expected since I had to rebuild the backup server)
Owner

Hmm.. I'm not sure what's going on here. Can I get a look at the command you're using to run borgmatic? I'm particularly interested in the --verbosity and --monitoring-verbosity values. Thanks!

Hmm.. I'm not sure what's going on here. Can I get a look at the command you're using to run borgmatic? I'm particularly interested in the `--verbosity` and `--monitoring-verbosity` values. Thanks!
Author

Hmm.. I'm not sure what's going on here. Can I get a look at the command you're using to run borgmatic? I'm particularly interested in the --verbosity and --monitoring-verbosity values. Thanks!

@witten I'm just using the systemd timer/service that comes bundled with the Debian package. I haven't customized it at all. Looks like it runs:

/usr/bin/borgmatic --verbosity -1 --syslog-verbosity 1

Do I need to add a monitoring-verbosity setting to that? I'm not sure why by default it enables console logging and syslog logging but not monitoring logging.

> Hmm.. I'm not sure what's going on here. Can I get a look at the command you're using to run borgmatic? I'm particularly interested in the `--verbosity` and `--monitoring-verbosity` values. Thanks! @witten I'm just using the systemd timer/service that comes bundled with the Debian package. I haven't customized it at all. Looks like it runs: ``` /usr/bin/borgmatic --verbosity -1 --syslog-verbosity 1 ``` Do I need to add a `monitoring-verbosity` setting to that? I'm not sure why by default it enables console logging and syslog logging but not monitoring logging.
Owner

Yup, try adding --monitoring-verbosity 1. With the version of borgmatic you're using, that value defaults to 0. With newer versions (1.8.3), it defaults to 1.

Yup, try adding `--monitoring-verbosity 1`. With the version of borgmatic you're using, that value defaults to 0. With newer versions (1.8.3), it defaults to 1.
Author

Perfect, thanks. That works well. Now I just need to do the same change on all of my servers.

With newer versions (1.8.3), it defaults to 1.

Unfortunately Debian is a bit behind... They haven't uploaded new Borgmatic package versions in 7 months: https://tracker.debian.org/pkg/borgmatic

Perfect, thanks. That works well. Now I just need to do the same change on all of my servers. > With newer versions (1.8.3), it defaults to 1. Unfortunately Debian is a bit behind... They haven't uploaded new Borgmatic package versions in 7 months: https://tracker.debian.org/pkg/borgmatic
Owner

Glad to hear that works! I guess the trade-off for Debian's stability is that it doesn't always have the latest and greatest versions of things. You can always install borgmatic without a Debian package if you want a newer version.

Glad to hear that works! I guess the trade-off for Debian's stability is that it doesn't always have the latest and greatest versions of things. You can always [install borgmatic without a Debian package](https://torsion.org/borgmatic/docs/how-to/set-up-backups/#installation) if you want a newer version.
witten added the
question / support
label 2023-10-04 17:50:29 +00:00
Author

If I do want the latest version, I'd probably end up either using the Docker container or forking and updating the Debian package myself. I don't like installing software through random ad-hoc package managers since I always forget to update them. I'm using the Docker container on my Unraid server at home and it's working well there.

If I do want the latest version, I'd probably end up either using the Docker container or forking and updating the Debian package myself. I don't like installing software through random ad-hoc package managers since I always forget to update them. I'm using the Docker container on my Unraid server at home and it's working well there.
Owner

Totally understood. I use the Docker container as well on several machines.

Totally understood. I use the Docker container as well on several machines.
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#764
No description provided.