Backup failure #910

Closed
opened 2024-08-31 05:01:46 +00:00 by modem7 · 6 comments

What I'm trying to do and why

Backing up

Steps to reproduce

  borgmatic:
    image: modem7/borgmatic-docker
    container_name: Borgmatic
    environment:
      TZ: $TZ
      BORG_PASSPHRASE: $BORG_PASSPHRASE
      BORG_SOURCE_1: $BORG_SOURCE_1
      BORG_SOURCE_2: $BORG_SOURCE_2
      BORG_REPO: $BORG_REPO
      BORG_HEALTHCHECK_URL: $BORG_HEALTHCHECK_URL
      # EXTRA_PKGS: postgresql16-client rclone coreutils jq
      DOCKERCLI: true
      CRON: $BORG_CRON
      CRON_COMMAND: $BORG_CRON_COMMAND
      # EXTRA_CRON: |-
      #   @daily borgmatic check -v 1 2>&1 > tee /mnt/log/check-$(date +\%Y-\%m-\%dT\%H:\%M:\%S).log
      #   0 7 1 * * command2
    logging:
      driver: "local"
      options:
        max-size: 10m
        max-file: "3"
    volumes:
      - $BORGHOMESOURCEDIR:/mnt/source/
      # - $CRONTAB:/mnt/source/Cron
      # - Pihole:/mnt/source/Pihole/Pihole
      # - Dnsmasq:/mnt/source/Pihole/Dnsmasq
      - $BORGSERVBACKUPDIR/Database:/mnt/borg-DBrepository
      - $BORGSERVBACKUPDIR/Docker:/mnt/borg-repository
      - $RAMDRIVEBACKUP/borg:/mnt/ramdrive
      - $USERDIR:/mnt/source/DockerApps/
      - $USERDIR/Borgmatic/borgmatic.d/:/etc/borgmatic.d/
      - $USERDIR/Borgmatic/.config/borg/:/root/.config/borg
      - $USERDIR/Borgmatic/.ssh/:/root/.ssh
      - $USERDIR/Borgmatic/.state/:/root/.borgmatic
      - $USERDIR/Borgmatic/.cache/borg/:/root/.cache/borg
      - $DOCKERDIR/HDA/.env:/mnt/docker/HDA/.env:ro
      - $DOCKERDIR/HDA/docker-compose.yml:/mnt/docker/HDA/docker-compose.yml:ro
      - $BORGSCRIPTS:/borgscripts:ro
      - /var/run/docker.sock:/var/run/docker.sock # So we can run scripts
      # - /mnt/downloads/script/:/custom-cont-init.d:ro
    networks:
      isonet:
      isolated:
    restart: always
source_directories:
     - ${BORG_SOURCE_1}
     - ${BORG_SOURCE_2}
repositories:
     - path: ${BORG_REPO}
       label: Borgbase
one_file_system: true
exclude_caches: true

#storage:
#   Passphase is set in variable $BORG_PASSPHRASE
compression: lz4
archive_name_format: 'backup-{now}'

keep_hourly: 0
keep_daily: 7
keep_weekly: 4
keep_monthly: 12
keep_yearly: 1

checks:
  - name: repository
    frequency: 2 weeks
  - name: archives
    frequency: always
  - name: extract
    frequency: 2 weeks
  - name: data
    frequency: 1 month

before_everything:
    - borgmatic break-lock
    - echo "Starting a backup job."
    - echo "Stopping containers."
    - exec /borgscripts/docker-stop.sh
after_everything:
    - echo "Starting containers."
    - exec /borgscripts/docker-start.sh
    - echo "Backup created."
on_error:
    - echo "Error while creating a backup."
    - exec /borgscripts/docker-start.sh
    # https://torsion.org/borgmatic/docs/how-to/backup-your-databases/

Actual behavior

--- Logging error ---
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/site-packages/borgmatic/commands/borgmatic.py", line 621, in log_error_records
    raise error
  File "/usr/local/lib/python3.12/site-packages/borgmatic/commands/borgmatic.py", line 145, in run_configuration
    yield from run_actions(
  File "/usr/local/lib/python3.12/site-packages/borgmatic/commands/borgmatic.py", line 371, in run_actions
    borgmatic.actions.check.run_check(
  File "/usr/local/lib/python3.12/site-packages/borgmatic/actions/check.py", line 660, in run_check
    repository_id = borgmatic.borg.check.get_repository_id(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/borgmatic/borg/check.py", line 97, in get_repository_id
    rinfo.display_repository_info(
  File "/usr/local/lib/python3.12/site-packages/borgmatic/borg/rinfo.py", line 55, in display_repository_info
    return execute_command_and_capture_output(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/site-packages/borgmatic/execute.py", line 348, in execute_command_and_capture_output
    output = subprocess.check_output(
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/subprocess.py", line 466, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.12/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '('borg', 'info', '--json', 'ssh://jv7q0jm5@jv7q0jm5.repo.borgbase.com/./repo')' returned non-zero exit status 81.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.12/logging/__init__.py", line 1163, in emit
    stream.write(msg + self.terminator)
                 ~~~~^~~~~~~~~~~~~~~~~
TypeError: unsupported operand type(s) for +: 'CalledProcessError' and 'str'
Call stack:
  File "/usr/local/bin/borgmatic", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.12/site-packages/borgmatic/commands/borgmatic.py", line 922, in main
    or list(collect_configuration_run_summary_logs(configs, config_paths, arguments))
  File "/usr/local/lib/python3.12/site-packages/borgmatic/commands/borgmatic.py", line 804, in collect_configuration_run_summary_logs
    results = list(run_configuration(config_filename, config, config_paths, arguments))
  File "/usr/local/lib/python3.12/site-packages/borgmatic/commands/borgmatic.py", line 176, in run_configuration
    yield from log_error_records(
  File "/usr/local/lib/python3.12/site-packages/borgmatic/commands/borgmatic.py", line 642, in log_error_records
    yield log_record(levelno=levelno, levelname=level_name, msg=error)
  File "/usr/local/lib/python3.12/site-packages/borgmatic/commands/borgmatic.py", line 599, in log_record
    logger.handle(record)
  File "/usr/local/lib/python3.12/logging/__init__.py", line 1700, in handle
    self.callHandlers(record)
  File "/usr/local/lib/python3.12/logging/__init__.py", line 1762, in callHandlers
    hdlr.handle(record)
  File "/usr/local/lib/python3.12/logging/__init__.py", line 1028, in handle
    self.emit(record)
  File "/usr/local/lib/python3.12/site-packages/borgmatic/logger.py", line 76, in emit
    self.log_level_to_handler[record.levelno].emit(record)
Message: CalledProcessError(81, ('borg', 'info', '--json', 'ssh://jv7q0jm5@jv7q0jm5.repo.borgbase.com/./repo'))
Arguments: ()
/etc/borgmatic.d/config.yml: Pinging Healthchecks log
/etc/borgmatic.d/config.yml: Running 2 commands for on-error hook
Error while creating a backup.
/etc/borgmatic.d/config.yml: Pinging Healthchecks fail
/etc/borgmatic.d/config.yml: An error occurred
/etc/borgmatic.d/config.yml: Running 3 commands for post-everything hook
Starting containers.
Backup created.

summary:
/etc/borgmatic.d/config.yml: An error occurred
Borgbase: Error running actions for repository
--- Logging error ---
Traceback (most recent call last):
  File "/usr/local/lib/python3.12/logging/__init__.py", line 1163, in emit
    stream.write(msg + self.terminator)
                 ~~~~^~~~~~~~~~~~~~~~~
TypeError: unsupported operand type(s) for +: 'CalledProcessError' and 'str'
Call stack:
  File "/usr/local/bin/borgmatic", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.12/site-packages/borgmatic/commands/borgmatic.py", line 935, in main
    logger.handle(log)
  File "/usr/local/lib/python3.12/logging/__init__.py", line 1700, in handle
    self.callHandlers(record)
  File "/usr/local/lib/python3.12/logging/__init__.py", line 1762, in callHandlers
    hdlr.handle(record)
  File "/usr/local/lib/python3.12/logging/__init__.py", line 1028, in handle
    self.emit(record)
  File "/usr/local/lib/python3.12/site-packages/borgmatic/logger.py", line 76, in emit
    self.log_level_to_handler[record.levelno].emit(record)
Message: CalledProcessError(81, ('borg', 'info', '--json', 'ssh://jv7q0jm5@jv7q0jm5.repo.borgbase.com/./repo'))
Arguments: ()

Expected behavior

No response

Other notes / implementation ideas

No response

borgmatic version

1.8.14

borgmatic installation method

Docker

Borg version

1.4.0

Python version

3.12.4

Database version (if applicable)

No response

Operating system and version

24.04.1

### What I'm trying to do and why Backing up ### Steps to reproduce ```yaml borgmatic: image: modem7/borgmatic-docker container_name: Borgmatic environment: TZ: $TZ BORG_PASSPHRASE: $BORG_PASSPHRASE BORG_SOURCE_1: $BORG_SOURCE_1 BORG_SOURCE_2: $BORG_SOURCE_2 BORG_REPO: $BORG_REPO BORG_HEALTHCHECK_URL: $BORG_HEALTHCHECK_URL # EXTRA_PKGS: postgresql16-client rclone coreutils jq DOCKERCLI: true CRON: $BORG_CRON CRON_COMMAND: $BORG_CRON_COMMAND # EXTRA_CRON: |- # @daily borgmatic check -v 1 2>&1 > tee /mnt/log/check-$(date +\%Y-\%m-\%dT\%H:\%M:\%S).log # 0 7 1 * * command2 logging: driver: "local" options: max-size: 10m max-file: "3" volumes: - $BORGHOMESOURCEDIR:/mnt/source/ # - $CRONTAB:/mnt/source/Cron # - Pihole:/mnt/source/Pihole/Pihole # - Dnsmasq:/mnt/source/Pihole/Dnsmasq - $BORGSERVBACKUPDIR/Database:/mnt/borg-DBrepository - $BORGSERVBACKUPDIR/Docker:/mnt/borg-repository - $RAMDRIVEBACKUP/borg:/mnt/ramdrive - $USERDIR:/mnt/source/DockerApps/ - $USERDIR/Borgmatic/borgmatic.d/:/etc/borgmatic.d/ - $USERDIR/Borgmatic/.config/borg/:/root/.config/borg - $USERDIR/Borgmatic/.ssh/:/root/.ssh - $USERDIR/Borgmatic/.state/:/root/.borgmatic - $USERDIR/Borgmatic/.cache/borg/:/root/.cache/borg - $DOCKERDIR/HDA/.env:/mnt/docker/HDA/.env:ro - $DOCKERDIR/HDA/docker-compose.yml:/mnt/docker/HDA/docker-compose.yml:ro - $BORGSCRIPTS:/borgscripts:ro - /var/run/docker.sock:/var/run/docker.sock # So we can run scripts # - /mnt/downloads/script/:/custom-cont-init.d:ro networks: isonet: isolated: restart: always ``` ```yaml source_directories: - ${BORG_SOURCE_1} - ${BORG_SOURCE_2} repositories: - path: ${BORG_REPO} label: Borgbase one_file_system: true exclude_caches: true #storage: # Passphase is set in variable $BORG_PASSPHRASE compression: lz4 archive_name_format: 'backup-{now}' keep_hourly: 0 keep_daily: 7 keep_weekly: 4 keep_monthly: 12 keep_yearly: 1 checks: - name: repository frequency: 2 weeks - name: archives frequency: always - name: extract frequency: 2 weeks - name: data frequency: 1 month before_everything: - borgmatic break-lock - echo "Starting a backup job." - echo "Stopping containers." - exec /borgscripts/docker-stop.sh after_everything: - echo "Starting containers." - exec /borgscripts/docker-start.sh - echo "Backup created." on_error: - echo "Error while creating a backup." - exec /borgscripts/docker-start.sh # https://torsion.org/borgmatic/docs/how-to/backup-your-databases/ ``` ### Actual behavior ``` --- Logging error --- Traceback (most recent call last): File "/usr/local/lib/python3.12/site-packages/borgmatic/commands/borgmatic.py", line 621, in log_error_records raise error File "/usr/local/lib/python3.12/site-packages/borgmatic/commands/borgmatic.py", line 145, in run_configuration yield from run_actions( File "/usr/local/lib/python3.12/site-packages/borgmatic/commands/borgmatic.py", line 371, in run_actions borgmatic.actions.check.run_check( File "/usr/local/lib/python3.12/site-packages/borgmatic/actions/check.py", line 660, in run_check repository_id = borgmatic.borg.check.get_repository_id( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/borgmatic/borg/check.py", line 97, in get_repository_id rinfo.display_repository_info( File "/usr/local/lib/python3.12/site-packages/borgmatic/borg/rinfo.py", line 55, in display_repository_info return execute_command_and_capture_output( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/site-packages/borgmatic/execute.py", line 348, in execute_command_and_capture_output output = subprocess.check_output( ^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/subprocess.py", line 466, in check_output return run(*popenargs, stdout=PIPE, timeout=timeout, check=True, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.12/subprocess.py", line 571, in run raise CalledProcessError(retcode, process.args, subprocess.CalledProcessError: Command '('borg', 'info', '--json', 'ssh://jv7q0jm5@jv7q0jm5.repo.borgbase.com/./repo')' returned non-zero exit status 81. During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/lib/python3.12/logging/__init__.py", line 1163, in emit stream.write(msg + self.terminator) ~~~~^~~~~~~~~~~~~~~~~ TypeError: unsupported operand type(s) for +: 'CalledProcessError' and 'str' Call stack: File "/usr/local/bin/borgmatic", line 8, in <module> sys.exit(main()) File "/usr/local/lib/python3.12/site-packages/borgmatic/commands/borgmatic.py", line 922, in main or list(collect_configuration_run_summary_logs(configs, config_paths, arguments)) File "/usr/local/lib/python3.12/site-packages/borgmatic/commands/borgmatic.py", line 804, in collect_configuration_run_summary_logs results = list(run_configuration(config_filename, config, config_paths, arguments)) File "/usr/local/lib/python3.12/site-packages/borgmatic/commands/borgmatic.py", line 176, in run_configuration yield from log_error_records( File "/usr/local/lib/python3.12/site-packages/borgmatic/commands/borgmatic.py", line 642, in log_error_records yield log_record(levelno=levelno, levelname=level_name, msg=error) File "/usr/local/lib/python3.12/site-packages/borgmatic/commands/borgmatic.py", line 599, in log_record logger.handle(record) File "/usr/local/lib/python3.12/logging/__init__.py", line 1700, in handle self.callHandlers(record) File "/usr/local/lib/python3.12/logging/__init__.py", line 1762, in callHandlers hdlr.handle(record) File "/usr/local/lib/python3.12/logging/__init__.py", line 1028, in handle self.emit(record) File "/usr/local/lib/python3.12/site-packages/borgmatic/logger.py", line 76, in emit self.log_level_to_handler[record.levelno].emit(record) Message: CalledProcessError(81, ('borg', 'info', '--json', 'ssh://jv7q0jm5@jv7q0jm5.repo.borgbase.com/./repo')) Arguments: () /etc/borgmatic.d/config.yml: Pinging Healthchecks log /etc/borgmatic.d/config.yml: Running 2 commands for on-error hook Error while creating a backup. /etc/borgmatic.d/config.yml: Pinging Healthchecks fail /etc/borgmatic.d/config.yml: An error occurred /etc/borgmatic.d/config.yml: Running 3 commands for post-everything hook Starting containers. Backup created. summary: /etc/borgmatic.d/config.yml: An error occurred Borgbase: Error running actions for repository --- Logging error --- Traceback (most recent call last): File "/usr/local/lib/python3.12/logging/__init__.py", line 1163, in emit stream.write(msg + self.terminator) ~~~~^~~~~~~~~~~~~~~~~ TypeError: unsupported operand type(s) for +: 'CalledProcessError' and 'str' Call stack: File "/usr/local/bin/borgmatic", line 8, in <module> sys.exit(main()) File "/usr/local/lib/python3.12/site-packages/borgmatic/commands/borgmatic.py", line 935, in main logger.handle(log) File "/usr/local/lib/python3.12/logging/__init__.py", line 1700, in handle self.callHandlers(record) File "/usr/local/lib/python3.12/logging/__init__.py", line 1762, in callHandlers hdlr.handle(record) File "/usr/local/lib/python3.12/logging/__init__.py", line 1028, in handle self.emit(record) File "/usr/local/lib/python3.12/site-packages/borgmatic/logger.py", line 76, in emit self.log_level_to_handler[record.levelno].emit(record) Message: CalledProcessError(81, ('borg', 'info', '--json', 'ssh://jv7q0jm5@jv7q0jm5.repo.borgbase.com/./repo')) Arguments: () ``` ### Expected behavior _No response_ ### Other notes / implementation ideas _No response_ ### borgmatic version 1.8.14 ### borgmatic installation method Docker ### Borg version 1.4.0 ### Python version 3.12.4 ### Database version (if applicable) _No response_ ### Operating system and version 24.04.1
Owner

So it looks like borgmatic is handling this Borg error poorly, but there is an underlying Borg error. What happens if you run the borg info command directly?

Example: borg info --json ssh://jv7q0jm5@jv7q0jm5.repo.borgbase.com/./repo

The logs seem to indicate that's returning exit code 81, which might correspond to a remote connection closed.

So it looks like borgmatic is handling this Borg error poorly, but there _is_ an underlying Borg error. What happens if you run the `borg info` command directly? Example: `borg info --json ssh://jv7q0jm5@jv7q0jm5.repo.borgbase.com/./repo` The logs seem to indicate that's returning exit code 81, which might correspond to a remote connection closed.
Author

Heya,

Thanks for the fast reply!

❯ docker exec -it Borgmatic /bin/bash
fc4881062a50:/# borg info --json ssh://jv7q0jm5@jv7q0jm5.repo.borgbase.com/./repo
{
    "cache": {
        "path": "/root/.cache/borg/f08b9415e920d64477f1609f1fbe089bd20fab57528ab4672b8bb622b4c1edb0",
        "stats": {
            "total_chunks": 3191250,
            "total_csize": 189061436104,
            "total_size": 450762656921,
            "total_unique_chunks": 728453,
            "unique_csize": 43938031138,
            "unique_size": 121780855685
        }
    },
    "encryption": {
        "mode": "repokey-blake2"
    },
    "repository": {
        "id": "f08b9415e920d64477f1609f1fbe089bd20fab57528ab4672b8bb622b4c1edb0",
        "last_modified": "2024-08-31T05:47:05.000000",
        "location": "ssh://jv7q0jm5@jv7q0jm5.repo.borgbase.com/./repo"
    },
    "security_dir": "/root/.config/borg/security/f08b9415e920d64477f1609f1fbe089bd20fab57528ab4672b8bb622b4c1edb0"
}
fc4881062a50:/#
Heya, Thanks for the fast reply! ```console ❯ docker exec -it Borgmatic /bin/bash fc4881062a50:/# borg info --json ssh://jv7q0jm5@jv7q0jm5.repo.borgbase.com/./repo { "cache": { "path": "/root/.cache/borg/f08b9415e920d64477f1609f1fbe089bd20fab57528ab4672b8bb622b4c1edb0", "stats": { "total_chunks": 3191250, "total_csize": 189061436104, "total_size": 450762656921, "total_unique_chunks": 728453, "unique_csize": 43938031138, "unique_size": 121780855685 } }, "encryption": { "mode": "repokey-blake2" }, "repository": { "id": "f08b9415e920d64477f1609f1fbe089bd20fab57528ab4672b8bb622b4c1edb0", "last_modified": "2024-08-31T05:47:05.000000", "location": "ssh://jv7q0jm5@jv7q0jm5.repo.borgbase.com/./repo" }, "security_dir": "/root/.config/borg/security/f08b9415e920d64477f1609f1fbe089bd20fab57528ab4672b8bb622b4c1edb0" } fc4881062a50:/# ```
Owner

Huh, and does that occur reliably if you run it several times? Especially if you run it around the same time that borgmatic's running normally? I can certainly try to fix borgmatic's handling of the error to display it more nicely, but I'm not sure what I can do about a remote connection closed.

One idea though: I can't imagine that it would impact a borg info (as opposed to, say, a borg create), but there is this SSH-level work-around for connections being closed prematurely: https://torsion.org/borgmatic/docs/how-to/deal-with-very-large-backups/#broken-pipe-with-remote-repository ... Might be worth looking into if you have a repro.

Huh, and does that occur reliably if you run it several times? Especially if you run it around the same time that borgmatic's running normally? I can certainly try to fix borgmatic's handling of the error to display it more nicely, but I'm not sure what I can do about a remote connection closed. One idea though: I can't imagine that it would impact a `borg info` (as opposed to, say, a `borg create`), but there is this SSH-level work-around for connections being closed prematurely: https://torsion.org/borgmatic/docs/how-to/deal-with-very-large-backups/#broken-pipe-with-remote-repository ... Might be worth looking into if you have a repro.
Author

Seems to be reliable currently (will need to check at 5am when the server backs up).

In my ssh config file, I seem to already have:

Host *
  ServerAliveInterval 300
Seems to be reliable currently (will need to check at 5am when the server backs up). In my ssh config file, I seem to already have: ``` Host * ServerAliveInterval 300 ```
Owner

Gotcha. And if you run borgmatic now instead of around 5 a.m. (assuming that's not too inconvenient to do), do you get the same error?

Gotcha. And if you run borgmatic now instead of around 5 a.m. (assuming that's not too inconvenient to do), do you get the same error?
witten added the waiting for response label 2024-09-09 20:36:00 +00:00
Owner

I'm closing this due to inactivity, but I'd be happy to reopen if you're still experiencing issues here. Thanks!

I'm closing this due to inactivity, but I'd be happy to reopen if you're still experiencing issues here. Thanks!
witten removed the waiting for response label 2024-10-09 16:27:27 +00:00
Sign in to join this conversation.
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#910