PostgreSQL backup running infinitely #940

Open
opened 2024-11-18 10:41:58 +00:00 by User-26 · 2 comments

What I'm trying to do and why

Hello, I'm trying to back up PostgreSQL. It usually takes about 20 minutes, but sometimes it hangs indefinitely. It's been about 10 hours since the last backup started, and it is still running. There is no disk or network activity, and it seems like nothing is happening. Here is the log:

/home/borgmatic/borgmatic_postgres.example.com.yaml: No commands to run for pre-everything hook
borg --version --debug --show-rc
ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Running actions for repository
/home/borgmatic/borgmatic_postgres.example.com.yaml: No commands to run for pre-actions hook
/home/borgmatic/borgmatic_postgres.example.com.yaml: Running command for pre-backup hook
echo "`date` - Starting backup"
Mon Nov 18 03:30:00 UTC 2024 - Starting backup
ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Creating archive
ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Calling mariadb_databases hook function remove_data_source_dumps
ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Removing MariaDB data source dumps
ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Calling mysql_databases hook function remove_data_source_dumps
ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Removing MySQL data source dumps
ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Calling mongodb_databases hook function remove_data_source_dumps
ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Removing MongoDB data source dumps
ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Calling postgresql_databases hook function remove_data_source_dumps
ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Removing PostgreSQL data source dumps
ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Calling sqlite_databases hook function remove_data_source_dumps
ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Removing SQLite data source dumps
ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Calling postgresql_databases hook function dump_data_sources
ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Dumping PostgreSQL databases
ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Querying for "all" PostgreSQL databases to dump
psql --list --no-password --no-psqlrc --csv --tuples-only --host postgres.example.com --port 5432 --username borgmatic
ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Dumping PostgreSQL database "DB_1" to /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/DB_1
pg_dump --no-password --clean --if-exists --host postgres.example.com --port 5432 --username borgmatic --format plain DB_1 > /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/DB_1
ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Dumping PostgreSQL database "DB_2" to /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/DB_2
pg_dump --no-password --clean --if-exists --host postgres.example.com --port 5432 --username borgmatic --format plain DB_2 > /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/DB_2
ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Dumping PostgreSQL database "DB_3" to /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/DB_3
pg_dump --no-password --clean --if-exists --host postgres.example.com --port 5432 --username borgmatic --format plain DB_3 > /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/DB_3
ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Dumping PostgreSQL database "DB_4" to /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/DB_4
pg_dump --no-password --clean --if-exists --host postgres.example.com --port 5432 --username borgmatic --format plain DB_4 > /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/DB_4
ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Dumping PostgreSQL database "_it_n8n_monitoring_db" to /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/_it_n8n_monitoring_db
pg_dump --no-password --clean --if-exists --host postgres.example.com --port 5432 --username borgmatic --format plain _it_n8n_monitoring_db > /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/_it_n8n_monitoring_db
ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Dumping PostgreSQL database "postgres" to /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/postgres
pg_dump --no-password --clean --if-exists --host postgres.example.com --port 5432 --username borgmatic --format plain postgres > /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/postgres
ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Collecting special file paths
borg create --compression auto,zstd --one-file-system --read-special --list --filter AMEx- ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo::{hostname}-{now} /home/borgmatic/.borgmatic /home/borgmatic/borgmatic_postgres.example.com.yaml --dry-run --list
borg create --compression auto,zstd --one-file-system --read-special --list --filter AMEx- --stats --debug --show-rc ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo::{hostname}-{now} /home/borgmatic/.borgmatic /home/borgmatic/borgmatic_postgres.example.com.yaml
using builtin fallback logging configuration
33 self tests completed in 0.08 seconds
SSH command line: ['ssh', 'h7q4fb7f@h2q4fb3f.repo.borgbase.com', 'borg', 'serve', '--debug']
Remote: using builtin fallback logging configuration
Remote: borg selftest disabled via BORG_SELFTEST env variable
Remote: using builtin fallback logging configuration
Remote: Initialized logging system for JSON-based protocol
Remote: Resolving repository path b'/./repo'
Remote: Resolved repository path to '/srv/repos/h4q4fb7f/repo'
Remote: Verified integrity of /srv/repos/h4q4fb7f/repo/index.1498
TAM-verified manifest
security: read previous location 'ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo'
security: read manifest timestamp '2024-11-17T07:34:20.435213'
security: determined newest manifest timestamp as 2024-11-17T07:34:20.435213
security: repository checks ok, allowing access
Creating archive at "ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo::server-1-2024-11-18T03:30:01"
Verified integrity of /home/borgmatic/.cache/borg/0fd3718f74e7d15efc652cfbcfe619c7d34f48798929eec8c09d9908f72541ab/chunks
Reading files cache ...
Verified integrity of /home/borgmatic/.cache/borg/0fd3718f74e7d15efc652cfbcfe619c7d34f48798929eec8c09d9908f72541ab/files
security: read previous location 'ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo'
security: read manifest timestamp '2024-11-17T07:34:20.435213'
security: determined newest manifest timestamp as 2024-11-17T07:34:20.435213
security: repository checks ok, allowing access
Processing files ...
A /home/borgmatic/.borgmatic/bootstrap/manifest.json
A /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/DB_1
Remote: Cleaned up 0 uncommitted segment files (== everything after segment 1498).
Remote: Verified integrity of /srv/repos/h4q4fb7f/repo/hints.1498
Remote: Storage quota: 84.45 GB out of 3.20 TB used.

Steps to reproduce

borgmatic create --verbosity 2 --list --stats --conf borgmatic_postgres.example.com.yaml

location:
    repositories:
        - ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo

storage:
    compression: auto,zstd
    encryption_passphrase: <PASSWORD> 
    archive_name_format: '{hostname}-{now}'

retention:
    keep_within: 100y

hooks:
    before_backup:
        - echo "`date` - Starting backup"
    after_backup:
        - echo "`date` - Finished backup"
    postgresql_databases:
        - name: all
          hostname: postgres.example.com
          port: 5432
          username: borgmatic
          password: <PASSWORD>
          format: plain
          ssl_mode: require

Actual behavior

After 10 hours backup is still not finished

Expected behavior

Backup finished in 20 minutes

Other notes / implementation ideas

No response

borgmatic version

1.8.3

borgmatic installation method

pipx install borgmatic

Borg version

1.2.8

Python version

Python 3.12.3

Database version (if applicable)

PostgreSQL 14.6

Operating system and version

Ubuntu 22.04.5 LTS

### What I'm trying to do and why Hello, I'm trying to back up PostgreSQL. It usually takes about 20 minutes, but sometimes it hangs indefinitely. It's been about 10 hours since the last backup started, and it is still running. There is no disk or network activity, and it seems like nothing is happening. Here is the log: ``` /home/borgmatic/borgmatic_postgres.example.com.yaml: No commands to run for pre-everything hook borg --version --debug --show-rc ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Running actions for repository /home/borgmatic/borgmatic_postgres.example.com.yaml: No commands to run for pre-actions hook /home/borgmatic/borgmatic_postgres.example.com.yaml: Running command for pre-backup hook echo "`date` - Starting backup" Mon Nov 18 03:30:00 UTC 2024 - Starting backup ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Creating archive ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Calling mariadb_databases hook function remove_data_source_dumps ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Removing MariaDB data source dumps ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Calling mysql_databases hook function remove_data_source_dumps ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Removing MySQL data source dumps ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Calling mongodb_databases hook function remove_data_source_dumps ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Removing MongoDB data source dumps ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Calling postgresql_databases hook function remove_data_source_dumps ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Removing PostgreSQL data source dumps ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Calling sqlite_databases hook function remove_data_source_dumps ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Removing SQLite data source dumps ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Calling postgresql_databases hook function dump_data_sources ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Dumping PostgreSQL databases ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Querying for "all" PostgreSQL databases to dump psql --list --no-password --no-psqlrc --csv --tuples-only --host postgres.example.com --port 5432 --username borgmatic ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Dumping PostgreSQL database "DB_1" to /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/DB_1 pg_dump --no-password --clean --if-exists --host postgres.example.com --port 5432 --username borgmatic --format plain DB_1 > /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/DB_1 ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Dumping PostgreSQL database "DB_2" to /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/DB_2 pg_dump --no-password --clean --if-exists --host postgres.example.com --port 5432 --username borgmatic --format plain DB_2 > /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/DB_2 ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Dumping PostgreSQL database "DB_3" to /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/DB_3 pg_dump --no-password --clean --if-exists --host postgres.example.com --port 5432 --username borgmatic --format plain DB_3 > /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/DB_3 ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Dumping PostgreSQL database "DB_4" to /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/DB_4 pg_dump --no-password --clean --if-exists --host postgres.example.com --port 5432 --username borgmatic --format plain DB_4 > /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/DB_4 ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Dumping PostgreSQL database "_it_n8n_monitoring_db" to /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/_it_n8n_monitoring_db pg_dump --no-password --clean --if-exists --host postgres.example.com --port 5432 --username borgmatic --format plain _it_n8n_monitoring_db > /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/_it_n8n_monitoring_db ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Dumping PostgreSQL database "postgres" to /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/postgres pg_dump --no-password --clean --if-exists --host postgres.example.com --port 5432 --username borgmatic --format plain postgres > /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/postgres ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo: Collecting special file paths borg create --compression auto,zstd --one-file-system --read-special --list --filter AMEx- ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo::{hostname}-{now} /home/borgmatic/.borgmatic /home/borgmatic/borgmatic_postgres.example.com.yaml --dry-run --list borg create --compression auto,zstd --one-file-system --read-special --list --filter AMEx- --stats --debug --show-rc ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo::{hostname}-{now} /home/borgmatic/.borgmatic /home/borgmatic/borgmatic_postgres.example.com.yaml using builtin fallback logging configuration 33 self tests completed in 0.08 seconds SSH command line: ['ssh', 'h7q4fb7f@h2q4fb3f.repo.borgbase.com', 'borg', 'serve', '--debug'] Remote: using builtin fallback logging configuration Remote: borg selftest disabled via BORG_SELFTEST env variable Remote: using builtin fallback logging configuration Remote: Initialized logging system for JSON-based protocol Remote: Resolving repository path b'/./repo' Remote: Resolved repository path to '/srv/repos/h4q4fb7f/repo' Remote: Verified integrity of /srv/repos/h4q4fb7f/repo/index.1498 TAM-verified manifest security: read previous location 'ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo' security: read manifest timestamp '2024-11-17T07:34:20.435213' security: determined newest manifest timestamp as 2024-11-17T07:34:20.435213 security: repository checks ok, allowing access Creating archive at "ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo::server-1-2024-11-18T03:30:01" Verified integrity of /home/borgmatic/.cache/borg/0fd3718f74e7d15efc652cfbcfe619c7d34f48798929eec8c09d9908f72541ab/chunks Reading files cache ... Verified integrity of /home/borgmatic/.cache/borg/0fd3718f74e7d15efc652cfbcfe619c7d34f48798929eec8c09d9908f72541ab/files security: read previous location 'ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo' security: read manifest timestamp '2024-11-17T07:34:20.435213' security: determined newest manifest timestamp as 2024-11-17T07:34:20.435213 security: repository checks ok, allowing access Processing files ... A /home/borgmatic/.borgmatic/bootstrap/manifest.json A /home/borgmatic/.borgmatic/postgresql_databases/postgres.example.com/DB_1 Remote: Cleaned up 0 uncommitted segment files (== everything after segment 1498). Remote: Verified integrity of /srv/repos/h4q4fb7f/repo/hints.1498 Remote: Storage quota: 84.45 GB out of 3.20 TB used. ``` ### Steps to reproduce borgmatic create --verbosity 2 --list --stats --conf borgmatic_postgres.example.com.yaml ``` location: repositories: - ssh://h7q4fb7f@h2q4fb3f.repo.borgbase.com/./repo storage: compression: auto,zstd encryption_passphrase: <PASSWORD> archive_name_format: '{hostname}-{now}' retention: keep_within: 100y hooks: before_backup: - echo "`date` - Starting backup" after_backup: - echo "`date` - Finished backup" postgresql_databases: - name: all hostname: postgres.example.com port: 5432 username: borgmatic password: <PASSWORD> format: plain ssl_mode: require ``` ### Actual behavior After 10 hours backup is still not finished ### Expected behavior Backup finished in 20 minutes ### Other notes / implementation ideas _No response_ ### borgmatic version 1.8.3 ### borgmatic installation method pipx install borgmatic ### Borg version 1.2.8 ### Python version Python 3.12.3 ### Database version (if applicable) PostgreSQL 14.6 ### Operating system and version Ubuntu 22.04.5 LTS
Owner

Thanks for the detailed ticket. Normally I'd presume that Borg is hanging on a special file unrelated to the database dump, but it looks like you don't even have any source_directories in your configuration. So if there is a hang related to a special file, it's got to be on one of the database dumps itself. So two ideas:

  • After stopping borgmatic, you can try nuking the temporary runtime directory where dumps are stored in case there are any stale named pipes in there. E.g.: rm -fr ~/.borgmatic/postgresql_databases. There shouldn't be any stale named pipes, because they get cleaned up automatically. But there could be a bug preventing that from happening.
  • Or you can try narrowing down the problem to individual databases. For instance, instead of dumping all, try dumping one or a handful of databases at a time to see if the hang starts happening on a particular database.
  • Or you could try upgrading borgmatic (and/or Borg). I don't see anything in the borgmatic changelog since 1.8.3 that might impact or fix the issue you're having, but you are on a pretty old version at this point and there could still potentially be fixes since then.
Thanks for the detailed ticket. Normally I'd presume that Borg is [hanging on a special file](https://torsion.org/borgmatic/docs/how-to/backup-your-databases/#borgmatic-hangs-during-backup) unrelated to the database dump, but it looks like you don't even have any `source_directories` in your configuration. So if there is a hang related to a special file, it's got to be on one of the database dumps itself. So two ideas: * After stopping borgmatic, you can try nuking the temporary runtime directory where dumps are stored in case there are any stale named pipes in there. E.g.: `rm -fr ~/.borgmatic/postgresql_databases`. There *shouldn't* be any stale named pipes, because they get cleaned up automatically. But there could be a bug preventing that from happening. * Or you can try narrowing down the problem to individual databases. For instance, instead of dumping `all`, try dumping one or a handful of databases at a time to see if the hang starts happening on a particular database. * Or you could try upgrading borgmatic (and/or Borg). I don't see anything in the borgmatic changelog since 1.8.3 that might impact or fix the issue you're having, but you are on a pretty old version at this point and there could still potentially be fixes since then.
witten added the
question / support
label 2024-11-18 18:04:31 +00:00
witten added the
waiting for response
label 2024-11-30 19:23:52 +00:00
Owner

Another idea: Is it possible that the borgmatic runtime directory happens to be listed in your exclude patterns? This was the issue discovered in #947, and the fix for it will be in the next release. But in the meantime, if this is the issue you're having, you can work around it by removing the borgmatic runtime directory from your excludes.

To find out what borgmatic runtime directory is being used, run borgmatic with --verbosity 2 and look for something like the following:

test: Using runtime directory /this/is/the/path

And then make sure that path (or a parent directory of it) isn't listed in, for instance, exclude_patterns in borgmatic's configuration.

Another idea: Is it possible that the borgmatic runtime directory happens to be listed in your exclude patterns? This was the issue discovered in #947, and the fix for it will be in the next release. But in the meantime, if this is the issue you're having, you can work around it by removing the borgmatic runtime directory from your excludes. To find out what borgmatic runtime directory is being used, run borgmatic with `--verbosity 2` and look for something like the following: ``` test: Using runtime directory /this/is/the/path ``` And then make sure that path (or a parent directory of it) isn't listed in, for instance, `exclude_patterns` in borgmatic's configuration.
Sign in to join this conversation.
No Milestone
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#940
No description provided.