pg_dumpall hangs (again) #468
Labels
No Label
bug
data loss
design finalized
good first issue
new feature area
question / support
security
waiting for response
No Milestone
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: borgmatic-collective/borgmatic#468
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
What I'm trying to do and why
backup a filesystem including postgresql dumps
Steps to reproduce (if a bug)
run borgmatic with the attached config
Actual behavior (if a bug)
/root/.borgmatic/postgresql_databases/localhost/all stays empty
Expected behavior (if a bug)
A dump in /root/.borgmatic/postgresql_databases/localhost/all which gets backed up.
Other notes / implementation ideas
Environment
borgmatic version: 1.5.20
Use
sudo borgmatic --version
orsudo pip show borgmatic | grep ^Version
borgmatic installation method: pip
Borg version: 1.1.16
Use
sudo borg --version
Python version: 3.9.2
Use
python3 --version
Database version (if applicable): 11.13
Use
psql --version
ormysql --version
on client and server.operating system and version: Debian Bullseye
Thanks for filing this! The
/root/.borgmatic/postgresql_databases/localhost/all
path is only used to create a named pipe for streaming the dump directly from Postgres to Borg without consuming additional disk space. So you shouldn't expect to see the contents of the database dump show up there. Are you sure that borgmatic / pg_dumpall are hanging, or is it perhaps just taking a while for the dump to stream to Borg? What database(s) size are we talking about here? If it is hanging, have you tried nuking/root/.borgmatic
before running borgmatic?Note: I'm not seeing the config attached here!
Here's the config file uploaded again
Uploading does not work, config inline (sensitive content replaced with dummy data):
Regarding your questions, a pg_dumpall command executed on the shell takes a couple of seconds, so we're not talking about a lot of data / a long runtime of the
pg_dumpall
command. Indeed, the backup runs into the runtime limit of 12h configured in systemd when I let it continue. And I see that the backup hangs at/after theopenat
syscall.I have nuked
/root/.borgmatic
in between runs.Maybe there is problems in the consumer of the named pipe?
This backup config used to run, but stopped running shortly after a debian update from buster to bullseye. Might be correlation, might be causation.
For reference, there was an issue a while ago that had the same symptoms: #316
Couple more questions on this:
--verbosity 2
on? Feel free to redact.borg create
command that shows up there, because I'm wondering if a source path is erroneously getting passed to Borg twice (which, as per #316, can cause the kind of hang you're seeing)..
source directory? Does it contain the/root/.borgmatic
path?First two questions I will come back to, but the answer to the third question is it's an lvm snapshot of /. Therefore it does not contain the /root/.borgmatic path that is actively used for the db hook AFAICS.
Did something change about database hook limitation #4 (read_special)?
Because the backups used to work before the system upgrade, but now they don't, because the snapshots contain a lot of special files.
I think the easiest solution could be to separate the database backup and the snapshot backup into two sets/configs.
Thanks for your guidance.
Separating the database backup from the snapshot backup works with 1.5.20. I sadly don't know which version of borgmatic was used before the upgrade (which worked backing up snapshot & database in one config).
If it helps, here's the create command from the verbose logs:
borg create --exclude-from /tmp/tmpxsft8_na --exclude-caches --compression lz4 --one-file-system --read-special --lock-wait 60 --debug --show-rc ssh://host1/./backups/borg::{hostname}-{now:%Y-%m-%dT%H:%M:%S} /root/.borgmatic
Feel free to close this or investigate further.
The database streaming behavior was introduced in borgmatic 1.5.3 (including that
read_special
limitation) and there have been various fixes/tweaks since then. So my guess is that you were using a pre-1.5.3 version of borgmatic before your upgrade. Alternately, maybe you already had a 1.5.3+ version of borgmatic, but your system upgrade introduced some new special files that weren't present previously.One "fix" you could make is to exclude those special files in borgmatic's configuration.
Closing this for now, but please feel free to continue the discussion here.