Database backups fail and cause stalling in borg #509

Closed
opened 2022-03-12 13:04:28 +00:00 by fschrempf · 8 comments

What I'm trying to do and why

Doing backups of a remote Postgres database (though other types also seem affected).

Steps to reproduce (if a bug)

Add database hook to working config.yaml:

hooks:
    postgresql_databases:
        - name: postgres
          hostname: postgres-host
          username: postgres
          password: postgres

Run backup:

/usr/bin/borgmatic --stats -v 2

Actual behavior (if a bug)

Ensuring legacy configuration is upgraded
/etc/borgmatic.d/config.yaml: No commands to run for pre-everything hook
/etc/borgmatic.d/config.yaml: No commands to run for pre-prune hook
/etc/borgmatic.d/config.yaml: No commands to run for pre-backup hook
/etc/borgmatic.d/config.yaml: No commands to run for pre-check hook
user@borghost:backups: Pruning archives
borg prune --keep-daily 7 --keep-weekly 4 --keep-monthly 6 --keep-yearly 1 --prefix {hostname}- --stats --debug --show-rc user@borghost:backups
using builtin fallback logging configuration
35 self tests completed in 0.29 seconds
SSH command line: ['ssh', '-p', '23', 'user@borghost', 'borg', 'serve', '--umask=077', '--debug']
TAM-verified manifest
security: read previous location 'ssh://user@borghost/./backups'
security: read manifest timestamp '2022-03-12T01:00:32.269805'
security: determined newest manifest timestamp as 2022-03-12T01:00:32.269805
security: repository checks ok, allowing access
security: read previous location 'ssh://user@borghost/./backups'
security: read manifest timestamp '2022-03-12T01:00:32.269805'
security: determined newest manifest timestamp as 2022-03-12T01:00:32.269805
security: repository checks ok, allowing access
Synchronizing chunks cache...
Archives: 3, w/ cached Idx: 0, w/ outdated Idx: 0, w/o cached Idx: 3.
Fetching and building archive index for 8bbb522e0457-2022-03-11T15:09:36.291355 ...
Merging into master chunks index ...
Fetching and building archive index for a858ce7b118d-2022-03-11T20:00:29.318475 ...
Merging into master chunks index ...
Fetching and building archive index for a858ce7b118d-2022-03-12T02:00:20.555360 ...
Merging into master chunks index ...
Cache sync: had to fetch 0 B (0 chunks) because no archive had a csize set for them (due to --no-cache-sync)
Cache sync: processed 22.42 MB (135 chunks) of metadata
Cache sync: compact chunks.archive.d storage saved 4.88 MB bytes
Done.
RepositoryCache: current items 58, size 3.30 MB / 1.93 GB, 77 hits, 61 misses, 0 slow misses (+0.0s), 0 evictions, 0 ENOSPC hit
security: saving state for c5f4cd326511fc60df0df013c276b98dd8faa4a4ca77b9053a11a69a971e5cbf to /root/.config/borg/security/c5f4cd326511fc60df0df013c276b98dd8faa4a4ca77b9053a11a69a971e5cbf
security: current location   ssh://user@borghost/./backups
security: key type           5
security: manifest timestamp 2022-03-12T01:00:32.269805
------------------------------------------------------------------------------
                       Original size      Compressed size    Deduplicated size
Deleted data:                    0 B                  0 B                  0 B
All archives:                6.58 GB              6.33 GB              2.16 GB
                       Unique chunks         Total chunks
Chunk index:                   28678                85793
------------------------------------------------------------------------------
RemoteRepository: 3.34 kB bytes sent, 3.31 MB bytes received, 67 messages sent
terminating with success status, rc 0
user@borghost:backups: Creating archive
user@borghost:backups: Calling postgresql_databases hook function remove_database_dumps
user@borghost:backups: Removing PostgreSQL database dumps
user@borghost:backups: Calling postgresql_databases hook function dump_databases
user@borghost:backups: Dumping PostgreSQL databases
user@borghost:backups: Dumping PostgreSQL database postgres to /root/.borgmatic/postgresql_databases/postgres-host/postgres
pg_dump --no-password --clean --if-exists --host postgres-host --username postgres --format custom postgres > /root/.borgmatic/postgresql_databases/postgres-host/postgres
borg create --one-file-system --read-special --stats --debug --show-rc user@borghost:backups::{hostname}-{now:%Y-%m-%dT%H:%M:%S.%f} <redacted> /root/.borgmatic
using builtin fallback logging configuration
35 self tests completed in 0.28 seconds
SSH command line: ['ssh', '-p', '23', 'user@borghost', 'borg', 'serve', '--umask=077', '--debug']
TAM-verified manifest
security: read previous location 'ssh://user@borghost/./backups'
security: read manifest timestamp '2022-03-12T01:00:32.269805'
security: determined newest manifest timestamp as 2022-03-12T01:00:32.269805
security: repository checks ok, allowing access
Creating archive at "user@borghost:backups::{hostname}-{now:%Y-%m-%dT%H:%M:%S.%f}"
Verified integrity of /root/.cache/borg/c5f4cd326511fc60df0df013c276b98dd8faa4a4ca77b9053a11a69a971e5cbf/chunks
Reading files cache ...
security: read previous location 'ssh://user@borghost/./backups'
security: read manifest timestamp '2022-03-12T01:00:32.269805'
security: determined newest manifest timestamp as 2022-03-12T01:00:32.269805
security: repository checks ok, allowing access
Processing files ...

The log hangs at this point and there doesn't seem to be any more progress.

Looking at the created database dump, the file is empty:

/ # ls -la /root/.borgmatic/postgresql_databases/postgres-host/
total 8
drwx------    2 root     root          4096 Mar 12 13:50 .
drwxr-xr-x    3 root     root          4096 Mar 12 13:12 ..
prw-------    1 root     root             0 Mar 12 13:12 postgres

Running the dump manually works:

/ # PGPASSWORD=postgres pg_dump --no-password --clean --if-exists --host postgres-host --username postgres --format custom postgres > /root/.borgmatic/postgresql_databases/postgres-host/postgres1
/ # ls -la /root/.borgmatic/postgresql_databases/postgres-host/
total 12
drwx------    2 root     root          4096 Mar 12 13:52 .
drwxr-xr-x    3 root     root          4096 Mar 12 13:12 ..
prw-------    1 root     root             0 Mar 12 13:12 postgres
-rw-r--r--    1 root     root          1041 Mar 12 13:52 postgres1

Running the dump manually without password, we get an error as expected and also an empty dump file:

/ # pg_dump --no-password --clean --if-exists --host postgres-host --username postgres --format custom postgres > /root/.borgmatic/postgresql_databases/postgres-host/postgres2
pg_dump: error: connection to server at "postgres-host" (172.27.0.2), port 5432 failed: fe_sendauth: no password supplied
/ # ls -la /root/.borgmatic/postgresql_databases/postgres-host/
total 12
drwx------    2 root     root          4096 Mar 12 13:53 .
drwxr-xr-x    3 root     root          4096 Mar 12 13:12 ..
prw-------    1 root     root             0 Mar 12 13:12 postgres
-rw-r--r--    1 root     root          1041 Mar 12 13:52 postgres1
-rw-r--r--    1 root     root             0 Mar 12 13:53 postgres2

Expected behavior (if a bug)

  1. Borg shouldn't hang and stall the whole backup (that's not the root issue here and probably not an issue within borgmatic).

  2. Database dump should work properly or at least catch the error.

Other notes / implementation ideas

  • I had this working before on another host. Now I see this on both hosts (old and new) and downgrading docker-borgmatic doesn't seem to help.
  • Maybe it's related to some updated dependency or to changes in pg_dump?
  • Maybe related to #399?

Environment

borgmatic version: 1.5.23 (I tried 1.5.22 and 1.5.21 and got the same result)

borgmatic installation method: Docker (https://hub.docker.com/r/b3vis/borgmatic)

Borg version: 1.2.0

Python version: Python 3.8.10

Database version (if applicable): psql (PostgreSQL) 12.9 (Debian 12.9-1.pgdg110+1)

operating system and version: Docker on Ubuntu 20.04.4 LTS

#### What I'm trying to do and why Doing backups of a remote Postgres database (though other types also seem affected). #### Steps to reproduce (if a bug) Add database hook to working `config.yaml`: ```yaml hooks: postgresql_databases: - name: postgres hostname: postgres-host username: postgres password: postgres ``` Run backup: ```shell /usr/bin/borgmatic --stats -v 2 ``` #### Actual behavior (if a bug) ```shell Ensuring legacy configuration is upgraded /etc/borgmatic.d/config.yaml: No commands to run for pre-everything hook /etc/borgmatic.d/config.yaml: No commands to run for pre-prune hook /etc/borgmatic.d/config.yaml: No commands to run for pre-backup hook /etc/borgmatic.d/config.yaml: No commands to run for pre-check hook user@borghost:backups: Pruning archives borg prune --keep-daily 7 --keep-weekly 4 --keep-monthly 6 --keep-yearly 1 --prefix {hostname}- --stats --debug --show-rc user@borghost:backups using builtin fallback logging configuration 35 self tests completed in 0.29 seconds SSH command line: ['ssh', '-p', '23', 'user@borghost', 'borg', 'serve', '--umask=077', '--debug'] TAM-verified manifest security: read previous location 'ssh://user@borghost/./backups' security: read manifest timestamp '2022-03-12T01:00:32.269805' security: determined newest manifest timestamp as 2022-03-12T01:00:32.269805 security: repository checks ok, allowing access security: read previous location 'ssh://user@borghost/./backups' security: read manifest timestamp '2022-03-12T01:00:32.269805' security: determined newest manifest timestamp as 2022-03-12T01:00:32.269805 security: repository checks ok, allowing access Synchronizing chunks cache... Archives: 3, w/ cached Idx: 0, w/ outdated Idx: 0, w/o cached Idx: 3. Fetching and building archive index for 8bbb522e0457-2022-03-11T15:09:36.291355 ... Merging into master chunks index ... Fetching and building archive index for a858ce7b118d-2022-03-11T20:00:29.318475 ... Merging into master chunks index ... Fetching and building archive index for a858ce7b118d-2022-03-12T02:00:20.555360 ... Merging into master chunks index ... Cache sync: had to fetch 0 B (0 chunks) because no archive had a csize set for them (due to --no-cache-sync) Cache sync: processed 22.42 MB (135 chunks) of metadata Cache sync: compact chunks.archive.d storage saved 4.88 MB bytes Done. RepositoryCache: current items 58, size 3.30 MB / 1.93 GB, 77 hits, 61 misses, 0 slow misses (+0.0s), 0 evictions, 0 ENOSPC hit security: saving state for c5f4cd326511fc60df0df013c276b98dd8faa4a4ca77b9053a11a69a971e5cbf to /root/.config/borg/security/c5f4cd326511fc60df0df013c276b98dd8faa4a4ca77b9053a11a69a971e5cbf security: current location ssh://user@borghost/./backups security: key type 5 security: manifest timestamp 2022-03-12T01:00:32.269805 ------------------------------------------------------------------------------ Original size Compressed size Deduplicated size Deleted data: 0 B 0 B 0 B All archives: 6.58 GB 6.33 GB 2.16 GB Unique chunks Total chunks Chunk index: 28678 85793 ------------------------------------------------------------------------------ RemoteRepository: 3.34 kB bytes sent, 3.31 MB bytes received, 67 messages sent terminating with success status, rc 0 user@borghost:backups: Creating archive user@borghost:backups: Calling postgresql_databases hook function remove_database_dumps user@borghost:backups: Removing PostgreSQL database dumps user@borghost:backups: Calling postgresql_databases hook function dump_databases user@borghost:backups: Dumping PostgreSQL databases user@borghost:backups: Dumping PostgreSQL database postgres to /root/.borgmatic/postgresql_databases/postgres-host/postgres pg_dump --no-password --clean --if-exists --host postgres-host --username postgres --format custom postgres > /root/.borgmatic/postgresql_databases/postgres-host/postgres borg create --one-file-system --read-special --stats --debug --show-rc user@borghost:backups::{hostname}-{now:%Y-%m-%dT%H:%M:%S.%f} <redacted> /root/.borgmatic using builtin fallback logging configuration 35 self tests completed in 0.28 seconds SSH command line: ['ssh', '-p', '23', 'user@borghost', 'borg', 'serve', '--umask=077', '--debug'] TAM-verified manifest security: read previous location 'ssh://user@borghost/./backups' security: read manifest timestamp '2022-03-12T01:00:32.269805' security: determined newest manifest timestamp as 2022-03-12T01:00:32.269805 security: repository checks ok, allowing access Creating archive at "user@borghost:backups::{hostname}-{now:%Y-%m-%dT%H:%M:%S.%f}" Verified integrity of /root/.cache/borg/c5f4cd326511fc60df0df013c276b98dd8faa4a4ca77b9053a11a69a971e5cbf/chunks Reading files cache ... security: read previous location 'ssh://user@borghost/./backups' security: read manifest timestamp '2022-03-12T01:00:32.269805' security: determined newest manifest timestamp as 2022-03-12T01:00:32.269805 security: repository checks ok, allowing access Processing files ... ``` The log hangs at this point and there doesn't seem to be any more progress. Looking at the created database dump, the file is empty: ```shell / # ls -la /root/.borgmatic/postgresql_databases/postgres-host/ total 8 drwx------ 2 root root 4096 Mar 12 13:50 . drwxr-xr-x 3 root root 4096 Mar 12 13:12 .. prw------- 1 root root 0 Mar 12 13:12 postgres ``` Running the dump manually works: ```shell / # PGPASSWORD=postgres pg_dump --no-password --clean --if-exists --host postgres-host --username postgres --format custom postgres > /root/.borgmatic/postgresql_databases/postgres-host/postgres1 / # ls -la /root/.borgmatic/postgresql_databases/postgres-host/ total 12 drwx------ 2 root root 4096 Mar 12 13:52 . drwxr-xr-x 3 root root 4096 Mar 12 13:12 .. prw------- 1 root root 0 Mar 12 13:12 postgres -rw-r--r-- 1 root root 1041 Mar 12 13:52 postgres1 ``` Running the dump manually without password, we get an error as expected and also an empty dump file: ```shell / # pg_dump --no-password --clean --if-exists --host postgres-host --username postgres --format custom postgres > /root/.borgmatic/postgresql_databases/postgres-host/postgres2 pg_dump: error: connection to server at "postgres-host" (172.27.0.2), port 5432 failed: fe_sendauth: no password supplied / # ls -la /root/.borgmatic/postgresql_databases/postgres-host/ total 12 drwx------ 2 root root 4096 Mar 12 13:53 . drwxr-xr-x 3 root root 4096 Mar 12 13:12 .. prw------- 1 root root 0 Mar 12 13:12 postgres -rw-r--r-- 1 root root 1041 Mar 12 13:52 postgres1 -rw-r--r-- 1 root root 0 Mar 12 13:53 postgres2 ``` #### Expected behavior (if a bug) 1. Borg shouldn't hang and stall the whole backup (that's not the root issue here and probably not an issue within borgmatic). 2. Database dump should work properly or at least catch the error. #### Other notes / implementation ideas * I had this working before on another host. Now I see this on both hosts (old and new) and downgrading docker-borgmatic doesn't seem to help. * Maybe it's related to some updated dependency or to changes in pg_dump? * Maybe related to #399? #### Environment **borgmatic version:** 1.5.23 (I tried 1.5.22 and 1.5.21 and got the same result) **borgmatic installation method:** Docker (https://hub.docker.com/r/b3vis/borgmatic) **Borg version:** 1.2.0 **Python version:** Python 3.8.10 **Database version (if applicable):** psql (PostgreSQL) 12.9 (Debian 12.9-1.pgdg110+1) **operating system and version:** Docker on Ubuntu 20.04.4 LTS
Owner

Thanks for reporting this. A little context: borgmatic streams database backups to Borg via a named pipe. Specifically, that postgres file (notice the "p" at the start of the line), so it makes sense that it's empty. Consuming from a named pipe can sometimes cause hangs with Borg if the producing end of the pipe (in this case, Postgres) isn't filling it with data.

I'm not sure what's going on here though. One thing you can try is to delete the entirety of the postgresql_databases directory before running borgmatic, in case there's an old named pipe leftover from a prior run. Generally, though, these should get deleted automatically.

If that doesn't help, you could try adding the --files flag to your borgmatic invocation, with the idea that might show which file is causing the hang. What paths are in your source directories? Do any of them include named pipes or other special files?

Thanks for reporting this. A little context: borgmatic streams database backups to Borg via a named pipe. Specifically, that `postgres` file (notice the "p" at the start of the line), so it makes sense that it's empty. Consuming from a named pipe can sometimes cause hangs with Borg if the producing end of the pipe (in this case, Postgres) isn't filling it with data. I'm not sure what's going on here though. One thing you can try is to delete the entirety of the `postgresql_databases` directory before running borgmatic, in case there's an old named pipe leftover from a prior run. Generally, though, these should get deleted automatically. If that doesn't help, you could try adding the `--files` flag to your borgmatic invocation, with the idea that might show which file is causing the hang. What paths are in your source directories? Do any of them include named pipes or other special files?
Author

Thanks for reporting this. A little context: borgmatic streams database backups to Borg via a named pipe. Specifically, that postgres file (notice the "p" at the start of the line), so it makes sense that it's empty. Consuming from a named pipe can sometimes cause hangs with Borg if the producing end of the pipe (in this case, Postgres) isn't filling it with data.

Thanks for the explanation. I admit I didn't come across named pipes before and I totally missed the p and so I didn't notice that this isn't a regular file.

I'm not sure what's going on here though. One thing you can try is to delete the entirety of the postgresql_databases directory before running borgmatic, in case there's an old named pipe leftover from a prior run. Generally, though, these should get deleted automatically.

I tried deleting the leftover pipes after an aborted/hanging run, but the same issue occured again.

If that doesn't help, you could try adding the --files flag to your borgmatic invocation, with the idea that might show which file is causing the hang. What paths are in your source directories? Do any of them include named pipes or other special files?

Thanks a lot for this hint! Using --files I can see that it hangs while processing a file in one of the other source directories. So this is completely unrelated to the database backups. I was misled by not noticing the usage of named pipes.

I recently added the volume of a Postfix MTA container (mailcow-dockerized) and this seems to be causing the lockup:

Processing files ...
M /backup-sources/mailcow-crypt/ecprivkey.pem
M /backup-sources/mailcow-crypt/ecpubkey.pem
M /backup-sources/mailcow-postfix/incoming/DEB4DBFB15
M /backup-sources/mailcow-postfix/pid/master.pid
/ # ls -la /backup-sources/mailcow-postfix/pid/master.pid
-rw-------    1 root     root            33 Mar 12 13:42 /backup-sources/mailcow-postfix/pid/master.pid

All other containers using this volume are stopped before the backup, so there shouldn't be any parallel access to the file. But still master.pid seems to be problematic for some reason although it's just a regular file. A pid-file in a backup doesn't really make sense anyway, but the official mailcow backup script includes this volume, too.

> Thanks for reporting this. A little context: borgmatic streams database backups to Borg via a named pipe. Specifically, that `postgres` file (notice the "p" at the start of the line), so it makes sense that it's empty. Consuming from a named pipe can sometimes cause hangs with Borg if the producing end of the pipe (in this case, Postgres) isn't filling it with data. Thanks for the explanation. I admit I didn't come across named pipes before and I totally missed the `p` and so I didn't notice that this isn't a regular file. > I'm not sure what's going on here though. One thing you can try is to delete the entirety of the `postgresql_databases` directory before running borgmatic, in case there's an old named pipe leftover from a prior run. Generally, though, these should get deleted automatically. I tried deleting the leftover pipes after an aborted/hanging run, but the same issue occured again. > If that doesn't help, you could try adding the `--files` flag to your borgmatic invocation, with the idea that might show which file is causing the hang. What paths are in your source directories? Do any of them include named pipes or other special files? Thanks a lot for this hint! Using `--files` I can see that it hangs while processing a file in one of the other source directories. So this is completely unrelated to the database backups. I was misled by not noticing the usage of named pipes. I recently added the volume of a Postfix MTA container ([mailcow-dockerized](https://github.com/mailcow/mailcow-dockerized)) and this seems to be causing the lockup: ``` Processing files ... M /backup-sources/mailcow-crypt/ecprivkey.pem M /backup-sources/mailcow-crypt/ecpubkey.pem M /backup-sources/mailcow-postfix/incoming/DEB4DBFB15 M /backup-sources/mailcow-postfix/pid/master.pid ``` ``` / # ls -la /backup-sources/mailcow-postfix/pid/master.pid -rw------- 1 root root 33 Mar 12 13:42 /backup-sources/mailcow-postfix/pid/master.pid ``` All other containers using this volume are stopped before the backup, so there shouldn't be any parallel access to the file. But still `master.pid` seems to be problematic for some reason although it's just a regular file. A pid-file in a backup doesn't really make sense anyway, but the official mailcow backup script includes this volume, too.
Owner

It's possible Borg only prints the name of a file after it's done being backed up. In which case the actual file causing the hang might be another file that's not listed. You could try sudo find /your/source/path -type b,c,p,s which should give you the paths to any special files that could be causing this hang. I don't imagine that a regular pid file, for instance, causes any problems here.

So this is completely unrelated to the database backups.

It actually is related! When you configure a database with borgmatic, that turns on Borg's --read-special flag, which instructs Borg to try to read from special files (like the named pipes used to stream database backups).

It's possible Borg only prints the name of a file after it's done being backed up. In which case the actual file causing the hang might be another file that's not listed. You could try `sudo find /your/source/path -type b,c,p,s` which should give you the paths to any special files that could be causing this hang. I don't imagine that a regular pid file, for instance, causes any problems here. > So this is completely unrelated to the database backups. It actually is related! When you configure a database with borgmatic, that turns on Borg's `--read-special` flag, which instructs Borg to try to read from special files (like the named pipes used to stream database backups).
witten added the
question / support
label 2022-03-12 19:56:40 +00:00
Author

Edit: Sorry, didn't yet read your previous reply.

Ok, this is still very strange. I can reliably solve the issue when I remove all the database hooks from the config.yaml even though the issue seems to be elsewhere.

@witten Can you come up with an explanation or further debugging hints for this behavior?

And thanks for your prompt help by the way!

Edit: Sorry, didn't yet read your previous reply. Ok, this is still very strange. I can reliably solve the issue when I remove all the database hooks from the `config.yaml` even though the issue seems to be elsewhere. @witten Can you come up with an explanation or further debugging hints for this behavior? And thanks for your prompt help by the way!
Author

It's possible Borg only prints the name of a file after it's done being backed up. In which case the actual file causing the hang might be another file that's not listed. You could try sudo find /your/source/path -type b,c,p,s which should give you the paths to any special files that could be causing this hang. I don't imagine that a regular pid file, for instance, causes any problems here.

Thanks, will try.

So this is completely unrelated to the database backups.

It actually is related! When you configure a database with borgmatic, that turns on Borg's --read-special flag, which instructs Borg to try to read from special files (like the named pipes used to stream database backups).

Ok, once again thanks for explaining. Really learning a lot today! 😉

> It's possible Borg only prints the name of a file after it's done being backed up. In which case the actual file causing the hang might be another file that's not listed. You could try `sudo find /your/source/path -type b,c,p,s` which should give you the paths to any special files that could be causing this hang. I don't imagine that a regular pid file, for instance, causes any problems here. Thanks, will try. > > So this is completely unrelated to the database backups. > > It actually is related! When you configure a database with borgmatic, that turns on Borg's `--read-special` flag, which instructs Borg to try to read from special files (like the named pipes used to stream database backups). Ok, once again thanks for explaining. Really learning a lot today! :wink:
Author

It's possible Borg only prints the name of a file after it's done being backed up. In which case the actual file causing the hang might be another file that's not listed. You could try sudo find /your/source/path -type b,c,p,s which should give you the paths to any special files that could be causing this hang. I don't imagine that a regular pid file, for instance, causes any problems here.

Thanks, will try.

It turns out there are a bunch of sockets and pipes contained in the mailcow volumes that cause this. I will close this issue now that it is clear how it happened. I guess I will have to look at how to exclude these files.

/backup-sources # find . -type b,c,p,s -exec ls -l {} +
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/anvil
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/bounce
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/defer
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/discard
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/dnsblog
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/error
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/lmtp
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/local
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/maildrop
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/proxymap
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/proxywrite
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/relay
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/retry
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/rewrite
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/scache
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/smtp
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/smtp_enforced_tls
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/smtp_via_transport_maps
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/smtpd
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/tlsmgr
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/tlsproxy
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/trace
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/verify
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/virtual
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/watchdog_discard
srw-rw-rw-    1 101      102              0 Mar 12 21:07 ./mailcow-postfix/private/watchdog_rewrite
srw-rw-rw-    1 101      103              0 Mar 12 21:07 ./mailcow-postfix/public/cleanup
srw-rw-rw-    1 101      103              0 Mar 12 21:07 ./mailcow-postfix/public/flush
prw--w--w-    1 101      103              0 Mar 12 21:27 ./mailcow-postfix/public/pickup
prw--w--w-    1 101      103              0 Mar 12 21:27 ./mailcow-postfix/public/qmgr
srw-rw-rw-    1 101      103              0 Mar 12 21:07 ./mailcow-postfix/public/showq
srw-rw-rw-    1 101      103              0 Mar 12 21:07 ./mailcow-postfix/public/smtp_sender_cleanup
srw-rw-rw-    1 101      103              0 Mar 12 21:07 ./mailcow-postfix/public/watchdog_cleanup
prw--w--w-    1 101      103              0 Mar 12 21:27 ./mailcow-postfix/public/watchdog_qmgr
srw-rw-rw-    1 nobody   nobody           0 Mar 12 21:08 ./mailcow-rspamd/rspamd.sock

Still the "silent" failure is somewhat frustrating. It would be nice to have some way for borgmatic or borg to catch and/or ignore these kind of issues instead of breaking the whole backup. But I also see this might not be easily implemented.

> > It's possible Borg only prints the name of a file after it's done being backed up. In which case the actual file causing the hang might be another file that's not listed. You could try `sudo find /your/source/path -type b,c,p,s` which should give you the paths to any special files that could be causing this hang. I don't imagine that a regular pid file, for instance, causes any problems here. > > Thanks, will try. It turns out there are a bunch of sockets and pipes contained in the mailcow volumes that cause this. I will close this issue now that it is clear how it happened. I guess I will have to look at how to exclude these files. ``` /backup-sources # find . -type b,c,p,s -exec ls -l {} + srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/anvil srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/bounce srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/defer srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/discard srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/dnsblog srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/error srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/lmtp srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/local srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/maildrop srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/proxymap srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/proxywrite srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/relay srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/retry srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/rewrite srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/scache srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/smtp srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/smtp_enforced_tls srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/smtp_via_transport_maps srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/smtpd srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/tlsmgr srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/tlsproxy srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/trace srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/verify srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/virtual srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/watchdog_discard srw-rw-rw- 1 101 102 0 Mar 12 21:07 ./mailcow-postfix/private/watchdog_rewrite srw-rw-rw- 1 101 103 0 Mar 12 21:07 ./mailcow-postfix/public/cleanup srw-rw-rw- 1 101 103 0 Mar 12 21:07 ./mailcow-postfix/public/flush prw--w--w- 1 101 103 0 Mar 12 21:27 ./mailcow-postfix/public/pickup prw--w--w- 1 101 103 0 Mar 12 21:27 ./mailcow-postfix/public/qmgr srw-rw-rw- 1 101 103 0 Mar 12 21:07 ./mailcow-postfix/public/showq srw-rw-rw- 1 101 103 0 Mar 12 21:07 ./mailcow-postfix/public/smtp_sender_cleanup srw-rw-rw- 1 101 103 0 Mar 12 21:07 ./mailcow-postfix/public/watchdog_cleanup prw--w--w- 1 101 103 0 Mar 12 21:27 ./mailcow-postfix/public/watchdog_qmgr srw-rw-rw- 1 nobody nobody 0 Mar 12 21:08 ./mailcow-rspamd/rspamd.sock ``` Still the "silent" failure is somewhat frustrating. It would be nice to have some way for borgmatic or borg to catch and/or ignore these kind of issues instead of breaking the whole backup. But I also see this might not be easily implemented.
Owner

Yeah, there's an open Borg issue on this. I'll see if I can mention this "silent" failure more prominently in the borgmatic documentation though.

In terms of excludes, you should be able to just add to the exclude patterns in borgmatic's configuration file. Might be obnoxious to exclude those independently if there are other things in that directory that need to be backed up...

Alternatively, you could create separate borgmatic configuration files—one for database backups and one for backing up all your other source files. That would allow you to backup these Mailcow socket files without worrying about the database settings enabling --read-special.

Yeah, there's an [open Borg issue](https://github.com/borgbackup/borg/issues/5422) on this. I'll see if I can mention this "silent" failure more prominently in the borgmatic documentation though. In terms of excludes, you *should* be able to just add to the exclude patterns in borgmatic's configuration file. Might be obnoxious to exclude those independently if there are other things in that directory that need to be backed up... Alternatively, you could create separate borgmatic configuration files—one for database backups and one for backing up all your other source files. That would allow you to backup these Mailcow socket files without worrying about the database settings enabling `--read-special`.
Author

@witten Thanks for the pointers. Especially the latter sounds like a nice and easy workaround.

@witten Thanks for the pointers. Especially the latter sounds like a nice and easy workaround.
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#509
No description provided.