read_special defaults to true if unset #315

Closed
opened 2020-05-19 12:21:02 +00:00 by bjo · 13 comments

What I'm trying to do and why

Steps to reproduce (if a bug)

run borgmatic with the attached config - with unset read_special.

Actual behavior (if a bug)

Backup is created including --read-special (which included a /dev/random device of a Collabora Online installation until the destination was full).

Edit: Even with read_special: false it sets --read-special as a parameter.

backup@backup.domain.org:mail: Creating archive
backup@backup.domain.org:mail: Calling postgresql_databases hook function dump_databases
backup@backup.domain.org:mail: Dumping PostgreSQL databases
backup@backup.domain.org:mail: Dumping PostgreSQL database all to /root/.borgmatic/postgresql_databases/localhost/all
pg_dumpall --no-password --clean --if-exists --username postgres > /root/.borgmatic/postgresql_databases/localhost/all
borg create --exclude-from /tmp/tmpr1j0qzcd --exclude-caches --exclude-if-present .nobackup --compression zstd,16 --one-file-system --read-special --files-cache ctime,size,inode --debug --show-rc backup@backup.domain.org:mail::{hostname}-{now:%Y-%m-%dT%H:%M:%S.%f} / /data /root/.borgmatic
using builtin fallback logging configuration
35 self tests completed in 0.14 seconds
SSH command line: ['ssh', 'backup@backup.domain.org', 'borg', 'serve', '--umask=077', '--debug']
Remote: using builtin fallback logging configuration
Remote: 35 self tests completed in 0.31 seconds
Remote: using builtin fallback logging configuration
Remote: Initialized logging system for JSON-based protocol
Remote: Resolving repository path b'mail'
Remote: Resolved repository path to '/data/backup/mail'
Remote: Verified integrity of /data/backup/mail/index.1865
TAM-verified manifest
security: read previous location 'ssh://backup@backup.domain.org/./mail'
security: read manifest timestamp '2020-05-19T10:32:48.688079'
security: determined newest manifest timestamp as 2020-05-19T10:32:48.688079
security: repository checks ok, allowing access
Creating archive at "backup@backup.domain.org:mail::{hostname}-{now:%Y-%m-%dT%H:%M:%S.%f}"
Verified integrity of /root/.cache/borg/aeae4d53efd4d4de3385d59de18a63b9513ae628783d792dd052335a44651299/chunks

Expected behavior (if a bug)

No inclusion of special devices when read_special is not set to true.

Environment

borgmatic version: 1.5.4

borgmatic installation method: Arch Linux package

Borg version: 1.1.11

Python version: 3.8.2

operating system and version: Arch Linux

#### What I'm trying to do and why #### Steps to reproduce (if a bug) run borgmatic with the attached config - with unset read_special. #### Actual behavior (if a bug) Backup is created including --read-special (which included a /dev/random device of a Collabora Online installation until the destination was full). Edit: Even with `read_special: false` it sets `--read-special` as a parameter. ``` backup@backup.domain.org:mail: Creating archive backup@backup.domain.org:mail: Calling postgresql_databases hook function dump_databases backup@backup.domain.org:mail: Dumping PostgreSQL databases backup@backup.domain.org:mail: Dumping PostgreSQL database all to /root/.borgmatic/postgresql_databases/localhost/all pg_dumpall --no-password --clean --if-exists --username postgres > /root/.borgmatic/postgresql_databases/localhost/all borg create --exclude-from /tmp/tmpr1j0qzcd --exclude-caches --exclude-if-present .nobackup --compression zstd,16 --one-file-system --read-special --files-cache ctime,size,inode --debug --show-rc backup@backup.domain.org:mail::{hostname}-{now:%Y-%m-%dT%H:%M:%S.%f} / /data /root/.borgmatic using builtin fallback logging configuration 35 self tests completed in 0.14 seconds SSH command line: ['ssh', 'backup@backup.domain.org', 'borg', 'serve', '--umask=077', '--debug'] Remote: using builtin fallback logging configuration Remote: 35 self tests completed in 0.31 seconds Remote: using builtin fallback logging configuration Remote: Initialized logging system for JSON-based protocol Remote: Resolving repository path b'mail' Remote: Resolved repository path to '/data/backup/mail' Remote: Verified integrity of /data/backup/mail/index.1865 TAM-verified manifest security: read previous location 'ssh://backup@backup.domain.org/./mail' security: read manifest timestamp '2020-05-19T10:32:48.688079' security: determined newest manifest timestamp as 2020-05-19T10:32:48.688079 security: repository checks ok, allowing access Creating archive at "backup@backup.domain.org:mail::{hostname}-{now:%Y-%m-%dT%H:%M:%S.%f}" Verified integrity of /root/.cache/borg/aeae4d53efd4d4de3385d59de18a63b9513ae628783d792dd052335a44651299/chunks ``` #### Expected behavior (if a bug) No inclusion of special devices when read_special is not set to true. #### Environment **borgmatic version:** 1.5.4 **borgmatic installation method:** Arch Linux package **Borg version:** 1.1.11 **Python version:** 3.8.2 **operating system and version:** Arch Linux
bjo changed title from read_special defaults to true to read_special defaults to true if unset 2020-05-19 12:27:15 +00:00
Owner

Ugh! Thank you for filing this. Here's what I think is going on: The new database integration uses named pipes on the filesystem to stream dumps directly to Borg. This means that whenever a database hook is enabled, borgmatic automatically/implicitly enables Borg's --read-special, so that reading from named pipes works.

Obviously, this has some pretty bad side-effects.. Like --read-special being active when reading /dev/random, as you discovered.

So, brainstorming potential solutions... What do you think of: 1. Better documentation on this, so users are aware of it, and 2. Perhaps borgmatic automatically excluding paths like /dev to hopefully sidestep this. At least when a database hook is in use.

In the meantime, you could try adding a manual exclude for /dev to see if that sidesteps the issue.

Ugh! Thank you for filing this. Here's what I think is going on: The new database integration uses named pipes on the filesystem to stream dumps directly to Borg. This means that whenever a database hook is enabled, borgmatic automatically/implicitly enables Borg's `--read-special`, so that reading from named pipes works. Obviously, this has some pretty bad side-effects.. Like `--read-special` being active when reading `/dev/random`, as you discovered. So, brainstorming potential solutions... What do you think of: 1. Better documentation on this, so users are aware of it, and 2. Perhaps borgmatic automatically excluding paths like `/dev` to hopefully sidestep this. At least when a database hook is in use. In the meantime, you could try adding a manual exclude for `/dev` to see if that sidesteps the issue.
witten added the
bug
label 2020-05-19 17:03:30 +00:00
Author

/dev/random did not get backed up here, as I use one_file_system = true. But Collabora Online puts somehow a random and an urandom device into /opt/lool/child-roots/<somehash>/dev which isn't a extra mounted devfs, so it gets included when backing up /.
I think solution 2 would be the better one, maybe with an extra documentated option if someone who backs up / wants really to backup his/her devfs. Ad hoc I have no idea of a usecase why somebody would want to do this.

`/dev/random` did not get backed up here, as I use `one_file_system = true`. But Collabora Online puts somehow a random and an urandom device into `/opt/lool/child-roots/<somehash>/dev` which isn't a extra mounted devfs, so it gets included when backing up `/`. I think solution 2 would be the better one, maybe with an extra documentated option if someone who backs up `/` wants really to backup his/her devfs. Ad hoc I have no idea of a usecase why somebody would want to do this.
Owner

It sounds like solution 2 (borgmatic auto-excluding /dev) won't address your issue though, because /opt/lool/child-roots/<somehash>/dev isn't located there. I suppose you could manually exclude /opt/lool/child-roots/, but that's annoying for you to have to do. Thoughts?

I agree about someone wanting to backup /dev.. Seems pretty low-utility.

It sounds like solution 2 (borgmatic auto-excluding `/dev`) won't address your issue though, because `/opt/lool/child-roots/<somehash>/dev` isn't located there. I suppose you could manually exclude `/opt/lool/child-roots/`, but that's annoying for you to have to do. Thoughts? I agree about someone wanting to backup `/dev`.. Seems pretty low-utility.
Owner

FYI I tried backing up / on my host (Manjaro Linux) with --read-special implicitly enabled, and I had to exclude all of the following directories to get the backup not to hang:

  • /dev
  • /sys
  • /proc
  • /run

Not using one_file_system, which would take care of most of that. Which leads me to another idea: Maybe borgmatic should automatically enable one_file_system whenever automatically enabling --read-special? That might "solve" most of this.. At the cost of limiting what you can backup when database hooks are enabled. And of course not actually solving your /opt/.../dev issue.

FYI I tried backing up `/` on my host (Manjaro Linux) with `--read-special` implicitly enabled, and I had to exclude all of the following directories to get the backup not to hang: * /dev * /sys * /proc * /run Not using `one_file_system`, which would take care of most of that. Which leads me to another idea: Maybe borgmatic should automatically enable `one_file_system` whenever automatically enabling `--read-special`? That might "solve" most of this.. At the cost of limiting what you can backup when database hooks are enabled. And of course not actually solving your `/opt/.../dev` issue.
Author

It sounds like solution 2 (borgmatic auto-excluding /dev) won't address your issue though, because /opt/lool/child-roots/<somehash>/dev isn't located there. I suppose you could manually exclude /opt/lool/child-roots/, but that's annoying for you to have to do. Thoughts?

It's excluded now manually which is fine.

> It sounds like solution 2 (borgmatic auto-excluding `/dev`) won't address your issue though, because `/opt/lool/child-roots/<somehash>/dev` isn't located there. I suppose you could manually exclude `/opt/lool/child-roots/`, but that's annoying for you to have to do. Thoughts? It's excluded now manually which is fine.
Owner

Okay. In that case I'll go with the implicit one_file_system approach, along with documentating the hell out of this behavior. I'll leave this ticket open til that's done. Thanks again for the bug report.

Okay. In that case I'll go with the implicit `one_file_system` approach, along with documentating the hell out of this behavior. I'll leave this ticket open til that's done. Thanks again for the bug report.
Author

Would it be a big task to make the DB-backup behaviour switchable, turning backup via pipe on/off? I know streaming database backups was a wish from #258, but if space is not such an issue, the "usual" behaviour without the read-special side effects is IMHO sufficient. Using this as default and a big warning regarding piped database backups and it's side-effects as special devices aren't excluded any more (if not one_file_system is set) would be nice.

Would it be a big task to make the DB-backup behaviour switchable, turning backup via pipe on/off? I know streaming database backups was a wish from #258, but if space is not such an issue, the "usual" behaviour without the read-special side effects is IMHO sufficient. Using this as default and a big warning regarding piped database backups and it's side-effects as special devices aren't excluded any more (if not one_file_system is set) would be nice.
Owner

Would it be a big task to make the DB-backup behaviour switchable, turning backup via pipe on/off?

I think that's certainly a backstop if there are too many more issues with the auto-streaming behavior. But I was hoping that I could get away with just one supported database dump/restore approach, and a single code path for it. (Although I've already had to stray from that somewhat to continue supporting PostgreSQL's directory dump format, which by its nature can't stream.)

> Would it be a big task to make the DB-backup behaviour switchable, turning backup via pipe on/off? I think that's certainly a backstop if there are too many more issues with the auto-streaming behavior. But I was hoping that I could get away with just one supported database dump/restore approach, and a single code path for it. (Although I've already had to stray from that somewhat to continue supporting PostgreSQL's directory dump format, which by its nature can't stream.)
Owner

Okay, the "fix" of forcing one_file_system is released with borgmatic 1.5.5!

Okay, the "fix" of forcing `one_file_system` is released with borgmatic 1.5.5!
Author

Strange issue now with the storage dir of Knot DNS
It contains some certs and LMDB files and borg seems to get a timeout. Are LMDB files considered special or is it another issue?

Strange issue now with the storage dir of [Knot DNS ](https://www.knot-dns.cz/) It contains some certs and LMDB files and borg seems to get a timeout. Are LMDB files considered special or is it another issue?
bjo reopened this issue 2020-06-18 22:12:03 +00:00
Owner

I'm not familiar with LMDB, but here's something that might be relevant from http://www.lmdb.tech/doc/:

Do not use LMDB databases on remote filesystems, even between processes on the same host. This breaks flock() on some OSes, possibly memory map sync, and certainly sync between programs on different hosts.

So the way I'm reading that is that one shouldn't access an LMDB database from multiple different processes. Although that's contradicted by the Wikipedia entry. In any case, borgmatic is just reading from the filesystem and not "connecting" to LMDB of course.

Another idea is that LMDB has some named pipes or other devices on its filesystem that borgmatic can't read without hanging.

What's the directory listing look like where the problem is occurring?

When you say that Borg gets a timeout, what's that look like? Is it an actual remote timeout error?

I'm not familiar with LMDB, but here's something that might be relevant from http://www.lmdb.tech/doc/: > Do not use LMDB databases on remote filesystems, even between processes on the same host. This breaks flock() on some OSes, possibly memory map sync, and certainly sync between programs on different hosts. So the way I'm reading that is that one shouldn't access an LMDB database from multiple different processes. Although that's contradicted by the [Wikipedia entry](https://en.wikipedia.org/wiki/Lightning_Memory-Mapped_Database). In any case, borgmatic is just reading from the filesystem and not "connecting" to LMDB of course. Another idea is that LMDB has some named pipes or other devices on its filesystem that borgmatic can't read without hanging. What's the directory listing look like where the problem is occurring? When you say that Borg gets a timeout, what's that look like? Is it an actual remote timeout error?
Author

If I remember correctly, the output tells about the last included file? So lock.mdb seems to be included:

SSH command line: ['ssh', '12345@ch-s012.rsync.net', 'borg1', 'serve', '--umask=077', '--debug']
Remote: using builtin fallback logging configuration
Remote: using builtin fallback logging configuration
Remote: Initialized logging system for JSON-based protocol
Remote: Resolving repository path b'mail'
Remote: Resolved repository path to '/data2/home/12345/mail'
Remote: Verified integrity of /data2/home/12345/mail/index.1650
TAM-verified manifest
security: read previous location 'ssh://12345@ch-s012.rsync.net/./mail'
security: read manifest timestamp '2020-06-19T07:26:46.328936'
security: determined newest manifest timestamp as 2020-06-19T07:26:46.328936
security: repository checks ok, allowing access
Creating archive at "12345@ch-s012.rsync.net:mail::{hostname}-{now:%Y-%m-%dT%H:%M:%S.%f}"
Verified integrity of /root/.cache/borg/806c460e59a4dcc340f6a3ab2e76d2102ab899f468189d77dbb17ec4bf5cfb00/chunks
Reading files cache ...
Verified integrity of /root/.cache/borg/806c460e59a4dcc340f6a3ab2e76d2102ab899f468189d77dbb17ec4bf5cfb00/files
security: read previous location 'ssh://12345@ch-s012.rsync.net/./mail'
security: read manifest timestamp '2020-06-19T07:26:46.328936'
security: determined newest manifest timestamp as 2020-06-19T07:26:46.328936
security: repository checks ok, allowing access
Processing files ...
Remote: Cleaned up 0 uncommitted segment files (== everything after segment 1650).0876dd46af6f9efc6d4a20b9                                                                                                         
Remote: Verified integrity of /data2/home/12345/mail/hints.1650
35.28 GB O 33.14 GB C 22.56 MB D 27622 N var/lib/knot/journal/lock.mdb  

But it seems borg went further after some time, so it's no real lockup.

If I remember correctly, the output tells about the last included file? So `lock.mdb` seems to be included: ``` SSH command line: ['ssh', '12345@ch-s012.rsync.net', 'borg1', 'serve', '--umask=077', '--debug'] Remote: using builtin fallback logging configuration Remote: using builtin fallback logging configuration Remote: Initialized logging system for JSON-based protocol Remote: Resolving repository path b'mail' Remote: Resolved repository path to '/data2/home/12345/mail' Remote: Verified integrity of /data2/home/12345/mail/index.1650 TAM-verified manifest security: read previous location 'ssh://12345@ch-s012.rsync.net/./mail' security: read manifest timestamp '2020-06-19T07:26:46.328936' security: determined newest manifest timestamp as 2020-06-19T07:26:46.328936 security: repository checks ok, allowing access Creating archive at "12345@ch-s012.rsync.net:mail::{hostname}-{now:%Y-%m-%dT%H:%M:%S.%f}" Verified integrity of /root/.cache/borg/806c460e59a4dcc340f6a3ab2e76d2102ab899f468189d77dbb17ec4bf5cfb00/chunks Reading files cache ... Verified integrity of /root/.cache/borg/806c460e59a4dcc340f6a3ab2e76d2102ab899f468189d77dbb17ec4bf5cfb00/files security: read previous location 'ssh://12345@ch-s012.rsync.net/./mail' security: read manifest timestamp '2020-06-19T07:26:46.328936' security: determined newest manifest timestamp as 2020-06-19T07:26:46.328936 security: repository checks ok, allowing access Processing files ... Remote: Cleaned up 0 uncommitted segment files (== everything after segment 1650).0876dd46af6f9efc6d4a20b9 Remote: Verified integrity of /data2/home/12345/mail/hints.1650 35.28 GB O 33.14 GB C 22.56 MB D 27622 N var/lib/knot/journal/lock.mdb ``` But it seems borg went further after some time, so it's no real lockup.
Owner

So is there still a problem here? How big is that lock file?

It might be the case that the preferred way of backing up an LMDB database is by exporting/dumping it to file, just like with PostgreSQL and MySQL databases. Here's a command I found: http://www.lmdb.tech/doc/man1/mdb_dump_1.html

It's possible that you could exclude the entire LMDB database directory, and then back it up via that dump command instead.

So is there still a problem here? How big is that lock file? It might be the case that the preferred way of backing up an LMDB database is by exporting/dumping it to file, just like with PostgreSQL and MySQL databases. Here's a command I found: http://www.lmdb.tech/doc/man1/mdb_dump_1.html It's possible that you could exclude the entire LMDB database directory, and then back it up via that dump command instead.
witten added the
waiting for response
label 2020-06-23 18:05:57 +00:00
bjo closed this issue 2021-02-26 23:23:56 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#315
No description provided.