Make source_directories optional if roots are specified in patterns file #542

Closed
opened 2022-06-05 18:02:01 +00:00 by rcdailey · 8 comments

Suppose I have the following YAML:

location:
  repositories:
    - fubar.repo.borgbase.com:repo
  patterns_from:
    - /volume2/BorgBackup/patterns.lst

And my patterns.lst file has:

# Root directories
R /volume2/homes/user1
R /volume2/nextcloud/user1/files
R /volume2/nextcloud/user2/files

# Exclusion filters
- /volume2/homes/user1/Google Drive Backup

My understanding is that the 3 lines beginning with R are effectively source directories. I'm taking this from the borg documentation on patterns.

Suppose I have the following YAML: ```yml location: repositories: - fubar.repo.borgbase.com:repo patterns_from: - /volume2/BorgBackup/patterns.lst ``` And my `patterns.lst` file has: ```txt # Root directories R /volume2/homes/user1 R /volume2/nextcloud/user1/files R /volume2/nextcloud/user2/files # Exclusion filters - /volume2/homes/user1/Google Drive Backup ``` My understanding is that the 3 lines beginning with `R` are effectively source directories. I'm taking this from the borg documentation on patterns.
Owner

I'd be happy to make source_directories optional.. Not just for your use case, but for a variety of other use cases (like having a machine that only runs check, not create). But two notes on that:

  • You can make source_directories "optional" today by setting it to an empty list: [ ]
  • The behavior you're describing with --patterns-from might take some work with borgmatic, because borgmatic always injects ~/.borgmatic into source_directories behind the scenes. The rationale is that if database hooks are used, ~/.borgmatic is where the database dump streaming occurs, and that needs to get backed up. Maybe borgmatic could be a little smarter and only inject ~/.borgmatic when database hooks are enabled? But then your desired --patterns-from behavior would break whenever databases are used...

I'm interested in hearing your thoughts on this!

I'd be happy to make `source_directories` optional.. Not just for your use case, but for a variety of other use cases (like having a machine that only runs `check`, not `create`). But two notes on that: * You can make `source_directories` "optional" today by setting it to an empty list: `[ ]` * The behavior you're describing with `--patterns-from` might take some work with borgmatic, because borgmatic always injects `~/.borgmatic` into `source_directories` behind the scenes. The rationale is that if database hooks are used, `~/.borgmatic` is where the database dump streaming occurs, and that needs to get backed up. Maybe borgmatic could be a little smarter and only inject `~/.borgmatic` when database hooks are enabled? But then your desired `--patterns-from` behavior would break whenever databases are used... I'm interested in hearing your thoughts on this!
Author

Could you take an approach where if source_directories is missing, you initialize it to an empty array, just as if I had done it manually (as explained in your first bullet point)?

That way, the logic you have today would still be able to "inject" that path into the empty array.

Not sure if it's that simple, but hopefully so. Thanks for the quick response and being open minded!

Could you take an approach where if `source_directories` is missing, you initialize it to an empty array, just as if I had done it manually (as explained in your first bullet point)? That way, the logic you have today would still be able to "inject" that path into the empty array. Not sure if it's that simple, but hopefully so. Thanks for the quick response and being open minded!
Owner

That could totally work for source_directories, but the problem is that when there's anything passed to Borg for source paths, then --patterns-from root directories aren't treated at source directories. At least, that's the behavior I'm seeing locally with Borg 1.2.0. Meaning that if borgmatic implicitly injects the ~/.borgmatic path into source directories, your --patterns-from root paths will be ignored. I think that's because, as per the Borg docs:

You can specify recursion roots either on the command line or in a patternfile:

The key word there being either.

BTW, is there a reason you want to be able to specify the "source directories" in your patterns file rather than in borgmatic's config file?

That could totally work for `source_directories`, but the problem is that when there's *anything* passed to Borg for source paths, then `--patterns-from` root directories aren't treated at source directories. At least, that's the behavior I'm seeing locally with Borg 1.2.0. Meaning that if borgmatic implicitly injects the `~/.borgmatic` path into source directories, your `--patterns-from` root paths will be ignored. I think that's because, as per [the Borg docs](https://borgbackup.readthedocs.io/en/stable/usage/help.html#borg-patterns): > You can specify recursion roots either on the command line or in a patternfile: The key word there being *either*. BTW, is there a reason you want to be able to specify the "source directories" in your patterns file rather than in borgmatic's config file?
Author

I took a peek at the borg source code. It looks like it just parses the R paths out of the patterns file and appends it to the list holding the values passed to the PATH positional parameter. So they are interchangeable and putting R in a patterns file is exactly the same as passing it on CLI.

Source:

  1. args.paths variable defined here (for the positional argument)
  2. See args.paths being passed to the patterns parsing method invoked when --patterns-from is specified here (same logic is shared by --pattern here).
  3. Pattern roots are appended here and parsed from here and here.

I think we might be splitting hairs here, but I think the usage of "or" in the docs you quoted is an inclusive or, not an exclusive or. In other words, I think it also is saying you can specify both. Put another way, it didn't say we weren't allowed to.

Like all things... a simple test should be able to prove or disprove this behavior.

I took a peek at the borg source code. It looks like it just parses the `R` paths out of the patterns file and appends it to the list holding the values passed to the `PATH` positional parameter. So they are interchangeable and putting `R` in a patterns file is exactly the same as passing it on CLI. Source: 1. `args.paths` variable defined [here][1] (for the positional argument) 1. See `args.paths` being passed to the patterns parsing method invoked when `--patterns-from` is specified [here][3] (same logic is shared by `--pattern` [here][2]). 1. Pattern roots are appended [here][4] and parsed from [here][5] and [here][6]. [1]: https://github.com/borgbackup/borg/blob/0e8c9941bb560c02aa29561d8112e270f3d282bf/src/borg/archiver.py#L3843 [2]: https://github.com/borgbackup/borg/blob/0e8c9941bb560c02aa29561d8112e270f3d282bf/src/borg/patterns.py#L45 [3]: https://github.com/borgbackup/borg/blob/0e8c9941bb560c02aa29561d8112e270f3d282bf/src/borg/patterns.py#L64 [4]: https://github.com/borgbackup/borg/blob/0e8c9941bb560c02aa29561d8112e270f3d282bf/src/borg/patterns.py#L19 [5]: https://github.com/borgbackup/borg/blob/0e8c9941bb560c02aa29561d8112e270f3d282bf/src/borg/patterns.py#L371 [6]: https://github.com/borgbackup/borg/blob/0e8c9941bb560c02aa29561d8112e270f3d282bf/src/borg/patterns.py#L391 I think we might be splitting hairs here, but I think the usage of "or" in the docs you quoted is an inclusive or, not an exclusive or. In other words, I think it also is saying you can specify both. Put another way, it *didn't say we weren't allowed to*. Like all things... a simple test should be able to prove or disprove this behavior.
Author

To answer your question at the end: I'm aiming for separation of concerns: I'd like the YAML to define configuration-specific (not paths) and have all paths centralized in one patterns file. That way I'm not jumping back and forth between two files when I want to add a root path + specific include/exclude patterns.

Not a huge deal but this is just my personal preference for setting it up.

To answer your question at the end: I'm aiming for separation of concerns: I'd like the YAML to define configuration-specific (not paths) and have all paths centralized in one patterns file. That way I'm not jumping back and forth between two files when I want to add a root path + specific include/exclude patterns. Not a huge deal but this is just my personal preference for setting it up.
Owner

That totally makes sense as a use case. Thanks for explaining it.

The reason I'm fixated on the "either" is that's the behavior I'm seeing when running a local test. I'm not sure how that squares with the Borg source code appends you're seeing though.

Here's my local test with Borg 1.2.0...

With patterns only:

patterns.lst:

# Root directories
R /root/tmp

# Exclusion filters
- /root/tmp/tmp

Test run:

# borg create --patterns-from patterns.lst test.borg::{hostname}-{now:%Y-%m-%dT%H:%M:%S.%f}
# borgmatic -c test.yaml --archive latest list
test.borg: Listing archives
flux-2022-06-05T15:59:01.076374
drwxr-xr-x root   root          0 Sun, 2022-06-05 15:58:54 root/tmp
[... snipped for brevity ...]
-rw-r--r-- root   root         68 Sun, 2022-06-05 13:25:58 root/tmp/patterns.lst

(I'm running create here without borgmatic so as to verify Borg's own behavior.)

With source directories and patterns:

patterns.lst: Same as above.

Test run:

# borg create --patterns-from patterns.lst test.borg::{hostname}-{now:%Y-%m-%dT%H:%M:%S.%f} /root/.borgmatic
# borgmatic -c test.yaml --archive latest list
test.borg: Listing archives
flux-2022-06-05T16:07:25.508214
drwxr-xr-x root   root          0 Tue, 2022-05-31 10:13:46 root/.borgmatic
[... snipped for brevity ...]
-rw-r--r-- root   root          0 Sun, 2022-06-05 00:38:59 root/.borgmatic/checks/50341d1930dfc9a918e7ff70b3fbfa3a7b7fb824dad7c5d38267e1562a1239ea/archives

Notice how in the second test, only ~/.borgmatic gets backed up, and patterns are ignored.

Now I'd love to find out I'm doing something wrong here, and source directories and patterns aren't mutually exclusive like this. But that does appear to be how Borg is acting.

That totally makes sense as a use case. Thanks for explaining it. The reason I'm fixated on the "either" is that's the behavior I'm seeing when running a local test. I'm not sure how that squares with the Borg source code appends you're seeing though. Here's my local test with Borg 1.2.0... **With patterns only:** patterns.lst: ``` # Root directories R /root/tmp # Exclusion filters - /root/tmp/tmp ``` Test run: ```bash # borg create --patterns-from patterns.lst test.borg::{hostname}-{now:%Y-%m-%dT%H:%M:%S.%f} # borgmatic -c test.yaml --archive latest list test.borg: Listing archives flux-2022-06-05T15:59:01.076374 drwxr-xr-x root root 0 Sun, 2022-06-05 15:58:54 root/tmp [... snipped for brevity ...] -rw-r--r-- root root 68 Sun, 2022-06-05 13:25:58 root/tmp/patterns.lst ``` (I'm running `create` here without borgmatic so as to verify Borg's own behavior.) **With source directories and patterns:** patterns.lst: Same as above. Test run: ```bash # borg create --patterns-from patterns.lst test.borg::{hostname}-{now:%Y-%m-%dT%H:%M:%S.%f} /root/.borgmatic # borgmatic -c test.yaml --archive latest list test.borg: Listing archives flux-2022-06-05T16:07:25.508214 drwxr-xr-x root root 0 Tue, 2022-05-31 10:13:46 root/.borgmatic [... snipped for brevity ...] -rw-r--r-- root root 0 Sun, 2022-06-05 00:38:59 root/.borgmatic/checks/50341d1930dfc9a918e7ff70b3fbfa3a7b7fb824dad7c5d38267e1562a1239ea/archives ``` Notice how in the second test, only `~/.borgmatic` gets backed up, and patterns are ignored. Now I'd love to find out I'm doing something wrong here, and source directories and patterns aren't mutually exclusive like this. But that *does* appear to be how Borg is acting.
Owner

Related: #574.

Also, the Borg dev seems to think the Borg patterns behavior described above could be a bug, so I've filed it here: https://github.com/borgbackup/borg/issues/6994

Related: #574. Also, the Borg dev seems to think the Borg patterns behavior described above could be a bug, so I've filed it here: https://github.com/borgbackup/borg/issues/6994
Owner

source_directories is now optional, released in borgmatic 1.7.1! And as of #574, ~/.borgmatic gets automatically converted into a pattern behind the scenes if patterns are specified.

Thanks for filing this!

`source_directories` is now optional, released in borgmatic 1.7.1! And as of #574, `~/.borgmatic` gets automatically converted into a pattern behind the scenes if `patterns` are specified. Thanks for filing this!
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#542
No description provided.