Unable to extract when using one repository across multiple configuration files #722

Open
opened 2023-07-02 23:55:59 +00:00 by davidtorosyan · 7 comments

What I'm trying to do and why

I want to backup multiple apps to one repo, following the instructions in: How to make per-application backups.

Backup works, but extraction doesn't (without having to specify the config with -c).

Steps to reproduce (if a bug)

Here are my config files:

common: /etc/borgmatic/common.yaml

location:
    repositories:
        - path: <my server>
          label: remote

storage:
    encryption_passcommand: <my command>
    ssh_command: ssh -i <my file>

app1: /etc/borgmatic.d/app1.yaml

<<: !include /etc/borgmatic/common.yaml
location:
    source_directories:
        - <some path>/dir1

storage:
    archive_name_format: 'dir1-{now}'

app2: /etc/borgmatic.d/app2.yaml

<<: !include /etc/borgmatic/common.yaml
location:
    source_directories:
        - <some path>/dir2

storage:
    archive_name_format: 'dir2-{now}'

Create the backup using:

sudo borgmatic init --encryption repokey
sudo borgmatic create --verbosity 1 --list --stats

Verify the files are backed up:

sudo borgmatic list

Try to extract one of the archives:

sudo borgmatic extract --archive <archive name> --verbosity 2

Actual behavior (if a bug)

Can't determine which repository to use. Use --repository to disambiguate

summary:
/etc/borgmatic.d/app1.yaml: Loading configuration file
/etc/borgmatic.d/app2.yaml: Loading configuration file
Can't determine which repository to use. Use --repository to disambiguate

Need some help? https://torsion.org/borgmatic/#issues

Obviously, using --repository doesn't help since it's the same repo in both files. If I do include it, I get:

Repository remote found in multiple configuration files

Expected behavior (if a bug)

I'm expecting it to successfully extract the archive, since (a) it's the same repo in both cases, (b) even if they weren't, my archive is only in one of them.

Other notes / implementation ideas

I can work around this by supplying -c /etc/borgmatic.d/app1.yaml, but that's a little annoying since the other commands work so well with multiple configuration files.

Not the end of the world since I'd presumably not be using extract too often... but still.

Environment

borgmatic version: 1.7.15

borgmatic installation method: Pacman

Borg version: borg 1.2.4

Python version: Python 3.11.3

Database version (if applicable): N/A

operating system and version: Arch Linux

#### What I'm trying to do and why I want to backup multiple apps to one repo, following the instructions in: [How to make per-application backups](https://torsion.org/borgmatic/docs/how-to/make-per-application-backups/#multiple-backup-configurations). Backup works, but extraction doesn't (without having to specify the config with `-c`). #### Steps to reproduce (if a bug) Here are my config files: common: `/etc/borgmatic/common.yaml` ``` location: repositories: - path: <my server> label: remote storage: encryption_passcommand: <my command> ssh_command: ssh -i <my file> ``` app1: `/etc/borgmatic.d/app1.yaml` ``` <<: !include /etc/borgmatic/common.yaml location: source_directories: - <some path>/dir1 storage: archive_name_format: 'dir1-{now}' ``` app2: `/etc/borgmatic.d/app2.yaml` ``` <<: !include /etc/borgmatic/common.yaml location: source_directories: - <some path>/dir2 storage: archive_name_format: 'dir2-{now}' ``` Create the backup using: ``` sudo borgmatic init --encryption repokey sudo borgmatic create --verbosity 1 --list --stats ``` Verify the files are backed up: ``` sudo borgmatic list ``` Try to extract one of the archives: ``` sudo borgmatic extract --archive <archive name> --verbosity 2 ``` #### Actual behavior (if a bug) ``` Can't determine which repository to use. Use --repository to disambiguate summary: /etc/borgmatic.d/app1.yaml: Loading configuration file /etc/borgmatic.d/app2.yaml: Loading configuration file Can't determine which repository to use. Use --repository to disambiguate Need some help? https://torsion.org/borgmatic/#issues ``` Obviously, using `--repository` doesn't help since it's the same repo in both files. If I do include it, I get: ``` Repository remote found in multiple configuration files ``` #### Expected behavior (if a bug) I'm expecting it to successfully extract the archive, since (a) it's the same repo in both cases, (b) even if they weren't, my archive is only in one of them. #### Other notes / implementation ideas I can work around this by supplying `-c /etc/borgmatic.d/app1.yaml`, but that's a little annoying since the other commands work so well with multiple configuration files. Not the end of the world since I'd presumably not be using `extract` too often... but still. #### Environment **borgmatic version:** 1.7.15 **borgmatic installation method:** Pacman **Borg version:** borg 1.2.4 **Python version:** Python 3.11.3 **Database version (if applicable):** N/A **operating system and version:** Arch Linux
Owner

Thanks so much for taking the time to file this one! I've confirmed the behavior you're seeing locally, and I suspect a little de-duplication in the code throwing that error would solve things. I'll have a look.

Thanks so much for taking the time to file this one! I've confirmed the behavior you're seeing locally, and I suspect a little de-duplication in the code throwing that error would solve things. I'll have a look.
witten added the
bug
label 2023-07-03 05:47:08 +00:00
Owner

Okay, looking at this in a little more detail, here's the rationale for borgmatic erroring with "Repository remote found in multiple configuration files": When a repo exists in multiple configuration files, borgmatic doesn't know which configuration file to "use"; it relies on a particular configuration file to determine how to invoke Borg (with ssh_command, local_path, remote_path, lock_wait, etc etc). In your particular case, the two config files look like they're identical from that perspective. But if they weren't, borgmatic would either have to arbitrarily pick one and use its connection parameters or do what it does now—throw up its hands and error.

So a couple of ideas I can think of:

  • Just make a better error message to instruct the user to use --config as you did to manually select from among configuration files.
  • Or, somehow allow config files with the same repo as long as the resulting Borg commands are / would be identical. This would probably be tough to do though given that everything in the code is done so serially now. It's not like there's a phase where the potential Borg commands for all config files are collected for validation/comparison....
  • Or, somehow use archive_name_format or match_archives from each config file to match against the given --archive and implicitly select the right configuration file that way? Maybe error then if there's no archive_name_format or similar to use for disambiguation?

Thoughts? Other ideas?

Okay, looking at this in a little more detail, here's the rationale for borgmatic erroring with "Repository remote found in multiple configuration files": When a repo exists in multiple configuration files, borgmatic doesn't know which configuration file to "use"; it relies on a particular configuration file to determine how to invoke Borg (with `ssh_command`, `local_path`, `remote_path`, `lock_wait`, etc etc). In your particular case, the two config files look like they're identical from that perspective. But if they weren't, borgmatic would either have to arbitrarily pick one and use its connection parameters or do what it does now—throw up its hands and error. So a couple of ideas I can think of: * Just make a better error message to instruct the user to use `--config` as you did to manually select from among configuration files. * Or, somehow allow config files with the same repo as long as the resulting Borg commands are / would be identical. This would probably be tough to do though given that everything in the code is done so serially now. It's not like there's a phase where the potential Borg commands for all config files are collected for validation/comparison.... * Or, somehow use `archive_name_format` or `match_archives` from each config file to match against the given `--archive` and implicitly select the right configuration file that way? Maybe error then if there's no `archive_name_format` or similar to use for disambiguation? Thoughts? Other ideas?
witten removed the
bug
label 2023-07-03 17:00:26 +00:00
Author

Thanks @witten for the quick and detailed response!

Some thoughts on your ideas:

Just make a better error message

This would be nice, I'll admit it took me a while to figure out the -c workaround.

Or, somehow allow config files with the same repo as long as the resulting Borg commands are / would be identical.

I could think of two ways to do this:

  1. Identify repos not just by their name, but also other info like their path. That way you can figure out that the two configs are talking about the same repo.
  2. Detect this special case where all the repo info is in a common file.

My sense is that (1) would be more generally useful, but also harder to get right. While (2) is more limited, but would solve my particular (maybe common?) use case.

It's not like there's a phase where the potential Borg commands for all config files are collected for validation/comparison....

Really? I thought there must be since this error message exists: "Repository remote found in multiple configuration files"

Or, somehow use archive_name_format or match_archives from each config file

This sounds potentially error prone / hard. However, this other thing you said give me another idea:

In your particular case, the two config files look like they're identical from that perspective. But if they weren't, borgmatic would either have to arbitrarily pick one and use its connection parameters or do what it does now—throw up its hands and error.

There's a third option - execute it on each config. That's what you do for other commands like delete.

I actually was considering posting a different issue about that. If I do borgmatic borg delete <some archive name> with my setup, I see an error message that says something like "archive " doesn't exist.

This happens because borgmatic loads the first config, deletes the archive, then loads the second config and can't find the archive.

This is annoying, but also kind of sensible. You could take the same approach with extract and just run it for each config. It shouldn't cause an issue as long as you also filter archives based on archive_name_format... which I'm realizing you probably don't for extract. So it might get double extracted. Mmm...

Thanks @witten for the quick and detailed response! Some thoughts on your ideas: > Just make a better error message This would be nice, I'll admit it took me a while to figure out the `-c` workaround. > Or, somehow allow config files with the same repo as long as the resulting Borg commands are / would be identical. I could think of two ways to do this: 1. Identify repos not just by their name, but also other info like their path. That way you can figure out that the two configs are talking about the same repo. 2. Detect this special case where all the repo info is in a common file. My sense is that (1) would be more generally useful, but also harder to get right. While (2) is more limited, but would solve my particular (maybe common?) use case. > It's not like there's a phase where the potential Borg commands for all config files are collected for validation/comparison.... Really? I thought there must be since this error message exists: "Repository remote found in multiple configuration files" > Or, somehow use archive_name_format or match_archives from each config file This sounds potentially error prone / hard. However, this other thing you said give me another idea: > In your particular case, the two config files look like they're identical from that perspective. But if they weren't, borgmatic would either have to arbitrarily pick one and use its connection parameters or do what it does now—throw up its hands and error. There's a third option - execute it on each config. That's what you do for other commands like `delete`. I actually was considering posting a different issue about that. If I do `borgmatic borg delete <some archive name>` with my setup, I see an error message that says something like "archive <some archive name>" doesn't exist. This happens because borgmatic loads the first config, deletes the archive, then loads the second config and can't find the archive. This is annoying, but also kind of sensible. You could take the same approach with extract and just run it for each config. It shouldn't cause an issue as long as you also filter archives based on `archive_name_format`... which I'm realizing you probably don't for extract. So it might get double extracted. Mmm...
Owner

I could think of two ways to do this:

  1. Identify repos not just by their name, but also other info like their path. That way you can figure out that the two configs are talking about the same repo.

IIRC, this is already how repos are compared.

  1. Detect this special case where all the repo info is in a common file.

Interesting idea. Yeah, I'm not how feasible that would be given the current includes machinery.

It's not like there's a phase where the potential Borg commands for all config files are collected for validation/comparison....

Really? I thought there must be since this error message exists: "Repository remote found in multiple configuration files"

That's accomplished simply by looking for the given repository in each configuration file. That code doesn't have access to the (eventually constructed) Borg commands at that point. For instance, if one config file results in borg extract --lock-wait 5 user@remote:repo and the other results in borg extract --remote-path borg1 user@remoterepo, there's currently no global insight into those two commands.

Or, somehow use archive_name_format or match_archives from each config file

This sounds potentially error prone / hard.

Maybe. It might be as simple though as fnmatch.filter(user_provided_archive_name, make_match_archive_flags(...)) or something similar.

However, this other thing you said give me another idea:

There's a third option - execute it on each config. That's what you do for other commands like delete.

I actually was considering posting a different issue about that. If I do borgmatic borg delete <some archive name> with my setup, I see an error message that says something like "archive " doesn't exist.

This happens because borgmatic loads the first config, deletes the archive, then loads the second config and can't find the archive.

This is annoying, but also kind of sensible. You could take the same approach with extract and just run it for each config. It shouldn't cause an issue as long as you also filter archives based on archive_name_format... which I'm realizing you probably don't for extract. So it might get double extracted. Mmm...

I like the idea. But yeah, borg extract doesn't accept anything like an archive name format (or --match-archives / --glob-archives) because it only operates on the specific provided archive. But maybe borgmatic could do a first pass and match the archive against the archive_name_format before passing anything to Borg. Oh wait, now we're back to "somehow use archive_name_format or match_archives from each config file" above. 😃

> I could think of two ways to do this: > > 1. Identify repos not just by their name, but also other info like their path. That way you can figure out that the two configs are talking about the same repo. IIRC, this is already how repos are compared. > 2. Detect this special case where all the repo info is in a common file. Interesting idea. Yeah, I'm not how feasible that would be given the current includes machinery. > > It's not like there's a phase where the potential Borg commands for all config files are collected for validation/comparison.... > > Really? I thought there must be since this error message exists: "Repository remote found in multiple configuration files" That's accomplished simply by looking for the given repository in each configuration file. That code doesn't have access to the (eventually constructed) Borg commands at that point. For instance, if one config file results in `borg extract --lock-wait 5 user@remote:repo` and the other results in `borg extract --remote-path borg1 user@remoterepo`, there's currently no global insight into those two commands. > > Or, somehow use archive_name_format or match_archives from each config file > > This sounds potentially error prone / hard. Maybe. It might be as simple though as `fnmatch.filter(user_provided_archive_name, make_match_archive_flags(...))` or something similar. > However, this other thing you said give me another idea: > > There's a third option - execute it on each config. That's what you do for other commands like delete. > > I actually was considering posting a different issue about that. If I do borgmatic borg delete \<some archive name\> with my setup, I see an error message that says something like "archive " doesn't exist. > > This happens because borgmatic loads the first config, deletes the archive, then loads the second config and can't find the archive. > > This is annoying, but also kind of sensible. You could take the same approach with extract and just run it for each config. It shouldn't cause an issue as long as you also filter archives based on archive_name_format... which I'm realizing you probably don't for extract. So it might get double extracted. Mmm... I like the idea. But yeah, `borg extract` doesn't accept anything like an archive name format (or `--match-archives` / `--glob-archives`) because it only operates on the specific provided archive. But maybe borgmatic could do a first pass and match the archive against the `archive_name_format` before passing anything to Borg. Oh wait, now we're back to "somehow use archive_name_format or match_archives from each config file" above. 😃

I have a problem related to this.

I have an automated system that has been backing up data using 50 dynamically config files. Yesterday the service that generated the backup config went down and we lost all the configs. Yay! for not backing those up. Is there a way I can look at the data in the repo? Based on the archive names I will be able to recover or even know what is stored there. Thank you.

I have a problem related to this. I have an automated system that has been backing up data using 50 dynamically config files. Yesterday the service that generated the backup config went down and we lost all the configs. Yay! for not backing those up. Is there a way I can look at the data in the repo? Based on the archive names I will be able to recover or even know what is stored there. Thank you.
Owner

A couple of points:

  • If you were using borgmatic 1.7.15 or above when you made your backup, then you can use the borgmatic bootstrap action to recover your configuration files from within the repository—even if you didn't explicitly include them in your source directories to backup!
  • borgmatic won't operate without a config file, but you can always use borg directly to list, mount, extract an archive from a repository made with borgmatic.
A couple of points: * If you were using borgmatic 1.7.15 or above when you made your backup, then you can use the [`borgmatic bootstrap` action](https://torsion.org/borgmatic/docs/how-to/extract-a-backup/#extract-the-configuration-files-used-to-create-an-archive) to recover your configuration files from within the repository—even if you didn't explicitly include them in your source directories to backup! * borgmatic won't operate without a config file, but you can always use `borg` directly to `list`, `mount`, `extract` an archive from a repository made with borgmatic.

This is great. Thank you very much.

This is great. Thank you very much.
Sign in to join this conversation.
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#722
No description provided.