Retention policy appears to have no effect #772

Closed
opened 2023-10-16 15:35:45 +00:00 by kalsan · 12 comments

What I'm trying to do and why

I'm attempting to test borgmatic's prune capability. Creating a config file, I've tried the configs which will be listed under Steps to reproduce. Running borgmatic prune had no result and I'm wondering what I'm trying to do wrong.

Also, the documentation of borgmatic (e.g. https://torsion.org/borgmatic/) states that the sections keep_within etc. are part of the root namespace, but they're actually under retention.

Steps to reproduce

Relevant extract from config (and the config is applied, writing nonsense causes the commands below to fail):

retention:                                                                                                                                                                                                                                                                                                                                                                                 
    # Keep all archives within this time interval.                                                                                                                                                                                                                                                                                                                                         
    # keep_within: 3H                                                                                                                                                                                                                                                                                                                                                                      
                                                                                                                                                                                                                                                                                                                                                                                           
    # Number of secondly archives to keep.                                                                                                                                                                                                                                                                                                                                                 
    keep_secondly: 2                                                                                                                                                                                                                                                                                                                                                                       
                                                                                              
    # Number of minutely archives to keep.                  
    # keep_minutely: 60                                  
                                                                                              
    # Number of hourly archives to keep.                         
    # keep_hourly: 2                                                                          
                                                                                              
    # Number of daily archives to keep.                        
    # keep_daily: 2                                                                           
                                                                                              
    # Number of weekly archives to keep.                                                      
    # keep_weekly: 4                                                                          
                                                                                              
    # Number of monthly archives to keep.
    # keep_monthly: 6

    # Number of yearly archives to keep.
    # keep_yearly: 1

    # When pruning, only consider archive names starting with this
    # prefix.  Borg placeholders can be used. See the output of
    # "borg help placeholders" for details. Defaults to
    # "{hostname}-". Use an empty value to disable the default.
    # prefix: sourcehostname

Alternatively, I've also tried to comment out secondly and have yearly set to 1 or 2 instead.

borgmatic list returns:

borgmatic list
ssh://borg@my-backup/./repo: Listing archives
host-2023-10-16T16:51:06.425159 Mon, 2023-10-16 16:51:07 [31f28562fefe052a341bef3552d05501a15ae3ffe66acf404dbd6f1e1ca3e14f]
host-2023-10-16T16:51:51.689093 Mon, 2023-10-16 16:51:52 [a6c7ef9cbfdfc440cf79bbc62d1476fcda71a199e3e04163174f5c5a40371273]
host-2023-10-16T16:52:02.204982 Mon, 2023-10-16 16:52:03 [02b92a613ab002bcb506d7726c5bb86e9a973a8293a14452607d31241eb133d1]

Steps:

  • borgmatic list (returns above's output)
  • borgmatic check prune
  • borgmatic list (returns above's output)

Actual behavior

There are still 3 archives after pruning.

Expected behavior

There should be 2 (or in the alternative, 1) archive.

Other notes / implementation ideas

Not sure if this is a me-problem or a bug. Any help would be appreciated.

borgmatic version

1.7.7

borgmatic installation method

Debian 12 via apt and official sources

Borg version

1.2.4

Python version

3.11.2

Database version (if applicable)

Operating system and version

Debian 12.2

### What I'm trying to do and why I'm attempting to test borgmatic's prune capability. Creating a config file, I've tried the configs which will be listed under Steps to reproduce. Running `borgmatic prune` had no result and I'm wondering what I'm trying to do wrong. Also, the documentation of borgmatic (e.g. https://torsion.org/borgmatic/) states that the sections `keep_within` etc. are part of the root namespace, but they're actually under `retention`. ### Steps to reproduce Relevant extract from config (and the config **is** applied, writing nonsense causes the commands below to fail): ```yaml retention: # Keep all archives within this time interval. # keep_within: 3H # Number of secondly archives to keep. keep_secondly: 2 # Number of minutely archives to keep. # keep_minutely: 60 # Number of hourly archives to keep. # keep_hourly: 2 # Number of daily archives to keep. # keep_daily: 2 # Number of weekly archives to keep. # keep_weekly: 4 # Number of monthly archives to keep. # keep_monthly: 6 # Number of yearly archives to keep. # keep_yearly: 1 # When pruning, only consider archive names starting with this # prefix. Borg placeholders can be used. See the output of # "borg help placeholders" for details. Defaults to # "{hostname}-". Use an empty value to disable the default. # prefix: sourcehostname ``` Alternatively, I've also tried to comment out secondly and have yearly set to 1 or 2 instead. `borgmatic list` returns: ``` borgmatic list ssh://borg@my-backup/./repo: Listing archives host-2023-10-16T16:51:06.425159 Mon, 2023-10-16 16:51:07 [31f28562fefe052a341bef3552d05501a15ae3ffe66acf404dbd6f1e1ca3e14f] host-2023-10-16T16:51:51.689093 Mon, 2023-10-16 16:51:52 [a6c7ef9cbfdfc440cf79bbc62d1476fcda71a199e3e04163174f5c5a40371273] host-2023-10-16T16:52:02.204982 Mon, 2023-10-16 16:52:03 [02b92a613ab002bcb506d7726c5bb86e9a973a8293a14452607d31241eb133d1] ``` Steps: - borgmatic list (returns above's output) - borgmatic check prune - borgmatic list (returns above's output) ### Actual behavior There are still 3 archives after pruning. ### Expected behavior There should be 2 (or in the alternative, 1) archive. ### Other notes / implementation ideas Not sure if this is a me-problem or a bug. Any help would be appreciated. ### borgmatic version 1.7.7 ### borgmatic installation method Debian 12 via apt and official sources ### Borg version 1.2.4 ### Python version 3.11.2 ### Database version (if applicable) - ### Operating system and version Debian 12.2
Owner

Also, the documentation of borgmatic (e.g. https://torsion.org/borgmatic/) states that the sections keep_within etc. are part of the root namespace, but they're actually under retention.

These schema changes were introduced in borgmatic 1.8.0, so you'd need a newer version for keep_within to be at the top level of the configuration.

Anyway, as to your main issue: My understanding of Borg's pruning logic is that a keep_secondly value of 2 literally means: For archives made in the same second (so, sharing the exact same timestamp down to a second), keep at most two of them. And the three archives you're seeing all seem to satisfy that ask because none of them share the same second.

So my question is: What exactly are you trying to do in terms of retaining archives? Maybe keep_within or one of the other keep_* options would better suit your needs?

Alternatively, I've also tried to comment out secondly and have yearly set to 1 or 2 instead.

This I would expect to resulting in archive pruning, given that your three archives share the same year. My recommendation would be to run borgmatic with --verbosity 2 which will in turn cause Borg to log what it's doing with each archive during pruning. We can look at that output and try to figure out what might be going on.

For the yearly pruning, is it possible you ran borgmatic without any command-line action arguments (prune, etc.)? Because that would actually create a new archives even after pruning takes place—resulting in three archives.

> Also, the documentation of borgmatic (e.g. https://torsion.org/borgmatic/) states that the sections keep_within etc. are part of the root namespace, but they're actually under retention. These schema changes were introduced in borgmatic 1.8.0, so you'd need a newer version for `keep_within` to be at the top level of the configuration. Anyway, as to your main issue: My understanding of Borg's pruning logic is that a `keep_secondly` value of `2` literally means: For archives made in the same second (so, sharing the exact same timestamp down to a second), keep at most two of them. And the three archives you're seeing all seem to satisfy that ask because none of them share the same second. So my question is: What exactly are you trying to do in terms of retaining archives? Maybe `keep_within` or one of the other `keep_*` options would better suit your needs? > Alternatively, I've also tried to comment out secondly and have yearly set to 1 or 2 instead. *This* I would expect to resulting in archive pruning, given that your three archives share the same year. My recommendation would be to run borgmatic with `--verbosity 2` which will in turn cause Borg to log what it's doing with each archive during pruning. We can look at that output and try to figure out what might be going on. For the yearly pruning, is it possible you ran `borgmatic` without any command-line action arguments (`prune`, etc.)? Because that would actually `create` a new archives even after pruning takes place—resulting in three archives.
witten added the
question / support
label 2023-10-16 16:16:44 +00:00
Author

Hi Witten and thank you for your very fast answer!

About documentation and secondly, all good. About yearly, here is the command and output:

root@my-backup:~# borgmatic -c /root/borgmatic-configs/some-client.yaml prune -v 2
Ensuring legacy configuration is upgraded
borg --version --debug --show-rc
/root/borgmatic-configs/some-client.yaml: No commands to run for pre-actions hook
/root/borgmatic-configs/some-client.yaml: No commands to run for pre-prune hook
ssh://borg@my-backup/./repos/some-client: Pruning archives
borg prune --keep-secondly 2 --glob-archives {hostname}-* --debug --show-rc ssh://borg@my-backup/./repos/some-client
using builtin fallback logging configuration
33 self tests completed in 0.06 seconds
SSH command line: ['ssh', 'borg@my-backup', 'borg', 'serve', '--debug']
Remote: using builtin fallback logging configuration
Remote: 33 self tests completed in 0.06 seconds
Remote: using builtin fallback logging configuration
Remote: Initialized logging system for JSON-based protocol
Remote: Resolving repository path b'/./repos/some-client'
Remote: Resolved repository path to '/home/borg/repos/some-client'
Remote: Verified integrity of /home/borg/repos/some-client/index.33
TAM-verified manifest
security: read previous location 'ssh://borg@my-backup/./repos/some-client'
security: read manifest timestamp '2023-10-16T14:52:07.234675'
security: determined newest manifest timestamp as 2023-10-16T14:52:07.234675
security: repository checks ok, allowing access
Verified integrity of /root/.cache/borg/f353af009c2f06fa7e6afc23b79bb42a80acae65a0bba2fd7defd5f362798e5f/chunks
security: read previous location 'ssh://borg@my-backup/./repos/some-client'
security: read manifest timestamp '2023-10-16T14:52:07.234675'
security: determined newest manifest timestamp as 2023-10-16T14:52:07.234675
security: repository checks ok, allowing access
RemoteRepository: 227 B bytes sent, 3.35 kB bytes received, 5 messages sent
terminating with success status, rc 0
/root/borgmatic-configs/some-client.yaml: No commands to run for post-prune hook
/root/borgmatic-configs/some-client.yaml: No commands to run for post-actions hook

summary:
/root/borgmatic-configs/some-client.yaml: Successfully ran configuration file

Best,
Kalsan

Hi Witten and thank you for your very fast answer! About documentation and secondly, all good. About yearly, here is the command and output: ``` root@my-backup:~# borgmatic -c /root/borgmatic-configs/some-client.yaml prune -v 2 Ensuring legacy configuration is upgraded borg --version --debug --show-rc /root/borgmatic-configs/some-client.yaml: No commands to run for pre-actions hook /root/borgmatic-configs/some-client.yaml: No commands to run for pre-prune hook ssh://borg@my-backup/./repos/some-client: Pruning archives borg prune --keep-secondly 2 --glob-archives {hostname}-* --debug --show-rc ssh://borg@my-backup/./repos/some-client using builtin fallback logging configuration 33 self tests completed in 0.06 seconds SSH command line: ['ssh', 'borg@my-backup', 'borg', 'serve', '--debug'] Remote: using builtin fallback logging configuration Remote: 33 self tests completed in 0.06 seconds Remote: using builtin fallback logging configuration Remote: Initialized logging system for JSON-based protocol Remote: Resolving repository path b'/./repos/some-client' Remote: Resolved repository path to '/home/borg/repos/some-client' Remote: Verified integrity of /home/borg/repos/some-client/index.33 TAM-verified manifest security: read previous location 'ssh://borg@my-backup/./repos/some-client' security: read manifest timestamp '2023-10-16T14:52:07.234675' security: determined newest manifest timestamp as 2023-10-16T14:52:07.234675 security: repository checks ok, allowing access Verified integrity of /root/.cache/borg/f353af009c2f06fa7e6afc23b79bb42a80acae65a0bba2fd7defd5f362798e5f/chunks security: read previous location 'ssh://borg@my-backup/./repos/some-client' security: read manifest timestamp '2023-10-16T14:52:07.234675' security: determined newest manifest timestamp as 2023-10-16T14:52:07.234675 security: repository checks ok, allowing access RemoteRepository: 227 B bytes sent, 3.35 kB bytes received, 5 messages sent terminating with success status, rc 0 /root/borgmatic-configs/some-client.yaml: No commands to run for post-prune hook /root/borgmatic-configs/some-client.yaml: No commands to run for post-actions hook summary: /root/borgmatic-configs/some-client.yaml: Successfully ran configuration file ``` Best, Kalsan
Owner

Maybe you didn't save your configuration file changes with keep_yearly? Because the Borg command that's actually getting run according to that output is:

borg prune --keep-secondly 2 --glob-archives {hostname}-* --debug --show-rc ssh://borg@my-backup/./repos/some-client

So it's still using a keep_secondly value!

Maybe you didn't save your configuration file changes with `keep_yearly`? Because the Borg command that's actually getting run according to that output is: ``` borg prune --keep-secondly 2 --glob-archives {hostname}-* --debug --show-rc ssh://borg@my-backup/./repos/some-client ``` So it's still using a `keep_secondly` value!
Author

Sorry for posting the wrong log. When switching to yearly, the line is:

borg prune --keep-yearly 1 --glob-archives {hostname}-* --debug --show-rc ssh://borg@my-backup/./repos/some-client

No backup gets deleted.

Best,
Kalsan

Sorry for posting the wrong log. When switching to yearly, the line is: ``` borg prune --keep-yearly 1 --glob-archives {hostname}-* --debug --show-rc ssh://borg@my-backup/./repos/some-client ``` No backup gets deleted. Best, Kalsan
Author

...which makes the whole log:

root@my-backup:~# borgmatic -c /root/borgmatic-configs/some-client.yaml prune -v 2
Ensuring legacy configuration is upgraded
borg --version --debug --show-rc
/root/borgmatic-configs/some-client.yaml: No commands to run for pre-actions hook
/root/borgmatic-configs/some-client.yaml: No commands to run for pre-prune hook
ssh://borg@my-backup/./repos/some-client: Pruning archives
borg prune --keep-yearly 1 --glob-archives {hostname}-* --debug --show-rc ssh://borg@my-backup/./repos/some-client
using builtin fallback logging configuration
33 self tests completed in 0.06 seconds
SSH command line: ['ssh', 'borg@my-backup', 'borg', 'serve', '--debug']
Remote: using builtin fallback logging configuration
Remote: 33 self tests completed in 0.06 seconds
Remote: using builtin fallback logging configuration
Remote: Initialized logging system for JSON-based protocol
Remote: Resolving repository path b'/./repos/some-client'
Remote: Resolved repository path to '/home/borg/repos/some-client'
Remote: Verified integrity of /home/borg/repos/some-client/index.33
TAM-verified manifest
security: read previous location 'ssh://borg@my-backup/./repos/some-client'
security: read manifest timestamp '2023-10-16T14:52:07.234675'
security: determined newest manifest timestamp as 2023-10-16T14:52:07.234675
security: repository checks ok, allowing access
Verified integrity of /root/.cache/borg/f353af009c2f06fa7e6afc23b79bb42a80acae65a0bba2fd7defd5f362798e5f/chunks
security: read previous location 'ssh://borg@my-backup/./repos/some-client'
security: read manifest timestamp '2023-10-16T14:52:07.234675'
security: determined newest manifest timestamp as 2023-10-16T14:52:07.234675
security: repository checks ok, allowing access
RemoteRepository: 227 B bytes sent, 3.35 kB bytes received, 5 messages sent
terminating with success status, rc 0
/root/borgmatic-configs/some-client.yaml: No commands to run for post-prune hook
/root/borgmatic-configs/some-client.yaml: No commands to run for post-actions hook

summary:
/root/borgmatic-configs/some-client.yaml: Successfully ran configuration file
...which makes the whole log: ``` root@my-backup:~# borgmatic -c /root/borgmatic-configs/some-client.yaml prune -v 2 Ensuring legacy configuration is upgraded borg --version --debug --show-rc /root/borgmatic-configs/some-client.yaml: No commands to run for pre-actions hook /root/borgmatic-configs/some-client.yaml: No commands to run for pre-prune hook ssh://borg@my-backup/./repos/some-client: Pruning archives borg prune --keep-yearly 1 --glob-archives {hostname}-* --debug --show-rc ssh://borg@my-backup/./repos/some-client using builtin fallback logging configuration 33 self tests completed in 0.06 seconds SSH command line: ['ssh', 'borg@my-backup', 'borg', 'serve', '--debug'] Remote: using builtin fallback logging configuration Remote: 33 self tests completed in 0.06 seconds Remote: using builtin fallback logging configuration Remote: Initialized logging system for JSON-based protocol Remote: Resolving repository path b'/./repos/some-client' Remote: Resolved repository path to '/home/borg/repos/some-client' Remote: Verified integrity of /home/borg/repos/some-client/index.33 TAM-verified manifest security: read previous location 'ssh://borg@my-backup/./repos/some-client' security: read manifest timestamp '2023-10-16T14:52:07.234675' security: determined newest manifest timestamp as 2023-10-16T14:52:07.234675 security: repository checks ok, allowing access Verified integrity of /root/.cache/borg/f353af009c2f06fa7e6afc23b79bb42a80acae65a0bba2fd7defd5f362798e5f/chunks security: read previous location 'ssh://borg@my-backup/./repos/some-client' security: read manifest timestamp '2023-10-16T14:52:07.234675' security: determined newest manifest timestamp as 2023-10-16T14:52:07.234675 security: repository checks ok, allowing access RemoteRepository: 227 B bytes sent, 3.35 kB bytes received, 5 messages sent terminating with success status, rc 0 /root/borgmatic-configs/some-client.yaml: No commands to run for post-prune hook /root/borgmatic-configs/some-client.yaml: No commands to run for post-actions hook summary: /root/borgmatic-configs/some-client.yaml: Successfully ran configuration file ```
Owner

Hmm.. I'm not sure what might be going on. Is it possible your hostname has changed recently? Because the --glob-archives value limits Borg's consideration to only archives matching that pattern. You could try running the borg prune command directly without --glob-archives and see if that changes what gets pruned.

If that doesn't help, then this is either a Borg bug or at minimum a misunderstanding on our part in how Borg does pruning. You might give the docs a skim and see if there's anything we're missing.. I just did that but didn't come up with anything. Worst case, you can file a Borg ticket about this assuming it repros with just Borg.

Hmm.. I'm not sure what might be going on. Is it possible your hostname has changed recently? Because the `--glob-archives` value limits Borg's consideration to only archives matching that pattern. You could try running the `borg prune` command directly without `--glob-archives` and see if that changes what gets pruned. If that doesn't help, then this is either a Borg bug or at minimum a misunderstanding on our part in how Borg does pruning. You might give [the docs](https://borgbackup.readthedocs.io/en/stable/usage/prune.html) a skim and see if there's anything we're missing.. I just did that but didn't come up with anything. Worst case, you can file a Borg ticket about this assuming it repros with just Borg.
Author

Bingo. I'm backing up from one host (append-only) and pruning from the other. While this had worked with my custom borg script (before I discovered borgmatic which does all of that, but much better and more), obviously borgmatic uses the current hostname.

So the problem is my stupidity.

And the solution is a single config change:

retention:
    keep_yearly: 1
    prefix: "my-client-"

Thank you for your help, much appreciated. Hopefully this post will help someone in the future and cause a similar facepalm.

Best,
Kalsan

Bingo. I'm backing up from one host (append-only) and pruning from the other. While this had worked with my custom borg script (before I discovered borgmatic which does all of that, but much better and more), obviously borgmatic uses the current hostname. So the problem is my stupidity. And the solution is a single config change: ```yaml retention: keep_yearly: 1 prefix: "my-client-" ``` Thank you for your help, much appreciated. Hopefully this post will help someone in the future and cause a similar facepalm. Best, Kalsan
Owner

It's not your stupidity if the software doesn't make it clear what's going on! A couple more notes on this:

  • prefix will totally work for now, but it is deprecated. (In Borg as well, not just in borgmatic.) So you might consider using match_archives: "my-client-*" instead once you upgrade to a newer version of borgmatic.
  • You might be interested in this ticket which, if it had been implemented, would have made your issue more clear with an explicit warning: #748.
It's not your stupidity if the software doesn't make it clear what's going on! A couple more notes on this: * `prefix` will totally work for now, but it is deprecated. (In Borg as well, not just in borgmatic.) So you might consider using `match_archives: "my-client-*"` instead once you upgrade to a newer version of borgmatic. * You might be interested in this ticket which, if it had been implemented, would have made your issue more clear with an explicit warning: #748.
Author

I like your spirit @witten ! Also thank you for the deprecation warning, I'll switch to the newer option.

Indeed, #748 would have had me understand this much faster.

BTW since we're already in discussion: do you have an opinion about multi repo multi client single backup server setups? The idea is that there are many clients that don't trust each other and each has a different repo and key. I feel like a reasonnable setup to achieve this would be to have the clients share a single SSH user (but with different keys and different path restrictions), each being restricted to append-only and backing up into their own subdirectory. The path restriction makes sure they can't escape the cage (so a single SSH user should be enough) and append-only prevents them to compact. In this scenario, borgmatic is used on the server to periodically prune. Caveats are that multiple borgmatic configs are needed (and thus multiple cronjobs) due to the different repo keys and prefixes, and that clients can still perform deletions that would get pruned by the backup server on the next cleanup.

I like your spirit @witten ! Also thank you for the deprecation warning, I'll switch to the newer option. Indeed, #748 would have had me understand this much faster. BTW since we're already in discussion: do you have an opinion about multi repo multi client single backup server setups? The idea is that there are many clients that don't trust each other and each has a different repo and key. I feel like a reasonnable setup to achieve this would be to have the clients share a single SSH user (but with different keys and different path restrictions), each being restricted to append-only and backing up into their own subdirectory. The path restriction makes sure they can't escape the cage (so a single SSH user should be enough) and append-only prevents them to compact. In this scenario, borgmatic is used on the server to periodically prune. Caveats are that multiple borgmatic configs are needed (and thus multiple cronjobs) due to the different repo keys and prefixes, and that clients can still perform deletions that would get pruned by the backup server on the next cleanup.
Owner

A single shared SSH user with the restrictions you describe sounds like it would work, but just as a counterexample: Both BorgBase and rsync.net (commercial Borg hosting providers) use a different SSH user account for every customer. My guess is that's because it provides even more isolation, in theory, than a single shared account. But I think which approach you go with depends on your threat model and risk tolerance.

As for the append-only + server-side pruning, yes, that could totally work and seems reasonable to me. And in fact that's an option that BorgBase provides as part of their offering.

A single shared SSH user with the restrictions you describe sounds like it would work, but just as a counterexample: Both BorgBase and rsync.net (commercial Borg hosting providers) use a different SSH user account for every customer. My guess is that's because it provides even more isolation, in theory, than a single shared account. But I think which approach you go with depends on your threat model and risk tolerance. As for the append-only + server-side pruning, yes, that could totally work and seems reasonable to me. And in fact that's an option that BorgBase provides as part of their offering.
Author

Thank you for your inputs and have an excellent day!

Thank you for your inputs and have an excellent day!
Owner

You too!

You too!
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#772
No description provided.