Archive checks are incorrectly skipped if you have multiple config files with different archive name formats #688
Loading…
x
Reference in New Issue
Block a user
No description provided.
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
What I'm trying to do and why
Steps to reproduce (if a bug)
Create multiple config files with different
archive_name_format
s (a la https://torsion.org/borgmatic/docs/how-to/make-per-application-backups/) and the samefrequency
setting for thearchive
check.Run
borgmatic check
twice.Actual behavior (if a bug)
The first config file will run the archive check with the correct globbing argument and then touch the
~/.borgmatic/checks/<UUID>/archives
file. The remaining config files will look at the archives check timestamp, see that it's recent, and not run at all. This is despite the fact that the archives were not checked, since thearchive_name_format
is different.Expected behavior (if a bug)
All archives should be checked per the frequency setting.
Other notes / implementation ideas
It seems that the check timestamps are stored per repository. While that makes sense for the repository check, it doesn't work for the archive checks. There needs to be some way to track the archive check time per archive. A simple approach might be to just use a hash of
archive_name_format
.Environment
borgmatic version: 1.7.12
borgmatic installation method: pip
Borg version: 1.2.4
Python version: 3.9.2
Database version (if applicable): N/A
operating system and version: Debian 11
Good catch! Yeah, I think you're right that the path for recording the archives check needs to include a hash of the
archive_name_format
value or similar for this to work. The trick will be "upgrading" an existing directory of checks in place.My thought there was that you could leave the
<repo_hash>/archives
file as is, and add new files like<repo_hash>/archive_<hash>
, since you'd still want them per-repo. And then you can check to see which if any exist and run the checks accordingly.A fix for this is implemented now in main and will be released soon. I ended up going with a format like
<repo_hash>/archives/<archive_filters_hash>
(for a subset of archives) and<repo_hash>/archives/all
(for all archives). The archive filters hash isn't derived fromarchive_name_format
directly, but rather from the downstream archive filter flags produced fromarchive_name_format
(and a variety of other relevant options/flags that filter down the checked archives to a subset).The old check paths should get upgraded automatically.
Awesome! I will give this a shot when its released.
Makes sense.
Why is this needed?
This was just released in borgmatic 1.7.13! Let me know how it works out for you or if it needs any tweaking.
It's written to in the case that archives are unfiltered (by
archive_name_format
or anything else) and therefore all archives are being checked. It's also read in the case that anarchive_name_format
is used but there's no more specific archives check file on disk. The idea is that if all archives were checked at some point, then that means that any given subset of archives was implicitly checked as well.I did just make one tweak that'll have to wait for the next release: Instead of taking the first check time when multiple are present (e.g.,
<repo_hash>/archives/<archive_filters_hash>
and<repo_hash>/archives/all
, take the maximum. That way, if for some reason theall
file is newer, that more recent timestamp will get used even if you've since setarchive_name_format
. Kind of an edge case, but more correct I think.Based on some quick initial testing this seems to work great! I just deleted the checks dir and started borgmatic check then I'll rerun it to make sure everything looks good.
One thing that still doesn't work great for me, though, is the fact that when the checks do actually run, my health checks go down because it takes so much longer than a normal run.
This why I'd previously disabled checks in my config files, then had a separate cron job with an override to run the checks and call a different healthchecks url. (Which didn't previously work because of this bug).
I'm curious if you have any suggestions on how to best handle this?
#518 is probably the feature you're looking for to solve this. It would let you define different Healthchecks URLs for different borgmatic actions.
Glad to hear initial testing is working out for you though!
That's exactly it! FYI - the second round of testing worked as well.
Awesome!
Just released as part of borgmatic 1.7.14!