Archive checks are incorrectly skipped if you have multiple config files with different archive name formats
Closedopened 4 weeks ago by kaysond · 11 comments
Reference in New Issue
There is no content yet.
Delete Branch '%!s(<nil>)'
Deleting a branch is permanent. It CANNOT be undone. Continue?
What I'm trying to do and why
Steps to reproduce (if a bug)
Create multiple config files with different
archive_name_formats (a la https://torsion.org/borgmatic/docs/how-to/make-per-application-backups/) and the same
frequencysetting for the
Actual behavior (if a bug)
The first config file will run the archive check with the correct globbing argument and then touch the
~/.borgmatic/checks/<UUID>/archivesfile. The remaining config files will look at the archives check timestamp, see that it's recent, and not run at all. This is despite the fact that the archives were not checked, since the
Expected behavior (if a bug)
All archives should be checked per the frequency setting.
Other notes / implementation ideas
It seems that the check timestamps are stored per repository. While that makes sense for the repository check, it doesn't work for the archive checks. There needs to be some way to track the archive check time per archive. A simple approach might be to just use a hash of
borgmatic version: 1.7.12
borgmatic installation method: pip
Borg version: 1.2.4
Python version: 3.9.2
Database version (if applicable): N/A
operating system and version: Debian 11
Good catch! Yeah, I think you're right that the path for recording the archives check needs to include a hash of the
archive_name_formatvalue or similar for this to work. The trick will be "upgrading" an existing directory of checks in place.
My thought there was that you could leave the
<repo_hash>/archivesfile as is, and add new files like
<repo_hash>/archive_<hash>, since you'd still want them per-repo. And then you can check to see which if any exist and run the checks accordingly.
A fix for this is implemented now in main and will be released soon. I ended up going with a format like
<repo_hash>/archives/<archive_filters_hash>(for a subset of archives) and
<repo_hash>/archives/all(for all archives). The archive filters hash isn't derived from
archive_name_formatdirectly, but rather from the downstream archive filter flags produced from
archive_name_format(and a variety of other relevant options/flags that filter down the checked archives to a subset).
The old check paths should get upgraded automatically.
Awesome! I will give this a shot when its released.
Why is this needed?
This was just released in borgmatic 1.7.13! Let me know how it works out for you or if it needs any tweaking.
It's written to in the case that archives are unfiltered (by
archive_name_formator anything else) and therefore all archives are being checked. It's also read in the case that an
archive_name_formatis used but there's no more specific archives check file on disk. The idea is that if all archives were checked at some point, then that means that any given subset of archives was implicitly checked as well.
I did just make one tweak that'll have to wait for the next release: Instead of taking the first check time when multiple are present (e.g.,
<repo_hash>/archives/all, take the maximum. That way, if for some reason the
allfile is newer, that more recent timestamp will get used even if you've since set
archive_name_format. Kind of an edge case, but more correct I think.
Based on some quick initial testing this seems to work great! I just deleted the checks dir and started borgmatic check then I'll rerun it to make sure everything looks good.
One thing that still doesn't work great for me, though, is the fact that when the checks do actually run, my health checks go down because it takes so much longer than a normal run.
This why I'd previously disabled checks in my config files, then had a separate cron job with an override to run the checks and call a different healthchecks url. (Which didn't previously work because of this bug).
I'm curious if you have any suggestions on how to best handle this?
#518 is probably the feature you're looking for to solve this. It would let you define different Healthchecks URLs for different borgmatic actions.
Glad to hear initial testing is working out for you though!
That's exactly it! FYI - the second round of testing worked as well.