Q: Spot check schedule with config.yml & requesting clarification of spot check parameters #868

Closed
opened 2024-05-15 17:29:47 +00:00 by michaeltoohig · 2 comments

What I'm trying to do and why

I want to confirm spot checks are working as intended. I am defining how to run spot checks inside config.yml. Once a month it should spot check an archive. Given issue #866 I am seeing an error occur due to xxh64sum which sends me a fail alert; however, new archives are still being created on my remote repo. Is this the expected behavior?

I already restarted the container and lost the current log information but I recall seeing the output of --stats in the logs before the xxh6sum error which added to the fact I have new archives in my repo means an archive is created then checks are run which is counter to the documentation which says to run spot check on a separate schedule to create. Is this expected behavior too or should I manually add spot check to my crontab.txt instead?

Also for clarification here is an example spot check.

checks:
  - name: spot
    frequency: 1 week
    count_tolerance_percentage: 10
    data_sample_percentage: 10
    data_tolerance_percentage: 0.5

My second question is to clarify the meaning of the parameters as I find it a little unclear to me. If I have 1000 files, 100 files will be spot checked and from that 100 if 5 files are found to be different the check will fail? Is that the correct way to read this? In addition if there is more than a 10% change in the total number of files the check will fail given count_tolerance_percentage?

Steps to reproduce

No response

Actual behavior

No response

Expected behavior

No response

Other notes / implementation ideas

No response

borgmatic version

1.8.11

borgmatic installation method

container

Borg version

1.2.8

Python version

3.12.3

Database version (if applicable)

mysql Ver 15.1 Distrib 10.11.6-MariaDB, for Linux (x86_64) using readline 5.1

Operating system and version

Alpine 3.19

### What I'm trying to do and why I want to confirm `spot` checks are working as intended. I am defining how to run `spot` checks inside `config.yml`. Once a month it should spot check an archive. Given issue #866 I am seeing an error occur due to `xxh64sum` which sends me a `fail` alert; however, new archives are still being created on my remote repo. Is this the expected behavior? I already restarted the container and lost the current log information but I recall seeing the output of `--stats` in the logs before the `xxh6sum` error which added to the fact I have new archives in my repo means an archive is created then checks are run which is counter to the documentation which says to run `spot` check on a separate schedule to `create`. Is this expected behavior too or should I manually add `spot` check to my `crontab.txt` instead? Also for clarification here is an example spot check. ``` checks: - name: spot frequency: 1 week count_tolerance_percentage: 10 data_sample_percentage: 10 data_tolerance_percentage: 0.5 ``` My second question is to clarify the meaning of the parameters as I find it a little unclear to me. If I have 1000 files, 100 files will be spot checked and from that 100 if 5 files are found to be different the check will fail? Is that the correct way to read this? In addition if there is more than a 10% change in the total number of files the check will fail given `count_tolerance_percentage`? ### Steps to reproduce _No response_ ### Actual behavior _No response_ ### Expected behavior _No response_ ### Other notes / implementation ideas _No response_ ### borgmatic version 1.8.11 ### borgmatic installation method container ### Borg version 1.2.8 ### Python version 3.12.3 ### Database version (if applicable) mysql Ver 15.1 Distrib 10.11.6-MariaDB, for Linux (x86_64) using readline 5.1 ### Operating system and version Alpine 3.19
Owner

I want to confirm spot checks are working as intended. I am defining how to run spot checks inside config.yml. Once a month it should spot check an archive. Given issue #866 I am seeing an error occur due to xxh64sum which sends me a fail alert; however, new archives are still being created on my remote repo. Is this the expected behavior?

Yes. The create and check actions are run independently, and create comes before check by default. So if the create succeeds and the check fails, a new archive will still be created. This applies to all checks, not just the spot check.

I already restarted the container and lost the current log information but I recall seeing the output of --stats in the logs before the xxh6sum error which added to the fact I have new archives in my repo means an archive is created then checks are run which is counter to the documentation which says to run spot check on a separate schedule to create. Is this expected behavior too or should I manually add spot check to my crontab.txt instead?

Yeah, by default create runs before check by default, and that is expected behavior. The documentation (which perhaps needs clarification!) suggests that it may make more sense to run the spot check in particular on a separate schedule so that it's not simply checking that the archive that was just created has the exact same files that were just put into it. There is some value in doing that, because it could catch files modified during backup or even bugs in archive creation within Borg itself. (Also see #656 which has more about a use case where running a spot check immediately after a create is actually useful.) But there may be more value in running the spot check on a separate schedule from the create, so that, for instance, it can actually detect unwanted drift / major changes between backed up archives and files on disk. It really depends what you want to get out of the check.

However given that this is all beta and very new, I'm open to: 1. Suggestions on ways of clarifying this documentation, and 2. Maybe changing the recommendation altogether if it doesn't really make sense for common use cases.

My second question is to clarify the meaning of the parameters as I find it a little unclear to me. If I have 1000 files, 100 files will be spot checked and from that 100 if 5 files are found to be different the check will fail? Is that the correct way to read this? In addition if there is more than a 10% change in the total number of files the check will fail given count_tolerance_percentage?

That is all correct!

> I want to confirm spot checks are working as intended. I am defining how to run spot checks inside config.yml. Once a month it should spot check an archive. Given issue #866 I am seeing an error occur due to xxh64sum which sends me a fail alert; however, new archives are still being created on my remote repo. Is this the expected behavior? Yes. The `create` and `check` actions are run independently, and `create` comes before `check` by default. So if the `create` succeeds and the `check` fails, a new archive will still be created. This applies to all checks, not just the spot check. > I already restarted the container and lost the current log information but I recall seeing the output of --stats in the logs before the xxh6sum error which added to the fact I have new archives in my repo means an archive is created then checks are run which is counter to the documentation which says to run spot check on a separate schedule to create. Is this expected behavior too or should I manually add spot check to my crontab.txt instead? Yeah, by default `create` runs before `check` by default, and that is expected behavior. The documentation (which perhaps needs clarification!) suggests that it may make more sense to run the spot check in particular on a separate schedule so that it's not simply checking that the archive that was *just* created has the exact same files that were *just* put into it. There is some value in doing that, because it could catch files modified during backup or even bugs in archive creation within Borg itself. (Also see #656 which has more about a use case where running a spot check immediately after a `create` is actually useful.) But there may be more value in running the spot check on a separate schedule from the `create`, so that, for instance, it can actually detect unwanted drift / major changes between backed up archives and files on disk. It really depends what you want to get out of the check. However given that this is all beta and very new, I'm open to: 1. Suggestions on ways of clarifying this documentation, and 2. Maybe changing the recommendation altogether if it doesn't really make sense for common use cases. > My second question is to clarify the meaning of the parameters as I find it a little unclear to me. If I have 1000 files, 100 files will be spot checked and from that 100 if 5 files are found to be different the check will fail? Is that the correct way to read this? In addition if there is more than a 10% change in the total number of files the check will fail given count_tolerance_percentage? That is all correct!
witten added the question / support label 2024-06-09 22:30:53 +00:00
witten added the waiting for response label 2024-06-24 19:30:26 +00:00
Owner

I'm closing this due to inactivity, but please feel free to file a new ticket if you still have questions.

I'm closing this due to inactivity, but please feel free to file a new ticket if you still have questions.
witten removed the waiting for response label 2024-10-09 16:26:39 +00:00
Sign in to join this conversation.
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#868