diff --git a/NEWS b/NEWS index 22d0952c..7e50965f 100644 --- a/NEWS +++ b/NEWS @@ -1,8 +1,8 @@ 1.8.10.dev0 - * #656: Add a "spot" consistency check that compares file counts and contents between your source - files and the latest archive, ensuring they fall within configured tolerances. This can catch - problems like incorrect excludes, inadvertent deletes, files changed by malware, etc. See the - documentation for more information: + * #656 (beta): Add a "spot" consistency check that compares file counts and contents between your + source files and the latest archive, ensuring they fall within configured tolerances. This can + catch problems like incorrect excludes, inadvertent deletes, files changed by malware, etc. See + the documentation for more information: https://torsion.org/borgmatic/docs/how-to/deal-with-very-large-backups/#spot-check * #842: When a command hook exits with a soft failure, ping the log and finish states for any configured monitoring hooks. diff --git a/borgmatic/actions/check.py b/borgmatic/actions/check.py index 2c215706..60d8f05d 100644 --- a/borgmatic/actions/check.py +++ b/borgmatic/actions/check.py @@ -410,7 +410,7 @@ def compare_spot_check_hashes( # source directories. spot_check_config = next(check for check in config['checks'] if check['name'] == 'spot') sample_count = max( - int(len(source_paths) * (spot_check_config['data_sample_percentage'] / 100)), 1 + int(len(source_paths) * (min(spot_check_config['data_sample_percentage'], 100) / 100)), 1 ) source_sample_paths = tuple(random.sample(source_paths, sample_count)) existing_source_sample_paths = { diff --git a/docs/how-to/deal-with-very-large-backups.md b/docs/how-to/deal-with-very-large-backups.md index b4ba3fdd..5331398a 100644 --- a/docs/how-to/deal-with-very-large-backups.md +++ b/docs/how-to/deal-with-very-large-backups.md @@ -108,23 +108,23 @@ for more information. The various consistency checks all have trade-offs around speed and thoroughness, but most of them don't even look at your original source files—arguably one important way to ensure your backups contain the files -you'll ultimately want to restore in the case of catastrophe (or just an -accidentally deleted file). Because if something goes wrong with your source -files, most consistency checks will still pass with flying colors and you -won't discover there's a problem until you go to restore. +you'll want to restore in the case of catastrophe (or just an accidentally +deleted file). Because if something goes wrong with your source files, most +consistency checks will still pass with flying colors and you won't discover +there's a problem until you go to restore. New in version 1.8.10 Beta feature That's where the spot -check comes in. This check actually compares your source files counts and data +check comes in. This check actually compares your source file counts and data against those in the latest archive, potentially catching problems like incorrect excludes, inadvertent deletes, files changed by malware, etc. However, because an exhaustive comparison of all source files against the -latest archive might be too slow, the spot check supports sampling a +latest archive might be too slow, the spot check supports *sampling* a percentage of your source files for the comparison, ensuring it falls within configured tolerances. -Here's how to use it. Start by installing the `xxhash` OS package if you don't +Here's how it works. Start by installing the `xxhash` OS package if you don't already have it, so the spot check can run the `xxh64sum` command and efficiently hash files for comparison. Then add something like the following to your borgmatic configuration: @@ -140,40 +140,44 @@ checks: The `count_tolerance_percentage` is the percentage delta between the source directories file count and the latest backup archive file count that is allowed before the entire consistency check fails. For instance, if the spot -check runs and finds 100 source files and 105 files in the latest archive, -that would be within a 10% count tolerance and the check would succeed. But if -there were 100 source files and 200 archive files, the check would fail. (100 -source files and only 50 archive files would also fail.) +check runs and finds 100 source files on disk and 105 files in the latest +archive, that would be within the configured 10% count tolerance and the check +would succeed. But if there were 100 source files and 200 archive files, the +check would fail. (100 source files and only 50 archive files would also +fail.) The `data_sample_percentage` is the percentage of total files in the source directories to randomly sample and compare to their corresponding files in the -latest backup archive. The comparison is performed by hashing the selected -files in each of the source paths and the backup archive and counting hashes -that don't match. For instance, if you have 1,000 source files and your sample -percentage is 1%, then only 10 source files will be compared against the -latest archive. These sampled files are selected randomly each time, so in -effect the spot check is a probabilistic check. +latest backup archive. A higher value allows a more accurate check—and a +slower one. The comparison is performed by hashing the selected files in each +of the source paths and counting hashes that don't match the latest archive. +For instance, if you have 1,000 source files and your sample percentage is 1%, +then only 10 source files will be compared against the latest archive. These +sampled files are selected randomly each time, so in effect the spot check is +probabilistic. The `data_tolerance_percentage` is the percentage of total files in the source directories that can fail a spot check data comparison without failing the entire consistency check. The value must be lower than or equal to the `contents_sample_percentage`. -All three options are required when using the spot check. And because the spot +All three options are required when using the spot check. And because the check relies on these configured tolerances, it may not be a set-it-and-forget-it type of consistency check, at least until you get the -tolerances dialed in so there are minimal false positives or negatives. For -certain workloads where your source files experience wild swings of changed -data or file counts, the spot check may not suitable at all. +tolerances dialed in so there are minimal false positives or negatives. It is +recommended you run `borgmatic check` several times after configuring the spot +check, tweaking your tolerances as needed. For certain workloads where your +source files experience wild swings of file contents or counts, the spot check +may not suitable at all. -What if you change or add or delete a bunch of your source files and you don't +What if you add, delete, or change a bunch of your source files and you don't want the spot check to fail the next time it's run? Run `borgmatic create` to -create a new backup, thereby allowing the spot check to run against an archive -that contains your source file changes. +create a new backup, thereby allowing the next spot check to run against an +archive that contains your recent changes. -While the spot check feature is currently in beta, it may be subject to -breaking changes. But feel free to use it in production if you're okay with -that caveat, and please [provide any +As long as the spot check feature is in beta, it may be subject to breaking +changes. But feel free to use it in production if you're okay with that +caveat, and please [provide any feedback](https://torsion.org/borgmatic/#issues) you have on this feature. diff --git a/test_requirements.txt b/test_requirements.txt index 0b63bb87..24adc2b3 100644 --- a/test_requirements.txt +++ b/test_requirements.txt @@ -13,7 +13,7 @@ flake8-quotes==3.3.2 flake8-use-fstring==1.4 flake8-variables-names==0.0.5 flexmock==0.11.3 -idna==3.4 +idna==3.7 isort==5.12.0 jsonschema==4.17.3 Markdown==3.4.1 diff --git a/tests/unit/actions/test_check.py b/tests/unit/actions/test_check.py index 8246ec5e..48627205 100644 --- a/tests/unit/actions/test_check.py +++ b/tests/unit/actions/test_check.py @@ -635,6 +635,42 @@ def test_compare_spot_check_hashes_returns_paths_having_failing_hashes(): ) == ('/bar',) +def test_compare_spot_check_hashes_handles_data_sample_percentage_above_100(): + flexmock(module.random).should_receive('sample').replace_with( + lambda population, count: population[:count] + ) + flexmock(module.os.path).should_receive('exists').and_return(True) + flexmock(module.borgmatic.execute).should_receive( + 'execute_command_and_capture_output' + ).with_args(('xxh64sum', '/foo', '/bar')).and_return('hash1 /foo\nhash2 /bar') + flexmock(module.borgmatic.borg.list).should_receive('capture_archive_listing').and_return( + ['nothash1 /foo', 'nothash2 /bar'] + ) + + assert module.compare_spot_check_hashes( + repository={'path': 'repo'}, + archive='archive', + config={ + 'checks': [ + { + 'name': 'archives', + 'frequency': '2 weeks', + }, + { + 'name': 'spot', + 'data_sample_percentage': 1000, + }, + ] + }, + local_borg_version=flexmock(), + global_arguments=flexmock(), + local_path=flexmock(), + remote_path=flexmock(), + log_label='repo', + source_paths=('/foo', '/bar'), + ) == ('/foo', '/bar') + + def test_compare_spot_check_hashes_uses_xxh64sum_command_option(): flexmock(module.random).should_receive('sample').replace_with( lambda population, count: population[:count]