WIP: Add locking of borgmatic config file #254

Closed
drewkett wants to merge 9 commits from (deleted):configlock into main
1 changed files with 16 additions and 0 deletions
Showing only changes of commit 1d1df99ee8 - Show all commits

View File

@ -566,9 +566,25 @@ def main(): # pragma: no cover
logger.debug('Ensuring legacy configuration is upgraded')
convert.guard_configuration_upgraded(LEGACY_CONFIG_PATH, config_filenames)
locks = []

In terms of where to put this functionality, I'd recommend moving it to the run_configuration() function (same file). The idea is that the function is responsible for running all the actions for a particular borgmatic configuration file, and therefore it can also perform locking of it. That would also let you eliminate the loop, as run_configuration() is already per-config-file!

Alternately, this could could be put in a separate function that's called from run_configuration(), just to factor it out a bit.

One implication of either approach is that you'd be locking configuration files one at a time instead of all at once. That may actually be a good thing, as it's locking a smaller surface area. (Generally best practice, even if it's not strictly needed here.)

In terms of where to put this functionality, I'd recommend moving it to the `run_configuration()` function (same file). The idea is that the function is responsible for running all the actions for a particular borgmatic configuration file, and therefore it can also perform locking of it. That would also let you eliminate the loop, as `run_configuration()` is already per-config-file! Alternately, this could could be put in a separate function that's called from `run_configuration()`, just to factor it out a bit. One implication of either approach is that you'd be locking configuration files one at a time instead of all at once. That may actually be a good thing, as it's locking a smaller surface area. (Generally best practice, even if it's not strictly needed here.)

I haven’t looked at enough to understand the structure of the code. But for my use case I wanted to make sure it was locked for the entirety of the time before prebackup through postbackup. Would run_configuration accomplish that?

I haven’t looked at enough to understand the structure of the code. But for my use case I wanted to make sure it was locked for the entirety of the time before prebackup through postbackup. Would run_configuration accomplish that?

Another note is the file handle needs to be open for as long as the lock is held. I’ve done it in the past where I set up a context manager so its a little more pythonic where you use with hold_lock(): and then indent for duration.

Another note is the file handle needs to be open for as long as the lock is held. I’ve done it in the past where I set up a context manager so its a little more pythonic where you use `with hold_lock():` and then indent for duration.

I haven’t looked at enough to understand the structure of the code. But for my use case I wanted to make sure it was locked > for the entirety of the time before prebackup through postbackup. Would run_configuration accomplish that?

Nope, it wouldn't! It would lock each config file individually. So if you instead want to lock for the entirety of the time before pre-backup through post-backup, then I'd suggest putting the locking near the existing before_everything and after_everything hook invocations (in that same source file). That's because before_everything/after_everything are not configuration-file specific.. They trigger before/after all config files run. Unlike before_backup/after_backup, which are per-config file.

Anyway, if the locking is not per-config file, then have you considered making a separate dedicated lock file instead of having to lock n config files? I suppose by locking config files, you implicitly allow separate simultaneous borgmatic runs if they use disjoint configuration files. Is that a desired property?

> I haven’t looked at enough to understand the structure of the code. But for my use case I wanted to make sure it was locked > for the entirety of the time before prebackup through postbackup. Would run_configuration accomplish that? Nope, it wouldn't! It would lock each config file individually. So if you instead want to lock for the entirety of the time before pre-backup through post-backup, then I'd suggest putting the locking near the existing `before_everything` and `after_everything` hook invocations (in that same source file). That's because `before_everything`/`after_everything` are not configuration-file specific.. They trigger before/after all config files run. Unlike `before_backup`/`after_backup`, which are per-config file. Anyway, if the locking is not per-config file, then have you considered making a separate dedicated lock file instead of having to lock n config files? I suppose by locking config files, you implicitly allow separate simultaneous borgmatic runs if they use disjoint configuration files. Is that a desired property?

Sorry, I read per config file as per backup destination. That was my confusion. But now I'm not sure whether before_backup or before_everything since I didn't realize you could run borgmatic with multiple config files. I'm honestly not sure which makes more sense. For my use case it would be before_backup but I guess someone could be using before_everything similarly. For before_backup it makes sense to me just to lock the config file, but a single lock file for before_everything seems more logical. I guess it could be configurable, but I'm not sure if that is worth the complexity.

Sorry, I read per config file as per backup destination. That was my confusion. But now I'm not sure whether `before_backup` or `before_everything` since I didn't realize you could run borgmatic with multiple config files. I'm honestly not sure which makes more sense. For my use case it would be `before_backup` but I guess someone could be using `before_everything` similarly. For `before_backup` it makes sense to me just to lock the config file, but a single lock file for `before_everything` seems more logical. I guess it could be configurable, but I'm not sure if that is worth the complexity.

On the question of locking around before_backup vs. before_everything: Let's start by solving your use case, and then we can always expand the feature later or make it configurable as needed. But I'll also put a note on the original ticket (#250) in case any watchers there want to weigh in.

On the question of locking around `before_backup` vs. `before_everything`: Let's start by solving your use case, and then we can always expand the feature later or make it configurable as needed. But I'll also put a note on the original ticket (#250) in case any watchers there want to weigh in.
for config_filename,config in configs.items():
lock_config = config.get("location",dict()).get("lock_config",False)
if lock_config:
f = open(config_filename)
try:
fcntl.flock(f.fileno(), fcntl.LOCK_EX | fcntl.LOCK_NB)
except IOError:
logger.critical("Failed to acquire lock for {}".format(config_filename))
sys.exit(0))
locks.append(f)
summary_logs = parse_logs + list(collect_configuration_run_summary_logs(configs, arguments))
summary_logs_max_level = max(log.levelno for log in summary_logs)
# this removes the reference to the open files which python will garbage collect the file objects
# and close them
locks = None

Is this needed if you're okay with the garbage collector kicking in at the end of the function when the variable just goes out of scope?

Is this needed if you're okay with the garbage collector kicking in at the end of the function when the variable just goes out of scope?

No preference for this. I mostly put it in as a reminder that there are locks being held in case further changes to the code are made down the line. Though maybe the context manager is the way to go since it accomplishes this as well.

No preference for this. I mostly put it in as a reminder that there are locks being held in case further changes to the code are made down the line. Though maybe the context manager is the way to go since it accomplishes this as well.

Got it. I don't feel super strongly either way.

Got it. I don't feel super strongly either way.

Another note is the file handle needs to be open for as long as the lock is held. I’ve done it in the past where I set up a context manager so its a little more pythonic where you use with hold_lock(): and then indent for duration.

I suppose one minor benefit to the context manager is it provides a nice point to factor out the actual locking (e.g. hold_lock()).

> Another note is the file handle needs to be open for as long as the lock is held. I’ve done it in the past where I set up a context manager so its a little more pythonic where you use with hold_lock(): and then indent for duration. I suppose one minor benefit to the context manager is it provides a nice point to factor out the actual locking (e.g. `hold_lock()`).

An annoyance to the context manager in this case would be if its configurable whether locks are used. Its not a big deal but it makes a weird api since you basically need to pass an argument to the function to make it a noop if its turned off (rather than wrapping the code in an if statement and duplicating whats inside the context). I guess hold_lock(disabled=True) isn't terrible.

An annoyance to the context manager in this case would be if its configurable whether locks are used. Its not a big deal but it makes a weird api since you basically need to pass an argument to the function to make it a noop if its turned off (rather than wrapping the code in an if statement and duplicating whats inside the context). I guess `hold_lock(disabled=True)` isn't terrible.

True. Although if we punt on the configurability, then that becomes a problem for our future selves to solve. :D

True. Although if we punt on the configurability, then that becomes a problem for our future selves to solve. :D
for message in ('', 'summary:'):
log_record(
levelno=summary_logs_max_level,