WIP: Separate endpoint for each primary action for monitor hooks #660
|
@ -66,13 +66,10 @@ def run_configuration(config_filename, config, arguments):
|
|||
error_repository = ''
|
||||
using_primary_action = {'create', 'prune', 'compact', 'check'}.intersection(arguments)
|
||||
monitoring_log_level = verbosity_to_log_level(global_arguments.monitoring_verbosity)
|
||||
action_name = next(iter(arguments))
|
||||
skip_monitoring = global_arguments.skip_monitoring
|
||||
|
||||
# action_names = [action for action in arguments.keys() if action != 'global' and isinstance(action, str)]
|
||||
|
||||
# print (action_names)
|
||||
# return
|
||||
skip_monitoring = getattr(global_arguments, 'skip_monitoring', False)
|
||||
jetchirag marked this conversation as resolved
Outdated
|
||||
action_names = [
|
||||
action for action in arguments.keys() if action != 'global' and isinstance(action, str)
|
||||
jetchirag marked this conversation as resolved
Outdated
witten
commented
On the On the `isinstance()` call: Shouldn't they always be strings? Or is something else sneaking in there?
|
||||
]
|
||||
|
||||
try:
|
||||
local_borg_version = borg_version.local_borg_version(storage, local_path)
|
||||
|
@ -90,17 +87,18 @@ def run_configuration(config_filename, config, arguments):
|
|||
monitoring_log_level,
|
||||
global_arguments.dry_run,
|
||||
)
|
||||
if using_primary_action and not skip_monitoring:
|
||||
dispatch.call_hooks(
|
||||
'ping_monitor',
|
||||
hooks,
|
||||
config_filename,
|
||||
monitor.MONITOR_HOOK_NAMES,
|
||||
monitor.State.START,
|
||||
monitoring_log_level,
|
||||
global_arguments.dry_run,
|
||||
action_name,
|
||||
)
|
||||
|
||||
for action_name in action_names:
|
||||
dispatch.call_hooks(
|
||||
'ping_monitor',
|
||||
hooks,
|
||||
config_filename,
|
||||
monitor.MONITOR_HOOK_NAMES,
|
||||
monitor.State.START,
|
||||
monitoring_log_level,
|
||||
global_arguments.dry_run,
|
||||
action_name,
|
||||
)
|
||||
witten
commented
I know we've going back and forth on the subject of looping over
I think that means with the actions loop in place and So, the question is what to do about this. One idea is to remove the above normalization code so that you can retain the prior behavior when I know we've going back and forth on the subject of looping over `ping_monitor` calls, but something occurs to me: Are there cases where this will ping the monitor multiple times even if it's only configured with a single `ping_url` option? I'm thinking of this code in `normalize.py`:
```
config['hooks']['cronitor'] = {
'create': cronitor['ping_url'],
'prune': cronitor['ping_url'],
'compact': cronitor['ping_url'],
'check': cronitor['ping_url'],
}
```
I think that means with the actions loop in place and `ping_url` set, Cronitor will get pinged *four* times for a standard borgmatic run. But shouldn't it ideally only get pinged *once* on startโunless the user explicitly configures ping endpoints for multiple actions?
So, the question is what to do about this. One idea is to remove the above normalization code so that you can retain the prior behavior when `ping_url` is set: Only ping *once* on start. Another idea is to drop support for `ping_url` entirely. There are probably other options as well. I'm open to discussion on this point of course! I didn't realize this would be such a thorny ticket.
jetchirag
commented
@witten I had this concern too since I started working on this. I was also wondering if someone wants to receive notification for a failed event in any of 4 actions, he will have to set the endpoints in each of them and same for future update. And suppose someone wants to get notified for failure in all of these actions, it will send alert 4 times if all of them fail, where probability of other act So what if we keep the ping_url but add a boolean configuration so user can select which action to receive notification on. I don't think people will use different cronitor endpoint for different actions of same job. This should still fulfil the problem raised in this PR's issue. It's been a while since I worked on this so I might be misunderstanding something here. So these are just the thoughts I gathered going over the code again. Let me know what you think. @witten I had this concern too since I started working on this. I was also wondering if someone wants to receive notification for a failed event in any of 4 actions, he will have to set the endpoints in each of them and same for future update.
And suppose someone wants to get notified for failure in all of these actions, it will send alert 4 times if all of them fail, where probability of other act
So what if we keep the ping_url but add a boolean configuration so user can select which action to receive notification on. I don't think people will use different cronitor endpoint for different actions of same job.
This should still fulfil the problem raised in this PR's issue.
It's been a while since I worked on this so I might be misunderstanding something here. So these are just the thoughts I gathered going over the code again. Let me know what you think.
witten
commented
It's also been a while since I've looked at this code! Could you maybe provide a configuration example to illustrate what you're thinking here? For instance, I'm unsure how an additional boolean configuration option would allow a user to select which borgmatic action to receive notifications on. I would think that any action with a Cronitor URL set would receive a notification. Or is the distinction you're drawing that, for a given action, URLs might only be notified for some of begin/start/fail and not all three?
Looking at #518, I think a common case is probably where the user wants one URL for, say, the For the current (released borgmatic) use case of pinging Cronitor only once for begin/start/fail when the action is create, prune, compact, or check.. What do you think of including some logic such that if a URL is specified for multiple different actions (whether explicitly.. or implicitly via I've given no thought to how that might be implemented, but such de-duplication seems nice from a user perspectiveโand it lines up with the currently released behavior. Let me know though if this doesn't sound feasible or desirable. It's also been a while since I've looked at this code! Could you maybe provide a configuration example to illustrate what you're thinking here? For instance, I'm unsure how an additional boolean configuration option would allow a user to select which borgmatic action to receive notifications on. I would think that any action with a Cronitor URL set would receive a notification. Or is the distinction you're drawing that, for a given action, URLs might only be notified for some of begin/start/fail and not all three?
> I don't think people will use different cronitor endpoint for different actions of same job.
Looking at #518, I think a common case *is* probably where the user wants one URL for, say, the `create` action and a completely different URL if any for, say, the `check` action. (See #366 as well.) In that case it wouldn't be a problem that each such URL would get pinged once upon failure, as each action is hitting a different Cronitor endpoint.
For the current (released borgmatic) use case of pinging Cronitor only once for begin/start/fail when the action is create, prune, compact, or check.. What do you think of including some logic such that if a URL is specified for multiple different actions (whether explicitly.. or implicitly via `ping_url`), then borgmatic is smart enough to only ping Cronitor once before/after all the actions and once on failure.
I've given no thought to how that might be implemented, but such de-duplication seems nice from a user perspectiveโand it lines up with the currently released behavior. Let me know though if this doesn't sound feasible or desirable.
jetchirag
commented
It makes sense to include logic to remove duplicate endpoints. I'll add that. I've fixed the latest merge conflicts and will start working on the rest of suggestions. Development might still be slow unfortunately since classes are starting but I'll keep at it. Since merge conflicts are fixed, I can solve the rest in small pieces. Thank you for the help and patience! It makes sense to include logic to remove duplicate endpoints. I'll add that.
I've fixed the latest merge conflicts and will start working on the rest of suggestions. Development might still be slow unfortunately since classes are starting but I'll keep at it. Since merge conflicts are fixed, I can solve the rest in small pieces.
Thank you for the help and patience!
|
||||
except (OSError, CalledProcessError) as error:
|
||||
if command.considered_soft_failure(config_filename, error):
|
||||
return
|
||||
|
@ -111,7 +109,9 @@ def run_configuration(config_filename, config, arguments):
|
|||
if not encountered_error:
|
||||
repo_queue = Queue()
|
||||
for repo in location['repositories']:
|
||||
repo_queue.put((repo, 0),)
|
||||
repo_queue.put(
|
||||
(repo, 0),
|
||||
)
|
||||
|
||||
while not repo_queue.empty():
|
||||
repository_path, retry_num = repo_queue.get()
|
||||
|
@ -135,7 +135,9 @@ def run_configuration(config_filename, config, arguments):
|
|||
)
|
||||
except (OSError, CalledProcessError, ValueError) as error:
|
||||
if retry_num < retries:
|
||||
repo_queue.put((repository_path, retry_num + 1),)
|
||||
repo_queue.put(
|
||||
(repository_path, retry_num + 1),
|
||||
)
|
||||
tuple( # Consume the generator so as to trigger logging.
|
||||
log_error_records(
|
||||
f'{repository_path}: Error running actions for repository',
|
||||
|
@ -161,16 +163,17 @@ def run_configuration(config_filename, config, arguments):
|
|||
try:
|
||||
if using_primary_action and not skip_monitoring:
|
||||
# send logs irrespective of error
|
||||
dispatch.call_hooks(
|
||||
'ping_monitor',
|
||||
hooks,
|
||||
config_filename,
|
||||
monitor.MONITOR_HOOK_NAMES,
|
||||
monitor.State.LOG,
|
||||
monitoring_log_level,
|
||||
global_arguments.dry_run,
|
||||
action_name,
|
||||
)
|
||||
for action_name in action_names:
|
||||
dispatch.call_hooks(
|
||||
'ping_monitor',
|
||||
hooks,
|
||||
config_filename,
|
||||
monitor.MONITOR_HOOK_NAMES,
|
||||
monitor.State.LOG,
|
||||
monitoring_log_level,
|
||||
global_arguments.dry_run,
|
||||
action_name,
|
||||
)
|
||||
except (OSError, CalledProcessError) as error:
|
||||
if command.considered_soft_failure(config_filename, error):
|
||||
return
|
||||
|
@ -181,16 +184,17 @@ def run_configuration(config_filename, config, arguments):
|
|||
if not encountered_error:
|
||||
try:
|
||||
if using_primary_action and not skip_monitoring:
|
||||
dispatch.call_hooks(
|
||||
'ping_monitor',
|
||||
hooks,
|
||||
config_filename,
|
||||
monitor.MONITOR_HOOK_NAMES,
|
||||
monitor.State.FINISH,
|
||||
monitoring_log_level,
|
||||
global_arguments.dry_run,
|
||||
action_name,
|
||||
)
|
||||
for action_name in action_names:
|
||||
dispatch.call_hooks(
|
||||
'ping_monitor',
|
||||
hooks,
|
||||
config_filename,
|
||||
monitor.MONITOR_HOOK_NAMES,
|
||||
monitor.State.FINISH,
|
||||
monitoring_log_level,
|
||||
global_arguments.dry_run,
|
||||
action_name,
|
||||
)
|
||||
dispatch.call_hooks(
|
||||
'destroy_monitor',
|
||||
hooks,
|
||||
|
@ -218,16 +222,17 @@ def run_configuration(config_filename, config, arguments):
|
|||
error=encountered_error,
|
||||
output=getattr(encountered_error, 'output', ''),
|
||||
)
|
||||
dispatch.call_hooks(
|
||||
'ping_monitor',
|
||||
hooks,
|
||||
config_filename,
|
||||
monitor.MONITOR_HOOK_NAMES,
|
||||
monitor.State.FAIL,
|
||||
monitoring_log_level,
|
||||
global_arguments.dry_run,
|
||||
action_name,
|
||||
)
|
||||
for action_name in action_names:
|
||||
dispatch.call_hooks(
|
||||
'ping_monitor',
|
||||
hooks,
|
||||
config_filename,
|
||||
monitor.MONITOR_HOOK_NAMES,
|
||||
monitor.State.FAIL,
|
||||
monitoring_log_level,
|
||||
global_arguments.dry_run,
|
||||
action_name,
|
||||
)
|
||||
dispatch.call_hooks(
|
||||
'destroy_monitor',
|
||||
hooks,
|
||||
|
@ -417,19 +422,39 @@ def run_actions(
|
|||
)
|
||||
elif action_name == 'rlist':
|
||||
yield from borgmatic.actions.rlist.run_rlist(
|
||||
repository, storage, local_borg_version, action_arguments, local_path, remote_path,
|
||||
repository,
|
||||
storage,
|
||||
local_borg_version,
|
||||
action_arguments,
|
||||
local_path,
|
||||
remote_path,
|
||||
)
|
||||
elif action_name == 'list':
|
||||
yield from borgmatic.actions.list.run_list(
|
||||
repository, storage, local_borg_version, action_arguments, local_path, remote_path,
|
||||
repository,
|
||||
storage,
|
||||
local_borg_version,
|
||||
action_arguments,
|
||||
local_path,
|
||||
remote_path,
|
||||
)
|
||||
elif action_name == 'rinfo':
|
||||
yield from borgmatic.actions.rinfo.run_rinfo(
|
||||
repository, storage, local_borg_version, action_arguments, local_path, remote_path,
|
||||
repository,
|
||||
storage,
|
||||
local_borg_version,
|
||||
action_arguments,
|
||||
local_path,
|
||||
remote_path,
|
||||
)
|
||||
elif action_name == 'info':
|
||||
yield from borgmatic.actions.info.run_info(
|
||||
repository, storage, local_borg_version, action_arguments, local_path, remote_path,
|
||||
repository,
|
||||
storage,
|
||||
local_borg_version,
|
||||
action_arguments,
|
||||
local_path,
|
||||
remote_path,
|
||||
)
|
||||
elif action_name == 'break-lock':
|
||||
borgmatic.actions.break_lock.run_break_lock(
|
||||
|
@ -442,7 +467,12 @@ def run_actions(
|
|||
)
|
||||
elif action_name == 'borg':
|
||||
borgmatic.actions.borg.run_borg(
|
||||
repository, storage, local_borg_version, action_arguments, local_path, remote_path,
|
||||
repository,
|
||||
storage,
|
||||
local_borg_version,
|
||||
action_arguments,
|
||||
local_path,
|
||||
remote_path,
|
||||
)
|
||||
|
||||
command.execute_hook(
|
||||
|
@ -635,7 +665,8 @@ def collect_configuration_run_summary_logs(configs, arguments):
|
|||
logger.info(f"Unmounting mount point {arguments['umount'].mount_point}")
|
||||
try:
|
||||
borg_umount.unmount_archive(
|
||||
mount_point=arguments['umount'].mount_point, local_path=get_local_path(configs),
|
||||
mount_point=arguments['umount'].mount_point,
|
||||
local_path=get_local_path(configs),
|
||||
)
|
||||
except (CalledProcessError, OSError) as error:
|
||||
yield from log_error_records('Error unmounting mount point', error)
|
||||
|
|
|
@ -35,9 +35,11 @@ def ping_monitor(hook_config, config_filename, state, monitoring_log_level, dry_
|
|||
|
||||
dry_run_label = ' (dry run; not actually pinging)' if dry_run else ''
|
||||
try:
|
||||
ping_url = f"{hook_config['sss']}/{MONITOR_STATE_TO_CRONITOR[state]}"
|
||||
ping_url = f"{hook_config[action_name]}/{MONITOR_STATE_TO_CRONITOR[state]}"
|
||||
except KeyError:
|
||||
jetchirag marked this conversation as resolved
witten
commented
Nice, succinct way to skip unconfigured actions. You could optionally Nice, succinct way to skip unconfigured actions. You could optionally `logger.debug()` in the `except` so that it's clear what's being skipped and why.
|
||||
logger.debug(f'{config_filename}: Skipping Cronitor {state.name.lower()} ping due to unconfigured action: {action_name}')
|
||||
logger.debug(
|
||||
jetchirag marked this conversation as resolved
Outdated
diivi
commented
Missed a debug log here ๐ Missed a debug log here ๐
jetchirag
commented
Thanks! :) Thanks! :)
|
||||
f'{config_filename}: Skipping Cronitor {state.name.lower()} ping due to unconfigured action: {action_name}'
|
||||
)
|
||||
return
|
||||
|
||||
logger.info(f'{config_filename}: Pinging Cronitor {state.name.lower()}{dry_run_label}')
|
||||
|
|
nit(pythonic):
next(iter(arguments))
Also, if we are only using this variable when
using_primary_action
, it might make sense to rename it toprimary_aciton_name
and/or move it where it's needed too.Thanks, that's definitely more effecient.
If I'm reading this correctly, this will just grab the first action. But there could very well be multiple actions at once, and I'd expect Cronitor to be pinged for each one (if configured for each one).
@witten I unexpectedly missed that multiple actions. Will fix it.
Why
getattr()
? Doesn't the attribute always exist now that you've added it to the global arguments? So would justglobal_arguments.skip_monitoring
work? I don't even think you need theFalse
here becauseaction='store_true'
above defaults to False if not specified.