WIP: Separate endpoint for each primary action for monitor hooks #660
No reviewers
Labels
No Label
bug
data loss
design finalized
good first issue
new feature area
question / support
security
waiting for response
No Milestone
No Assignees
3 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: borgmatic-collective/borgmatic#660
Loading…
Reference in New Issue
No description provided.
Delete Branch "jetchirag/borgmatic:action-based-monitor-endpoints"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Fixes #518
Defines new schema for cronitor to allow configuring endpoint per primary action.
Signed-off-by: jetchirag thechiragaggarwal@gmail.com
@ -66,6 +66,7 @@ def run_configuration(config_filename, config, arguments):
error_repository = ''
using_primary_action = {'create', 'prune', 'compact', 'check'}.intersection(arguments)
monitoring_log_level = verbosity_to_log_level(global_arguments.monitoring_verbosity)
action_name = list(arguments.keys())[0]
nit(pythonic):
next(iter(arguments))
Also, if we are only using this variable when
using_primary_action
, it might make sense to rename it toprimary_aciton_name
and/or move it where it's needed too.Thanks, that's definitely more effecient.
@ -1235,3 +1258,2 @@
description: |
Cronitor ping URL to notify when a backup begins,
ends, or errors.
If this is set, other properties will be ignored
A description should probably still be here (followed by what you added). somethine like
Default ping url for cronitor...
@ -38,0 +37,4 @@
try:
ping_url = '{}/{}'.format(hook_config[action_name], MONITOR_STATE_TO_CRONITOR[state])
except KeyError:
print('KeyError')
Missed a debug log here 😄
Thanks! :)
@ -1227,20 +1227,38 @@ properties:
service. See borgmatic monitoring documentation for details.
cronitor:
type: object
required: ['ping_url']
Not sure about this, but now that we don't have the
required
validation for anything, what will happen when I do this:Because even in
normalize.py
we only do checks for (cronitor == str and cronitor == dict).I think that should be covered by default yaml configuration. It should error out cause cronitor is now None.
An error occurred while parsing a configuration file at /XXX/borgmatic/config.yaml:
At 'hooks.cronitor': None is not of type 'object'
@ -90,4 +104,3 @@
def test_ping_monitor_with_unsupported_monitoring_state():
hook_config = {'ping_url': 'https://example.com'}
Maybe one of these tests (or an additional one) can have the
ping_url
instead ofcreate
as a key, now that we support it.Yes, would be working on tests once I complete basic functions.
My bad, you can mark the PR as draft on gitea too btw.
Just found it. Marked!
@jetchirag does this PR add support for multiple actions only for cronitor?
Every monitoring hook recieving the action_name and doing nothing with it seems a little off.
Maybe @witten can decide the scope though.
@diivi No, full scope of this PR includes all 5 supported monitors. This Initial PR is just to verify the method.
Initial schema modification and ping_monitor params changesto Seperate endpoint for each primary action for monitor hooksSeperate endpoint for each primary action for monitor hooksto WIP: Seperate endpoint for each primary action for monitor hooksWIP: Seperate endpoint for each primary action for monitor hooksto WIP: Separate endpoint for each primary action for monitor hooksThe overall approach seems like it'd work! The main thing missing to my eye though is support for multiple borgmatic actions run at once. E.g.:
borgmatic create check
or even justborgmatic
which implicitly runs four actions. Presumably a monitor should be pinged once for each such action, assuming it's configured?@ -66,6 +66,7 @@ def run_configuration(config_filename, config, arguments):
error_repository = ''
using_primary_action = {'create', 'prune', 'compact', 'check'}.intersection(arguments)
monitoring_log_level = verbosity_to_log_level(global_arguments.monitoring_verbosity)
action_name = next(iter(arguments))
If I'm reading this correctly, this will just grab the first action. But there could very well be multiple actions at once, and I'd expect Cronitor to be pinged for each one (if configured for each one).
@witten I unexpectedly missed that multiple actions. Will fix it.
@ -30,0 +33,4 @@
'prune': cronitor['ping_url'],
'compact': cronitor['ping_url'],
'check': cronitor['ping_url'],
}
Makes sense to me. You could even do something like this if you wanted to:
In any case, I'd expect to see a unit test covering this new normalization.
Added unit test
@ -1247,2 +1270,2 @@
Cronitor ping URL to notify when a backup begins,
ends, or errors.
If this is set, other properties will be ignored
and replaced by this value.
Did oneOf end up not working for this use case? Seems like it might be a little bit cleaner.
@ -23,3 +23,3 @@
def ping_monitor(hook_config, config_filename, state, monitoring_log_level, dry_run):
def ping_monitor(hook_config, config_filename, state, monitoring_log_level, dry_run, action_name):
To support the multiple actions thing mentioned above, a couple of options I can think of:
ping_monitor()
in a loop over the action names.I decided to go with calling ping_monitor in a loop.
Sounds good. Please let me know when this is ready for me to have another look!
I think it's ready for a review. Only tests should be missing.
@ -37,1 +37,3 @@
ping_url = f"{hook_config['ping_url']}/{MONITOR_STATE_TO_CRONITOR[state]}"
try:
ping_url = f"{hook_config[action_name]}/{MONITOR_STATE_TO_CRONITOR[state]}"
except KeyError:
Nice, succinct way to skip unconfigured actions. You could optionally
logger.debug()
in theexcept
so that it's clear what's being skipped and why.@witten How do you want to go about adding flag to turn off monitor hook in interactive mode? Does borgmatic already have a method to detect cron or interactive mode?
If not, I think it'd better to make this flag
--skip-monitors
i.e. opt-out instead of opt-in like--ping-monitor
to avoid breaking monitoring for existing configs.There's not currently a way to disable the monitoring hook in interactive mode, but it only fires for certain actions:
create
,prune
,compact
, andcheck
. So if you're doing an interactivelist
orinfo
or whatever, that (currently) won't trigger the monitors.--skip-monitors
(or perhaps--skip-monitoring
, since that's what it's called elsewhere, e.g.--monitoring-verbosity
) as a new opt-out flag makes sense to me.I left another round of feedback on this! Mostly minor stuff except for the one annoying issue with
ping_url
which you can see in comments. I'm interested in your thoughts on that.@ -146,2 +146,4 @@
help='Go through the motions, but do not actually write to any repositories',
)
global_group.add_argument(
'-nm',
I'd prefer not including short flags for an option like this one. I generally only add short flags for the more frequently used options.
But if it's just too annoying to type
--skip-monitoring
all the time, you could remove the option entirely and do what borgmatic does for logging: If borgmatic is run interactively, it implicitly disables syslog logging. So it could also disable monitoring when at an interactive console. Up to you! (Although, a related issue: #665.)Noted. I would personally prefer the skip-monitoring flag if I am configuring monitoring. I've removed the short flag but let me know if you rather want to go through your alternate suggestion.
Nope, it's fine as-is IMO.
@ -148,0 +148,4 @@
global_group.add_argument(
'-nm',
'--skip-monitoring',
dest='skip_monitoring',
I think you can omit this
dest
line, because Python will automatically use this name based on the flag name.@ -66,6 +66,10 @@ def run_configuration(config_filename, config, arguments):
error_repository = ''
using_primary_action = {'create', 'prune', 'compact', 'check'}.intersection(arguments)
monitoring_log_level = verbosity_to_log_level(global_arguments.monitoring_verbosity)
skip_monitoring = getattr(global_arguments, 'skip_monitoring', False)
Why
getattr()
? Doesn't the attribute always exist now that you've added it to the global arguments? So would justglobal_arguments.skip_monitoring
work? I don't even think you need theFalse
here becauseaction='store_true'
above defaults to False if not specified.@ -68,1 +68,4 @@
monitoring_log_level = verbosity_to_log_level(global_arguments.monitoring_verbosity)
skip_monitoring = getattr(global_arguments, 'skip_monitoring', False)
action_names = [
action for action in arguments.keys() if action != 'global' and isinstance(action, str)
On the
isinstance()
call: Shouldn't they always be strings? Or is something else sneaking in there?@ -96,0 +98,4 @@
monitoring_log_level,
global_arguments.dry_run,
action_name,
)
I know we've going back and forth on the subject of looping over
ping_monitor
calls, but something occurs to me: Are there cases where this will ping the monitor multiple times even if it's only configured with a singleping_url
option? I'm thinking of this code innormalize.py
:I think that means with the actions loop in place and
ping_url
set, Cronitor will get pinged four times for a standard borgmatic run. But shouldn't it ideally only get pinged once on start—unless the user explicitly configures ping endpoints for multiple actions?So, the question is what to do about this. One idea is to remove the above normalization code so that you can retain the prior behavior when
ping_url
is set: Only ping once on start. Another idea is to drop support forping_url
entirely. There are probably other options as well. I'm open to discussion on this point of course! I didn't realize this would be such a thorny ticket.@witten I had this concern too since I started working on this. I was also wondering if someone wants to receive notification for a failed event in any of 4 actions, he will have to set the endpoints in each of them and same for future update.
And suppose someone wants to get notified for failure in all of these actions, it will send alert 4 times if all of them fail, where probability of other act
So what if we keep the ping_url but add a boolean configuration so user can select which action to receive notification on. I don't think people will use different cronitor endpoint for different actions of same job.
This should still fulfil the problem raised in this PR's issue.
It's been a while since I worked on this so I might be misunderstanding something here. So these are just the thoughts I gathered going over the code again. Let me know what you think.
It's also been a while since I've looked at this code! Could you maybe provide a configuration example to illustrate what you're thinking here? For instance, I'm unsure how an additional boolean configuration option would allow a user to select which borgmatic action to receive notifications on. I would think that any action with a Cronitor URL set would receive a notification. Or is the distinction you're drawing that, for a given action, URLs might only be notified for some of begin/start/fail and not all three?
Looking at #518, I think a common case is probably where the user wants one URL for, say, the
create
action and a completely different URL if any for, say, thecheck
action. (See #366 as well.) In that case it wouldn't be a problem that each such URL would get pinged once upon failure, as each action is hitting a different Cronitor endpoint.For the current (released borgmatic) use case of pinging Cronitor only once for begin/start/fail when the action is create, prune, compact, or check.. What do you think of including some logic such that if a URL is specified for multiple different actions (whether explicitly.. or implicitly via
ping_url
), then borgmatic is smart enough to only ping Cronitor once before/after all the actions and once on failure.I've given no thought to how that might be implemented, but such de-duplication seems nice from a user perspective—and it lines up with the currently released behavior. Let me know though if this doesn't sound feasible or desirable.
It makes sense to include logic to remove duplicate endpoints. I'll add that.
I've fixed the latest merge conflicts and will start working on the rest of suggestions. Development might still be slow unfortunately since classes are starting but I'll keep at it. Since merge conflicts are fixed, I can solve the rest in small pieces.
Thank you for the help and patience!
Hey @jetchirag, just checking in on this PR. I know you've got lots of other work going on with GSoC, but do you intend to follow-up on this one? Thanks, and let me know if I can answer any questions on the feedback above.
@witten Hey, with start of GSoC and completing the other PR, this one slipped my radar.
So Sorry for the delay! I do intend to complete this and will get right back as soon as possible. I dread resolving all the conflicts after such inactivity though :(
No worries.. and take your time. I just wanted to make sure you still intended to work on this. If you need help resolving any of the conflicts, please let me know!
And thank you for your persistence here! Please let me know when it's ready for review.
Closing this for now due to inactivity, but I'd be happy to revisit this if you're still game!
Pull request closed