WIP: Separate endpoint for each primary action for monitor hooks #660

Closed
jetchirag wants to merge 8 commits from jetchirag/borgmatic:action-based-monitor-endpoints into main
Contributor

Fixes #518

Defines new schema for cronitor to allow configuring endpoint per primary action.

  • Unit test for new schema and normalization

Signed-off-by: jetchirag thechiragaggarwal@gmail.com

Fixes #518 Defines new schema for cronitor to allow configuring endpoint per primary action. - [ ] Unit test for new schema and normalization Signed-off-by: jetchirag <thechiragaggarwal@gmail.com>
jetchirag added 1 commit 2023-03-25 13:38:04 +00:00
e87ebab625 Initial schema modification and ping_monitor params changes
Signed-off-by: jetchirag <thechiragaggarwal@gmail.com>
diivi reviewed 2023-03-25 14:00:32 +00:00
@ -66,6 +66,7 @@ def run_configuration(config_filename, config, arguments):
error_repository = ''
using_primary_action = {'create', 'prune', 'compact', 'check'}.intersection(arguments)
monitoring_log_level = verbosity_to_log_level(global_arguments.monitoring_verbosity)
action_name = list(arguments.keys())[0]
Collaborator

nit(pythonic): next(iter(arguments))

nit(pythonic): `next(iter(arguments))`
Collaborator

Also, if we are only using this variable when using_primary_action, it might make sense to rename it to primary_aciton_name and/or move it where it's needed too.

Also, if we are only using this variable when `using_primary_action`, it might make sense to rename it to `primary_aciton_name` and/or move it where it's needed too.
Author
Contributor

Thanks, that's definitely more effecient.

Thanks, that's definitely more effecient.
jetchirag marked this conversation as resolved
diivi reviewed 2023-03-25 14:01:10 +00:00
@ -1235,3 +1258,2 @@
description: |
Cronitor ping URL to notify when a backup begins,
ends, or errors.
If this is set, other properties will be ignored
Collaborator

A description should probably still be here (followed by what you added). somethine like Default ping url for cronitor...

A description should probably still be here (followed by what you added). somethine like `Default ping url for cronitor...`
diivi reviewed 2023-03-25 14:01:20 +00:00
diivi reviewed 2023-03-25 14:01:28 +00:00
@ -38,0 +37,4 @@
try:
ping_url = '{}/{}'.format(hook_config[action_name], MONITOR_STATE_TO_CRONITOR[state])
except KeyError:
print('KeyError')
Collaborator

Missed a debug log here 😄

Missed a debug log here 😄
Author
Contributor

Thanks! :)

Thanks! :)
jetchirag marked this conversation as resolved
diivi reviewed 2023-03-25 14:13:43 +00:00
@ -1227,20 +1227,38 @@ properties:
service. See borgmatic monitoring documentation for details.
cronitor:
type: object
required: ['ping_url']
Collaborator

Not sure about this, but now that we don't have the required validation for anything, what will happen when I do this:

cronitor:
next_item:

Because even in normalize.py we only do checks for (cronitor == str and cronitor == dict).

Not sure about this, but now that we don't have the `required` validation for anything, what will happen when I do this: ```yaml cronitor: next_item: ``` Because even in `normalize.py` we only do checks for (cronitor == str and cronitor == dict).
Author
Contributor

I think that should be covered by default yaml configuration. It should error out cause cronitor is now None.

I think that should be covered by default yaml configuration. It should error out cause cronitor is now None.
Author
Contributor

An error occurred while parsing a configuration file at /XXX/borgmatic/config.yaml:
At 'hooks.cronitor': None is not of type 'object'

An error occurred while parsing a configuration file at /XXX/borgmatic/config.yaml: At 'hooks.cronitor': None is not of type 'object'
jetchirag marked this conversation as resolved
diivi reviewed 2023-03-25 14:33:27 +00:00
@ -90,4 +104,3 @@
def test_ping_monitor_with_unsupported_monitoring_state():
hook_config = {'ping_url': 'https://example.com'}
Collaborator

Maybe one of these tests (or an additional one) can have the ping_url instead of create as a key, now that we support it.

Maybe one of these tests (or an additional one) can have the `ping_url` instead of `create` as a key, now that we support it.
Author
Contributor

Yes, would be working on tests once I complete basic functions.

Yes, would be working on tests once I complete basic functions.
Collaborator

My bad, you can mark the PR as draft on gitea too btw.

My bad, you can mark the PR as draft on gitea too btw.
Author
Contributor

Just found it. Marked!

Just found it. Marked!
jetchirag marked this conversation as resolved
jetchirag added 1 commit 2023-03-25 14:34:36 +00:00
Collaborator

@jetchirag does this PR add support for multiple actions only for cronitor?
Every monitoring hook recieving the action_name and doing nothing with it seems a little off.

Maybe @witten can decide the scope though.

@jetchirag does this PR add support for multiple actions only for cronitor? Every monitoring hook recieving the action_name and doing nothing with it seems a little off. Maybe @witten can decide the scope though.
Author
Contributor

@diivi No, full scope of this PR includes all 5 supported monitors. This Initial PR is just to verify the method.

@diivi No, full scope of this PR includes all 5 supported monitors. This Initial PR is just to verify the method.
jetchirag changed title from Initial schema modification and ping_monitor params changes to Seperate endpoint for each primary action for monitor hooks 2023-03-25 14:51:19 +00:00
jetchirag changed title from Seperate endpoint for each primary action for monitor hooks to WIP: Seperate endpoint for each primary action for monitor hooks 2023-03-25 14:51:31 +00:00
jetchirag changed title from WIP: Seperate endpoint for each primary action for monitor hooks to WIP: Separate endpoint for each primary action for monitor hooks 2023-03-25 14:55:50 +00:00
jetchirag added 1 commit 2023-03-26 10:16:49 +00:00
1ee3b89e99 Merge with master
Signed-off-by: jetchirag <thechiragaggarwal@gmail.com>
witten requested changes 2023-03-26 21:16:53 +00:00
witten left a comment
Owner

The overall approach seems like it'd work! The main thing missing to my eye though is support for multiple borgmatic actions run at once. E.g.: borgmatic create check or even just borgmatic which implicitly runs four actions. Presumably a monitor should be pinged once for each such action, assuming it's configured?

The overall approach seems like it'd work! The main thing missing to my eye though is support for multiple borgmatic actions run at once. E.g.: `borgmatic create check` or even just `borgmatic` which implicitly runs four actions. Presumably a monitor should be pinged once for each such action, assuming it's configured?
@ -66,6 +66,7 @@ def run_configuration(config_filename, config, arguments):
error_repository = ''
using_primary_action = {'create', 'prune', 'compact', 'check'}.intersection(arguments)
monitoring_log_level = verbosity_to_log_level(global_arguments.monitoring_verbosity)
action_name = next(iter(arguments))
Owner

If I'm reading this correctly, this will just grab the first action. But there could very well be multiple actions at once, and I'd expect Cronitor to be pinged for each one (if configured for each one).

If I'm reading this correctly, this will just grab the first action. But there could very well be multiple actions at once, and I'd expect Cronitor to be pinged for each one (if configured for each one).
Author
Contributor

@witten I unexpectedly missed that multiple actions. Will fix it.

@witten I unexpectedly missed that multiple actions. Will fix it.
@ -30,0 +33,4 @@
'prune': cronitor['ping_url'],
'compact': cronitor['ping_url'],
'check': cronitor['ping_url'],
}
Owner

Makes sense to me. You could even do something like this if you wanted to:

config['hooks']['cronitor'] = {
    action_name: config['hooks']['cronitor']
    for action_name in ('create', 'prune', 'compact', 'check')
}

In any case, I'd expect to see a unit test covering this new normalization.

Makes sense to me. You could even do something like this if you wanted to: ```python config['hooks']['cronitor'] = { action_name: config['hooks']['cronitor'] for action_name in ('create', 'prune', 'compact', 'check') } ``` In any case, I'd expect to see a unit test covering this new normalization.
Author
Contributor

Added unit test

Added unit test
jetchirag marked this conversation as resolved
@ -1247,2 +1270,2 @@
Cronitor ping URL to notify when a backup begins,
ends, or errors.
If this is set, other properties will be ignored
and replaced by this value.
Owner

Did oneOf end up not working for this use case? Seems like it might be a little bit cleaner.

Did [oneOf](https://json-schema.org/understanding-json-schema/reference/combining.html#oneof) end up not working for this use case? Seems like it might be a little bit cleaner.
@ -23,3 +23,3 @@
def ping_monitor(hook_config, config_filename, state, monitoring_log_level, dry_run):
def ping_monitor(hook_config, config_filename, state, monitoring_log_level, dry_run, action_name):
Owner

To support the multiple actions thing mentioned above, a couple of options I can think of:

  • Change this to take multiple action names and internally loop accordingly.
  • Or, call ping_monitor() in a loop over the action names.
To support the multiple actions thing mentioned above, a couple of options I can think of: * Change this to take multiple action names and internally loop accordingly. * Or, call `ping_monitor()` in a loop over the action names.
Author
Contributor

I decided to go with calling ping_monitor in a loop.

I decided to go with calling ping_monitor in a loop.
Owner

Sounds good. Please let me know when this is ready for me to have another look!

Sounds good. Please let me know when this is ready for me to have another look!
Author
Contributor

I think it's ready for a review. Only tests should be missing.

I think it's ready for a review. Only tests should be missing.
jetchirag marked this conversation as resolved
@ -37,1 +37,3 @@
ping_url = f"{hook_config['ping_url']}/{MONITOR_STATE_TO_CRONITOR[state]}"
try:
ping_url = f"{hook_config[action_name]}/{MONITOR_STATE_TO_CRONITOR[state]}"
except KeyError:
Owner

Nice, succinct way to skip unconfigured actions. You could optionally logger.debug() in the except so that it's clear what's being skipped and why.

Nice, succinct way to skip unconfigured actions. You could optionally `logger.debug()` in the `except` so that it's clear what's being skipped and why.
jetchirag marked this conversation as resolved
Author
Contributor

@witten How do you want to go about adding flag to turn off monitor hook in interactive mode? Does borgmatic already have a method to detect cron or interactive mode?

If not, I think it'd better to make this flag --skip-monitors i.e. opt-out instead of opt-in like --ping-monitor to avoid breaking monitoring for existing configs.

@witten How do you want to go about adding flag to turn off monitor hook in interactive mode? Does borgmatic already have a method to detect cron or interactive mode? If not, I think it'd better to make this flag `--skip-monitors` i.e. opt-out instead of opt-in like `--ping-monitor` to avoid breaking monitoring for existing configs.
Owner

There's not currently a way to disable the monitoring hook in interactive mode, but it only fires for certain actions: create, prune, compact, and check. So if you're doing an interactive list or info or whatever, that (currently) won't trigger the monitors.

--skip-monitors (or perhaps --skip-monitoring, since that's what it's called elsewhere, e.g. --monitoring-verbosity) as a new opt-out flag makes sense to me.

There's not currently a way to disable the monitoring hook in interactive mode, but it only fires for certain actions: `create`, `prune`, `compact`, and `check`. So if you're doing an interactive `list` or `info` or whatever, that (currently) won't trigger the monitors. `--skip-monitors` (or perhaps `--skip-monitoring`, since that's what it's called elsewhere, e.g. `--monitoring-verbosity`) as a new opt-out flag makes sense to me.
jetchirag added 3 commits 2023-03-29 10:25:23 +00:00
6ce6367a26 --skip-monitoring flag
Signed-off-by: Chirag Aggarwal <thechiragaggarwal@gmail.com>
2d728d57d7 minor; log on skipping cronitor due to KeyError
Signed-off-by: Chirag Aggarwal <thechiragaggarwal@gmail.com>
aa6f1e6623 Support multiple actions with ping_monitor
Signed-off-by: Chirag Aggarwal <thechiragaggarwal@gmail.com>
jetchirag requested review from witten 2023-03-31 18:57:34 +00:00
witten reviewed 2023-04-03 02:40:31 +00:00
witten left a comment
Owner

I left another round of feedback on this! Mostly minor stuff except for the one annoying issue with ping_url which you can see in comments. I'm interested in your thoughts on that.

I left another round of feedback on this! Mostly minor stuff except for the one annoying issue with `ping_url` which you can see in comments. I'm interested in your thoughts on that.
@ -146,2 +146,4 @@
help='Go through the motions, but do not actually write to any repositories',
)
global_group.add_argument(
'-nm',
Owner

I'd prefer not including short flags for an option like this one. I generally only add short flags for the more frequently used options.

But if it's just too annoying to type --skip-monitoring all the time, you could remove the option entirely and do what borgmatic does for logging: If borgmatic is run interactively, it implicitly disables syslog logging. So it could also disable monitoring when at an interactive console. Up to you! (Although, a related issue: #665.)

I'd prefer not including short flags for an option like this one. I generally only add short flags for the more frequently used options. But if it's just too annoying to type `--skip-monitoring` all the time, you could remove the option entirely and do what borgmatic does for logging: If borgmatic is run interactively, it implicitly disables syslog logging. So it could also disable monitoring when at an interactive console. Up to you! (Although, a related issue: #665.)
Author
Contributor

Noted. I would personally prefer the skip-monitoring flag if I am configuring monitoring. I've removed the short flag but let me know if you rather want to go through your alternate suggestion.

Noted. I would personally prefer the skip-monitoring flag if I am configuring monitoring. I've removed the short flag but let me know if you rather want to go through your alternate suggestion.
Owner

Nope, it's fine as-is IMO.

Nope, it's fine as-is IMO.
jetchirag marked this conversation as resolved
@ -148,0 +148,4 @@
global_group.add_argument(
'-nm',
'--skip-monitoring',
dest='skip_monitoring',
Owner

I think you can omit this dest line, because Python will automatically use this name based on the flag name.

I think you can omit this `dest` line, because Python will automatically use this name based on the flag name.
jetchirag marked this conversation as resolved
@ -66,6 +66,10 @@ def run_configuration(config_filename, config, arguments):
error_repository = ''
using_primary_action = {'create', 'prune', 'compact', 'check'}.intersection(arguments)
monitoring_log_level = verbosity_to_log_level(global_arguments.monitoring_verbosity)
skip_monitoring = getattr(global_arguments, 'skip_monitoring', False)
Owner

Why getattr()? Doesn't the attribute always exist now that you've added it to the global arguments? So would just global_arguments.skip_monitoring work? I don't even think you need the False here because action='store_true' above defaults to False if not specified.

Why `getattr()`? Doesn't the attribute always exist now that you've added it to the global arguments? So would just `global_arguments.skip_monitoring` work? I don't even think you need the `False` here because `action='store_true'` above defaults to False if not specified.
jetchirag marked this conversation as resolved
@ -68,1 +68,4 @@
monitoring_log_level = verbosity_to_log_level(global_arguments.monitoring_verbosity)
skip_monitoring = getattr(global_arguments, 'skip_monitoring', False)
action_names = [
action for action in arguments.keys() if action != 'global' and isinstance(action, str)
Owner

On the isinstance() call: Shouldn't they always be strings? Or is something else sneaking in there?

On the `isinstance()` call: Shouldn't they always be strings? Or is something else sneaking in there?
jetchirag marked this conversation as resolved
@ -96,0 +98,4 @@
monitoring_log_level,
global_arguments.dry_run,
action_name,
)
Owner

I know we've going back and forth on the subject of looping over ping_monitor calls, but something occurs to me: Are there cases where this will ping the monitor multiple times even if it's only configured with a single ping_url option? I'm thinking of this code in normalize.py:

        config['hooks']['cronitor'] = {
            'create': cronitor['ping_url'],
            'prune': cronitor['ping_url'],
            'compact': cronitor['ping_url'],
            'check': cronitor['ping_url'],
        }

I think that means with the actions loop in place and ping_url set, Cronitor will get pinged four times for a standard borgmatic run. But shouldn't it ideally only get pinged once on start—unless the user explicitly configures ping endpoints for multiple actions?

So, the question is what to do about this. One idea is to remove the above normalization code so that you can retain the prior behavior when ping_url is set: Only ping once on start. Another idea is to drop support for ping_url entirely. There are probably other options as well. I'm open to discussion on this point of course! I didn't realize this would be such a thorny ticket.

I know we've going back and forth on the subject of looping over `ping_monitor` calls, but something occurs to me: Are there cases where this will ping the monitor multiple times even if it's only configured with a single `ping_url` option? I'm thinking of this code in `normalize.py`: ``` config['hooks']['cronitor'] = { 'create': cronitor['ping_url'], 'prune': cronitor['ping_url'], 'compact': cronitor['ping_url'], 'check': cronitor['ping_url'], } ``` I think that means with the actions loop in place and `ping_url` set, Cronitor will get pinged *four* times for a standard borgmatic run. But shouldn't it ideally only get pinged *once* on start—unless the user explicitly configures ping endpoints for multiple actions? So, the question is what to do about this. One idea is to remove the above normalization code so that you can retain the prior behavior when `ping_url` is set: Only ping *once* on start. Another idea is to drop support for `ping_url` entirely. There are probably other options as well. I'm open to discussion on this point of course! I didn't realize this would be such a thorny ticket.
Author
Contributor

@witten I had this concern too since I started working on this. I was also wondering if someone wants to receive notification for a failed event in any of 4 actions, he will have to set the endpoints in each of them and same for future update.

And suppose someone wants to get notified for failure in all of these actions, it will send alert 4 times if all of them fail, where probability of other act

So what if we keep the ping_url but add a boolean configuration so user can select which action to receive notification on. I don't think people will use different cronitor endpoint for different actions of same job.

This should still fulfil the problem raised in this PR's issue.

It's been a while since I worked on this so I might be misunderstanding something here. So these are just the thoughts I gathered going over the code again. Let me know what you think.

@witten I had this concern too since I started working on this. I was also wondering if someone wants to receive notification for a failed event in any of 4 actions, he will have to set the endpoints in each of them and same for future update. And suppose someone wants to get notified for failure in all of these actions, it will send alert 4 times if all of them fail, where probability of other act So what if we keep the ping_url but add a boolean configuration so user can select which action to receive notification on. I don't think people will use different cronitor endpoint for different actions of same job. This should still fulfil the problem raised in this PR's issue. It's been a while since I worked on this so I might be misunderstanding something here. So these are just the thoughts I gathered going over the code again. Let me know what you think.
Owner

It's also been a while since I've looked at this code! Could you maybe provide a configuration example to illustrate what you're thinking here? For instance, I'm unsure how an additional boolean configuration option would allow a user to select which borgmatic action to receive notifications on. I would think that any action with a Cronitor URL set would receive a notification. Or is the distinction you're drawing that, for a given action, URLs might only be notified for some of begin/start/fail and not all three?

I don't think people will use different cronitor endpoint for different actions of same job.

Looking at #518, I think a common case is probably where the user wants one URL for, say, the create action and a completely different URL if any for, say, the check action. (See #366 as well.) In that case it wouldn't be a problem that each such URL would get pinged once upon failure, as each action is hitting a different Cronitor endpoint.

For the current (released borgmatic) use case of pinging Cronitor only once for begin/start/fail when the action is create, prune, compact, or check.. What do you think of including some logic such that if a URL is specified for multiple different actions (whether explicitly.. or implicitly via ping_url), then borgmatic is smart enough to only ping Cronitor once before/after all the actions and once on failure.

I've given no thought to how that might be implemented, but such de-duplication seems nice from a user perspective—and it lines up with the currently released behavior. Let me know though if this doesn't sound feasible or desirable.

It's also been a while since I've looked at this code! Could you maybe provide a configuration example to illustrate what you're thinking here? For instance, I'm unsure how an additional boolean configuration option would allow a user to select which borgmatic action to receive notifications on. I would think that any action with a Cronitor URL set would receive a notification. Or is the distinction you're drawing that, for a given action, URLs might only be notified for some of begin/start/fail and not all three? > I don't think people will use different cronitor endpoint for different actions of same job. Looking at #518, I think a common case *is* probably where the user wants one URL for, say, the `create` action and a completely different URL if any for, say, the `check` action. (See #366 as well.) In that case it wouldn't be a problem that each such URL would get pinged once upon failure, as each action is hitting a different Cronitor endpoint. For the current (released borgmatic) use case of pinging Cronitor only once for begin/start/fail when the action is create, prune, compact, or check.. What do you think of including some logic such that if a URL is specified for multiple different actions (whether explicitly.. or implicitly via `ping_url`), then borgmatic is smart enough to only ping Cronitor once before/after all the actions and once on failure. I've given no thought to how that might be implemented, but such de-duplication seems nice from a user perspective—and it lines up with the currently released behavior. Let me know though if this doesn't sound feasible or desirable.
Author
Contributor

It makes sense to include logic to remove duplicate endpoints. I'll add that.

I've fixed the latest merge conflicts and will start working on the rest of suggestions. Development might still be slow unfortunately since classes are starting but I'll keep at it. Since merge conflicts are fixed, I can solve the rest in small pieces.

Thank you for the help and patience!

It makes sense to include logic to remove duplicate endpoints. I'll add that. I've fixed the latest merge conflicts and will start working on the rest of suggestions. Development might still be slow unfortunately since classes are starting but I'll keep at it. Since merge conflicts are fixed, I can solve the rest in small pieces. Thank you for the help and patience!
Owner

Hey @jetchirag, just checking in on this PR. I know you've got lots of other work going on with GSoC, but do you intend to follow-up on this one? Thanks, and let me know if I can answer any questions on the feedback above.

Hey @jetchirag, just checking in on this PR. I know you've got lots of other work going on with GSoC, but do you intend to follow-up on this one? Thanks, and let me know if I can answer any questions on the feedback above.
Author
Contributor

@witten Hey, with start of GSoC and completing the other PR, this one slipped my radar.

So Sorry for the delay! I do intend to complete this and will get right back as soon as possible. I dread resolving all the conflicts after such inactivity though :(

@witten Hey, with start of GSoC and completing the other PR, this one slipped my radar. So Sorry for the delay! I do intend to complete this and will get right back as soon as possible. I dread resolving all the conflicts after such inactivity though :(
Owner

No worries.. and take your time. I just wanted to make sure you still intended to work on this. If you need help resolving any of the conflicts, please let me know!

No worries.. and take your time. I just wanted to make sure you still intended to work on this. If you need help resolving any of the conflicts, please let me know!
jetchirag added 1 commit 2023-07-28 19:08:57 +00:00
08e8262ee7 Minor suggestion fixes
Signed-off-by: jetchirag <thechiragaggarwal@gmail.com>
jetchirag added 1 commit 2023-07-28 20:09:34 +00:00
29e718813d Resolved all merge conflicts
Signed-off-by: jetchirag <thechiragaggarwal@gmail.com>
Owner

Thank you for the help and patience!

And thank you for your persistence here! Please let me know when it's ready for review.

> Thank you for the help and patience! And thank you for your persistence here! Please let me know when it's ready for review.
Owner

Closing this for now due to inactivity, but I'd be happy to revisit this if you're still game!

Closing this for now due to inactivity, but I'd be happy to revisit this if you're still game!
witten closed this pull request 2024-03-12 04:17:37 +00:00

Pull request closed

Sign in to join this conversation.
No reviewers
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#660
No description provided.