Check on specific day of month #785

Open
opened 2023-11-07 19:44:26 +00:00 by 2rs2ts · 3 comments

What I'd like to do and why

I'd like the ability to specify to specify check frequencies based not on when the last check was done, but based on the calendar. This would not replace the current frequency system, rather, it would be an alternative, perhaps specified with a different key (e.g., schedule instead of frequency.)

Examples

  • The 1st/2nd/etc. day of every month

    It should be possible to select multiple of these. For example, "every 1st and 15th day of the month" for something like "2 weeks" but something you can actually plan on

  • The 1st/2nd/etc. Monday/Tuesday/etc. of each month.

    Like with the last option, you should be able to choose multiple of these, so, like, "every 1st and 3rd Monday." It would also be nice, although not necessary or my use case, to be able to say stuff like "the 1st Monday and the 3rd Wednesday."

  • Every Monday/Tuesday/etc., not of each month, but just in absolute terms.

    You should also be able to choose stuff like "every other Saturday" or "every third Friday." Also, you should be able to do something like "Every Monday and every Thursday."

Other aspects/concerns

  • I am unsure whether missed checks (e.g., borgmatic fails to run on the day that the check would be scheduled to run) should just be queued up to run the next time that borgmatic runs, like they currently are, or if they should be allowed to run only on the dates specified, for those who find the checks disruptive. I think making that an option would probably be the best way to satisfy the most users, but personally, I would prefer that they only run on the dates specified, while specifying checks the current way would keep the current behavior.
  • I don't have a solid idea of what the format would be for the configuration. I do think it would probably be a list of objects (in YAML terms) but I don't know what the object keys would be, exactly. The reason for the list is that I expect that it would make it easier to specify combinations like "every 1st and 3rd Monday"–it'd be a list containing "the 1st Monday" and "the 3rd Monday"–but other than that, I don't have a specific design in mind. I would accept whatever format works best for implementing it without it being too much of a pain.
  • I think that, for simplicity's sake, you should only be able to choose one of frequency or this new feature when configuring when checks run. It would be confusing to combine "1 week" frequency and "every other Monday" scheduling, after all.

The motivations for this request

  1. The current approach makes the day of the week that my checks run irregular. It's more convenient for me to have checks run on weekends, when there is less of a need to do ad-hoc backups or other actions which would require the lock.
  2. I have multiple repositories, but due to various reasons like partially interrupted backups and some repositories being added later, the scheduled times for their checks is not aligned. This means that, rather than doing all my lengthier (data/archive/extract) checks on one predictable day, they happen unpredictably and will sometimes leave a job running on an undesirable day.
  3. I feel like this would be useful for other users, especially ones who manage databases or other services, so they can do checks during times of lower activity.

Thank you for considering.

Other notes / implementation ideas

No response

### What I'd like to do and why I'd like the ability to specify to specify check frequencies based not on when the last check was done, but based on the calendar. This would not replace the current frequency system, rather, it would be an alternative, perhaps specified with a different key (e.g., `schedule` instead of `frequency`.) ### Examples * The 1st/2nd/etc. day of every month It should be possible to select multiple of these. For example, "every 1st and 15th day of the month" for something _like_ "2 weeks" but something you can actually plan on * The 1st/2nd/etc. Monday/Tuesday/etc. of each month. Like with the last option, you should be able to choose multiple of these, so, like, "every 1st and 3rd Monday." It would also be nice, although not necessary or my use case, to be able to say stuff like "the 1st Monday and the 3rd Wednesday." * Every Monday/Tuesday/etc., not of each month, but just in absolute terms. You should also be able to choose stuff like "every other Saturday" or "every third Friday." Also, you should be able to do something like "Every Monday and every Thursday." ### Other aspects/concerns * I am unsure whether missed checks (e.g., borgmatic fails to run on the day that the check would be scheduled to run) should just be queued up to run the next time that borgmatic runs, like they currently are, or if they should be allowed to run _only_ on the dates specified, for those who find the checks disruptive. I think making that an option would probably be the best way to satisfy the most users, but personally, I would prefer that they only run on the dates specified, while specifying checks the current way would keep the current behavior. * I don't have a solid idea of what the format would be for the configuration. I do think it would probably be a list of objects (in YAML terms) but I don't know what the object keys would be, exactly. The reason for the list is that I expect that it would make it easier to specify combinations like "every 1st and 3rd Monday"–it'd be a list containing "the 1st Monday" and "the 3rd Monday"–but other than that, I don't have a specific design in mind. I would accept whatever format works best for implementing it without it being too much of a pain. * I think that, for simplicity's sake, you should only be able to choose one of `frequency` or this new feature when configuring when checks run. It would be confusing to combine "1 week" frequency and "every other Monday" scheduling, after all. ### The motivations for this request 1. The current approach makes the day of the week that my checks run irregular. It's more convenient for me to have checks run on weekends, when there is less of a need to do ad-hoc backups or other actions which would require the lock. 2. I have multiple repositories, but due to various reasons like partially interrupted backups and some repositories being added later, the scheduled times for their checks is not aligned. This means that, rather than doing all my lengthier (data/archive/extract) checks on one predictable day, they happen unpredictably and will sometimes leave a job running on an undesirable day. 3. I feel like this would be useful for other users, especially ones who manage databases or other services, so they can do checks during times of lower activity. Thank you for considering. ### Other notes / implementation ideas _No response_
Owner

First of all, thanks for taking the time to file such a detailed ticket. It's especially helpful to see the motivations so I can really understand your use case.

My first thought is that because borgmatic isn't a daemon like cron, it can't really make guarantees around when it's run. And you do mention this above in your section about missed checks. But this "feature" of borgmatic makes me wonder if allowing the sort of precise scheduling you've described would be a little misleading for the user. It's suggesting to them that they can request checks on, say, the 1st and 3rd Monday of the month, but borgmatic can only do best effort scheduling towards that.

So I actually have a different proposal that I hope solves your core need while still leaning on the existing frequency mechanism.. I feel like a simple frequency is much more "honest" to the user in that it telegraphs that the scheduling is only best effort. So in that spirit, what do you think of a configuration something like this?

checks:
    - name: repository
      frequency: 2 weeks
      days_of_week:
          - Saturday
          - Sunday

Semantically, this would mean that checks run every 2 weeks but only on Saturday or Sunday. So if the 2 week "timer" expired on a Thursday, for instance, the check wouldn't run until two days later on Saturday. I could also see supporting weekday or weekend as values in days_of_week.

In practice, I think this would mean that similarly configured checks would tend to get aligned naturally. (Although maybe in practice, some would run on the 1st and 3rd Saturday and some on the 2nd and 4th.)

I have multiple repositories, but due to various reasons like partially interrupted backups and some repositories being added later, the scheduled times for their checks is not aligned. This means that, rather than doing all my lengthier (data/archive/extract) checks on one predictable day, they happen unpredictably and will sometimes leave a job running on an undesirable day.

Would the day of week restriction sufficiently solve that, even if it's not perfect? Or do you actually need them to run at particular points in the month (1st Thursday, etc.)? You might also consider cron/systemd for this need. I'm open to expanding borgmatic's scheduling functionality, but if you need really precise scheduling, a real scheduler may be a better bet.

First of all, thanks for taking the time to file such a detailed ticket. It's especially helpful to see the motivations so I can really understand your use case. My first thought is that because borgmatic isn't a daemon like cron, it can't really make guarantees around when it's run. And you do mention this above in your section about missed checks. But this "feature" of borgmatic makes me wonder if allowing the sort of precise scheduling you've described would be a little misleading for the user. It's suggesting to them that they can request checks on, say, the 1st and 3rd Monday of the month, but borgmatic can only do best effort scheduling towards that. So I actually have a different proposal that I hope solves your core need while still leaning on the existing `frequency` mechanism.. I feel like a simple `frequency` is much more "honest" to the user in that it telegraphs that the scheduling is only best effort. So in that spirit, what do you think of a configuration something like this? ```yaml checks: - name: repository frequency: 2 weeks days_of_week: - Saturday - Sunday ``` Semantically, this would mean that checks run every 2 weeks but only on Saturday or Sunday. So if the 2 week "timer" expired on a Thursday, for instance, the check wouldn't run until two days later on Saturday. I could also see supporting `weekday` or `weekend` as values in `days_of_week`. In practice, I think this would mean that similarly configured checks would tend to get aligned naturally. (Although maybe in practice, some would run on the 1st and 3rd Saturday and some on the 2nd and 4th.) > I have multiple repositories, but due to various reasons like partially interrupted backups and some repositories being added later, the scheduled times for their checks is not aligned. This means that, rather than doing all my lengthier (data/archive/extract) checks on one predictable day, they happen unpredictably and will sometimes leave a job running on an undesirable day. Would the day of week restriction sufficiently solve that, even if it's not perfect? Or do you actually need them to run at particular points in the month (1st Thursday, etc.)? You might also consider cron/systemd for this need. I'm open to expanding borgmatic's scheduling functionality, but if you need really precise scheduling, a real scheduler may be a better bet.
Author

But this "feature" of borgmatic makes me wonder if allowing the sort of precise scheduling you've described would be a little misleading for the user.

I definitely tried to be considerate of that, but knew that I wouldn't be able to fully reconcile the tension here. I mean, it's kind of already like that already (you might say to run some check every week, but then, due to missed borgmatic runs, it ends up taking more than a week for the check to occur.) But, of course asking for specific dates does up the expectations that things would occur on certain days. I'd yield to your experience dealing with questions from users when it comes to design that reduces the number of questions you'd get.

So in that spirit, what do you think of a configuration something like this?

Your suggestion seems good to me aesthetically and I believe it conveys some implicit information about what to expect. It would suit my needs, and, like you said:

You might also consider cron/systemd for this need. I'm open to expanding borgmatic's scheduling functionality, but if you need really precise scheduling, a real scheduler may be a better bet.

For those who really need super precise scheduling, separating the checks from the backups into their own jobs orchestrated by whatever one's OS provides is, although less convenient to implement as a user, going to cover the gaps for just about anyone. So I'm fine with designs that don't try to solve everything, knowing that people can always fall back to these workarounds, in exchange for being more elegant within their own scope.

Or do you actually need them to run at particular points in the month (1st Thursday, etc.)?

Personally, the answer to this is mostly "no, I don't need them at particular points in the month." Rather, I anticipated that I might need to specifically avoid certain days of the month, but I didn't want to expand the scope of my request to include blacklisting, cuz that's a whole can of worms that can be worked around by just specifying days besides the ones to avoid. I worded my request with the potential needs of business users in mind, as they're more likely to have scheduled times when maintenance occurs that they can communicate with their customers. Basically, I didn't want to let my personal home use of borgmatic overly influence the design of the feature.

With that said,

Would the day of week restriction sufficiently solve that, even if it's not perfect?

Yes, for me it would. I personally would prefer to run these long checks on weekends, when not being able to access the data (because, when a check finishes running, the next backup will start and I need to not have those files open during that time) will not be as disruptive to me personally. I suspect other home users will prefer weekdays over weekends for similar reasons.

So, tl;dr, I like your proposed solution. Let's go with it.

> But this "feature" of borgmatic makes me wonder if allowing the sort of precise scheduling you've described would be a little misleading for the user. I definitely tried to be considerate of that, but knew that I wouldn't be able to fully reconcile the tension here. I mean, it's kind of already like that already (you might say to run some check every week, but then, due to missed `borgmatic` runs, it ends up taking more than a week for the check to occur.) But, of course asking for specific dates does up the expectations that things would occur on certain days. I'd yield to your experience dealing with questions from users when it comes to design that reduces the number of questions you'd get. > So in that spirit, what do you think of a configuration something like this? Your suggestion seems good to me aesthetically and I believe it conveys some implicit information about what to expect. It would suit my needs, and, like you said: > You might also consider cron/systemd for this need. I'm open to expanding borgmatic's scheduling functionality, but if you need really precise scheduling, a real scheduler may be a better bet. For those who really need super precise scheduling, separating the checks from the backups into their own jobs orchestrated by whatever one's OS provides is, although less convenient to implement as a user, going to cover the gaps for just about anyone. So I'm fine with designs that don't try to solve everything, knowing that people can always fall back to these workarounds, in exchange for being more elegant within their own scope. > Or do you actually need them to run at particular points in the month (1st Thursday, etc.)? Personally, the answer to this is _mostly_ "no, I don't need them at particular points in the month." Rather, I anticipated that I might need to specifically _avoid_ certain days of the month, but I didn't want to expand the scope of my request to include blacklisting, cuz that's a whole can of worms that can be worked around by just specifying days besides the ones to avoid. I worded my request with the potential needs of business users in mind, as they're more likely to have scheduled times when maintenance occurs that they can communicate with their customers. Basically, I didn't want to let my personal home use of borgmatic overly influence the design of the feature. With that said, > Would the day of week restriction sufficiently solve that, even if it's not perfect? Yes, for me it would. I personally would prefer to run these long checks on weekends, when not being able to access the data (because, when a check finishes running, the next backup will start and I need to not have those files open during that time) will not be as disruptive to me personally. I suspect other home users will prefer weekdays over weekends for similar reasons. **So, tl;dr, I like your proposed solution. Let's go with it.**
Owner

Sounds good! I appreciate your flexibility here. I think we can always iterate and expand the feature in the future as more users weigh in. And if some of those hypothetical business users need blacklisting, then we can always consider it then!

Sounds good! I appreciate your flexibility here. I think we can always iterate and expand the feature in the future as more users weigh in. And if some of those hypothetical business users need blacklisting, then we can always consider it then!
witten added the
design finalized
label 2023-11-09 21:52:34 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#785
No description provided.