Suggestions for more robust borgmatic systemd service #205

Closed
opened 2019-08-09 06:39:25 +00:00 by adatum · 9 comments

Would it make sense to use systemd-inhibit in the borgmatic.service file to prevent the system from being suspended or shut down while a scheduled backup is underway?

eg. modifying line 6 to something like

ExecStart=systemd-inhibit --who="borgmatic" --why="Prevent interrupting scheduled backup" /root/.local/bin/borgmatic

Also, borgmatic.timer specifies Persistent=true which should cause borgmatic to trigger on boot if the schedule was missed. Perhaps a delay should be incorporated so that borgmatic doesn't start backing up immediately while many programs are loading. This might be bad both for system responsiveness and due to a greater chance of files being written to at that time.

Would it make sense to use `systemd-inhibit` in the [borgmatic.service](https://projects.torsion.org/witten/borgmatic/raw/branch/master/sample/systemd/borgmatic.service) file to prevent the system from being suspended or shut down while a scheduled backup is underway? eg. modifying line 6 to something like ```ExecStart=systemd-inhibit --who="borgmatic" --why="Prevent interrupting scheduled backup" /root/.local/bin/borgmatic``` Also, [borgmatic.timer](https://projects.torsion.org/witten/borgmatic/raw/branch/master/sample/systemd/borgmatic.timer) specifies `Persistent=true` which should cause borgmatic to trigger on boot if the schedule was missed. Perhaps a delay should be incorporated so that borgmatic doesn't start backing up immediately while many programs are loading. This might be bad both for system responsiveness and due to a greater chance of files being written to at that time.
Owner

Both of these suggestions make sense to me. Thank you for filing them!

The one argument I could see against systemd-inhibit is that you may actually want the system to suspend while a backup is underway. For instance, let's say you're using a laptop and a backup is running, and you have somewhere to go, and you want to take your laptop (oops). You may not want to wait for the backup to finish before suspending. Given that Borg checkpoints every few minutes, an interrupted backup isn't so bad. Especially since the checkpoints are automatically recoverable on the next backup. And you could in theory lower the checkpoint interval if this was a common thing.

Anyway, this is certainly not a deal breaker, but I wanted to think through the use cases a bit. Do you have thoughts on this? Can you talk about your use case for inhibiting suspend or shutdown when Borg is running? Thanks!

Both of these suggestions make sense to me. Thank you for filing them! The one argument I could see against `systemd-inhibit` is that you may actually want the system to suspend while a backup is underway. For instance, let's say you're using a laptop and a backup is running, and you have somewhere to go, and you want to take your laptop (oops). You may not want to wait for the backup to finish before suspending. Given that Borg checkpoints every few minutes, an interrupted backup isn't *so* bad. Especially since the checkpoints are automatically recoverable on the next backup. And you could in theory lower the checkpoint interval if this was a common thing. Anyway, this is certainly not a deal breaker, but I wanted to think through the use cases a bit. Do you have thoughts on this? Can you talk about your use case for inhibiting suspend or shutdown when Borg is running? Thanks!
Owner

As for the suggested delay with Persistent=true, systemd.timer's RandomizedDelaySec= may do the trick.

As for the suggested delay with `Persistent=true`, `systemd.timer`'s `RandomizedDelaySec=` may do the trick.
Contributor

@witten: I get your point, maybe that deserves a service level configuration option (eg: in /etc/default/borgmatic) ? For my own use case, I would prefer the change suggested by @adatam (especially as I use the after_backup hook to transfer data remotely).

On a related note (sorry if off topic), I also use a few additions compared to the sample service, some security hardening stuff:

CapabilityBoundingSet=CAP_SYS_RESOURCE CAP_SYS_ADMIN CAP_MKNOD CAP_DAC_READ_SEARCH CAP_SYS_CHROOT CAP_SETPCAP
NoNewPrivileges=yes
PrivateUsers=no
PrivateTmp=yes
PrivateDevices=yes
DevicePolicy=closed
ProtectSystem=strict
ProtectHome=read-only
ProtectControlGroups=yes
ProtectKernelModules=yes
ProtectKernelTunables=yes
RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6 AF_NETLINK
RestrictRealtime=yes
RestrictNamespaces=yes
MemoryDenyWriteExecute=yes
LockPersonality=true
SystemCallArchitectures=native
SystemCallFilter=@system-service
ReadWritePaths=-/root/.config -/root/.cache /mnt/backup

And some options to improve system responsiveness during backups:

IOSchedulingClass=idle
CPUSchedulingPolicy=idle
IOReadIOPSMax=/dev/nvme0n1 100

As you can see some options are host specific, but maybe it would be useful to add it commented out ?

@witten: I get your point, maybe that deserves a service level configuration option (eg: in `/etc/default/borgmatic`) ? For my own use case, I would prefer the change suggested by @adatam (especially as I use the `after_backup` hook to transfer data remotely). On a related note (sorry if off topic), I also use a few additions compared to the sample service, some security hardening stuff: ``` CapabilityBoundingSet=CAP_SYS_RESOURCE CAP_SYS_ADMIN CAP_MKNOD CAP_DAC_READ_SEARCH CAP_SYS_CHROOT CAP_SETPCAP NoNewPrivileges=yes PrivateUsers=no PrivateTmp=yes PrivateDevices=yes DevicePolicy=closed ProtectSystem=strict ProtectHome=read-only ProtectControlGroups=yes ProtectKernelModules=yes ProtectKernelTunables=yes RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6 AF_NETLINK RestrictRealtime=yes RestrictNamespaces=yes MemoryDenyWriteExecute=yes LockPersonality=true SystemCallArchitectures=native SystemCallFilter=@system-service ReadWritePaths=-/root/.config -/root/.cache /mnt/backup ``` And some options to improve system responsiveness during backups: ``` IOSchedulingClass=idle CPUSchedulingPolicy=idle IOReadIOPSMax=/dev/nvme0n1 100 ``` As you can see some options are host specific, but maybe it would be useful to add it commented out ?
Author

@witten The use case for these suggestions in my case is a desktop that is on 24/7. Backups would be scheduled for say 5AM daily, so it's unlikely that it would be rebooted, or even in use at that time. On the odd chance it were necessary, delaying the reboot a little probably wouldn't be much of an issue. Then again your point about borg's checkpoints alleviate the concern about interrupted backups, but it still might be nice to avoid.


As for the laptop use case, your point is well taken. Is there a way to override the systemd-inhibit? I have additional concerns about scheduled backups on laptops:

  • Does it make sense to have scheduled backups for devices that may often be off, or used at unpredictable times?
  • Does it make sense to have backups run on boot if 1) schedules are likely to be missed, and 2) immediate responsiveness is usually desired in laptops?
  • For backups to local network shares, what if it is not mounted or available? (eg. laptop is not at home)
  • For backups using SSH, what if there is no internet connection?

How gracefully are such scenarios/failures handled?

For the second point, perhaps your suggestion of RandomizedDelaySec= could work. It isn't quite what I had in mind, due to its random nature, but maybe it's good enough.

I am undecided whether my laptop should have scheduled backups or if I should run borgmatic manually. The tradeoffs in the convenience of automation versus the complexity of a reasonably robust solution are not yet clear to me. This discussion is useful to that end.


@nicoulaj Thanks for the list of additional service options. Maybe the reasons and use case could be documented?

@witten The use case for these suggestions in my case is a desktop that is on 24/7. Backups would be scheduled for say 5AM daily, so it's unlikely that it would be rebooted, or even in use at that time. On the odd chance it were necessary, delaying the reboot a little probably wouldn't be much of an issue. Then again your point about borg's checkpoints alleviate the concern about interrupted backups, but it still might be nice to avoid. --- As for the laptop use case, your point is well taken. Is there a way to override the `systemd-inhibit`? I have additional concerns about scheduled backups on laptops: * Does it make sense to have scheduled backups for devices that may often be off, or used at unpredictable times? * Does it make sense to have backups run on boot if 1) schedules are likely to be missed, and 2) immediate responsiveness is usually desired in laptops? * For backups to local network shares, what if it is not mounted or available? (eg. laptop is not at home) * For backups using SSH, what if there is no internet connection? How gracefully are such scenarios/failures handled? For the second point, perhaps your suggestion of `RandomizedDelaySec=` could work. It isn't quite what I had in mind, due to its random nature, but maybe it's good enough. I am undecided whether my laptop should have scheduled backups or if I should run `borgmatic` manually. The tradeoffs in the convenience of automation versus the complexity of a reasonably robust solution are not yet clear to me. This discussion is useful to that end. --- @nicoulaj Thanks for the list of additional service options. Maybe the reasons and use case could be documented?
Author

There is an interesting similar discussion about a systemd service and timer for Relax-and-Recover: https://github.com/rear/rear/issues/2139

The use cases for both ReaR and borgmatic have a lot of overlap, so the ideas proposed are likely to be relevant.

There is an interesting similar discussion about a systemd service and timer for Relax-and-Recover: https://github.com/rear/rear/issues/2139 The use cases for both ReaR and borgmatic have a lot of overlap, so the ideas proposed are likely to be relevant.
Author

For the past couple of weeks, I've been running the following daily (prune, create) and weekly (prune, create, and the lengthier check) systemd service/timer pairs. I have not tested edge cases, though it has been working fine under "normal" use.

Most of the configuration is taken from suggestions from the ReaR issue referenced above. A couple of differences are:

  • launching with systemd-inhibit to prevent interrupting backups
  • combining a X minute sleep and setting the timer X minutes earlier as a workaround to avoiding starting a backup immediately upon system startup in case Persistent=true were triggered

Security hardening mentioned above is not included yet.

Here are the four files I have so far, all in /etc/systemd/system:

####borgmatic-daily.service

[Unit]
Description=borgmatic daily backup
Wants=network-online.target
After=network-online.target

[Service]
Type=oneshot
# lowest CPU priority
Nice=19
# lowest I/O priority among 'normal' processes
IOSchedulingClass=best-effort
IOSchedulingPriority=7
IOWeight=100
Restart=no
LogRateLimitIntervalSec=0
# delay start to prevent backups immediately upon system startup if Persistent=true is triggered
ExecStartPre=sleep 30m 
ExecStart=systemd-inhibit --who="borgmatic" --why="Prevent interrupting scheduled backup" /root/.local/bin/borgmatic prune create --syslog-verbosity 1

####borgmatic-daily.timer

[Unit]
Description=Run borgmatic daily backup
ConditionACPower=true

[Timer]
# backup will run at time specified here + delay specified by ExecStartPre in .service
OnCalendar=Mon..Fri,Sun 04:30:00
Persistent=true

[Install]
WantedBy=timers.target

####borgmatic-weekly.service

[Unit]
Description=borgmatic backup and weekly check
Wants=network-online.target
After=network-online.target

[Service]
Type=oneshot
# lowest CPU priority
Nice=19
# lowest I/O priority among 'normal' processes
IOSchedulingClass=best-effort
IOSchedulingPriority=7
IOWeight=100
Restart=no
LogRateLimitIntervalSec=0
# delay start to prevent backups immediately upon system startup if Persistent=true is triggered
ExecStartPre=sleep 30m 
ExecStart=systemd-inhibit --who="borgmatic" --why="Prevent interrupting scheduled backup" /root/.local/bin/borgmatic --syslog-verbosity 1

####borgmatic-weekly.timer

[Unit]
Description=Run borgmatic backup and weekly check
ConditionACPower=true

[Timer]
# backup will run at time specified here + delay specified by ExecStartPre in .service
OnCalendar=Sat 02:30:00
Persistent=true

[Install]
WantedBy=timers.target
For the past couple of weeks, I've been running the following daily (prune, create) and weekly (prune, create, and the lengthier check) systemd service/timer pairs. I have not tested edge cases, though it has been working fine under "normal" use. Most of the configuration is taken from suggestions from the ReaR issue referenced above. A couple of differences are: * launching with `systemd-inhibit` to prevent interrupting backups * combining a X minute sleep and setting the timer X minutes earlier as a workaround to avoiding starting a backup immediately upon system startup in case Persistent=true were triggered Security hardening mentioned above is not included yet. Here are the four files I have so far, all in `/etc/systemd/system`: ####borgmatic-daily.service ``` [Unit] Description=borgmatic daily backup Wants=network-online.target After=network-online.target [Service] Type=oneshot # lowest CPU priority Nice=19 # lowest I/O priority among 'normal' processes IOSchedulingClass=best-effort IOSchedulingPriority=7 IOWeight=100 Restart=no LogRateLimitIntervalSec=0 # delay start to prevent backups immediately upon system startup if Persistent=true is triggered ExecStartPre=sleep 30m ExecStart=systemd-inhibit --who="borgmatic" --why="Prevent interrupting scheduled backup" /root/.local/bin/borgmatic prune create --syslog-verbosity 1 ``` ####borgmatic-daily.timer ``` [Unit] Description=Run borgmatic daily backup ConditionACPower=true [Timer] # backup will run at time specified here + delay specified by ExecStartPre in .service OnCalendar=Mon..Fri,Sun 04:30:00 Persistent=true [Install] WantedBy=timers.target ``` ####borgmatic-weekly.service ``` [Unit] Description=borgmatic backup and weekly check Wants=network-online.target After=network-online.target [Service] Type=oneshot # lowest CPU priority Nice=19 # lowest I/O priority among 'normal' processes IOSchedulingClass=best-effort IOSchedulingPriority=7 IOWeight=100 Restart=no LogRateLimitIntervalSec=0 # delay start to prevent backups immediately upon system startup if Persistent=true is triggered ExecStartPre=sleep 30m ExecStart=systemd-inhibit --who="borgmatic" --why="Prevent interrupting scheduled backup" /root/.local/bin/borgmatic --syslog-verbosity 1 ``` ####borgmatic-weekly.timer ``` [Unit] Description=Run borgmatic backup and weekly check ConditionACPower=true [Timer] # backup will run at time specified here + delay specified by ExecStartPre in .service OnCalendar=Sat 02:30:00 Persistent=true [Install] WantedBy=timers.target ```
Owner

Thanks for your patience on this one! I've tried to incorporate most of these ideas into the sample systemd timer/service files.

As for the laptop use case, your point is well taken. Is there a way to override the systemd-inhibit? I have additional concerns about scheduled backups on laptops: ...

For the laptop use cases, I think scheduled backups of some sort absolutely make sense, because otherwise I won't remember to backup my laptop! However, I withdraw any concerns about systemd-inhibit because I just tried it on my laptop.. systemd-inhibit doesn't prevent the laptop from sleeping, which is my primary use case for "crap, I have to grab my laptop and go somewhere while a backup is running".

You are correct that there are more issues with backups on laptops (network connectivity, network shares), but that doesn't change the need to have regular backups run.. It just raises the difficulty level!

Also, I omitted ConditionACPower=true, because I found that it prevented the timer from getting scheduled if I enabled/started it while my laptop was off AC power.. Pretty unexpected behavior!

Please feel free to review the commit and let me know if I missed anything.

Note though that I left out the security hardening bits, as I don't have the systemd expertise to apply them in a generic way. For instance, I don't know if ProtectHome=read-only will prevent the Borg cache from getting updated. If you'd like to file a separate ticket (or a PR!) for us to work out the right general-purpose security hardening options, I'd be happy to get your help there!

Thanks for your patience on this one! I've tried to incorporate most of these ideas into the sample systemd timer/service files. > As for the laptop use case, your point is well taken. Is there a way to override the systemd-inhibit? I have additional concerns about scheduled backups on laptops: ... For the laptop use cases, I think scheduled backups of some sort absolutely make sense, because otherwise I won't remember to backup my laptop! However, I withdraw any concerns about `systemd-inhibit` because I just tried it on my laptop.. `systemd-inhibit` doesn't prevent the laptop from sleeping, which is my primary use case for "crap, I have to grab my laptop and go somewhere while a backup is running". You are correct that there are more issues with backups on laptops (network connectivity, network shares), but that doesn't change the need to have regular backups run.. It just raises the difficulty level! Also, I omitted `ConditionACPower=true`, because I found that it prevented the timer from getting scheduled if I enabled/started it while my laptop was off AC power.. Pretty unexpected behavior! Please feel free to review the commit and let me know if I missed anything. Note though that I left out the security hardening bits, as I don't have the systemd expertise to apply them in a generic way. For instance, I don't know if `ProtectHome=read-only` will prevent the Borg cache from getting updated. If you'd like to file a separate ticket (or a PR!) for us to work out the right general-purpose security hardening options, I'd be happy to get your help there!
Author

Maybe the ConditionACPower=true should be in the .service file(s), and not the .timer file(s).

Maybe the `ConditionACPower=true` should be in the .service file(s), and not the .timer file(s).
Owner

Good call, that worked!

Good call, that worked!
Sign in to join this conversation.
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#205
No description provided.