systemd service fails with status 1/FAILURE #376

Closed
opened 2020-12-04 16:19:24 +00:00 by csteinforth · 9 comments

What I'm trying to do and why

I run borgmatic via systemd and the borgmatic.timer and borgmatic.service. Since yesterday (04-12-2020) the service does exit with status 1/FAILURE. Manually running borgmatic works.

Steps to reproduce (if a bug)

Here is one of my config files.

location:
    source_directories:
        - /home

    repositories:
        - /mnt/data/EOSHome

    exclude_patterns:
        - /home/*/.thumbnails
        - /home/*/.cache
        - /home/*/.dbus
        - /home/*/.gvfs
        - /home/*/.local/share/[Tt]rash
        
retention:
    keep_daily: 7
    keep_weekly: 4
    keep_monthly: -1

Here is the borgmatic.service

[Unit]
Description=borgmatic backup
Wants=network-online.target
After=network-online.target
ConditionACPower=true

[Service]
Type=oneshot

# Security settings for systemd running as root
# For more details about this settings check the systemd manuals
# https://www.freedesktop.org/software/systemd/man/systemd.exec.html
LockPersonality=true
# Certain borgmatic features like Healthchecks integration need MemoryDenyWriteExecute to be off.
# But you can try setting it to "yes" for improved security if you don't use those features.
MemoryDenyWriteExecute=no
NoNewPrivileges=yes
PrivateDevices=yes
PrivateTmp=yes
ProtectClock=yes
ProtectControlGroups=yes
ProtectHostname=yes
ProtectKernelLogs=yes
ProtectKernelModules=yes
ProtectKernelTunables=yes
RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6 AF_NETLINK
RestrictNamespaces=yes
RestrictRealtime=yes
RestrictSUIDSGID=yes
SystemCallArchitectures=native
SystemCallFilter=@system-service
# Restrict write access
# Change to 'ProtectSystem=strict' and uncomment 'ProtectHome' to make the whole file
# system read-only be default and uncomment 'ReadWritePaths' for the required write access.
# Add local repositroy paths to the list of 'ReadWritePaths' like '-/mnt/my_backup_drive'.
ProtectSystem=strict
ProtectHome=read-only
ReadWritePaths=-/root/.config/borg -/root/.cache/borg -/root/.borgmatic --/mnt/data/EOSHome

CapabilityBoundingSet=CAP_DAC_READ_SEARCH CAP_NET_RAW

# Lower CPU and I/O priority.
Nice=19
CPUSchedulingPolicy=batch
IOSchedulingClass=best-effort
IOSchedulingPriority=7
IOWeight=100

Restart=no
# Prevent rate limiting of borgmatic log events. If you are using an older version of systemd that
# doesn't support this (pre-240 or so), you may have to remove this option.
LogRateLimitIntervalSec=0

# Delay start to prevent backups running during boot. Note that systemd-inhibit requires dbus and
# dbus-user-session to be installed.
ExecStartPre=sleep 1m
ExecStart=systemd-inhibit --who="borgmatic" --why="Prevent interrupting scheduled backup" /usr/bin/borgmatic --syslog-verbosity 2

and the borgmatic.timer

[Unit]
Description=Run borgmatic backup

[Timer]
OnCalendar=daily
Persistent=true

[Install]
WantedBy=timers.target

Actual behavior (if a bug)

Here is the output of journalctl -xu borgmatic

Dez 04 17:10:01 E14 systemd[1]: Starting borgmatic backup...
░░ Subject: A start job for unit borgmatic.service has begun execution
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░ 
░░ A start job for unit borgmatic.service has begun execution.
░░ 
░░ The job identifier is 2512.
Dez 04 17:11:02 E14 systemd-inhibit[4054]: /usr/bin/borgmatic terminated by signal SYS.
Dez 04 17:11:02 E14 systemd[1]: borgmatic.service: Main process exited, code=exited, status=1/FAILURE
░░ Subject: Unit process exited
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░ 
░░ An ExecStart= process belonging to unit borgmatic.service has exited.
░░ 
░░ The process' exit code is 'exited' and its exit status is 1.
Dez 04 17:11:02 E14 systemd[1]: borgmatic.service: Failed with result 'exit-code'.
░░ Subject: Unit failed
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░ 
░░ The unit borgmatic.service has entered the 'failed' state with result 'exit-code'.
Dez 04 17:11:02 E14 systemd[1]: Failed to start borgmatic backup.
░░ Subject: A start job for unit borgmatic.service has failed
░░ Defined-By: systemd
░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel
░░ 
░░ A start job for unit borgmatic.service has finished with a failure.
░░ 
░░ The job identifier is 2512 and the job result is failed.

Expected behavior (if a bug)

The service runs without error.

Other notes / implementation ideas

Environment

borgmatic version: 1.5.12

borgmatic installation method: Pacman package

Borg version: 1.1.14

Python version: 3.9.0

Database version (if applicable): ---

operating system and version: Endeavour OS, kernel version 5.9.11-arch2-1

#### What I'm trying to do and why I run borgmatic via systemd and the borgmatic.timer and borgmatic.service. Since yesterday (04-12-2020) the service does exit with status 1/FAILURE. Manually running borgmatic works. #### Steps to reproduce (if a bug) Here is one of my config files. ``` location: source_directories: - /home repositories: - /mnt/data/EOSHome exclude_patterns: - /home/*/.thumbnails - /home/*/.cache - /home/*/.dbus - /home/*/.gvfs - /home/*/.local/share/[Tt]rash retention: keep_daily: 7 keep_weekly: 4 keep_monthly: -1 ``` Here is the borgmatic.service ``` [Unit] Description=borgmatic backup Wants=network-online.target After=network-online.target ConditionACPower=true [Service] Type=oneshot # Security settings for systemd running as root # For more details about this settings check the systemd manuals # https://www.freedesktop.org/software/systemd/man/systemd.exec.html LockPersonality=true # Certain borgmatic features like Healthchecks integration need MemoryDenyWriteExecute to be off. # But you can try setting it to "yes" for improved security if you don't use those features. MemoryDenyWriteExecute=no NoNewPrivileges=yes PrivateDevices=yes PrivateTmp=yes ProtectClock=yes ProtectControlGroups=yes ProtectHostname=yes ProtectKernelLogs=yes ProtectKernelModules=yes ProtectKernelTunables=yes RestrictAddressFamilies=AF_UNIX AF_INET AF_INET6 AF_NETLINK RestrictNamespaces=yes RestrictRealtime=yes RestrictSUIDSGID=yes SystemCallArchitectures=native SystemCallFilter=@system-service # Restrict write access # Change to 'ProtectSystem=strict' and uncomment 'ProtectHome' to make the whole file # system read-only be default and uncomment 'ReadWritePaths' for the required write access. # Add local repositroy paths to the list of 'ReadWritePaths' like '-/mnt/my_backup_drive'. ProtectSystem=strict ProtectHome=read-only ReadWritePaths=-/root/.config/borg -/root/.cache/borg -/root/.borgmatic --/mnt/data/EOSHome CapabilityBoundingSet=CAP_DAC_READ_SEARCH CAP_NET_RAW # Lower CPU and I/O priority. Nice=19 CPUSchedulingPolicy=batch IOSchedulingClass=best-effort IOSchedulingPriority=7 IOWeight=100 Restart=no # Prevent rate limiting of borgmatic log events. If you are using an older version of systemd that # doesn't support this (pre-240 or so), you may have to remove this option. LogRateLimitIntervalSec=0 # Delay start to prevent backups running during boot. Note that systemd-inhibit requires dbus and # dbus-user-session to be installed. ExecStartPre=sleep 1m ExecStart=systemd-inhibit --who="borgmatic" --why="Prevent interrupting scheduled backup" /usr/bin/borgmatic --syslog-verbosity 2 ``` and the borgmatic.timer ``` [Unit] Description=Run borgmatic backup [Timer] OnCalendar=daily Persistent=true [Install] WantedBy=timers.target ``` #### Actual behavior (if a bug) Here is the output of journalctl -xu borgmatic ``` Dez 04 17:10:01 E14 systemd[1]: Starting borgmatic backup... ░░ Subject: A start job for unit borgmatic.service has begun execution ░░ Defined-By: systemd ░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel ░░ ░░ A start job for unit borgmatic.service has begun execution. ░░ ░░ The job identifier is 2512. Dez 04 17:11:02 E14 systemd-inhibit[4054]: /usr/bin/borgmatic terminated by signal SYS. Dez 04 17:11:02 E14 systemd[1]: borgmatic.service: Main process exited, code=exited, status=1/FAILURE ░░ Subject: Unit process exited ░░ Defined-By: systemd ░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel ░░ ░░ An ExecStart= process belonging to unit borgmatic.service has exited. ░░ ░░ The process' exit code is 'exited' and its exit status is 1. Dez 04 17:11:02 E14 systemd[1]: borgmatic.service: Failed with result 'exit-code'. ░░ Subject: Unit failed ░░ Defined-By: systemd ░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel ░░ ░░ The unit borgmatic.service has entered the 'failed' state with result 'exit-code'. Dez 04 17:11:02 E14 systemd[1]: Failed to start borgmatic backup. ░░ Subject: A start job for unit borgmatic.service has failed ░░ Defined-By: systemd ░░ Support: https://lists.freedesktop.org/mailman/listinfo/systemd-devel ░░ ░░ A start job for unit borgmatic.service has finished with a failure. ░░ ░░ The job identifier is 2512 and the job result is failed. ``` #### Expected behavior (if a bug) The service runs without error. #### Other notes / implementation ideas #### Environment **borgmatic version:** 1.5.12 **borgmatic installation method:** Pacman package **Borg version:** 1.1.14 **Python version:** 3.9.0 **Database version (if applicable):** --- **operating system and version:** Endeavour OS, kernel version 5.9.11-arch2-1
Author

I may have found the reason. I checked the updated sample service file and added

SystemCallErrorNumber=EPERM

to my service file and it worked again. What is the exact reason for this line?

I may have found the reason. I checked the updated sample service file and added ``` SystemCallErrorNumber=EPERM ``` to my service file and it worked again. What is the exact reason for this line?
Owner

Thank you for reporting this, and for including so many details!

Here's what I know so far. It looks like systemd version 247 recently made some sort of breaking change such that the example borgmatic systemd service file no longer works, resulting in the exact behavior you're seeing. systemd 246 worked great. systemd 247, not so much. (It'd be good to confirm that that's the version of systemd you're on, e.g. systemctl --version.)

I strongly suspect that the breaking change has to do with the set of system calls borgmatic is allowed to make, although I have not confirmed this. I don't have a local repro of the problem here—I'm on systemd 246 still.

The recent SystemCallErrorNumber=EPERM addition was an attempt to change how systemd deals with insufficient permissions when a system call is blocked due to permissions. Specifically, instead of just unceremoneously killing the borgmatic process (what you're seeing), systemd instead returns a permission error (EPERM) from the system call. Which gives borgmatic the opportunity to handle the error more gracefully.

Here's what I don't yet know: Which system call is failing, and how does borgmatic handle it gracefully? There is, IIRC, a single place in the code where a specific PermissionError is handled, and that's it. I suppose it's also possible that one of borgmatic's dependencies is handling the PermissionError internally.

So that's where we are. SystemCallErrorNumber=EPERM seems to "make it work", which isn't really satisfying because it's not clear how or why.

Thanks to dmc for diagnosing much of this on IRC.

Thank you for reporting this, and for including so many details! Here's what I know so far. It looks like systemd version 247 recently made some sort of breaking change such that the example borgmatic systemd service file no longer works, resulting in the exact behavior you're seeing. systemd 246 worked great. systemd 247, not so much. (It'd be good to confirm that that's the version of systemd you're on, e.g. `systemctl --version`.) I strongly suspect that the breaking change has to do with the set of system calls borgmatic is allowed to make, although I have not confirmed this. I don't have a local repro of the problem here—I'm on systemd 246 still. The recent `SystemCallErrorNumber=EPERM` addition was an attempt to change how systemd deals with insufficient permissions when a system call is blocked due to permissions. Specifically, instead of just unceremoneously killing the borgmatic process (what you're seeing), systemd instead returns a permission error (`EPERM`) from the system call. Which gives borgmatic the opportunity to handle the error more gracefully. Here's what I don't yet know: Which system call is failing, and how does borgmatic handle it gracefully? There is, IIRC, a single place in the code where a specific `PermissionError` is handled, and that's it. I suppose it's also possible that one of borgmatic's dependencies is handling the `PermissionError` internally. So that's where we are. `SystemCallErrorNumber=EPERM` seems to "make it work", which isn't really satisfying because it's not clear how or why. Thanks to `dmc` for diagnosing much of this on IRC.
witten added the
bug
label 2020-12-04 18:41:52 +00:00
Author

Here is my version of systemd:

[csteinforth@E14 ~]$ systemctl --version
systemd 247 (247.1-1-arch)
+PAM +AUDIT -SELINUX -IMA -APPARMOR +SMACK -SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +ZSTD +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid

And here is the relevant part of the output from grep -i upgraded /var/log/pacman.log:

[2020-12-02T19:45:27+0100] [ALPM] upgraded systemd-libs (246.6-1 -> 247.1-1)
[2020-12-02T19:45:27+0100] [ALPM] upgraded systemd (246.6-1 -> 247.1-1)

So yes, since the upgrade of systemd the service stopped working.

Here is my version of systemd: ``` [csteinforth@E14 ~]$ systemctl --version systemd 247 (247.1-1-arch) +PAM +AUDIT -SELINUX -IMA -APPARMOR +SMACK -SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ +LZ4 +ZSTD +SECCOMP +BLKID +ELFUTILS +KMOD +IDN2 -IDN +PCRE2 default-hierarchy=hybrid ``` And here is the relevant part of the output from `grep -i upgraded /var/log/pacman.log`: ``` [2020-12-02T19:45:27+0100] [ALPM] upgraded systemd-libs (246.6-1 -> 247.1-1) [2020-12-02T19:45:27+0100] [ALPM] upgraded systemd (246.6-1 -> 247.1-1) ``` So yes, since the upgrade of systemd the service stopped working.
Owner

Okay, at least it's consistent then. So my current recommendation is to keep SystemCallErrorNumber=EPERM in your systemd service configuration, and holler here if you experience any PermissionDenied or similar issues with borgmatic. I'll try to get access to a systemd 247 machine. (Or just wait.)

Okay, at least it's consistent then. So my current recommendation is to keep `SystemCallErrorNumber=EPERM` in your systemd service configuration, and holler here if you experience any `PermissionDenied` or similar issues with borgmatic. I'll try to get access to a systemd 247 machine. (Or just wait.)

Thanks a lot @csteinforth, I couldn't figure out why the service wasn't working, but your solution solved my problem.

Thanks a lot @csteinforth, I couldn't figure out why the service wasn't working, but your solution solved my problem.
Owner

I've confirmed/repro'd this on my own systemd 247 machine. Next step: See if the SystemCallErrorNumber=EPERM "fixes" it.

I've confirmed/repro'd this on my own systemd 247 machine. Next step: See if the `SystemCallErrorNumber=EPERM` "fixes" it.
Author

So far SystemCallErrorNumber=EPERM worked for me.

So far `SystemCallErrorNumber=EPERM` worked for me.

I have the same issue (posted an issue a few days ago about it). SystemCallErrorNumber=EPERM is in my config but it's still not working. No issues running it in terminal, it only fails when run with the systemd timer.

I have the same issue (posted an issue a few days ago about it). SystemCallErrorNumber=EPERM is in my config but it's still not working. No issues running it in terminal, it only fails when run with the systemd timer.
Owner

Calling this working after six months of use with SystemCallErrorNumber=EPERM. 😄

Calling this working after six months of use with `SystemCallErrorNumber=EPERM`. 😄
Sign in to join this conversation.
No Milestone
No Assignees
4 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#376
No description provided.