UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 118: invalid start byte #489

Closed
opened 2022-01-10 20:08:15 +00:00 by Alexander-Shukaev · 3 comments

What I'm trying to do and why

Steps to reproduce (if a bug)

After backing up several thousands of files, for whatever reason, the remote timed out, which can happen, that's fine. Some of the last files had non-latin characters. As a result, when borgmatic print error trace tail, it does capture a few A /... lines, where some of them also having non-latin characters are output just fine, but then suddenly it crashes on some file name (see below).

Actual behavior (if a bug)

borgmatic[999048]: Traceback (most recent call last):
borgmatic[999048]:   File "/usr/bin/borgmatic", line 33, in <module>
borgmatic[999048]:     sys.exit(load_entry_point('borgmatic==1.5.21', 'console_scripts', 'borgmatic')())
borgmatic[999048]:   File "/usr/lib/python3.10/site-packages/borgmatic/commands/borgmatic.py", line 823, in main
borgmatic[999048]:     summary_logs = parse_logs + list(collect_configuration_run_summary_logs(configs, arguments))
borgmatic[999048]:   File "/usr/lib/python3.10/site-packages/borgmatic/commands/borgmatic.py", line 721, in collect_configuration_run_summary_logs
borgmatic[999048]:     results = list(run_configuration(config_filename, config, arguments))
borgmatic[999048]:   File "/usr/lib/python3.10/site-packages/borgmatic/commands/borgmatic.py", line 233, in run_configuration
borgmatic[999048]:     command.execute_hook(
borgmatic[999048]:   File "/usr/lib/python3.10/site-packages/borgmatic/hooks/command.py", line 65, in execute_hook
borgmatic[999048]:     execute.execute_command(
borgmatic[999048]:   File "/usr/lib/python3.10/site-packages/borgmatic/execute.py", line 217, in execute_command
borgmatic[999048]:     log_outputs(
borgmatic[999048]:   File "/usr/lib/python3.10/site-packages/borgmatic/execute.py", line 90, in log_outputs
borgmatic[999048]:     line = ready_buffer.readline().rstrip().decode()
borgmatic[999048]: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 118: invalid start byte
borgmatic[999047]: /usr/bin/borgmatic failed with exit status 1.

Expected behavior (if a bug)

I'm not sure we can really prevent this from happening universally, I would simply suggest to make this piece of code more robust as a first step at least. That is encoding error in outputing just one of the lines should not crash the whole story. Either we catch it and skip to next line or, even better (but more involved), replace the offending symbol(s) with some filler and continue.

Environment

borgmatic version: 1.5.21

Borg version: 1.1.17

Python version: 3.10.1

#### What I'm trying to do and why #### Steps to reproduce (if a bug) After backing up several thousands of files, for whatever reason, the remote timed out, which can happen, that's fine. Some of the last files had non-latin characters. As a result, when `borgmatic` print error trace tail, it does capture a few `A /...` lines, where some of them also having non-latin characters are output just fine, but then suddenly it crashes on some file name (see below). #### Actual behavior (if a bug) ``` borgmatic[999048]: Traceback (most recent call last): borgmatic[999048]: File "/usr/bin/borgmatic", line 33, in <module> borgmatic[999048]: sys.exit(load_entry_point('borgmatic==1.5.21', 'console_scripts', 'borgmatic')()) borgmatic[999048]: File "/usr/lib/python3.10/site-packages/borgmatic/commands/borgmatic.py", line 823, in main borgmatic[999048]: summary_logs = parse_logs + list(collect_configuration_run_summary_logs(configs, arguments)) borgmatic[999048]: File "/usr/lib/python3.10/site-packages/borgmatic/commands/borgmatic.py", line 721, in collect_configuration_run_summary_logs borgmatic[999048]: results = list(run_configuration(config_filename, config, arguments)) borgmatic[999048]: File "/usr/lib/python3.10/site-packages/borgmatic/commands/borgmatic.py", line 233, in run_configuration borgmatic[999048]: command.execute_hook( borgmatic[999048]: File "/usr/lib/python3.10/site-packages/borgmatic/hooks/command.py", line 65, in execute_hook borgmatic[999048]: execute.execute_command( borgmatic[999048]: File "/usr/lib/python3.10/site-packages/borgmatic/execute.py", line 217, in execute_command borgmatic[999048]: log_outputs( borgmatic[999048]: File "/usr/lib/python3.10/site-packages/borgmatic/execute.py", line 90, in log_outputs borgmatic[999048]: line = ready_buffer.readline().rstrip().decode() borgmatic[999048]: UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb5 in position 118: invalid start byte borgmatic[999047]: /usr/bin/borgmatic failed with exit status 1. ``` #### Expected behavior (if a bug) I'm not sure we can really prevent this from happening universally, I would simply suggest to make this piece of code more robust as a first step at least. That is encoding error in outputing just one of the lines should not crash the whole story. Either we catch it and skip to next line or, even better (but more involved), replace the offending symbol(s) with some filler and continue. #### Environment **borgmatic version:** 1.5.21 **Borg version:** 1.1.17 **Python version:** 3.10.1
Owner

Thank you for taking the time to report this! I totally agree that borgmatic shouldn't blow up with a traceback on "invalid" input. A few questions for you:

  • Do you have an example of a filename that causes this sort of problem, so that I can try to reproduce it?
  • Can I get a look at a redacted copy of your borgmatic configuration? In particular, I'm interested in seeing your hook configuration (if any).
  • Also, is your terminal where you're running this using UTF-8? Running echo $LANG may help determine that.
Thank you for taking the time to report this! I totally agree that borgmatic shouldn't blow up with a traceback on "invalid" input. A few questions for you: * Do you have an example of a filename that causes this sort of problem, so that I can try to reproduce it? * Can I get a look at a redacted copy of your borgmatic configuration? In particular, I'm interested in seeing your hook configuration (if any). * Also, is your terminal where you're running this using UTF-8? Running `echo $LANG` may help determine that.
witten added the
bug
label 2022-01-10 21:12:58 +00:00
Owner

Can I get a look at your /etc/borgmatic.d/._backup1.yaml configuration file? Or is that the same as config.yaml above?

And would it possible to see your borgmatic --verbosity 2 output? That might help pinpoint where the unicode error occurs.

Thank you!

Can I get a look at your `/etc/borgmatic.d/._backup1.yaml` configuration file? Or is that the same as `config.yaml` above? And would it possible to see your `borgmatic --verbosity 2` output? That might help pinpoint where the unicode error occurs. Thank you!
witten added the
waiting for response
label 2022-03-06 00:14:02 +00:00
Owner

I'm closing this now due to inactivity, but please feel free to re-open if you'd like to post follow-up information at any point. Thanks!

I'm closing this now due to inactivity, but please feel free to re-open if you'd like to post follow-up information at any point. Thanks!
witten removed the
waiting for response
label 2022-04-28 20:44:14 +00:00
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#489
No description provided.