support paths-from-command or similar type feature to further customize archive contents #882

Open
opened 2024-06-20 19:30:01 +00:00 by johnny2678 · 5 comments

What I'd like to do and why

I've got a folder with 30 files. I only want to add the most recent 7 to an archive.

Ideally, it's as simple as adding a find command to the source_directories yml. I actually tried:

source_directories:
    - find /mnt/user/AppBackup/ -type f -mtime -7 -name 'backup*'

didn't think it would work, and it didn't, but worth a shot 🤣

This can be done natively in borg with the paths-from-command flag:
borg create --paths-from-command /mnt/disks/WD-diskid::jhtest-{now:%Y-%m-%dT%H:%M:%S} -- find /mnt/user/AppBackup/ -type f -mtime -7 -name 'backup*'

but then I would miss out on all the borgmatic goodness, mainly the loki integration.

So next I tried the borgmatic borg command. I actually got this to work, but it doesn't look like any of the notification/monitoring hooks are triggered by this as I didn't see any new loki logs. Syntax I used:

docker exec -t borgmatic borgmatic -c /home/config_manual/test.yml borg create --paths-from-command /mnt/borg-repository::jhtest3 -- find /mnt/user/AppBackup/ -type f -mtime -7 -name 'backup*'

so worst case, I can make this work, but seems like an opportunity to let the user further define source_directories contents. If there's already an easy way to do this, apologies for making you read this.

Thanks for this project! I've been pretty obsessed with it since finding it over the weekend.

Other notes / implementation ideas

No response

### What I'd like to do and why I've got a folder with 30 files. I only want to add the most recent 7 to an archive. Ideally, it's as simple as adding a `find` command to the source_directories yml. I actually tried: ``` source_directories: - find /mnt/user/AppBackup/ -type f -mtime -7 -name 'backup*' ``` didn't think it would work, and it didn't, but worth a shot 🤣 This can be done natively in borg with the `paths-from-command` flag: `borg create --paths-from-command /mnt/disks/WD-diskid::jhtest-{now:%Y-%m-%dT%H:%M:%S} -- find /mnt/user/AppBackup/ -type f -mtime -7 -name 'backup*' ` but then I would miss out on all the borgmatic goodness, mainly the loki integration. So next I tried the `borgmatic borg` command. I actually got this to work, but it doesn't look like any of the notification/monitoring hooks are triggered by this as I didn't see any new loki logs. Syntax I used: `docker exec -t borgmatic borgmatic -c /home/config_manual/test.yml borg create --paths-from-command /mnt/borg-repository::jhtest3 -- find /mnt/user/AppBackup/ -type f -mtime -7 -name 'backup*' ` so worst case, I can make this work, but seems like an opportunity to let the user further define `source_directories` contents. If there's already an easy way to do this, apologies for making you read this. Thanks for this project! I've been pretty obsessed with it since finding it over the weekend. ### Other notes / implementation ideas _No response_
Owner

Thanks for filing this, and I'm glad to hear borgmatic is mostly working for you! I think you're correct that there's not a way to do this with borgmatic today. And yeah, the borgmatic borg action is kind of a backdoor for calling Borg pretty directly without going through most of borgmatic's machinery, so that explains why monitoring isn't triggered. But I could certainly envision a new option like source_directories_from_command or similar that supports your use case.

If I may ask though: What's your use case for only including the most recently modified files? Is it that the whole 30 files are just too large? Borg is pretty aggressive about deduplication, so even if you throw the whole directory of files at it, only modified files (and in fact only modified parts of files) will actually get stored in each archive after the first.

Thanks for filing this, and I'm glad to hear borgmatic is mostly working for you! I think you're correct that there's not a way to do this with borgmatic today. And yeah, the `borgmatic borg` action is kind of a backdoor for calling Borg pretty directly without going through most of borgmatic's machinery, so that explains why monitoring isn't triggered. But I could certainly envision a new option like `source_directories_from_command` or similar that supports your use case. If I may ask though: What's your use case for only including the most recently modified files? Is it that the whole 30 files are just too large? Borg is pretty aggressive about deduplication, so even if you throw the whole directory of files at it, only modified files (and in fact only modified _parts_ of files) will actually get stored in each archive after the first.
witten added the
new feature area
label 2024-06-20 20:55:27 +00:00
Author

If I may ask though: What's your use case for only including the most recently modified files? Is it that the whole 30 files are just too large?

Completely fair question. My use case is backing up a folder of backups - if that makes sense.

Not the best use of borg/matic but I really like it for shipping local backups offsite (produced by apps, not borg/matic) and still being able to use the monitoring hooks borgmatic exposes. I don't mind having a lot of backups stored locally, but I was trying to trim down my remote backup size.

I had played with the idea of a before_backup/before_action that copies the desired files to a tmp folder and an after_backup/after_action that removes the tmp files, but that seems like the long way around.

edit: example: my Hubitat & unifi devices produce nightly tar.gz backup files. I keep the last 90 backup files, but I don't need that many remotely. I can get by with good backup coverage using --keep-daily/weekly/monthly. So I need a way (like --paths-from-command) to filter those 90 backups down to 10-20.

Anyways, cheers. 🍻 I can get by for now but happy to discuss more if you like.

> If I may ask though: What's your use case for only including the most recently modified files? Is it that the whole 30 files are just too large? Completely fair question. My use case is backing up a folder of backups - if that makes sense. Not the best use of borg/matic but I really like it for shipping local backups offsite (produced by apps, not borg/matic) and still being able to use the monitoring hooks borgmatic exposes. I don't mind having a lot of backups stored locally, but I was trying to trim down my remote backup size. I had played with the idea of a before_backup/before_action that copies the desired files to a tmp folder and an after_backup/after_action that removes the tmp files, but that seems like the long way around. edit: example: my Hubitat & unifi devices produce nightly tar.gz backup files. I keep the last 90 backup files, but I don't need that many remotely. I can get by with good backup coverage using `--keep-daily/weekly/monthly`. So I need a way (like `--paths-from-command`) to filter those 90 backups down to 10-20. Anyways, cheers. 🍻 I can get by for now but happy to discuss more if you like.
Owner

Got it.. Thanks for the explanation. That use case makes sense to me. I think a borgmatic option to support Borg's --paths-from-command would probably make the most sense here.

But in the meantime, I did think of a potential work-around! Check out this already-existing option:

# Additional options to pass directly to particular Borg commands,
# handy for Borg options that borgmatic does not yet support natively.
# Note that borgmatic does not perform any validation on these
# options. Running borgmatic with "--verbosity 2" shows the exact Borg
# command-line invocation.
extra_borg_options:
    ...
    # Extra command-line options to pass to "borg create".
    create: --extra-option

So you could specify your --paths-from-command there, and hopefully the resulting command-line passed to Borg would work for you.

Got it.. Thanks for the explanation. That use case makes sense to me. I think a borgmatic option to support Borg's `--paths-from-command` would probably make the most sense here. But in the meantime, I did think of a potential work-around! Check out this already-existing option: ```yaml # Additional options to pass directly to particular Borg commands, # handy for Borg options that borgmatic does not yet support natively. # Note that borgmatic does not perform any validation on these # options. Running borgmatic with "--verbosity 2" shows the exact Borg # command-line invocation. extra_borg_options: ... # Extra command-line options to pass to "borg create". create: --extra-option ``` So you could specify your `--paths-from-command` there, and hopefully the resulting command-line passed to Borg would work for you.
Author

So you could specify your --paths-from-command there, and hopefully the resulting command-line passed to Borg would work for you.

The challenge here is the -- separator used by --paths-from-command.

I tested it. Here's how I modified my yml:

extra_borg_options:
    # Extra command-line options to pass to "borg init".
    # init: --extra-option

    # Extra command-line options to pass to "borg create".
    create: --paths-from-command -- find /mnt/user/AppBackup-type f -mtime -7 -name 'backup*' 

and here's the resulting error when I run docker exec -it borgmatic borgmatic create -v 2 --stats --list -c /home/config_manual/test.yml

BORG_PASSPHRASE=*** BORG_EXIT_CODES=*** borg create --patterns-from /tmp/tmpi0042m4x --exclude-caches --exclude-if-present .nobackup --exclude-if-present .NOBACKUP --exclude-nodump --files-cache mtime,size --list --filter AMEx- --paths-from-command -- find /mnt/user/AppBackup/ -type f -mtime -7 -name 'backup*' --stats --debug --show-rc /mnt/borg-repository::{hostname}-{app-name}-{now}
usage: borg create [-h] [--critical] [--error] [--warning] [--info] [--debug]
                   [--debug-topic TOPIC] [-p] [--iec] [--log-json]
                   [--lock-wait SECONDS] [--bypass-lock] [--show-version]
                   [--show-rc] [--umask M] [--remote-path PATH]
                   [--remote-ratelimit RATE] [--upload-ratelimit RATE]
                   [--remote-buffer UPLOAD_BUFFER]
                   [--upload-buffer UPLOAD_BUFFER] [--consider-part-files]
                   [--debug-profile FILE] [--rsh RSH] [-n] [-s] [--list]
                   [--filter STATUSCHARS] [--json] [--no-cache-sync]
                   [--stdin-name NAME] [--stdin-user USER]
                   [--stdin-group GROUP] [--stdin-mode M]
                   [--content-from-command] [--paths-from-stdin]
                   [--paths-from-command] [--paths-delimiter DELIM]
                   [-e PATTERN] [--exclude-from EXCLUDEFILE]
                   [--pattern PATTERN] [--patterns-from PATTERNFILE]
                   [--exclude-caches] [--exclude-if-present NAME]
                   [--keep-exclude-tags] [--exclude-nodump] [-x]
                   [--numeric-owner] [--numeric-ids] [--noatime] [--atime]
                   [--noctime] [--nobirthtime] [--nobsdflags] [--noflags]
                   [--noacls] [--noxattrs] [--sparse] [--files-cache MODE]
                   [--read-special] [--comment COMMENT]
                   [--timestamp TIMESTAMP] [-c SECONDS]
                   [--chunker-params PARAMS] [-C COMPRESSION]
                   ARCHIVE [PATH ...]
borg create: error: argument ARCHIVE: "find": No archive specified
beast-borg-local: Error running actions for repository
Command 'borg create --patterns-from /tmp/tmpi0042m4x --exclude-caches --exclude-if-present .nobackup --exclude-if-present .NOBACKUP --exclude-nodump --files-cache mtime,size --list --filter AMEx- --paths-from-command -- find /mnt/user/AppBackup/ -type f -mtime -7 -name 'backup*' --stats --debug --show-rc /mnt/borg-repository::{hostname}-{app-name}-{now}' returned non-zero exit status 2.

I believe the issue is, according the the borg docs, the --paths-from-command flag needs to be after borg create but before the repo::archive.

-extra_borg_options does added the --paths-from-command after borg create but then it also adds the -- find /mnt/user/AppBackup/ -type f -mtime -7 -name 'backup*', which borg interprets as the ARCHIVE and causes it to error out.

I think that extra_borg_options is the right place for --paths-from-command but we would need another config var that appends to the end of the borg create command for the -- find /mnt/user/AppBackup/ -type f -mtime -7 -name 'backup*'.

something like, -args_to_append_to_borg_command 🤷‍♂️😄

Did I get that right?

> So you could specify your `--paths-from-command` there, and hopefully the resulting command-line passed to Borg would work for you. The challenge here is the `--` separator used by `--paths-from-command`. I tested it. Here's how I modified my yml: ``` extra_borg_options: # Extra command-line options to pass to "borg init". # init: --extra-option # Extra command-line options to pass to "borg create". create: --paths-from-command -- find /mnt/user/AppBackup-type f -mtime -7 -name 'backup*' ``` and here's the resulting error when I run `docker exec -it borgmatic borgmatic create -v 2 --stats --list -c /home/config_manual/test.yml` ``` BORG_PASSPHRASE=*** BORG_EXIT_CODES=*** borg create --patterns-from /tmp/tmpi0042m4x --exclude-caches --exclude-if-present .nobackup --exclude-if-present .NOBACKUP --exclude-nodump --files-cache mtime,size --list --filter AMEx- --paths-from-command -- find /mnt/user/AppBackup/ -type f -mtime -7 -name 'backup*' --stats --debug --show-rc /mnt/borg-repository::{hostname}-{app-name}-{now} usage: borg create [-h] [--critical] [--error] [--warning] [--info] [--debug] [--debug-topic TOPIC] [-p] [--iec] [--log-json] [--lock-wait SECONDS] [--bypass-lock] [--show-version] [--show-rc] [--umask M] [--remote-path PATH] [--remote-ratelimit RATE] [--upload-ratelimit RATE] [--remote-buffer UPLOAD_BUFFER] [--upload-buffer UPLOAD_BUFFER] [--consider-part-files] [--debug-profile FILE] [--rsh RSH] [-n] [-s] [--list] [--filter STATUSCHARS] [--json] [--no-cache-sync] [--stdin-name NAME] [--stdin-user USER] [--stdin-group GROUP] [--stdin-mode M] [--content-from-command] [--paths-from-stdin] [--paths-from-command] [--paths-delimiter DELIM] [-e PATTERN] [--exclude-from EXCLUDEFILE] [--pattern PATTERN] [--patterns-from PATTERNFILE] [--exclude-caches] [--exclude-if-present NAME] [--keep-exclude-tags] [--exclude-nodump] [-x] [--numeric-owner] [--numeric-ids] [--noatime] [--atime] [--noctime] [--nobirthtime] [--nobsdflags] [--noflags] [--noacls] [--noxattrs] [--sparse] [--files-cache MODE] [--read-special] [--comment COMMENT] [--timestamp TIMESTAMP] [-c SECONDS] [--chunker-params PARAMS] [-C COMPRESSION] ARCHIVE [PATH ...] borg create: error: argument ARCHIVE: "find": No archive specified beast-borg-local: Error running actions for repository Command 'borg create --patterns-from /tmp/tmpi0042m4x --exclude-caches --exclude-if-present .nobackup --exclude-if-present .NOBACKUP --exclude-nodump --files-cache mtime,size --list --filter AMEx- --paths-from-command -- find /mnt/user/AppBackup/ -type f -mtime -7 -name 'backup*' --stats --debug --show-rc /mnt/borg-repository::{hostname}-{app-name}-{now}' returned non-zero exit status 2. ``` I believe the issue is, [according the the borg docs]([url](https://borgbackup.readthedocs.io/en/stable/usage/create.html)), the `--paths-from-command` flag needs to be after `borg create` but before the `repo::archive`. `-extra_borg_options` does added the `--paths-from-command` after `borg create` but then it also adds the `-- find /mnt/user/AppBackup/ -type f -mtime -7 -name 'backup*'`, which borg interprets as the `ARCHIVE` and causes it to error out. I think that `extra_borg_options` is the right place for `--paths-from-command` but we would need another config var that appends to the end of the borg create command for the `-- find /mnt/user/AppBackup/ -type f -mtime -7 -name 'backup*'`. something like, `-args_to_append_to_borg_command` 🤷‍♂️😄 Did I get that right?
Owner

Ah, yeah, that definitely doesn't work. I even tried putting only --paths-from-command in create: and then moving the find ... command into source_directories: like you originally tried. But that doesn't work either because the resulting Borg command: 1. Has extra source paths injected by borgmatic and: 2. Doesn't have -- where it needs to be.

So I guess this feature really needs to be added properly for this to work.

Ah, yeah, that definitely doesn't work. I even tried putting _only_ `--paths-from-command` in `create:` and then moving the `find ...` command into `source_directories:` like you originally tried. But that doesn't work either because the resulting Borg command: 1. Has extra source paths injected by borgmatic and: 2. Doesn't have `--` where it needs to be. So I guess this feature really needs to be added properly for this to work.
witten added the
design finalized
label 2024-06-25 02:02:01 +00:00
Sign in to join this conversation.
No Milestone
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#882
No description provided.