Create named archive per expanded source file #142

Closed
opened 2019-02-22 06:54:48 +00:00 by FreitagDavid · 7 comments

So one feature I'm currently missing is the ability to use folder names in the name of an archive. Basically what I would want is two different features.

  1. Some sort of boolean option for whether borgmatic should be processing the single directory put in or whether it should be iterating through the sub-directories to be backed up.

  2. Store the sub-directory and parent of the item currently being worked on as a variable some where that can be used in the naming of the backup. For instance
    {hostname}-{currentdirectoryname}-{parent}-{now}

My use case for this is backing up snapper snapshots into individual backups. For the parent you could feasibly just have a sub-config for the backup directory for specifying a tag to go on it. For my use case this would basically be the name of the subvolume I am currently backing up e.g. 'home', 'root'.

So one feature I'm currently missing is the ability to use folder names in the name of an archive. Basically what I would want is two different features. 1. Some sort of boolean option for whether borgmatic should be processing the single directory put in or whether it should be iterating through the sub-directories to be backed up. 2. Store the sub-directory and parent of the item currently being worked on as a variable some where that can be used in the naming of the backup. For instance {hostname}-{currentdirectoryname}-{parent}-{now} My use case for this is backing up snapper snapshots into individual backups. For the parent you could feasibly just have a sub-config for the backup directory for specifying a tag to go on it. For my use case this would basically be the name of the subvolume I am currently backing up e.g. 'home', 'root'.
Owner

When you mention directories here, are you talking about the current source_directories entry that's being backed up? Or something else?

Today, when you specify a series of source_directories to borgmatic, they are all passed to Borg in a single command invocation. There is some pre-processing that borgmatic does to expand tildes and globs, but that's it. So this ask would probably require kicking off a series of separate Borg invocations, one per discovered source directory.

Could you say more about your use case? Why the need to turn individual paths into separate Borg archives?

Also, for point 1, why the need for the boolean option? Today, borgmatic unconditionally expands globs in source_directories if they are present.

Finally, for the parent, would separate borgmatic configuration files work? One for "home", one for "root", etc? That way you could set a separate archive name in each.

When you mention directories here, are you talking about the current `source_directories` entry that's being backed up? Or something else? Today, when you specify a series of `source_directories` to borgmatic, they are all passed to Borg in a single command invocation. There is some pre-processing that borgmatic does to expand tildes and globs, but that's it. So this ask would probably require kicking off a series of separate Borg invocations, one per discovered source directory. Could you say more about your use case? Why the need to turn individual paths into separate Borg archives? Also, for point 1, why the need for the boolean option? Today, borgmatic unconditionally expands globs in `source_directories` if they are present. Finally, for the parent, would separate borgmatic configuration files work? One for "home", one for "root", etc? That way you could set a separate archive name in each.
Author

Separate configurations could work. I just tend to prefer to keep a single config. But I can live with that. Kicking off separate instances was basically what I was thinking. The main reason being to reduce the amount of time it takes to backup the entire folder when each snapshot should be a single self contained entity. It would also make it much easier to restore since I could just restore a single snapshot by restoring a backup rather than having to mount a backup or restore a folder from within the backup. I would just make a separate config for each subfolder but with the way snapper and from what I can tell other snapshotting software name their folders that wouldn't really be possible since you can't really predict how many folders there will be and what their names will be. It seems like this could be achieved with a for loop that just waits for each instance to finish before kicking off another. The boolean was to specify whether you want the backup to be the source directory or if you want it to be the sub directories. Maybe I am just thinking about this wrong but that just seems like the best way to deal with rolling snapshots and borg. If I'm not mistaken if it was setup this way I would be able to set borg's and snapper's pruning to be exactly the same and then my backups would basically mirror the structure of my snapshot folder. I understand if you think this is outside the scope of this script I just thought it would be quite useful for anyone out there wanting to run borg with automated snapshotting. Hopefully I clarified my request it's been a long day so sorry if this isn't totally clear.

ie.

directory being backed up /.snapshots/381

borg repo /somebackuprepo::myhost-381-root

Edit: Also i may be mistaken but I also believe that consistency checks get slower as your backup gets bigger.

Separate configurations could work. I just tend to prefer to keep a single config. But I can live with that. Kicking off separate instances was basically what I was thinking. The main reason being to reduce the amount of time it takes to backup the entire folder when each snapshot should be a single self contained entity. It would also make it much easier to restore since I could just restore a single snapshot by restoring a backup rather than having to mount a backup or restore a folder from within the backup. I would just make a separate config for each subfolder but with the way snapper and from what I can tell other snapshotting software name their folders that wouldn't really be possible since you can't really predict how many folders there will be and what their names will be. It seems like this could be achieved with a for loop that just waits for each instance to finish before kicking off another. The boolean was to specify whether you want the backup to be the source directory or if you want it to be the sub directories. Maybe I am just thinking about this wrong but that just seems like the best way to deal with rolling snapshots and borg. If I'm not mistaken if it was setup this way I would be able to set borg's and snapper's pruning to be exactly the same and then my backups would basically mirror the structure of my snapshot folder. I understand if you think this is outside the scope of this script I just thought it would be quite useful for anyone out there wanting to run borg with automated snapshotting. Hopefully I clarified my request it's been a long day so sorry if this isn't totally clear. ie. directory being backed up /.snapshots/381 borg repo /somebackuprepo::myhost-381-root Edit: Also i may be mistaken but I also believe that consistency checks get slower as your backup gets bigger.
Owner

Okay, what you described as an implementation idea makes sense to me. I might suggest the tweak though of eliminating the bool and just introspecting the provided archive_name_format—if it contains {currentdirectoryname}, then assume the "one archive per source path" for-loop mode as you describe.

However, this does sound a little bit like an optimization, and I wonder if it's premature.. Have you already tried using borgmatic as-is to backup all of /.snapshots/ into a single archive, and then consistency check that? You are correct that consistency checks are slower on larger archives, but maybe it's fast enough for your needs? Also see https://torsion.org/borgmatic/docs/how-to/deal-with-very-large-backups/ for some more tips and tricks.

For what it's worth, I know of at least one use of borgmatic on rolling database snapshots, and the archive is on the level of all of the snapshots rather than one archive per source path. And that seems to work pretty well.

Either way, please let me know!

Okay, what you described as an implementation idea makes sense to me. I might suggest the tweak though of eliminating the bool and just introspecting the provided `archive_name_format`—if it contains `{currentdirectoryname}`, then assume the "one archive per source path" for-loop mode as you describe. However, this does sound a little bit like an optimization, and I wonder if it's premature.. Have you already tried using borgmatic as-is to backup all of `/.snapshots/` into a single archive, and then consistency check that? You are correct that consistency checks are slower on larger archives, but maybe it's fast enough for your needs? Also see https://torsion.org/borgmatic/docs/how-to/deal-with-very-large-backups/ for some more tips and tricks. For what it's worth, I know of at least one use of borgmatic on rolling database snapshots, and the archive is on the level of all of the snapshots rather than one archive per source path. And that seems to work pretty well. Either way, please let me know!
Author

It's not just the speed but also having the single archive per snapshot there are multiple levels to why I would want this. Really the speed was more of an afterthought than anything.

It's not just the speed but also having the single archive per snapshot there are multiple levels to why I would want this. Really the speed was more of an afterthought than anything.
Owner

Got it. Thanks for the clarification. So the main rationale is ease of restore, and synchronizing the retention policies between borgmatic and Snapper?

Got it. Thanks for the clarification. So the main rationale is ease of restore, and synchronizing the retention policies between borgmatic and Snapper?
witten changed title from [Feature] More robust archive naming. to Create named archive per expanded source file 2019-02-24 07:22:07 +00:00
Author

Yep sorry for the late response. That is the primary use case. Thanks for the consideration. I might even try my hand at it if I get some free time the weekend. Though I doubt it I've got a huge project at work right now.

Yep sorry for the late response. That is the primary use case. Thanks for the consideration. I might even try my hand at it if I get some free time the weekend. Though I doubt it I've got a huge project at work right now.
Owner

I apologize for the lack of traction on this ticket. My current thinking is that formal Snapper integration is probably the better way to go at this point, so I'll be closing this ticket in favor of that. However, if you're still using borgmatic and have any other requirements or feedback at this point, please feel free to comment and/or file a new ticket. Thank you!

I apologize for the lack of traction on this ticket. My current thinking is that [formal Snapper integration](https://github.com/borgmatic-collective/borgmatic/pull/51) is probably the better way to go at this point, so I'll be closing this ticket in favor of that. However, if you're still using borgmatic and have any other requirements or feedback at this point, please feel free to comment and/or file a new ticket. Thank you!
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#142
No description provided.