Q: How to delete specific backup? #864

Closed
opened 2024-05-12 15:46:21 +00:00 by Forza-tng · 5 comments

What I'm trying to do and why

Hi, I have two questions:

  1. Is it possible to delete specific archives using borgmatic?
  2. If a backup to a remote ssh is interrupted (lost connection, ctrl-C, etc...), how can I determine what archive is not a full backup?

Background:

I am using btrfs snapshots as source for borgmatic. The path is always /mnt/datavol/<snapshot>. Before I run borgmatic the snapshot is recreated. The snapshot has approx 1TB of data, so it could happen that the connection breaks.

In the situation a backup fails, I'd like to remove that entry and then redo the backup.

Steps to reproduce

No response

Actual behavior

No response

Expected behavior

No response

Other notes / implementation ideas

No response

borgmatic version

1.8.9

borgmatic installation method

Gentoo ebuild

Borg version

1.2.8

Python version

3.11.9

Database version (if applicable)

No response

Operating system and version

Gentoo Linux amd64

### What I'm trying to do and why Hi, I have two questions: 1. Is it possible to delete specific archives using borgmatic? 2. If a backup to a remote ssh is interrupted (lost connection, ctrl-C, etc...), how can I determine what archive is not a full backup? Background: I am using btrfs snapshots as source for borgmatic. The path is always `/mnt/datavol/<snapshot>`. Before I run borgmatic the snapshot is recreated. The snapshot has approx 1TB of data, so it could happen that the connection breaks. In the situation a backup fails, I'd like to remove that entry and then redo the backup. ### Steps to reproduce _No response_ ### Actual behavior _No response_ ### Expected behavior _No response_ ### Other notes / implementation ideas _No response_ ### borgmatic version 1.8.9 ### borgmatic installation method Gentoo ebuild ### Borg version 1.2.8 ### Python version 3.11.9 ### Database version (if applicable) _No response_ ### Operating system and version Gentoo Linux amd64
Owner

Is it possible to delete specific archives using borgmatic?

It's possible, but borgmatic doesn't have a "native" delete action yet. The way to do it until then is by calling through to Borg. Here's an example:

borgmatic borg delete ::host-2024-04-28T15:28:51.881598

Where host-2024-04-28T15:28:51.881598 is the name of the archive you want to delete. See the borg action documentation for more information.

If a backup to a remote ssh is interrupted (lost connection, ctrl-C, etc...), how can I determine what archive is not a full backup?

Borg (and borgmatic) hide incomplete archives by default. So everything in the borgmatic list output is a full backup. If you want to see incomplete backups as well, you can use borg list --consider-checkpoints.

In the situation a backup fails, I'd like to remove that entry and then redo the backup.

You should be able to do that with borg delete or borgmatic borg delete, but if each btrfs snapshot is pretty similar to the previous one, you may find that there's actually not much space savings in manually deleting checkpoint archives—and it may actually end up slowing down backups because you have to re-backup that data instead of reusing it from the incomplete archive. Borg is pretty great at de-duplication.

> Is it possible to delete specific archives using borgmatic? It's possible, but borgmatic doesn't have a ["native" delete action](https://projects.torsion.org/borgmatic-collective/borgmatic/issues/298) yet. The way to do it until then is by calling through to Borg. Here's an example: ``` borgmatic borg delete ::host-2024-04-28T15:28:51.881598 ``` Where `host-2024-04-28T15:28:51.881598` is the name of the archive you want to delete. See the [`borg` action documentation](https://torsion.org/borgmatic/docs/how-to/run-arbitrary-borg-commands/) for more information. > If a backup to a remote ssh is interrupted (lost connection, ctrl-C, etc...), how can I determine what archive is not a full backup? Borg (and borgmatic) hide incomplete archives by default. So everything in the `borgmatic list` output is a full backup. If you want to see incomplete backups as well, you can use `borg list --consider-checkpoints`. > In the situation a backup fails, I'd like to remove that entry and then redo the backup. You should be able to do that with `borg delete` or `borgmatic borg delete`, but if each btrfs snapshot is pretty similar to the previous one, you may find that there's actually not much space savings in manually deleting checkpoint archives—and it may actually end up slowing down backups because you have to re-backup that data instead of reusing it from the incomplete archive. Borg is pretty great at de-duplication.
witten added the
question / support
label 2024-05-13 16:52:32 +00:00
Author

Thank you for the explanation. I will do some interrupted tests and see how it works out.

As for btrfs, my normal way of doing snapshots and backups is to keep them labelled like this:

❯ ll /mnt/rootvol/snapshots/
total 16
drwxr-xr-x    1 root     root          1092 May 13 17:01 ./
drwxr-xr-x    1 root     root            30 Apr 18 14:44 ../
drwxr-xr-x    1 root     root           194 May  5 10:19 root.20240505T1101/
drwxr-xr-x    1 root     root           194 May  5 10:19 root.20240506T0001/
drwxr-xr-x    1 root     root           194 May  5 10:19 root.20240507T0001/
drwxr-xr-x    1 root     root           194 May  5 10:19 root.20240508T0001/
drwxr-xr-x    1 root     root           194 May  5 10:19 root.20240509T0001/
drwxr-xr-x    1 root     root           194 May  5 10:19 root.20240510T0001/
drwxr-xr-x    1 root     root           194 May  5 10:19 root.20240511T0001/
drwxr-xr-x    1 root     root           194 May  5 10:19 root.20240512T0001/
drwxr-xr-x    1 root     root           194 May  5 10:19 root.20240513T0001/
drwxr-xr-x    1 root     root           194 May  5 10:19 root.20240513T1301/
drwxr-xr-x    1 root     root           194 May  5 10:19 root.20240513T1401/
drwxr-xr-x    1 root     root           194 May  5 10:19 root.20240513T1501/
drwxr-xr-x    1 root     root           194 May  5 10:19 root.20240513T1601/
drwxr-xr-x    1 root     root           194 May  5 10:19 root.20240513T1701/
drwxr-xr-x    1 root     root           368 May  5 10:45 

It seemed to take longer for Borg to backup a new snapshot every time, so i decided on simply create a 'plain' snapshot as /mnt/rootvol/borgmatic/root before invoking borgmatic, so that the path to all files remain the same.

Borg (and borgmatic) hide incomplete archives by default. So everything in the borgmatic list output is a full backup. If you want to see incomplete backups as well, you can use borg list --consider-checkpoints.

In the situation a backup fails, I'd like to remove that entry and then redo the backup.

You should be able to do that with borg delete or borgmatic borg delete, but if each btrfs snapshot is pretty similar to the previous one, you may find that there's actually not much space savings in manually deleting checkpoint archives—and it may actually end up slowing down backups because you have to re-backup that data instead of reusing it from the incomplete archive. Borg is pretty great at de-duplication.

Now, this is what confuse me a little. Imagine I change some files throughout the day. With my snapshots structure it is possible to recovery changes as they were at a specific hour. With Borg, if I redo the backup from a new (current) snapshot, I would loose those changes.

Ideally, I'd like to preserve the existing snapshot structure with Borg. If the hourly snapshot failed upload it would be possible to re-upload it again from one of the existing snapshots.

Thank you for the explanation. I will do some interrupted tests and see how it works out. As for btrfs, my normal way of doing snapshots and backups is to keep them labelled like this: ``` ❯ ll /mnt/rootvol/snapshots/ total 16 drwxr-xr-x 1 root root 1092 May 13 17:01 ./ drwxr-xr-x 1 root root 30 Apr 18 14:44 ../ drwxr-xr-x 1 root root 194 May 5 10:19 root.20240505T1101/ drwxr-xr-x 1 root root 194 May 5 10:19 root.20240506T0001/ drwxr-xr-x 1 root root 194 May 5 10:19 root.20240507T0001/ drwxr-xr-x 1 root root 194 May 5 10:19 root.20240508T0001/ drwxr-xr-x 1 root root 194 May 5 10:19 root.20240509T0001/ drwxr-xr-x 1 root root 194 May 5 10:19 root.20240510T0001/ drwxr-xr-x 1 root root 194 May 5 10:19 root.20240511T0001/ drwxr-xr-x 1 root root 194 May 5 10:19 root.20240512T0001/ drwxr-xr-x 1 root root 194 May 5 10:19 root.20240513T0001/ drwxr-xr-x 1 root root 194 May 5 10:19 root.20240513T1301/ drwxr-xr-x 1 root root 194 May 5 10:19 root.20240513T1401/ drwxr-xr-x 1 root root 194 May 5 10:19 root.20240513T1501/ drwxr-xr-x 1 root root 194 May 5 10:19 root.20240513T1601/ drwxr-xr-x 1 root root 194 May 5 10:19 root.20240513T1701/ drwxr-xr-x 1 root root 368 May 5 10:45 ``` It seemed to take longer for Borg to backup a new snapshot every time, so i decided on simply create a 'plain' snapshot as `/mnt/rootvol/borgmatic/root` before invoking borgmatic, so that the path to all files remain the same. >Borg (and borgmatic) hide incomplete archives by default. So everything in the borgmatic list output is a full backup. If you want to see incomplete backups as well, you can use borg list --consider-checkpoints. >>In the situation a backup fails, I'd like to remove that entry and then redo the backup. >You should be able to do that with borg delete or borgmatic borg delete, but if each btrfs snapshot is pretty similar to the previous one, you may find that there's actually not much space savings in manually deleting checkpoint archives—and it may actually end up slowing down backups because you have to re-backup that data instead of reusing it from the incomplete archive. Borg is pretty great at de-duplication. Now, this is what confuse me a little. Imagine I change some files throughout the day. With my snapshots structure it is possible to recovery changes as they were at a specific hour. With Borg, if I redo the backup from a new (current) snapshot, I would loose those changes. Ideally, I'd like to preserve the existing snapshot structure with Borg. If the hourly snapshot failed upload it would be possible to re-upload it again from one of the existing snapshots.
witten added the
waiting for response
label 2024-06-24 19:29:49 +00:00
witten removed the
waiting for response
label 2024-06-24 22:11:02 +00:00
Owner

It seemed to take longer for Borg to backup a new snapshot every time, so i decided on simply create a 'plain' snapshot as /mnt/rootvol/borgmatic/root before invoking borgmatic, so that the path to all files remain the same.

I think that makes sense to me, because Borg having to scan and compare n snapshots against the existing repository contents should take a lot longer than scanning/comparing a single snapshot. It's just more data to process.

Now, this is what confuse me a little. Imagine I change some files throughout the day. With my snapshots structure it is possible to recovery changes as they were at a specific hour. With Borg, if I redo the backup from a new (current) snapshot, I would loose those changes.

Not exactly. If I'm understanding your setup correctly, then any given Borg backup archive would be a copy of the current snapshot at the time Borg ran. So if you run Borg hourly on whatever snapshot is the current one for that hour, then you'll have an individual Borg archive for each hour of the day going as far back as you can afford to retain them. And therefore you could recover changes from a specific hour.

Ideally, I'd like to preserve the existing snapshot structure with Borg.

What's the performance difference between pointing Borg at only the current snapshot versus at all snapshots? Maybe it's worth the performance hit to do the latter. Especially if the alternative is running Borg hourly, which has its own performance implications.

I will say though that having the different "versions" of your data across separate Borg archives is more how Borg is designed to be used, rather than having all of them in the same archive. But I think Borg is flexible enough to be used either way.

If the hourly snapshot failed upload it would be possible to re-upload it again from one of the existing snapshots.

My understanding is that's effectively how Borg works under the hood—if archive creation fails, then that data isn't lost (unless you go and delete it). Instead, that data is leveraged for any subsequent retry so that Borg doesn't have to start the upload from scratch. And then at that point you're not "wasting" any real space on the abortive archive because its data has been reused for the subsequent completed archive.

> It seemed to take longer for Borg to backup a new snapshot every time, so i decided on simply create a 'plain' snapshot as /mnt/rootvol/borgmatic/root before invoking borgmatic, so that the path to all files remain the same. I think that makes sense to me, because Borg having to scan and compare n snapshots against the existing repository contents should take a lot longer than scanning/comparing a single snapshot. It's just more data to process. > Now, this is what confuse me a little. Imagine I change some files throughout the day. With my snapshots structure it is possible to recovery changes as they were at a specific hour. With Borg, if I redo the backup from a new (current) snapshot, I would loose those changes. Not exactly. If I'm understanding your setup correctly, then any given Borg backup archive would be a copy of the current snapshot at the time Borg ran. So if you run Borg hourly on whatever snapshot is the current one for that hour, then you'll have an individual Borg archive for each hour of the day going as far back as you can afford to retain them. And therefore you could recover changes from a specific hour. > Ideally, I'd like to preserve the existing snapshot structure with Borg. What's the performance difference between pointing Borg at only the current snapshot versus at all snapshots? Maybe it's worth the performance hit to do the latter. Especially if the alternative is running Borg hourly, which has its own performance implications. I will say though that having the different "versions" of your data across separate Borg archives is more how Borg is designed to be used, rather than having all of them in the same archive. But I think Borg is flexible enough to be used either way. > If the hourly snapshot failed upload it would be possible to re-upload it again from one of the existing snapshots. My understanding is that's effectively how Borg works under the hood—if archive creation fails, then that data isn't lost (unless you go and `delete` it). Instead, that data is leveraged for any subsequent retry so that Borg doesn't have to start the upload from scratch. And then at that point you're not "wasting" any real space on the abortive archive because its data has been reused for the subsequent completed archive.
witten added the
waiting for response
label 2024-07-04 06:23:32 +00:00
Owner

Is it possible to delete specific archives using borgmatic?

It's possible, but borgmatic doesn't have a "native" delete action yet.

This has changed! As of #298 (implemented in main but not yet released as of this comment), borgmatic now has a native delete action.

> > Is it possible to delete specific archives using borgmatic? > It's possible, but borgmatic doesn't have a "native" delete action yet. This has changed! As of #298 (implemented in main but not yet released as of this comment), borgmatic now has a native `delete` action.
Owner

I'm closing this due to inactivity, but please feel free to file a new ticket if you still have issues.

I'm closing this due to inactivity, but please feel free to file a new ticket if you still have issues.
witten removed the
waiting for response
label 2024-10-09 16:25:51 +00:00
Sign in to join this conversation.
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#864
No description provided.