#80 Support libvirt backup

Open
opened 6 months ago by bfabio · 20 comments
bfabio commented 6 months ago

The only feature borgmatic is missing for our workflow is a way to backup libvirt’s virtual disks from running virtual machines (we use LVM). borg already has all we need with the --read-special argument, it would be just a matter of creating a read-only snapshot and then remove it when we are done.

We could have something like this in the config file:

source_logical_volumes:
    # By path
    - /dev/vg1/mercury.planets.org-disk

    # By name
    - venus.planets.org-disk

    # By partition within the volume, automatically using kpartx? (maybe that's too much)
    - earth.planets.org-disk@2

Or perhaps be fancy and backup whatever disk a VM is configured with:

source_libvirt:
    # Backup all of the 'mars.planets.org' disks
    - mars.planets.org

@witten I’m willing to write a patch if you like the feature.

The only feature borgmatic is missing for our workflow is a way to backup libvirt's virtual disks from running virtual machines (we use LVM). `borg` already has all we need with the `--read-special` argument, it would be just a matter of creating a read-only snapshot and then remove it when we are done. We could have something like this in the config file: ```yaml source_logical_volumes: # By path - /dev/vg1/mercury.planets.org-disk # By name - venus.planets.org-disk # By partition within the volume, automatically using kpartx? (maybe that's too much) - earth.planets.org-disk@2 ``` Or perhaps be fancy and backup whatever disk a VM is configured with: ```yaml source_libvirt: # Backup all of the 'mars.planets.org' disks - mars.planets.org ``` @witten I'm willing to write a patch if you like the feature.
bfabio commented 6 months ago
Poster

Just noticed #5.

I don’t think hooks are ergonomic for this use case:

  • Duplication. You would have to put the config for your volume in three different places (source_directories, before_backup and after_backup)
  • The chance of errors increases. If you forget to freeze a snapshot in before_backup everything seems to work, until you find out your backup is possibily corrupt when you try to restore it.
  • No way to know what you’re backing up (AFAIK). I can’t seem to find the documentation for user hooks, but it looks like hooks are ran once per repository. It will be actually better to freeze and destroy the snapshot on a per source basis, because we want to get rid of it as soon as possible since it can fill up and be no longer valid.
  • Will the backup run if the script in before_backup fails? If it does, it’s scenario two all over again without even realizing it.
Just noticed #5. I don't think hooks are ergonomic for this use case: - Duplication. You would have to put the config for your volume in three different places (`source_directories`, `before_backup` and `after_backup`) - The chance of errors increases. If you forget to freeze a snapshot in `before_backup` everything seems to work, until you find out your backup is possibily corrupt when you try to restore it. - No way to know what you're backing up (AFAIK). I can't seem to find the documentation for user hooks, but it looks like hooks are ran once per repository. It will be actually better to freeze and destroy the snapshot on a per source basis, because we want to get rid of it as soon as possible since it can fill up and be no longer valid. - Will the backup run if the script in `before_backup` fails? If it does, it's scenario two all over again without even realizing it.
witten commented 6 months ago
Owner

I’m not that familiar with the libvirt backup use case. Based on what you’ve said so far, I’m gathering that it’s maybe something along these lines:

  1. Create a snapshot.
  2. Invoke borg --read-special ..., providing the path to the snapshot.
  3. Destroy the snapshot.

Is that about right?

As for your other points:

  • Duplication: Is this because the path or name for your snapshot needs to be mentioned in all three places (source_directories, before_backup and after_backup)? Or is there additional config you’d need to duplicate?
  • Hooks are actually run once per config file, before all repositories are backed up, and after all repositories are backed up. I’ve filed #81 to better document hooks.
  • I believe that the backup script will not run if the before_backup hook fails. It’ll instead skip backups, run the on_error hook (if any), and exits with an error.

Brainstorming on this, I can think of a couple of classes of solutions:

  1. Make use of the existing hooks, and deal with the duplication.
  2. Make use of the existing hooks, and somehow eliminate the duplication via parameterization of some sort. This might be difficult, given that you probably don’t want to do the snapshot create/destroy on all sources.
  3. Add a feature to support before/after hooks on individual sources, perhaps with the name of the sources parameterized.
  4. Add actual support for snapshots, something like your example above.

Let me know your thoughts.

I'm not that familiar with the libvirt backup use case. Based on what you've said so far, I'm gathering that it's maybe something along these lines: 1. Create a snapshot. 2. Invoke `borg --read-special ...`, providing the path to the snapshot. 3. Destroy the snapshot. Is that about right? As for your other points: * Duplication: Is this because the path or name for your snapshot needs to be mentioned in all three places (`source_directories`, `before_backup` and `after_backup`)? Or is there additional config you'd need to duplicate? * Hooks are actually run once per config file, before all repositories are backed up, and after all repositories are backed up. I've filed #81 to better document hooks. * I believe that the backup script will not run if the `before_backup` hook fails. It'll instead skip backups, run the `on_error` hook (if any), and exits with an error. Brainstorming on this, I can think of a couple of classes of solutions: 1. Make use of the existing hooks, and deal with the duplication. 2. Make use of the existing hooks, and somehow eliminate the duplication via parameterization of some sort. This might be difficult, given that you probably don't want to do the snapshot create/destroy on all sources. 3. Add a feature to support before/after hooks on individual sources, perhaps with the name of the sources parameterized. 4. Add actual support for snapshots, something like your example above. Let me know your thoughts.
witten commented 6 months ago
Owner

One more type of solution:

5. Add support for snapshots, but instead of having a separate source_logical_volumes or source_libvert as per your example, just allow listing of logical volumes (by path?) directly in source_directories. If borgmatic detects that one such entry is actually a logical volume, it will automatically do the snapshop create/destroy for it!

Important question though for many of these options: Does the create/destroy need to be in any way configurable? Or is it pretty standard? What’s the command invocation look like for each of create and destroy?

One more type of solution: `5.` Add support for snapshots, but instead of having a separate `source_logical_volumes` or `source_libvert` as per your example, just allow listing of logical volumes (by path?) directly in `source_directories`. If borgmatic detects that one such entry is actually a logical volume, it will automatically do the snapshop create/destroy for it! Important question though for many of these options: Does the create/destroy need to be in any way configurable? Or is it pretty standard? What's the command invocation look like for each of create and destroy?
bfabio commented 6 months ago
Poster
  1. Create a snapshot.
  2. Invoke borg --read-special …, providing the path to the snapshot.
  3. Destroy the snapshot.

Is that about right?

That’s exactly right.

You create a snapshot with: lvcreate --size SIZE --snapshot --name SNAPSHOT_NAME /dev/mapper/mercury.planets.org--disk

and remove it with lvremove /dev/mapper/SNAPSHOT_NAME

  • Duplication: Is this because the path or name for your snapshot needs to be mentioned in all three places (source_directories, before_backup and after_backup)? Or is there additional config you?d need to duplicate?

No, there’s no additional config, my concern was just for those three places we’d have to maintain.

  1. Add a feature to support before/after hooks on individual sources, perhaps with the name of the sources parameterized.

That would be nice and useful for other scenarios as well. Something like

 before_backup: echo $source $repository
  1. Add support for snapshots, but instead of having a separate source_logical_volumes or source_libvert as per your example, just allow listing of logical volumes (by path?) directly in source_directories. If borgmatic detects that one such entry is actually a logical volume, it will automatically do the snapshop create/destroy for it!

My first reaction is “cool”, my second reaction is “it may be tricky to detect a logical volume from its path” (we’d need to check against its major and minor number, I guess) and “what happens if this heuristic fails”. I suppose it will silently fall back to a normal backup without freezing the volume.

Important question though for many of these options: Does the create/destroy need to be in any way configurable? Or is it pretty standard? What?s the command invocation look like for each of create and destroy?

The only thing that we can configure is the size of the snapshot device, which is basically the volume of changed data on the original volume the snapshot can handle before being invalidated. I think we can safely calculate that from the volume size for the most common case, or maybe use a reasonable percentage of the available size.

> 1. Create a snapshot. > 2. Invoke borg --read-special ..., providing the path to the snapshot. > 3. Destroy the snapshot. > Is that about right? That's exactly right. You create a snapshot with: `lvcreate --size SIZE --snapshot --name SNAPSHOT_NAME /dev/mapper/mercury.planets.org--disk` and remove it with `lvremove /dev/mapper/SNAPSHOT_NAME` > * Duplication: Is this because the path or name for your snapshot needs to be mentioned in all three places (source_directories, before_backup and after_backup)? Or is there additional config you?d need to duplicate? No, there's no additional config, my concern was just for those three places we'd have to maintain. > 3. Add a feature to support before/after hooks on individual sources, perhaps with the name of the sources parameterized. That would be nice and useful for other scenarios as well. Something like ``` before_backup: echo $source $repository ``` > 5. Add support for snapshots, but instead of having a separate source_logical_volumes or source_libvert as per your example, just allow listing of logical volumes (by path?) directly in source_directories. If borgmatic detects that one such entry is actually a logical volume, it will automatically do the snapshop create/destroy for it! My first reaction is "cool", my second reaction is "it may be tricky to detect a logical volume from its path" (we'd need to check against its major and minor number, I guess) and "what happens if this heuristic fails". I suppose it will silently fall back to a normal backup without freezing the volume. > Important question though for many of these options: Does the create/destroy need to be in any way configurable? Or is it pretty standard? What?s the command invocation look like for each of create and destroy? The only thing that we can configure is the size of the snapshot device, which is basically the volume of changed data on the original volume the snapshot can handle before being invalidated. I think we can safely calculate that from the volume size for the most common case, or maybe use a reasonable percentage of the available size.
witten commented 6 months ago
Owner

Okay, having had a chance to poke around with this a bit, I’m inclined to agree that “it may be tricky to detect a logical volume from it’s path”. Therefore I think your original suggestion of source_logical_volumes: sounds good to me. If you still are interested in implementing this, please feel free to take a crack at it. Otherwise, let me know. And thanks for walking me through this!

Okay, having had a chance to poke around with this a bit, I'm inclined to agree that "it may be tricky to detect a logical volume from it's path". Therefore I think your original suggestion of `source_logical_volumes:` sounds good to me. If you still are interested in implementing this, please feel free to take a crack at it. Otherwise, let me know. And thanks for walking me through this!
bfabio commented 6 months ago
Poster

Great, I’ll come up with a patch.

On a second thought this has become “Support LVM volumes backup”, do we want to support qcow2 and raw like backup-vm does?

Great, I'll come up with a patch. On a second thought this has become "Support LVM volumes backup", do we want to support qcow2 and raw like [backup-vm](https://github.com/milkey-mouse/backup-vm) does?
witten commented 6 months ago
Owner

Does the lvcreate command differ based on the VM disk image format? If so, how? Or would you use a different command in each case?

Does the `lvcreate` command differ based on the VM disk image format? If so, how? Or would you use a different command in each case?
bfabio commented 6 months ago
Poster

Sorry about the confusion. lvcreate just creates LVM volumes (and their snapshots) which are block devices that may or may not be used as backend storage for virtual machines.

Virtualization platforms also support file based disk images formats such as qcow2 and raw.

Sorry about the confusion. `lvcreate` just creates LVM volumes (and their snapshots) which are block devices that may or may not be used as backend storage for virtual machines. Virtualization platforms also support file based disk images formats such as qcow2 and raw.
witten commented 6 months ago
Owner

Gotcha. Those seem like a reasonable thing to support, either as part of this or separately. Is your thinking that you would support generic libvirt backups, and thereby target all the various formats?

Gotcha. Those seem like a reasonable thing to support, either as part of this or separately. Is your thinking that you would support generic libvirt backups, and thereby target all the various formats?
bfabio commented 6 months ago
Poster

I was thinking about implementing just the LVM part for now, as it’s what I need for our workflow.

I was thinking about implementing just the LVM part for now, as it's what I need for our workflow.
witten commented 6 months ago
Owner

Sounds good!

Sounds good!

I also wanted --read-special support for a similar purpose - to backup inactive disk partitions as well as LVM snapshot volumes. So, I forked borgmatic and added a read_special option to the location part of the yaml schema. The rest I do with hooks.

I can make it available on GitHub if you think it’s worth merging? Relatively straightforward to do though.

The default config is off, of course, and the template comments warn about its use, although I feel the warning might need to be stronger, and reference the borg docs.

I also wanted `--read-special` support for a similar purpose - to backup inactive disk partitions as well as LVM snapshot volumes. So, I forked borgmatic and added a `read_special` option to the `location` part of the yaml schema. The rest I do with hooks. I can make it available on GitHub if you think it's worth merging? Relatively straightforward to do though. The default config is off, of course, and the template comments warn about its use, although I feel the warning might need to be stronger, and reference the borg docs.
witten commented 5 months ago
Owner

Sure, I’d be happy to take a look at the fork and potentially merge it! It may not satisfy all of @bfabio’s asks in this ticket, but it could be a start.

Sure, I'd be happy to take a look at the fork and potentially merge it! It may not satisfy all of @bfabio's asks in this ticket, but it could be a start.

Added a pull request on GitHub https://github.com/witten/borgmatic/pull/25 for your consideration.

I’m also happy to discuss my before and after backup hooks if @bfabio would like, although I don’t consider them particularly fit for general use, as they’re just some shell-fu that make certain naming convention assumptions, and don’t deal with sync issues fully.

Added a pull request on GitHub https://github.com/witten/borgmatic/pull/25 for your consideration. I'm also happy to discuss my before and after backup hooks if @bfabio would like, although I don't consider them particularly fit for general use, as they're just some shell-fu that make certain naming convention assumptions, and don't deal with sync issues fully.
witten commented 5 months ago
Owner

Thanks, merged!

Thanks, merged!
witten commented 5 months ago
Owner

Also, let me know if you’d like an entry in AUTHORS (or submit a PR yourself for that).

Also, let me know if you'd like an entry in `AUTHORS` (or submit a PR yourself for that).

My name’s in the commit log. That’s enough for me ;)

My name's in the commit log. That's enough for me ;)
witten commented 5 months ago
Owner

Sounds good. :)

Sounds good. :)
witten commented 4 months ago
Owner

The --read-special support has just been released as part of borgmatic 1.2.3.

The `--read-special` support has just been released as part of borgmatic 1.2.3.
witten added the
design finalized
label 4 months ago
bfabio commented 1 month ago
Poster

@stevekerrison Thanks for your work. Do you think we can solve this with hooks?

@stevekerrison Thanks for your work. Do you think we can solve this with hooks?
Sign in to join this conversation.
No Milestone
No Assignees
3 Participants
Due Date

No due date set.

Dependencies

This issue currently doesn't have any dependencies.

Loading…
Cancel
Save
There is no content yet.