filesystem snapshots hooks #231

Closed
opened 2019-10-23 23:35:50 +00:00 by anarcat · 5 comments

Hi!

First, I must congratulate you on this great project. It (seems to) make borg so much easier to use, which has been a constant frustration to me. I still have to convert my pile of shell scripts into borgmatic, but after re-reading the docs a Nth time, I can't believe I haven't switched already.

That said, there's a certain number of things I'd like to see in borgmatic and that I have implemented in a similar program called bup-cron, designed for bup with a similar purpose than borgmatic.

This is my first feature request. :)

What I'm trying to do and why

I would like borgmatic to automatically schedule filesystem-level snapshots. Any backup takes a while, and it's important to get a consistent view of the system when doing a backup. The only proper way to do this is is through filesystem snapshots.

A common error I get in borg backups is file that disappear during the backup, which borg marks as an error (status code 1, which is a warning, go figure). Snapshots would solve this completely.

Other notes / implementation ideas

Of course, there are many ways of doing those, depending on the backend: BTRFS, ZFS, LVM, oh my! But you have to start somewhere and some of the work could be abstracted away in a library or a tool like snapper.

The code responsible for snapshots in bup-cron now kind of looks like a nasty affair to me, but it could also be reused. It supports Linux's LVM snapshots and Windows' VSS (not tested by me).

Hi! First, I must congratulate you on this great project. It (seems to) make borg so much easier to use, which has been a constant frustration to me. I still have to convert my pile of shell scripts into borgmatic, but after re-reading the docs a Nth time, I can't believe I haven't switched already. That said, there's a certain number of things I'd like to see in borgmatic and that I have implemented in a similar program called [bup-cron](https://github.com/anarcat/bup-cron), designed for [bup](https://bup.github.io/) with a similar purpose than borgmatic. This is my first feature request. :) #### What I'm trying to do and why I would like borgmatic to automatically schedule filesystem-level snapshots. Any backup takes a while, and it's important to get a consistent view of the system when doing a backup. The only proper way to do this is is through filesystem snapshots. A common error I get in borg backups is file that disappear during the backup, which borg marks as an error (status code 1, which is a warning, go figure). Snapshots would solve this completely. #### Other notes / implementation ideas Of course, there are many ways of doing those, depending on the backend: BTRFS, ZFS, LVM, oh my! But you have to start somewhere and some of the work *could* be abstracted away in a library or a tool like [snapper](http://snapper.io/). The [code responsible for snapshots](https://github.com/anarcat/bup-cron/blob/master/bup_cron/__init__.py#L296) in bup-cron now kind of looks like a nasty affair to me, but it could also be reused. It supports Linux's LVM snapshots and Windows' VSS (not tested by me).
Owner

I'm glad to hear borgmatic sounds like it may make Borg easier to use. ;) Goal number one, done! Now, on to goal number two..

I totally buy what you're selling in terms of the need for consistent filesystem snapshots. However, I don't have a ton of experience here. Have you seen ticket witten/borgmatic#80? Do you think your ask would be covered by that ticket, or is that too narrow because it's LVM only?

Also, do you happen to use LVM for filesystem snapshots, or something else? Could you outline the general series of steps that your scripts and/or bup-cron run through when making filesystem snapshots + backups?

Thanks!

I'm glad to hear borgmatic sounds like it may make Borg easier to use. ;) Goal number one, done! Now, on to goal number two.. I totally buy what you're selling in terms of the need for consistent filesystem snapshots. However, I don't have a ton of experience here. Have you seen ticket https://projects.torsion.org/witten/borgmatic/issues/80? Do you think your ask would be covered by that ticket, or is that too narrow because it's LVM only? Also, do you happen to use LVM for filesystem snapshots, or something else? Could you outline the general series of steps that your scripts and/or `bup-cron` run through when making filesystem snapshots + backups? Thanks!
Author

I totally buy what you’re selling in terms of the need for consistent filesystem snapshots. However, I don’t have a ton of experience here. Have you seen ticket #80? Do you think your ask would be covered by that ticket, or is that too narrow because it’s LVM only?

I have seen #80, and suspected it was related, but I must admit I didn't read it. :) Now I see it does talk a lot about LVM, but maybe it's best if the two are kept separate... I don't mind either way.

I would suggest you keep the Snapshot implementation filesystem-neutral, because it's a moving landscape right now. Traditional LVM has been there for a long time now, but there's also thin LVM, btrfs and ZFS in the game, and that's only in Linux. I didn't implement VSS support in bup-cron, someone else did, after needing it, which I found pretty surprising. :)

do you happen to use LVM for filesystem snapshots, or something else?

I'm not sure I understand the first question. Assuming it's whether I use LVM (or something else) for snapshots (as opposed to "do you use LVM for something else"), then the answer is yes: I generally use LVM for snapshotting. :)

Could you outline the general series of steps that your scripts and/or bup-cron run through when making filesystem snapshots + backups?

The code I linked to isn't great, but it should be fairly readable. It goes something like this, step by step:

  1. find the mountpoint the requested directory is associated to (e.g. "is /home a mountpoint or is it on /?" or "is /var separate from /home"?) - that's done with a loop crawling up the filesystem hierarchy while checking os.path.ismount
  2. find the device (and therefore logical volume) associated with the given mountpoint - that one is a hack, it parses the output of mount
  3. find the volume group associated with the logical volume - also a hack, parses the output of lvs $DEVICE, but there are nice formatting options in LVM that should be used instead to make this more reliably
  4. create a readonly snapshot of the specified size - basically just calling lvcreate --size $SIZE --snapshot --permission r --name $LV_NAME-snap $LVNAME
  5. create a mountpoint for the snapshot - basically makedirs on a predetermined directory in /media, would use a tmpdir from tempfiles now instead
  6. mount the snapshot, readonly - just calling to mount
  7. do the actual backup of the mountpoint - just shelling out to borg, as you say :)
  8. cleanup: unmount and remove the mountpoint, remove the snapshot

I made an abstract Snapshot class that has somewhat of an API for this, but it's not the best. It does very little. It acts as a context manager and on __enter__, it will create the snapshot and mount it, and on __exit__ remove all traces of it. That should probably be improved upon, as it would cleanup code significantly. Looking at commont code between the two Snapshot implementations in bup-cron would help in designing that, I suppose.

That class might look like this:

class Snapshot:
  def _notimplemented():
    raise NotImplementedError()
  # create the actual snapshot
  create = Snapshot._notimplemented
  # destroy the snapshot
  destroy = Snapshot._notimplemented
  # mount it
  mount = Snapshot._notimplemented
  # unmount the snapshot
  umount = Snapshot._notimplemented

It seems that find_device and find_mountpoint might also be commonly needed interfaces. Right now, the class operations are all driven by (and therefore hidden inside) the context-manager, but it might make sense to expose those as the Snapshot interface and let the caller drive and handle problems correctly.

I don't think I would have the time to work on this myself unfortunately, but I hope the above would serve as a good guide!

> I totally buy what you’re selling in terms of the need for consistent filesystem snapshots. However, I don’t have a ton of experience here. Have you seen ticket #80? Do you think your ask would be covered by that ticket, or is that too narrow because it’s LVM only? I have seen #80, and suspected it was related, but I must admit I didn't read it. :) Now I see it does talk a lot about LVM, but maybe it's best if the two are kept separate... I don't mind either way. I would suggest you keep the Snapshot implementation filesystem-neutral, because it's a moving landscape right now. Traditional LVM has been there for a long time now, but there's also thin LVM, btrfs and ZFS in the game, and that's only in Linux. I didn't implement VSS support in bup-cron, someone else did, after needing it, which I found pretty surprising. :) > do you happen to use LVM for filesystem snapshots, or something else? I'm not sure I understand the first question. Assuming it's whether I use LVM (or something else) for snapshots (as opposed to "do you use LVM for something else"), then the answer is yes: I generally use LVM for snapshotting. :) > Could you outline the general series of steps that your scripts and/or bup-cron run through when making filesystem snapshots + backups? The code I linked to isn't great, but it should be fairly readable. It goes something like this, step by step: 1. **find the mountpoint** the requested directory is associated to (e.g. "is `/home` a mountpoint or is it on `/`?" or "is `/var` separate from `/home`"?) - that's done with a loop crawling up the filesystem hierarchy while checking `os.path.ismount` 2. **find the device** (and therefore logical volume) associated with the given mountpoint - that one is a hack, it parses the output of `mount` 3. **find the volume group** associated with the logical volume - also a hack, parses the output of `lvs $DEVICE`, but there are nice formatting options in LVM that should be used instead to make this more reliably 4. **create a readonly snapshot** of the specified size - basically just calling `lvcreate --size $SIZE --snapshot --permission r --name $LV_NAME-snap $LVNAME` 5. **create a mountpoint** for the snapshot - basically `makedirs` on a predetermined directory in `/media`, would use a tmpdir from `tempfiles` now instead 6. **mount the snapshot, readonly** - just calling to mount 7. **do the actual backup** of the mountpoint - just shelling out to borg, as you say :) 8. **cleanup**: unmount and remove the mountpoint, remove the snapshot I made an abstract `Snapshot` class that has somewhat of an API for this, but it's not the best. It does very little. It acts as a context manager and on `__enter__`, it will create the snapshot and mount it, and on `__exit__` remove all traces of it. That should probably be improved upon, as it would cleanup code significantly. Looking at commont code between the two `Snapshot` implementations in bup-cron would help in designing that, I suppose. That class might look like this: ``` class Snapshot: def _notimplemented(): raise NotImplementedError() # create the actual snapshot create = Snapshot._notimplemented # destroy the snapshot destroy = Snapshot._notimplemented # mount it mount = Snapshot._notimplemented # unmount the snapshot umount = Snapshot._notimplemented ``` It seems that `find_device` and `find_mountpoint` might also be commonly needed interfaces. Right now, the class operations are all driven by (and therefore hidden inside) the context-manager, but it might make sense to expose those as the `Snapshot` interface and let the caller drive and handle problems correctly. I don't think I would have the time to work on this myself unfortunately, but I hope the above would serve as a good guide!
Owner

I’m not sure I understand the first question. Assuming it’s whether I use LVM (or something else) for snapshots (as opposed to “do you use LVM for something else”), then the answer is yes: I generally use LVM for snapshotting. :)

Yes, that's what I meant. :)

And thanks for the detailed walk-through. That's super helpful. I'll likely combine this ticket with the other one somehow.. Even if I go with an abstract/generalized approach to filesystem snapshots, I'll likely still need to start with support for one of them on day 1. (Unless leveraging something like Snapper, as you mention.)

Probably the best way for me to get up-to-speed on some of this tooling is to start playing with it myself. Then I'll get a better idea of next steps. In the meantime, if you have any other thoughts, feel free to add 'em to this ticket. I may have more questions as well.

> I’m not sure I understand the first question. Assuming it’s whether I use LVM (or something else) for snapshots (as opposed to “do you use LVM for something else”), then the answer is yes: I generally use LVM for snapshotting. :) Yes, that's what I meant. :) And thanks for the detailed walk-through. That's super helpful. I'll likely combine this ticket with the other one somehow.. Even if I go with an abstract/generalized approach to filesystem snapshots, I'll likely still need to start with support for one of them on day 1. (Unless leveraging something like Snapper, as you mention.) Probably the best way for me to get up-to-speed on some of this tooling is to start playing with it myself. Then I'll get a better idea of next steps. In the meantime, if you have any other thoughts, feel free to add 'em to this ticket. I may have more questions as well.
Owner

Merged into witten/borgmatic#80, mostly because more folks are subscribed to that ticket. Please continue any discussion there!

Merged into https://projects.torsion.org/witten/borgmatic/issues/80, mostly because more folks are subscribed to that ticket. Please continue any discussion there!
Author

Probably the best way for me to get up-to-speed on some of this tooling is to start playing with it myself. Then I’ll get a better idea of next steps. In the meantime, if you have any other thoughts, feel free to add ‘em to this ticket. I may have more questions as well.

feel free to ask! i'll monitor that other ticket as well.

> Probably the best way for me to get up-to-speed on some of this tooling is to start playing with it myself. Then I’ll get a better idea of next steps. In the meantime, if you have any other thoughts, feel free to add ‘em to this ticket. I may have more questions as well. feel free to ask! i'll monitor that other ticket as well.
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#231
No description provided.