WIP: Add container volume backups with podman cli #895
Loading…
x
Reference in New Issue
Block a user
No description provided.
Delete Branch "IBims1NicerTobi/borgmatic:main"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
So as of right now I have no Idea how exactly the restore function would have to look like and I would really appreciate your input. The restore command should look like this:
borg extract [ARCHIVE] [PATH TO DUMP]
. I know that borgmatic will probably provide me the archive and the path in the function should be built from the volume name but I'm not sure how to put it all together. I also think it would make sense if the restore had two modes for now: By default the restore should create the volume and fail if the volume exists already, if the user provides something line --force we delete the volume and overwrite it with the contents of the archive.borgmatic actually does much more than that for the existing database hooks, which I suggest modeling your hook after. Take a look at, for instance,
borgmatic/hooks/mariadb.py:restore_data_source_dump()
. That function is handed anextract_process
value that represents theborg extract
command that borgmatic has already started running to extract the correct path from the selected archive. Then you'll notice that at the very end of the function,extract_process.stdout
is passed as theinput_file
to the MariaDB restore command. In other words, theborg extract
stdout gets streamed directly to the stdin of the MariaDB restore.So you could do the same thing in a Podman hook—the stdout of the
extract_process
that borgmatic hands you could get passed as input to yourpodman volume import
command.That makes sense to me. (I might suggest calling it
--overwrite
instead of--force
though just because that's a little more explicit.)Just a few comments on the schema since this is still WIP..
@ -1873,3 +1873,21 @@ properties:
can send the logs to a self-hosted instance or create an account at
https://grafana.com/auth/sign-up/create-user. See borgmatic
monitoring documentation for details.
podman_databases:
This could just be
podman:
since there's nothing database-specific about this.Fixed
@ -1876,0 +1878,4 @@
required: ['volumes']
additionalProperties: false
properties:
volumes:
Is the rationale with having the
volumes
option nested under here that you might additional Podman-specific options in the future?containers
, etc?Yeah other than volumes I would also like to support labels.
Ok we maybe need some docs saying that the hook dump and restore layout of the schema is related to the restore code. The code in actions/restore.py checks for the database name on a specified point in the schema and assumes it is there (which is the case for all the database hooks but does not really make sense in my case since there might be a lot of volumes and I would prefer to not model my schema the way the database hooks have so far).
Can we maybe add a function to every database hook to the name of the configured databases instead of assuming their position in the schema?
@witten Any opinion on this?
Also it looks like I cannot just call the hook
podman
because the_databases
part of the path is hardcoded inborgmatic/actions/restore.py
in line166
.Yes, adding new functions to the internal data source "API" is of course possible. However it might not be sufficient to just add a function to resolve data source names, because it looks like there's also code in
actions/restore.py
that assumes the hook configuration itself is just a list. So one way to address both problems might be to change the code such that only thevolumes:
portion of your configuration gets passed to these variousrestore.py
functions. That way: 1. It'd still be dealing in a plain list (of mappings) like the other data sources, and 2. Each element of the list would have aname
like the other data sources.EDIT: Although I see that in your existing schema, the volumes names are just a list of strings, so that still doesn't directly line up with the
name
-style schema. But you could change to a format like the following, which would line up, at least thevolume:
level:That'd also have the benefit of allowing future options on a per-volume basis. Although maybe that's just unnecessary future-proofing/rationalization.
I'm open to other approaches of course; that's just one approach that occurred to me. I haven't worked through all the details of how to make it happen.
Excellent point. But that code might still work for the Podman hook if another glob path was added, e.g.:
Yeah but tbh that just feels hacky. I really dislike the use of magical names here and I think it would be much clearer if every hook just provided the name it used for the database dump directory directly instead of changing it for every new hook that comes along. I would prefer it if the hook api was just better documented and integrated less into the rest of the tool. Imho writing a hook should not require reading all of borgmatics source code.
I would seriously propose writing a clear api for the database dumps since I would assume there a a lot more coming soon. If you were up for that change I would propose writing something like an abstract base class for the database hooks and documenting it very well.
I agree that a more well-defined API would be preferable to the status quo. No API though is going to be perfectly prescient at designing for all future needs.. I think the trick is just getting started with an initial API that can meet the needs of current database hooks and your current Podman hook.
Something like a well-documented module with a bunch of empty functions could maybe serve that need, as borgmatic's code base doesn't really use classes (except where necessary to interface with the Python standard library / third-party libraries). I'd be happy to tackle that unless you wanted to take a stab at it as part of this Podman work.
Yeah I wouldn't mind working that out too. Kinda seems like wasted effort to try and hack in support for this one hook only to change the API afterwards.
Sounds good. Let me know if I can help with the API design—or if you just get sick of it and want to hand it over at any point!
Ok I have run into an issue I don't really know how to fix: If I create an empty volume using podman and dump it into a fifo podman creates an empty backup and following that instantly exits.
borg
does however try to read from the fifo and gets stuck since there is nothing to read. Can we handle that somehow?Could you provide some more details about how you're dumping the volume? For instance, if you're exporting it with
podman volume export
, I would expect that to produce a tarball. And any tarball has a non-zero size even if its contents are empty.Yeah no I really can't. I'm unfortunately really busy right now but I think the issue must be with my dev env since the podman volume export command does indeed not create an empty archive out of an empty volume.
Okay, I'm happy to help once you get some time to look at this. But if you aren't going to have time for the foreseeable, I totally understand.. Just let me know.
I'll close this for now due to inactivity (and branch conflicts), but I'd be happy to revisit in the future. Thanks!
Pull request closed