best practice on dumping databases? #551
Labels
No Label
bug
data loss
design finalized
good first issue
new feature area
question / support
security
waiting for response
No Milestone
No Assignees
2 Participants
Notifications
Due Date
No due date set.
Dependencies
No dependencies set.
Reference: borgmatic-collective/borgmatic#551
Loading…
Reference in New Issue
No description provided.
Delete Branch "%!s(<nil>)"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
on dumping databases for me it is not clear who is responsible to exclude the binaries of the databases?
e.g. under debian/ubuntu
postgresql
/var/lib/postgresql/<version>/main
maria/mysql
/var/lib/mysql/
if adding directly or indirectly
/var
into backup foldersThe both above named database data folders will be also dumped even if the config.yaml is configured to dump the databases too.
My expectation would be that borgmatic will skip the binary dumps.
But I am not clear about the best practices with borgmatic?
Could you suggest?
If dumping mysqldump shall I exclude
/var/lib/mysql/
from backup?If dumping pg_dumpall shall I exclude
/var/lib/postgresql/<version>/main
from backup?if in both cases yes, why does borgmatic not do it for me?
A dream for a admin would be:
If backuping data folders of databases, borgmatic should stop the databases and restart them directly before and after copy of the files
or if the dump is configured the files shall be skipped
While I understand the need, I think selecting backup directories on the user's behalf goes a little beyond the scope of borgmatic.. I wouldn't want borgmatic to do something automatically and get it wrong, making folks lose their files. Have you considered completely excluding
/var
in its entirety? Unless you have something in there like a mail server spool, there might not actually be anything within/var
you need backed up.. especially since you're already dumping databases with borgmatic. Might be worth poking around in there to see what you actually need. An OS reinstall after a catastrophic drive loss would recreate many of the structures in/var
(if not their contents).to exclude /var is IMHO even worse, if you read the docs of the distributions, you have to include it into the backups
so for me not an option and also not scope of my issue
My issue is, that I have databases with hundreds of GB or TB of size. So dumping is already taking ages.
Also on restore the issue may arrise since the restore does the double and senseless work.
You may say: yes, but you can exclude manually, this is also right. But if you consider having 10s or 100s of servers, you have to do for each server individually.
I fully confirm to this, but automatically is fine, if you
if this is missing I will underline and confirm your sentence right the way
Getting back to my issue:
So right now I have to manually check all installations with borgmatic to exclude the data folders of the databases.
add exclusion automatically
If I would build it myself in bash I would do (works with ubuntu 20.04):
Alternative to stop and start the databases.
if I would have an option to add a hook BEFORE and AFTER dumping a database I would add here to stop and to start the database accordingly. So I could use the binaries for backups, which would make the process also okay to me.
this solution would also be okay for my case, since I would have in this case a consistent database backup (even if binary only)
More to consider if using database based dump tools
assuming 2 cases
case1: backup of specific databases
in this case we could discuss if it makes sense to exclude.
IMHO I would exclude all data folders from backup, since the user skips the other databases already
case2: backup of all databases
in this case it is fully clear that the user wants to ensure all databases are backuped. So here we can blindly exclude the data folders of the database as shown in the bash script above.
Hope I could explain it better.
This requirement comes from bigger installation, on which I am trying borgmatic right now. If you have a handful of servers, you will not face the need for the requirements, since the workload for setup is pretty low.
Thanks for the detailed explanation. Would you be able to convert your script's grep pattern into Borg's pattern syntax and specify it as an exclude in borgmatic's configuration? That might be slightly easier than using a bash script.
As for the before/after database hooks: There are existing before/after backup hooks that could serve the same purpose. Since the database needs to be down for the duration of the backup, you could take the database offline in
before_backup
and bring it back online inafter_backup
.I am not sure if got your query properly?
Do you ask me here to add a borg extra parameter to my config
or do you ask me for a
contribution to implement the requirement in borgmatic?
can you clarify this point?
if I got your recommendation here, you ask me to use the binary backup path as recommendation? What are you experiences with migrating binary backups to new servers on restore? Especially postgresql and mariadb?
The former. I'm wondering if a "simple" exclude pattern within your borgmatic configuration would solve your use case. And then you wouldn't need a separate bash script to produce exclusions (or require the equivalent within borgmatic).
Nope, not going so far as making a recommendation. I'm just giving you an option since you brought up hooks. Personally, I would probably lean towards the standard dump and restore approach rather than binary backups, just because I'm not a big fan of database downtime and it seems like a more "official" way to do backups. But you know your systems and requirements better than I do!
Closing this for now, but feel free to follow up. We can always reopen it. Thanks!