how to manage ~40 linux vm with borgmatic #742

Closed
opened 2023-08-14 11:14:17 +00:00 by mhazan01 · 9 comments

What I'm trying to do and why

i've setup a backupserver with borgbackup and borgmatic to mange the configs, with one configuration file per vm, and one repository per vm. And i'am stuck at the next step, borgmatic seem to deal with "local" backup only ? my idea is to trigger the backup from the backupserver and use borgmatic "sequential" execution to run through all the config files and avoid having overlapping backup task running.

is that a bad idea ? would restore even work in that scenario ? should i give up on the "pull" idea ? what would be a better alternative ?

Thanks !

Steps to reproduce

No response

Actual behavior

No response

Expected behavior

No response

Other notes / implementation ideas

No response

borgmatic version

No response

borgmatic installation method

No response

Borg version

No response

Python version

No response

Database version (if applicable)

No response

Operating system and version

No response

### What I'm trying to do and why i've setup a backupserver with borgbackup and borgmatic to mange the configs, with one configuration file per vm, and one repository per vm. And i'am stuck at the next step, borgmatic seem to deal with "local" backup only ? my idea is to trigger the backup from the backupserver and use borgmatic "sequential" execution to run through all the config files and avoid having overlapping backup task running. is that a bad idea ? would restore even work in that scenario ? should i give up on the "pull" idea ? what would be a better alternative ? Thanks ! ### Steps to reproduce _No response_ ### Actual behavior _No response_ ### Expected behavior _No response_ ### Other notes / implementation ideas _No response_ ### borgmatic version _No response_ ### borgmatic installation method _No response_ ### Borg version _No response_ ### Python version _No response_ ### Database version (if applicable) _No response_ ### Operating system and version _No response_
Contributor

Borgmatic uses a push model, so normally you run borgmatic on the Server you want to back up from and it pushes the backup to the server you want to backup to. In order to start the backups sequentially you would have to be able to login to every machine you want to back up and run borgmatic. The way I do it personally is to just use a systemd timer or alternatively a cron entry to run borgmatic on all the clients I want to backup. Using a push model is not really supported by borgmatic unless you give a single server access to all the servers you want to back up.
I would also recommend that you use a single repo for all the backups as deduplication will save you a lot of space if you back up overlapping data.

Borgmatic uses a push model, so normally you run borgmatic on the Server you want to back up from and it pushes the backup to the server you want to backup to. In order to start the backups sequentially you would have to be able to login to every machine you want to back up and run borgmatic. The way I do it personally is to just use a systemd timer or alternatively a cron entry to run borgmatic on all the clients I want to backup. Using a push model is not really supported by borgmatic unless you give a single server access to all the servers you want to back up. I would also recommend that you use a single repo for all the backups as deduplication will save you a lot of space if you back up overlapping data.
Owner

@IBims1NicerTobi is correct that borgmatic generally assumes a push model rather than pull. However, Borg does have some documentation on doing pull backups even if borgmatic doesn't have any built-in support for it: https://borgbackup.readthedocs.io/en/stable/deployment/pull-backup.html#pull-backup

My first question for you though would be: What's your motivation in doing pull backups? Is it just to avoid the locking/contention of multiple servers all trying to backup at the same time? There are ways around that even with the push model. For instance, borgmatic includes lock_wait, retries, and retry_wait options that may help with multiple servers connecting at once.

Also, be careful if using a single repository for multiple servers: https://borgbackup.readthedocs.io/en/stable/faq.html#can-i-backup-from-multiple-servers-into-a-single-repository ... One option would be to use a separate Borg repository per source server, which would solve the lock contention issues and allow the backups to proceed simultaneously (assuming sufficient server resources).

Anyway, let me know your thoughts!

@IBims1NicerTobi is correct that borgmatic generally assumes a push model rather than pull. However, Borg does have some documentation on doing pull backups even if borgmatic doesn't have any built-in support for it: https://borgbackup.readthedocs.io/en/stable/deployment/pull-backup.html#pull-backup My first question for you though would be: What's your motivation in doing pull backups? Is it just to avoid the locking/contention of multiple servers all trying to backup at the same time? There are ways around that even with the push model. For instance, borgmatic includes `lock_wait`, `retries`, and `retry_wait` options that may help with multiple servers connecting at once. Also, be careful if using a single repository for multiple servers: https://borgbackup.readthedocs.io/en/stable/faq.html#can-i-backup-from-multiple-servers-into-a-single-repository ... One option would be to use a separate Borg repository per source server, which would solve the lock contention issues and allow the backups to proceed simultaneously (assuming sufficient server resources). Anyway, let me know your thoughts!
witten added the
question / support
label 2023-08-14 16:26:44 +00:00
Author

Hello, thanks for the replies.

I read about the pull mode on the Borg documentation, i was under the impression Borgmatic had it somehow. Indeed the motivation is trying to avoid having the 40 vm process overlap, I've seen people discourage the use of 1 single repo for many servers, i didn't understand that the lock_wait etc options could work on separate repos. It might be the solution. I've seen people say they use borg to backup 100's of server so there must be a better scheduler than "put it in a cron", it quickly become impossible to manage. if i get it right i should copy the Borgmatic config file to the vm and have each vm run it's own Cron with call to Borgmatic ? no remote execution. i'll need to properly setup the repository with ssh access.

Thanks for your help!

Hello, thanks for the replies. I read about the pull mode on the Borg documentation, i was under the impression Borgmatic had it somehow. Indeed the motivation is trying to avoid having the 40 vm process overlap, I've seen people discourage the use of 1 single repo for many servers, i didn't understand that the lock_wait etc options could work on separate repos. It might be the solution. I've seen people say they use borg to backup 100's of server so there must be a better scheduler than "put it in a cron", it quickly become impossible to manage. if i get it right i should copy the Borgmatic config file to the vm and have each vm run it's own Cron with call to Borgmatic ? no remote execution. i'll need to properly setup the repository with ssh access. Thanks for your help!

The solution I've used is taking snapshots of each VM, and exporting/copying the snapshots to a storage array with NFS or SMB. When the snapshots are all done, you can then run your Borg backup in normal push mode using the data on the storage array.

The solution I've used is taking snapshots of each VM, and exporting/copying the snapshots to a storage array with NFS or SMB. When the snapshots are all done, you can then run your Borg backup in normal push mode using the data on the storage array.
Owner

i didn't understand that the lock_wait etc options could work on separate repos

It doesn't! I was referring to lock_wait in regards to the original single repo idea.

I've seen people say they use borg to backup 100's of server so there must be a better scheduler than "put it in a cron", it quickly become impossible to manage.

Is the difficulty in managing all of those cron configurations? I assume that anyone with hundreds of servers is using some sort of configuration management (Ansible, etc.) to install cron jobs or systemd services fleet-wide.

if i get it right i should copy the Borgmatic config file to the vm and have each vm run it's own Cron with call to Borgmatic ? no remote execution. i'll need to properly setup the repository with ssh access.

Yes, that would be the traditional configuration.. and probably the one I'd start with. If you use separate repositories per source VM, you shouldn't have any backup server contention issues (given sufficient resources). Also, depending on your cron daemon (or even systemd), there are features there to add a random delay before running each job. That may lessen the stampeding herd effect on your backup server.

> i didn't understand that the lock_wait etc options could work on separate repos It doesn't! I was referring to `lock_wait` in regards to the original single repo idea. > I've seen people say they use borg to backup 100's of server so there must be a better scheduler than "put it in a cron", it quickly become impossible to manage. Is the difficulty in managing all of those cron configurations? I assume that anyone with hundreds of servers is using some sort of configuration management (Ansible, etc.) to install cron jobs or systemd services fleet-wide. > if i get it right i should copy the Borgmatic config file to the vm and have each vm run it's own Cron with call to Borgmatic ? no remote execution. i'll need to properly setup the repository with ssh access. Yes, that would be the traditional configuration.. and probably the one I'd start with. If you use separate repositories per source VM, you shouldn't have any backup server contention issues (given sufficient resources). Also, depending on your cron daemon (or even systemd), there are features there to add a random delay before running each job. That may lessen the stampeding herd effect on your backup server.
Author

Hello,

thanks for the clarification. Yes the issue isn't deploying the configuration, it's mostly to ensure there isn't multiple backup process running at the same time and over using the resources (disk i/o, bandwidth, cpu, etc), doing that via cron and having to estimate start time for each batch is bad. We have been using Amanda backup, while it has really good and strong features ( such as automatically "averaging" the backup size across all client) it's old and the tape mentality makes no more sense when using disk storage.

Hello, thanks for the clarification. Yes the issue isn't deploying the configuration, it's mostly to ensure there isn't multiple backup process running at the same time and over using the resources (disk i/o, bandwidth, cpu, etc), doing that via cron and having to estimate start time for each batch is bad. We have been using Amanda backup, while it has really good and strong features ( such as automatically "averaging" the backup size across all client) it's old and the tape mentality makes no more sense when using disk storage.
Owner

Understood. borgmatic unfortunately doesn't have any built-in features along those lines, so for now you can:

  • Use a single repository with lock_wait, which would guarantee only a single job at a time.
  • Use multiple repositories with random delays on the client side, which would probabilistically reduce the backup server load. Note that both cron and systemd have support for running jobs with random delays—without the user having to explicitly stagger backup times.
  • There may also be a way to re-nice the server-side Borg process so as to reduce the impact of multiple backup jobs on the server.

Short of that, I'm open to feature suggestions and/or designs for how a borgmatic feature enhancement could support this use case more fully: Whether formal "pull mode" support or some sort sort of explicit coordination between multiple "push" clients.

Understood. borgmatic unfortunately doesn't have any built-in features along those lines, so for now you can: * Use a single repository with `lock_wait`, which would guarantee only a single job at a time. * Use multiple repositories with random delays on the client side, which would probabilistically reduce the backup server load. Note that both cron and systemd have support for running jobs with random delays—without the user having to explicitly stagger backup times. * There may also be a way to re-nice the server-side Borg process so as to reduce the impact of multiple backup jobs on the server. Short of that, I'm open to feature suggestions and/or designs for how a borgmatic feature enhancement could support this use case more fully: Whether formal "pull mode" support or some sort sort of explicit coordination between multiple "push" clients.
Author

Thanks again for the help :) i'll try to fiddle around with the things you suggested and see how it goes.

as for features suggestions, and i'am not sure it's Borgmatic job to do that, i would like to see some kind of centralized coordinator with the ability to trigger backup, either using the "pull mode" or remotely starting push jobs on a "smart" schedule.

Thanks again for the help :) i'll try to fiddle around with the things you suggested and see how it goes. as for features suggestions, and i'am not sure it's Borgmatic job to do that, i would like to see some kind of centralized coordinator with the ability to trigger backup, either using the "pull mode" or remotely starting push jobs on a "smart" schedule.
Owner

Sounds good! Feel free to update the ticket with your progress or findings.

Sounds good! Feel free to update the ticket with your progress or findings.
Sign in to join this conversation.
No Milestone
No Assignees
4 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#742
No description provided.