Using environment variables for databases hooks #382

Closed
opened 2020-12-09 19:55:31 +00:00 by mmartinortiz · 14 comments

What I'm trying to do and why

I would like to use borgmatic to make backups of the databases of my current services. Currently, I have (for example) a docker compose with a wordpress and a mysql service. I want to use borgmatic to automatically backup the mysql database.

I do not want to expose the database port on the host, so I want to include borgmatic as an additional service within the docker compose.

All the sensitive data (users and passwords) are wihin a .env file that is read by the docker compose file.

Currently, I did not find a way to reuse this .env file for the borgmatic-config.yaml file.

So far, I tried:

  • Create a .env files with the database user and password at the same level of borgmatic-config.yaml
  • Having borgmatic-config.yaml in the same directory that the .env file used by docker-compose.yaml
  • Passing the variables as environment to borgmatic variables within the docker compose file

Other notes / implementation ideas

Database tables, users and passwords could be passed to borgmatic as environment variables

Environment

  • borgmatic version: 1.5.12
  • borgmatic installation method: Docker container
  • Borg version: borg 1.1.14
  • Python version: 3.8.5
  • Database version (if applicable): mysql Ver 15.1 Distrib 10.5.8-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2
    operating system and version: Ubuntu 20.10

Related: #370

#### What I'm trying to do and why I would like to use borgmatic to make backups of the databases of my current services. Currently, I have (for example) a docker compose with a wordpress and a mysql service. I want to use borgmatic to automatically backup the mysql database. I do not want to expose the database port on the host, so I want to include borgmatic as an additional service within the docker compose. All the sensitive data (users and passwords) are wihin a `.env` file that is read by the docker compose file. Currently, I did not find a way to reuse this `.env` file for the `borgmatic-config.yaml` file. So far, I tried: - Create a `.env` files with the database user and password at the same level of `borgmatic-config.yaml` - Having `borgmatic-config.yaml` in the same directory that the `.env` file used by `docker-compose.yaml` - Passing the variables as environment to borgmatic variables within the docker compose file #### Other notes / implementation ideas Database tables, users and passwords could be passed to borgmatic as environment variables #### Environment - **borgmatic version:** 1.5.12 - **borgmatic installation method:** Docker container - **Borg version:** borg 1.1.14 - **Python version:** 3.8.5 - **Database version (if applicable):** mysql Ver 15.1 Distrib 10.5.8-MariaDB, for debian-linux-gnu (x86_64) using readline 5.2 **operating system and version:** Ubuntu 20.10 Related: #370
Author

Checking the sourcecode, I saw here that the variable MYSQL_PWD is assigned based on the database dictionary.

extra_environment = {'MYSQL_PWD': database['password']} if 'password' in database else None

Using that variable within the docker-compose.yaml makes the trick:

docker-compose.yaml

---
services
  borgmatic:
    image: b3vis/borgmatic
    container_name: borgmatic
    volumes:
      ...
    environment:
      - TZ=${TZ}
      - MYSQL_PWD=${WORDPRESS_DB_PASSWORD}

If we could do the same with an environment variable called, for example, MYSQL_USER we could reuse the sensitive data from within the docker-compose.yaml file.

I'll make a pull request about this with the suggested change to illustrate the changes that I suggest.

Checking the sourcecode, I saw [here](https://projects.torsion.org/witten/borgmatic/src/branch/master/borgmatic/hooks/mysql.py#L75) that the variable `MYSQL_PWD` is assigned based on the database dictionary. ```python extra_environment = {'MYSQL_PWD': database['password']} if 'password' in database else None ``` Using that variable within the `docker-compose.yaml` makes the trick: **docker-compose.yaml** ```yaml --- services borgmatic: image: b3vis/borgmatic container_name: borgmatic volumes: ... environment: - TZ=${TZ} - MYSQL_PWD=${WORDPRESS_DB_PASSWORD} ``` If we could do the same with an environment variable called, for example, `MYSQL_USER` we could reuse the sensitive data from within the `docker-compose.yaml` file. I'll make a pull request about this with the suggested change to illustrate the changes that I suggest.
Author

I do not have the rights to write into the borgmatic repo for pushing a branch with the suggested changes. Here is the code that would allow to use environment variables from the Dockerfile:

borgmatic/hooks/mysql.py:77

if 'MYSQL_USER' in os.environ:
	database['username'] = os.getenv('MYSQL_USER')

borgmatic/hooks/mysql.py:156 (After inserting previous one)

if 'MYSQL_USER' in os.environ:
	database['username'] = os.getenv('MYSQL_USER')

I would need some help for writting a good test for this. I would use a PyTest fixture for trying different configurations (both Dockerfile and borgmatic-config.yaml) but I saw that they are setup in a different way, and I'm not sure about the best way to proceed without big changes.

I do not have the rights to write into the borgmatic repo for pushing a branch with the suggested changes. Here is the code that would allow to use environment variables from the Dockerfile: **borgmatic/hooks/mysql.py:77** ```python if 'MYSQL_USER' in os.environ: database['username'] = os.getenv('MYSQL_USER') ``` **borgmatic/hooks/mysql.py:156** (After inserting previous one) ```python if 'MYSQL_USER' in os.environ: database['username'] = os.getenv('MYSQL_USER') ``` I would need some help for writting a good test for this. I would use a PyTest fixture for trying different configurations (both Dockerfile and borgmatic-config.yaml) but I saw that they are setup in a different way, and I'm not sure about the best way to proceed without big changes.
Owner

Sorry for the delay here. This seems like a nice addition, with the caveat that all borgmatic databases would have to have the same credentials if you want to pass them in via MYSQL_USER and MYSQL_PWD. As long as that works for your use case and/or you only have a single database to backup, then this would be a good solution.

As for testing, I could see two different ways to proceed:

  • Write a new unit test for each of dump_databases() and restore_database_dump() that tests that when MYSQL_USER is passed in, the appropriate command is issued with --user. This would be in tests/unit/hooks/test_mysql.py, and you can look at test_dump_databases_runs_mysqldump_with_username_and_password() for inspiration. You would run these tests just with tox.
  • Or, write an end-to-end database test that actually runs against a real MySQL instance. This would be in tests/end-to-end/test_database.py, but may be overkill for this type of change. You would run these tests with tox -e end-to-end.

In terms of making a pull request, I think the traditional way is to make a fork of the repository, and then you should be able to submit a PR with any changes you make there (either in your master or in a branch).

Please let me know if that makes sense. Thanks!

Sorry for the delay here. This seems like a nice addition, with the caveat that all borgmatic databases would have to have the same credentials if you want to pass them in via `MYSQL_USER` and `MYSQL_PWD`. As long as that works for your use case and/or you only have a single database to backup, then this would be a good solution. As for testing, I could see two different ways to proceed: * Write a new unit test for each of `dump_databases()` and `restore_database_dump()` that tests that when `MYSQL_USER` is passed in, the appropriate command is issued with `--user`. This would be in `tests/unit/hooks/test_mysql.py`, and you can look at `test_dump_databases_runs_mysqldump_with_username_and_password()` for inspiration. You would run these tests just with `tox`. * Or, write an end-to-end database test that actually runs against a real MySQL instance. This would be in `tests/end-to-end/test_database.py`, but may be overkill for this type of change. You would run these tests with `tox -e end-to-end`. In terms of making a pull request, I think the traditional way is to make a fork of the repository, and then you should be able to submit a PR with any changes you make there (either in your master or in a branch). Please let me know if that makes sense. Thanks!

I'd like to do the exact same thing as described above: reuse the exising database credentials in the environment and pass them to borgmatic. Though in my case I would need to support multiple different combinations of user and password for different databases.

Any idea how to solve this?

I think the best option from the user's perspective would be to pass arbitrary (or limited to some prefix like DB_USER_* and DB_PWD_*) environment variables to config.yaml so I could just use them inside the config. Is this possible?

I'd like to do the exact same thing as described above: reuse the exising database credentials in the environment and pass them to borgmatic. Though in my case I would need to support multiple different combinations of user and password for different databases. Any idea how to solve this? I think the best option from the user's perspective would be to pass arbitrary (or limited to some prefix like `DB_USER_*` and `DB_PWD_*`) environment variables to `config.yaml` so I could just use them inside the config. Is this possible?

Or maybe letting borgmatic check for the env variables MYSQL_PWD_<dbname> and MYSQL_USER_<dbname> automatically would be a solution?

Or maybe letting borgmatic check for the env variables `MYSQL_PWD_<dbname>` and `MYSQL_USER_<dbname>` automatically would be a solution?

Or maybe letting borgmatic check for the env variables MYSQL_PWD_ and MYSQL_USER_ automatically would be a solution?

On second thought, the name of the db is not unique and there is also no other unique property that could be used to differentiate the different db configs.

> Or maybe letting borgmatic check for the env variables MYSQL_PWD_<dbname> and MYSQL_USER_<dbname> automatically would be a solution? On second thought, the name of the db is not unique and there is also no other unique property that could be used to differentiate the different db configs.
Owner

On second thought, the name of the db is not unique and there is also no other unique property that could be used to differentiate the different db configs.

I'm curious how you have this working! Here's a quote from the borgmatic database docs under limitations: "borgmatic does not currently support backing up or restoring multiple databases that share the exact same name on different hosts."

If that limitation were kept in place, then I could envision making the database name part of the environment variable name as you suggest. E.g., for a database called invoices, the corresponding environment variable could be called MYSQL_PWD_INVOICES. But that sounds like it may not work for your use case if you've got multiple databases with the same name.

> On second thought, the name of the db is not unique and there is also no other unique property that could be used to differentiate the different db configs. I'm curious how you have this working! Here's a quote from the [borgmatic database docs](https://torsion.org/borgmatic/docs/how-to/backup-your-databases/) under limitations: "borgmatic does not currently support backing up or restoring multiple databases that share the exact same name on different hosts." If that limitation were kept in place, then I could envision making the database name part of the environment variable name as you suggest. E.g., for a database called `invoices`, the corresponding environment variable could be called `MYSQL_PWD_INVOICES`. But that sounds like it may not work for your use case if you've got multiple databases with the same name.

I'm curious how you have this working! Here's a quote from the borgmatic database docs under limitations: "borgmatic does not currently support backing up or restoring multiple databases that share the exact same name on different hosts."

Actually I don't have databases with the same name and I didn't know of that limitation. I just assumed there could me multiple databases with the same name.

If that limitation were kept in place, then I could envision making the database name part of the environment variable name as you suggest. E.g., for a database called invoices, the corresponding environment variable could be called MYSQL_PWD_INVOICES. But that sounds like it may not work for your use case if you've got multiple databases with the same name.

Indeed. Still, building that feature on that borgmatic limitation doesn't sound like the best idea, but maybe it's good enough for now. If the limitation will ever be removed this will break.

Would it be possible to add an arbitrary key as additional property in the database config and use this in the environment variable?

> I'm curious how you have this working! Here's a quote from the borgmatic database docs under limitations: "borgmatic does not currently support backing up or restoring multiple databases that share the exact same name on different hosts." Actually I don't have databases with the same name and I didn't know of that limitation. I just assumed there could me multiple databases with the same name. > If that limitation were kept in place, then I could envision making the database name part of the environment variable name as you suggest. E.g., for a database called invoices, the corresponding environment variable could be called MYSQL_PWD_INVOICES. But that sounds like it may not work for your use case if you've got multiple databases with the same name. Indeed. Still, building that feature on that borgmatic limitation doesn't sound like the best idea, but maybe it's good enough for now. If the limitation will ever be removed this will break. Would it be possible to add an arbitrary key as additional property in the database config and use this in the environment variable?
Owner

Just for some background: That limitation is there because the database name is used as part of the filename constructed when dumping, restoring, and backing up a database. So that filename is contained in the created Borg archive, and that filename is looked up when restoring a backup.

Your idea of supporting an additional (optional) arbitrary identifier could help here though! If not specified, then borgmatic could continue using the database name to construct dump filenames. But if an identifier is specified, then borgmatic could use that instead when constructing the filename. The only potential issue I see is that, to maintain backwards compatibility, a user could not add an identifier to an existing database configuration that already has backups they want to be restorable.

So this is all a longwinded way to say that your idea sounds like it'll work!

Just for some background: That limitation is there because the database name is used as part of the filename constructed when dumping, restoring, and backing up a database. So that filename is contained in the created Borg archive, and that filename is looked up when restoring a backup. Your idea of supporting an additional (optional) arbitrary identifier could help here though! If not specified, then borgmatic could continue using the database name to construct dump filenames. But if an identifier is specified, then borgmatic could use that instead when constructing the filename. The only potential issue I see is that, to maintain backwards compatibility, a user could not add an identifier to an existing database configuration that already has backups they want to be restorable. So this is all a longwinded way to say that your idea sounds like it'll work!

I had a look at the code to see how the two things (using optional identifier for database configs and check environment for db user/password) could be implemented. It doesn't look too complicated and one part might look something like this:

--- a/borgmatic/hooks/mysql.py
+++ b/borgmatic/hooks/mysql.py
@@ -69,9 +69,20 @@ def dump_databases(databases, log_prefix, location_config, dry_run):
 
     for database in databases:
         requested_name = database['name']
+        if 'id' in database:
+            requested_name = database['id']
+
         dump_filename = dump.make_database_dump_filename(
             make_dump_path(location_config), requested_name, database.get('hostname')
         )

But I'm struggling to understand the unit testing. I had a look at test_dump_databases_runs_mysqldump_for_each_database() and simply changed the following and the test still succeeds.

--- a/tests/unit/hooks/test_mysql.py
+++ b/tests/unit/hooks/test_mysql.py
@@ -36,7 +36,7 @@ def test_database_names_to_dump_queries_mysql_for_database_names():
 
 
 def test_dump_databases_runs_mysqldump_for_each_database():
-    databases = [{'name': 'foo'}, {'name': 'bar'}]
+    databases = [{'name': 'foo'}, {'name': 'baz'}]
     processes = [flexmock(), flexmock()]
     flexmock(module).should_receive('make_dump_path').and_return('')
     flexmock(module.dump).should_receive('make_database_dump_filename').and_return(

Shouldn't the test fail in this case, as the dump generated by borgmatic doesn't match the reference anymore? Could you enlighten me? What am I missing?

I had a look at the code to see how the two things (using optional identifier for database configs and check environment for db user/password) could be implemented. It doesn't look too complicated and one part might look something like this: ```diff --- a/borgmatic/hooks/mysql.py +++ b/borgmatic/hooks/mysql.py @@ -69,9 +69,20 @@ def dump_databases(databases, log_prefix, location_config, dry_run): for database in databases: requested_name = database['name'] + if 'id' in database: + requested_name = database['id'] + dump_filename = dump.make_database_dump_filename( make_dump_path(location_config), requested_name, database.get('hostname') ) ``` But I'm struggling to understand the unit testing. I had a look at `test_dump_databases_runs_mysqldump_for_each_database()` and simply changed the following and the test still succeeds. ```diff --- a/tests/unit/hooks/test_mysql.py +++ b/tests/unit/hooks/test_mysql.py @@ -36,7 +36,7 @@ def test_database_names_to_dump_queries_mysql_for_database_names(): def test_dump_databases_runs_mysqldump_for_each_database(): - databases = [{'name': 'foo'}, {'name': 'bar'}] + databases = [{'name': 'foo'}, {'name': 'baz'}] processes = [flexmock(), flexmock()] flexmock(module).should_receive('make_dump_path').and_return('') flexmock(module.dump).should_receive('make_database_dump_filename').and_return( ``` Shouldn't the test fail in this case, as the dump generated by borgmatic doesn't match the reference anymore? Could you enlighten me? What am I missing?
Owner

I think what's going on in this case is that there are enough other functions mocked out—and returning the "correct" value of bar regardless of the databases value—that even with an "incorrect" database name in databases, the tests still pass. But if you try changing bar to baz in one of those intermediate mocks instead, you will see that tests do indeed fail.

A more "correct" test might be to put the bar or baz value into a constant and then refer to that everywhere it's used. You know, for consistency and to sidestep the sort of testing issue you've found here. But personally, I think that's probably more hassle than it's worth.

Hope that makes sense!

I think what's going on in this case is that there are enough other functions mocked out—and returning the "correct" value of `bar` regardless of the `databases` value—that even with an "incorrect" database name in `databases`, the tests still pass. But if you try changing `bar` to `baz` in one of those intermediate mocks instead, you will see that tests do indeed fail. A more "correct" test might be to put the `bar` or `baz` value into a constant and then refer to that everywhere it's used. You know, for consistency and to sidestep the sort of testing issue you've found here. But personally, I think that's probably more hassle than it's worth. Hope that makes sense!
Owner

Some relevant discussion with another proposed solution on #546.

Some relevant discussion with another proposed solution on #546.
Owner

I think with #546 implemented now, the general ask of this ticket (using environment variables to set borgmatic database passwords) is supported in borgmatic master. It'll be part of the next release. See the documentation for more information: https://torsion.org/borgmatic/docs/how-to/provide-your-passwords/

If there are any details still to do, please feel free to follow up here or file a separate ticket.

I think with #546 implemented now, the general ask of this ticket (using environment variables to set borgmatic database passwords) is supported in borgmatic master. It'll be part of the next release. See the documentation for more information: https://torsion.org/borgmatic/docs/how-to/provide-your-passwords/ If there are any details still to do, please feel free to follow up here or file a separate ticket.
Owner

Released in borgmatic 1.6.4!

Released in borgmatic 1.6.4!
Sign in to join this conversation.
No Milestone
No Assignees
3 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#382
No description provided.