Request for shallow merge. #672

Closed
opened 2023-04-10 21:04:43 +00:00 by alsoeric · 10 comments

What I'm trying to do and why

In addition to deep merging, I would like the addition of a shallow merge because I would like to replace entire sections like location and not just add to it.

Steps to reproduce (if a bug)

Subset of master template

location:
    # List of source directories to backup. Globs and tildes are
    # expanded. Do not backslash spaces in path names.
    source_directories:
        - /home
        - /etc
        - /var/log/syslog*
        - /home/user/path with spaces
    repositories:
        - path: ssh://user@backupserver/./sourcehostname.borg
          label: backupserver
        - path: /mnt/backup
          label: local

changes to template:

<<: !include /etc/borgmatic/pw-common.yaml                                               
location:                                                                                 
  source_directories:                                                                     
    - /nfs/home                                                                           
    - /etc
  repositories:
    - path: jdktn@deadbeaf.rsync.net:deadbeaf-home
      label: backupserver

Actual behavior (if a bug)

I think this is what I get when the two files deep merged. I really can't tell but the configuration validation command really should have an option to dump out the end result so you can see if you get what you think you getting.

location:
    # List of source directories to backup. Globs and tildes are
    # expanded. Do not backslash spaces in path names.
    source_directories:
        - /home
        - /etc
        - /var/log/syslog*
        - /home/user/path with spaces
        - /nfs/home
        - /etc        
    repositories:
        - path: ssh://user@backupserver/./sourcehostname.borg
          label: backupserver
        - path: /mnt/backup
          label: local
        - path: jdktn@deadbeaf.rsync.net:deadbeaf-home
          label: backupserver

Expected behavior (if a bug)

location:
    # List of source directories to backup. Globs and tildes are
    # expanded. Do not backslash spaces in path names.
    source_directories:
        - /nfs/home
        - /etc        
    repositories:
        - path: jdktn@deadbeaf.rsync.net:deadbeaf-home
          label: backupserver
          

Other notes / implementation ideas

Selecting between deep and shallow merge is a challenge. It's probably too late to put in a deep merge versus shallow merge differentiator but one thought is to use the = to indicate replacement of this section and a + to indicate addition. This would naturally extend to adding a - to say this is what I want you to remove.

starting with the same template as above:

<<: !include /etc/borgmatic/pw-common.yaml                                               
location:                                                                                 
  =source_directories:
    - /nfs/home                                                                           
    - /etc
  +repositories:
    - path: jdktn@deadbeaf.rsync.net:deadbeaf-home
      label: backupserver
  -repositories:
    - path: ssh://user@backupserver/./sourcehostname.borg
      label: backupserver

which should yield:

location:
    # List of source directories to backup. Globs and tildes are
    # expanded. Do not backslash spaces in path names.
    source_directories:
        - /nfs/home
        - /etc        
    repositories:
        - path: /mnt/backup
          label: local
        - path: jdktn@deadbeaf.rsync.net:deadbeaf-home
          label: backupserver

the yaml parsing will probably complain about the leading =+- characters but I'm sure there are other quickie notations one could use to express how to replace, add to and subtract from the existing template.

Environment

borgmatic version: [1.7.11]

Use sudo borgmatic --version or sudo pip show borgmatic | grep ^Version

borgmatic installation method: [pip3 into virtual environment]

Borg version: [1.2.4]

Use sudo borg --version

Python version: [3.10.6]

Use python3 --version

Database version (if applicable): [version here]

Use psql --version or mysql --version on client and server.

operating system and version: [OS here]

#### What I'm trying to do and why In addition to deep merging, I would like the addition of a shallow merge because I would like to replace entire sections like location and not just add to it. #### Steps to reproduce (if a bug) Subset of master template ``` location: # List of source directories to backup. Globs and tildes are # expanded. Do not backslash spaces in path names. source_directories: - /home - /etc - /var/log/syslog* - /home/user/path with spaces repositories: - path: ssh://user@backupserver/./sourcehostname.borg label: backupserver - path: /mnt/backup label: local ``` changes to template: ``` <<: !include /etc/borgmatic/pw-common.yaml location: source_directories: - /nfs/home - /etc repositories: - path: jdktn@deadbeaf.rsync.net:deadbeaf-home label: backupserver ``` #### Actual behavior (if a bug) I think this is what I get when the two files deep merged. I really can't tell but the configuration validation command really should have an option to dump out the end result so you can see if you get what you think you getting. ``` location: # List of source directories to backup. Globs and tildes are # expanded. Do not backslash spaces in path names. source_directories: - /home - /etc - /var/log/syslog* - /home/user/path with spaces - /nfs/home - /etc repositories: - path: ssh://user@backupserver/./sourcehostname.borg label: backupserver - path: /mnt/backup label: local - path: jdktn@deadbeaf.rsync.net:deadbeaf-home label: backupserver ``` #### Expected behavior (if a bug) ``` location: # List of source directories to backup. Globs and tildes are # expanded. Do not backslash spaces in path names. source_directories: - /nfs/home - /etc repositories: - path: jdktn@deadbeaf.rsync.net:deadbeaf-home label: backupserver ``` #### Other notes / implementation ideas Selecting between deep and shallow merge is a challenge. It's probably too late to put in a deep merge versus shallow merge differentiator but one thought is to use the = to indicate replacement of this section and a + to indicate addition. This would naturally extend to adding a - to say this is what I want you to remove. starting with the same template as above: ``` <<: !include /etc/borgmatic/pw-common.yaml location: =source_directories: - /nfs/home - /etc +repositories: - path: jdktn@deadbeaf.rsync.net:deadbeaf-home label: backupserver -repositories: - path: ssh://user@backupserver/./sourcehostname.borg label: backupserver ``` which should yield: ``` location: # List of source directories to backup. Globs and tildes are # expanded. Do not backslash spaces in path names. source_directories: - /nfs/home - /etc repositories: - path: /mnt/backup label: local - path: jdktn@deadbeaf.rsync.net:deadbeaf-home label: backupserver ``` the yaml parsing will probably complain about the leading =+- characters but I'm sure there are other quickie notations one could use to express how to replace, add to and subtract from the existing template. #### Environment **borgmatic version:** [1.7.11] Use `sudo borgmatic --version` or `sudo pip show borgmatic | grep ^Version` **borgmatic installation method:** [pip3 into virtual environment] **Borg version:** [1.2.4] Use `sudo borg --version` **Python version:** [3.10.6] Use `python3 --version` **Database version (if applicable):** [version here] Use `psql --version` or `mysql --version` on client and server. **operating system and version:** [OS here]
Owner

Thanks for taking the time to file this one. Yeah, the tricky part here is the configuration syntax to support it. One idea is just limiting what actually gets included. Made-up example:

<<: !include /etc/borgmatic/pw-common.yaml[consistency, retention]                     

Or maybe even:

<<: !include /etc/borgmatic/pw-common.yaml[omit=location]                     

What was your thought BTW on the work-around as posted on Reddit? It's potentially less convenient than "real" shallow merging because you have to individually include each section you want to merge. But it does have the distinct benefit of: 1. Working now, and 2. Allowing you to individually indicate which sections you want to merge and which ones you want to replace.

Is it just that shallow merging would be more convenient?

Thanks for taking the time to file this one. Yeah, the tricky part here is the configuration syntax to support it. One idea is just limiting what actually gets included. Made-up example: ```yaml <<: !include /etc/borgmatic/pw-common.yaml[consistency, retention] ``` Or maybe even: ```yaml <<: !include /etc/borgmatic/pw-common.yaml[omit=location] ``` What was your thought BTW on the [work-around as posted on Reddit](https://www.reddit.com/r/BorgBackup/comments/12e8n0u/borgmatic_one_template_many_fragments_take_2/jfbyskc/)? It's potentially less convenient than "real" shallow merging because you have to individually include each section you want to merge. But it does have the distinct benefit of: 1. Working now, and 2. Allowing you to individually indicate which sections you want to merge and which ones you want to replace. Is it just that shallow merging would be more convenient?
Author

The solution present on reddit gets overwhelming relatively quickly. For example I have one environment while trying to back up 10 different sets of files. Each set is terabytes so I want to do each of them individually. In the Reddit example I would have to have a set of imported files for every data set I'm trying to backup. I may bill to cut down the number of files somewhat if I'm clever about constant substitution but still, it rapidly becomes easier to just have 10 full template files and and manually synchronize the common elements.

With the proposed change to merging, I would only have to have 11 configuration files, 10 of which are very small. Deleting all the unnecessary key-value pairs from the base template might be a way to shrink the complexity of having 10 templates as well.

With the shallow merging type concept of proposed, it would not just be more convenient, it would make the changes much clearer.

Your example include references omitting or propagating different sections could be useful but not as useful as the fine granularity changes are proposed. I haven't dug deep into it but it looks like I will still have to cut-and-paste different chunks for common sections in my current use cases.

There may be some other ways to slice this but right now I need to get the job done so I'm going to take the brute force and bloody ignorance approach and replicate configuration files.

How much of violation of your conceptual model would it be to add a key-value pair that specifies whether the next section is replaced or merged? Something like

location:  
  source_directories:
    - MERGE
    - /home 
    - /etc                                                                                                            
  repositories:
    - REPLACE
    - path: jdktn@deadbeaf.rsync.net:deadbeaf-home
      label: backupserver
      
consistency:
  REPLACE: true
  checks:
    - repository
    - archives
  check_last: 3                                                                               prefix: home   

The downside with this notation is that I don't believe YAML allows for a single key without value in the hierarchy. It would mean having to change notation for merge versus replace depending on where the keyword happens.

The solution present on reddit gets overwhelming relatively quickly. For example I have one environment while trying to back up 10 different sets of files. Each set is terabytes so I want to do each of them individually. In the Reddit example I would have to have a set of imported files for every data set I'm trying to backup. I may bill to cut down the number of files somewhat if I'm clever about constant substitution but still, it rapidly becomes easier to just have 10 full template files and and manually synchronize the common elements. With the proposed change to merging, I would only have to have 11 configuration files, 10 of which are very small. Deleting all the unnecessary key-value pairs from the base template might be a way to shrink the complexity of having 10 templates as well. With the shallow merging type concept of proposed, it would not just be more convenient, it would make the changes much clearer. Your example include references omitting or propagating different sections could be useful but not as useful as the fine granularity changes are proposed. I haven't dug deep into it but it looks like I will still have to cut-and-paste different chunks for common sections in my current use cases. There may be some other ways to slice this but right now I need to get the job done so I'm going to take the brute force and bloody ignorance approach and replicate configuration files. How much of violation of your conceptual model would it be to add a key-value pair that specifies whether the next section is replaced or merged? Something like ``` location: source_directories: - MERGE - /home - /etc repositories: - REPLACE - path: jdktn@deadbeaf.rsync.net:deadbeaf-home label: backupserver consistency: REPLACE: true checks: - repository - archives check_last: 3 prefix: home ``` The downside with this notation is that I don't believe YAML allows for a single key without value in the hierarchy. It would mean having to change notation for merge versus replace depending on where the keyword happens.
Owner

Thanks for the explanation about your use case.

What about something like this, which might be more "supported" by YAML:

location:  
  source_directories:
    - /home 
    - /etc                                                                                                            
  repositories: !replace
    - path: jdktn@deadbeaf.rsync.net:deadbeaf-home
      label: backupserver
      
consistency: !replace
  checks:
    - repository
    - archives
  check_last: 3
  prefix: home

I'm assuming a merge default, which is why I only tagged !replace.

Thanks for the explanation about your use case. What about something like this, which might be more ["supported"](https://yaml.org/spec/1.2.2/#example-various-explicit-tags) by YAML: ```yaml location: source_directories: - /home - /etc repositories: !replace - path: jdktn@deadbeaf.rsync.net:deadbeaf-home label: backupserver consistency: !replace checks: - repository - archives check_last: 3 prefix: home ``` I'm assuming a merge default, which is why I only tagged `!replace`.
Author

looks good. to confirm:

location:  
  source_directories:
    - /home 
    - /etc                                                                                                            
  repositories:
    - path: jdktn@deadbeaf.rsync.net:deadbeaf-home
      label: backupserver
      
consistency:
  checks:
    - repository
    - archives
  check_last: 3
  prefix: home

and

location:
  source_direcotries: !replace
    - /xyzzy
    
consistency:
  checks:
    - repository  !remove
  prefix: plugh

would yield:

location:  
  source_directories:
    - /xyzzy                                                                                                           
  repositories:
    - path: jdktn@deadbeaf.rsync.net:deadbeaf-home
      label: backupserver
      
consistency:
  checks:
    - archives
  check_last: 3
  prefix: plugh
  

I added !remove so the core template values could propagate outward. using !replace would require copying what is in the core template to the backup template and a core template change that should be pushed to the backup config could be missed.

looks good. to confirm: ``` location: source_directories: - /home - /etc repositories: - path: jdktn@deadbeaf.rsync.net:deadbeaf-home label: backupserver consistency: checks: - repository - archives check_last: 3 prefix: home ``` and ``` location: source_direcotries: !replace - /xyzzy consistency: checks: - repository !remove prefix: plugh ``` would yield: ``` location: source_directories: - /xyzzy repositories: - path: jdktn@deadbeaf.rsync.net:deadbeaf-home label: backupserver consistency: checks: - archives check_last: 3 prefix: plugh ``` I added !remove so the core template values could propagate outward. using !replace would require copying what is in the core template to the backup template and a core template change that should be pushed to the backup config could be missed.
Owner

I think that would work, although to be valid YAML and have the !remove parsed as a tag, it would have to go first before a scalar value:

location:
  source_direcotries: !replace
    - /xyzzy
    
consistency:
  checks:
    - !remove repository
  prefix: plugh

I've already got a prototype of !replace working. I'll see what it would take to support !remove when I get a chance.

I think that would work, although to be valid YAML and have the `!remove` parsed as a tag, it would have to go first before a scalar value: ```yaml location: source_direcotries: !replace - /xyzzy consistency: checks: - !remove repository prefix: plugh ``` I've already got a prototype of `!replace` working. I'll see what it would take to support `!remove` when I get a chance.
witten added the
design finalized
label 2023-04-11 21:29:36 +00:00
Owner

FYI I decided to change the name of !replace to !retain due to ambiguity concerns. !replace could be interpreted as "Replace this YAML node with whatever gets merged in from the included configuration file," which is the opposite of what it means. Whereas !retain says to me: "Keep this YAML node as-is regardless of what gets merged in."

FYI I decided to change the name of `!replace` to `!retain` due to ambiguity concerns. `!replace` could be interpreted as "Replace this YAML node with whatever gets merged in from the included configuration file," which is the opposite of what it means. Whereas `!retain` says to me: "Keep this YAML node as-is regardless of what gets merged in."
Author

FYI I decided to change the name of !replace to !retain due to ambiguity concerns. !replace could be interpreted as "Replace this YAML node with whatever gets merged in from the included configuration file," which is the opposite of what it means. Whereas !retain says to me: "Keep this YAML node as-is regardless of what gets merged in."

good plan. let me know when I can test.

> FYI I decided to change the name of `!replace` to `!retain` due to ambiguity concerns. `!replace` could be interpreted as "Replace this YAML node with whatever gets merged in from the included configuration file," which is the opposite of what it means. Whereas `!retain` says to me: "Keep this YAML node as-is regardless of what gets merged in." good plan. let me know when I can test.
Owner

I've implemented !retain support in master, and it will be part of the next release. Documentation will show up here shortly: https://torsion.org/borgmatic/docs/how-to/make-per-application-backups/#shallow-merge

If you have the interest/ability to test this in master, I'd welcome any feedback on the feature or the docs. Otherwise, you can wait until the next release and we can "do it live."

!remove has not been implemented yet.

I've implemented `!retain` support in master, and it will be part of the next release. Documentation will show up here shortly: https://torsion.org/borgmatic/docs/how-to/make-per-application-backups/#shallow-merge If you have the interest/ability to test this in master, I'd welcome any feedback on the feature or the docs. Otherwise, you can wait until the next release and we can "do it live." `!remove` has not been implemented yet.
Owner

!remove has been implemented now, although I've called it !omit because apparently I can't help renaming things (and also I didn't want any confusion with !retain). Currently, it only works on scalar list items. The documentation is here: https://torsion.org/borgmatic/docs/how-to/make-per-application-backups/#list-merge

This will be part of the next release, which is coming soon. Please let me know if you have any feedback on either !omit or !retain when you try them.

`!remove` has been implemented now, although I've called it `!omit` because apparently I can't help renaming things (and also I didn't want any confusion with `!retain`). Currently, it only works on scalar list items. The documentation is here: https://torsion.org/borgmatic/docs/how-to/make-per-application-backups/#list-merge This will be part of the next release, which is coming soon. Please let me know if you have any feedback on either `!omit` or `!retain` when you try them.
Owner

Just released as part of borgmatic 1.7.12.

Just released as part of borgmatic 1.7.12.
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: borgmatic-collective/borgmatic#672
No description provided.