Git management technique when multiple customers need multiple customization
First of all, with 1k customizations, it sounds to me like something's wrong. With 1k different versions of the code, how many development teams would you need to really support them all in timely fashion?
I would first consider if those 'customizations':
- really have to be done in-code?
- really have to be in the same repo?
- really have to be managed by developers?
- really have to be managed by you?
Any "no" to any of those will rephrase your problem into a simpler one. Furthermore, they often go together. For example, for a "no" to "in-code" it seems that customizations are mostly in configuration or in data in database, and that almost always implies no to #3 as well.
You goal, as the supporter/mantainer/planner of the development is the last point. If the customizations can be thrown out of the code into config, or separate modules/plugins in different repos, or if they can be offloaded to another team (customer support team?) then your base problem dissolves.
Now, considering that for some reason you can't do it, then.. a mutation of a strict Git-Flow is your friend. Feature branches everywhere. Generous use of integration and release branches. Really strict flow, and not tutorialish Git-Flow, but rather an 'extended' one I would say..
Git-Flow at its core heavily bases on feature branches, but after that, it is a single-integration, single-release workflow.
Before reading further should know well how Git-Flow works. If not, go read on it and read further only when you feel you know it well.
In 'tutorialish' Git-Flow, every developer configures:
- name of target core master branch: master
- name of target next-release integration branch: develop
- naming scheme for feature branches: feature/*
- naming scheme for release branches: release/*
- ..
For "1K customers", there's a problem with this setup. You can have tons of feature branches, for core improvement, for new features, for various customizations, etc. However, once you 'finish' them, they go to develop
, the common develop, and you're busted. Everyone else would get them. 'Finished' customization branch for customer#1 would eventually flow back through develop
into release
and eventually into master
, and next month or year all the other customers would get that too. You don't want it.
This proves that, there can be no single develop, no single master.
Let's take another approach.
In 'extended' Git-Flow, (Git-Multi-Flow, sounds catchy, anyone?), every developer working for customer XYZ configures, for example:
- name of target core master branch: master-XYZ
- name of target next-release integration branch: develop-XYZ
- naming scheme for feature branches: feature/*
- naming scheme for release branches: release/*
- ..
I've planned them to use DIFFERENT names for master/develop branches. That means that for each customer, I have separate next-release integration branch, and a different version of the current-release branch.
What does it give? Any developer that works on a feature (be it patch, customization, whatever) just uses git flow. But if they work in context of customer XYZ they will 'finish' the feature and it will be integrated on a dedicated develop-XYZ
branch. When released, it will get merged into dedicated master-XYZ
branch. No possibility of "leaking" features or customizations.
However, 'feature' branches are still normal. If at some point of time you want feature #100 (originally for customer ABC) and bugfix #321 (orig for customer DEF) you can still merge them to another customer like XYZ, provided that the differences in their relevant develop-?/master-? are not too different, but that's another story)
If it sounds nice, well, good, but it's not that nice. With decent amount of customers and separate devel/master for them, you will quickly notice that:
- if customers' customization are hard, they will diverge their develop/master often, not only in customized areas, but in the core code as well. That will impeded future feature sharing between them
- developers will get grumpy about having to change git-flow configuration constantly (swithc to master-FOO, switch to master-BAR, master-XYZ), or, having to keep several repo copies, configured to different customers
- having separate develop/master is almost like having separate repos, but change notifications will spam everyone
- (..)
That's for starters. Some of them can be mitigated. The second point can be solved by noticing that git-flow setup just a few lines in the .git/config
file, so you don't need to re-config gitflow or keep several copies of the repo. Just edit the file. Or keep a copy of git-config and just flip that file. Or use some $ENV var to indicate current customer. Tenths of other ways to solve it.
Next thing, features. I deliberately didn't consider naming feature branches feature/XYZ/
but left simple feature/
. Features don't have to be bound to customer. You can share/etc them later.
Next thing, releases. I deliberately didn't consider naming release branches release/XYZ/
. You will not share them, ever, probably. You may. However, I suppose you already have a better naming for release branches than just prefixing them with customer name like releases/XYZ
would do. You need to version number or date there as well. Maybe some feature set code name as well. I don't know. Invent something here.
Next thing, core master and develop. For customers, devs are working on develop-XYZ, master-XYZ. But you can still have core master
and develop
for working on improvements of the shared core code. No changes there.
Next thing, I said master-XYZ, develop-XYZ. But that does not have to be like that. You know about feature/...
. So why not masters/XYZ
& develops/XYZ
? Sure, can do. You can even XYZ\master
and XYZ\develop
and XYZ\feature\
but then, why not make separate repos like on GitHub, where you can Fork/Merge/PR one off another?
Git-Multi-Flow is an extension of vanilla Git-Flow, and with the latter naming scheme (XYZ\master
etc) you end up with something like multiple logical repos in a single repo. Kind of multi-tenant Git-Flow..
So.. It's possible. Not that hard to set up. But still 1000+ branched master/develop/etc sets, handled by the same group of teams will be a pain. There will be mistakes. Devs will mix up things from time to time, they're just people. From my experience, when "lots of customizations" happen, it's almost just URLs, ports, passwords, images, styles, text, maybe sometimes limited layout changes. Maybe one extra module here or there. You should be able to handle it all in core code, and turn it on/off/configure via configuration. This way the number of long-lived 'customization branches' (aka master-XYZ) will be greatly reduced, at the cost of more detailed configuration for each customer deployment.
That configuration should also be tracked and versioned (i.e again in Git). I hope you do it already. But it should sit in a different storage than code. It's a different thing. Managed by support or deployment team. Plan for it ahead of the time.
Use multiple repositories for that. Of course, this required the application to be built in a way that supports this. So create one core application with stuff that is common across all customers. For each customer, create a specific module with the customizations. Of course, they go into separate repositories. If a client requests a feature that requires customer-specific changes to a core module, refactor your application so that the customization can be done outside the core module. If time is an issue, use a client-specific branch and do cherry-picking or merging. This is ok as long as it is for a short period of time. We successfully use this approach at one of my clients, who has exactly the same challenge.
Short answer: Don't use the git repo to model this demand.
Rather, unfortunately, I suggest to make everything that differs from one customer to another configurable. Like this you can deploy the same version to every customer and have only one release channel.
We have a similar situation in our company and this is the conclusion I arrived at over time. The downside is that these configurations can get messy, make the code complicated and your system less deterministic / difficult to test. The problem is that it needs to be done right, regarding the coding and the modeling. I think the most important thing is to have significant experience with a product or domain, so one can tell which settings should be modeled how, or even push back if a client wants something weird.
Here's my tier list of configurability (from best to worst):
Using persistence-layer (DB) modeling of the configuration
allow user self-service via UI/ConfigTools - but a lot of effort to implement
Using dynamically combinable modules / packages / services
requires complex architecture and deployments - often very/too complicated
Using config-files
May turn into a nightmare with 1k customers - simple but doesn't scale so well (e.g. when adding a new config setting)
Hard coded (e.g. dependency injection)
Bad: requires a branch/repo per customer, not really practicable
Fundamentally, there's a common core for each client, and there's a delta from that for each client. The question is where you keep and structure this delta. How do you save, display and model it. That's why #1 is the best: The config is saved in the deployed instance itself and is migrated up with every version. It's a pure runtime thing (which can be backed up for safety though). #2 and #3 are kind in the same league there, both are not ideal and can get messy when you have more than say 20 settings/differences.