Is having relationships between separate databases bad practice?
The only issue is that there isn't really a convenient way to enforce these relationships. You can use a boatload of triggers, or store the master table in each instance database and simply expose all of them as metadata to the master, using a view perhaps.
The biggest downside, I think, has to do with keeping the data accurate and synchronized, in at least three scenarios off the top of my head (all of which can still be an issue even if you use triggers):
- improperly coordinated cross-database transactions - for example one transaction adds a new row to master, hands off the identity to another transaction, and then the first transaction rolls back.
- recovery from a single database failure - a database crashes, you have to restore to a point in time, that may have been before other successful transactions in other databases had finished. This could mean if an instance database has to revert to an earlier time, it is missing rows that master expects; if master has to revert, it's possible that all of the instance databases will have values that don't exist in master.
- backup consistency in general - if you need disaster recovery, you're going to have a hard time keeping all your full and log backups transactionally consistent to a point in time. File system snapshots don't help with this, and even Availability Groups don't guarantee cross-database transactional consistency for the databases in a single Availability Group. If you have a disaster and need to restore your databases to a new system, there's no way to be sure they will be transactionally consistent.
All that said, I have always been and always will be a fan of separating tenants out into their own databases, and I acknowledge there is risk associated with the central database - but it is necessary.
(As an aside, you should use names for your databases other than master / instance. master
in SQL Server definitely has an explicit meaning, and even reading my own words above they could be ambiguous. Same for instance
. In a previous life I ran a multi-tenant system and we called the central database Control
and the tenant databases, well, Tenants
.)
If you are following an architecture like Microservices, it's pretty much a must to be able to correlate data across multiple autonomous services, each with its own data store. It will be understand by each service that business processes that would traditionally be performed in a single database transaction now may be potentially longer-running.