April 20, 2010 at 5:59 am
First some brief background. We have a global SaaS solution for the insurance industry. Our product has a dynamic UI that is database driven. That is, there is a set of “metadata” tables that are dedicated to specifying the data model and UI for a particular client’s insurance product. We maintain an identical code base at each of our global colos, including both production and QA server environments. In contrast to the code, however, the aforementioned metadata will vary in each environment depending on the clients and insurance products that pertain to that environment.
Metadata table PKs are independently generated in each environment as follows. When a metadata PK must be generated, it is retrieved from a small table called NextKey. The NextKey table has columns something like:
--TableNameID (PK)
--NextKey (int)
Thus, whenever a record is inserted into a metadata table, the PK for that INSERT is retrieved from the NextKey column for that TableNameID and then NextKey is incremented (this mechanism is encapsulated in some low-level ORM function).
Given this context, you can see that a serious issue arises when we wish to copy (ETL) metadata between environments. Since the PKs are independently generated, they can easily collide on INSERTs when records are moved from one environment to another. Furthermore, we need to support UPDATE scenarios as well. For example, let’s say new metadata is created in QA and then copied to production (all INSERTs). Next, metadata modifications occur in QA and need to update the corresponding metadata in production. We need to match the source records against all of the corresponding target records so that UPDATE operations can occur.
I initially considered an ETL-time approach wherein the mapping of PKs (and FKs) from one environment to another would be tracked using special, centralized tables designed for that purpose. The advantage of that approach is that it would only have an effect at ETL time, which is relatively infrequent.
Another, more elegant approach is to generate metadata PKs from a centralized Key Server (KS). This would essentially take the NextKey service and move it to a centralized server that all of the colos would call whenever they needed a metadata PK. While there are several issues with this approach (discussed below), it does have the enormous benefit of completely eliminating the ETL issue. Metadata records can be moved between environments without any concerns about PKs and FKs. If a PK doesn’t exist in the target environment, an INSERT occurs using that PK. If a PK does exist in the target environment, an UPDATE occurs using that PK. Simple and infallible.
Here are the issues and potential resolutions:
•Single point of failure (availability)—It would obviously be catastrophic if the KS went down and none of the colos could obtain metadata PKs.
Potential solution: We will need to have a very robust failover mechnism. Key generation would need to be wrapped in a transaction that included the updating of the failover KS. That is, under no circumstances can the failover KS be allowed to generate PKs that duplicate existing PKs.
•Performance and scalability—Currently, the KS will be called from about 12 environments globally and this number could easily double over the next couple years.
Potential solution: Create a “cached pool” mechanism that would overcome any performance issues by essentially retrieving PKs in batches. Thus, the local server would retrieve a pool of say 1000 PKs at a time for each metadata table (perhaps it would be sufficient to simply return a lower and upper bound for the PKs and assume that local PKs will be generated sequentially between those values). A service running locally would then serve up PKs from this pool. The pool would be a FIFO, so that, in the background, new PKs would be retrieved from the centralized KS and added to the end of the FIFO.
One of our DBAs suggested that we could leverage the SQL Server Linked Server mechanism instead of wrapping the NextKey mechanism in a Web Service. For example, we could write sprocs such as GetNextKeyPool (TableNameID) that would live on the KS. The KS would be setup as a linked server and the sproc could be called directly from the individual colo databases. I haven’t thought through the deployment implications of these choices. It would seem that the linked server approach would require a VPN between the KS and every colo. It would seem that the Web Service approach would be simpler from a deployment perspective since it would only require HTTPS. Performance would be far less of an issue if we used the cached pool mechanism described above.
I would so appreciate your feedback on this interesting challenge. What do you think of the overall approach? Do you know of any precedents? What about linked server versus Web Service?
Rob Oaks
April 20, 2010 at 6:34 am
I'm doing the quite the same thing with a linked server, but the two databases reside on the same sql cluster and there's no VPN complexity involved.
I like the Web Service solution, looks more robust over a WAN.
I don't know if it could fit your needs, but you could also take a look at merge replication, it could be a time saver, without re-inventing the wheel.
-- Gianluca Sartori
April 22, 2010 at 2:16 pm
Thanks Gianluca.
In referring to merge replication, I'm not sure whether you're talking about replicating the actual metadata tables or just the centrally generated PK table (for failover in case the primary KS fails). Replication would definitely not work in the former case (since don't want all metadata to be identical at all times--we want to migrate between environment as nevessary). In the latter case, I don't think replication would work because if the failover KS is even slightly behind the primary server with respect to the next key, this would lead to the generation of duplicate keys.
If you really implemented something along the lines I describe, I wonder if you could summarize your experience or perhaps we could chat. This ia a very strategic decision for us and I don't want to commit to this approach unless I am pretty certain it's a good idea.
April 23, 2010 at 2:22 am
What I have set up is very simple. Some applications share a key generator, in order to ensure key uniqueness.
The keys are generated by a SQL Server 2005 stored procedure, that returns a numeric ever increasing value, based on a string key code. Keys are stored in a simple table, that holds key codes and numeric values.
To avoid long lasting locks on the key table during transactions, the stored procedure is not called directly, but it's called via CLR assembly, that creates a new connection, not enlisted in transactions. The same could have been achieved with a web service, as you mentioned, but I decided not to go that way because all the databases accessing that key store reside in the same LAN.
This could be implemented because gaps in the keys are not a problem. If your keys don't accept gaps, forget about it.
I can't say if I would recommend this solution in you scenario, because I'm not there and I can't see the whole picture. This worked for me, but I have to admit that my architecture seems to be much simpler than yours.
Good luck with your project!
-- Gianluca Sartori
April 23, 2010 at 3:29 am
Having the system widely distributed adds some special considerations, but essentially the problem is the same as that faced by any locally-distributed system (e.g. a web farm) where records must be generated with unique identifiers.
The problem essentially has three solutions:
1. Allocate keys from a central resource.
This is usually implemented using a Sequence Table containing range identifiers and an associated next-key-value, as you mention. Pooling is essential for good performance - keys should almost never be generated singly; rather a range should be reserved for use. Background allocations (again, never singly) can be used to ensure that the local pool does not run out of keys.
As Gianluca mentions, any asynchronous solution can leave gaps in the overall allocation range - but that probably is not important here.
The other important consideration is concurrency. The two primary ways to avoid problems are (a) to use a separate connection via a CLR component (again, as Gianluca described); or (b) to use a linked server with the 'remote proc transaction promotion' option set to FALSE using sp_serveroption. This is only available in SQL Server 2008.
2. Pre-allocate a range to each site large enough to cater for all future requirements, and for all future sites. This may or may not be practical depending on the exact circumstances and key lengths.
3. Use GUIDs as keys. Values generated using the NEWSEQUENTIALID function (in a DEFAULT constraint only) overcome most of the problems usually associated with GUIDs.
My instinct would be to go with the GUID solution here.
Viewing 5 posts - 1 through 5 (of 5 total)
You must be logged in to reply to this topic. Login to reply