Identifiers: Reuse or Remap?

  • Hi SSC,

    I'm architecting some stuff which crosses service boundaries, and second guessing myself about how best to handle the identifiers of my various tables/entities.

    For example, we have a user management system, and a user is uniquely identified by [cust_num] varchar(16), which is a concatenation of the organization that user is associated with ([marketer] char(2)) and a string (usually a random hash). So for example

    • a marketer might be "XM"
    • a cust_num might be "XMXEDNI" (marketer + cust_num)

    Lots of systems rely on this system, and so in the past we've hung business logic on entities like [Marketer] and [cust_num], sometimes to our own detriment. There are a number of variants on these string identifiers in other systems, but they all have similar undesirable traits.

    I'm considering  creating my own identifiers (probably, literally, an IDENTITY) and use that in my system, and only reference the other identifier in a mapping capacity. Some of my reasons for wanting to do this are

    • Many identifiers are not atomic (i.e. they are concatenations of data like <client>_<username>
    • Since many of these identifiers have some sort of business meaning, they occasionally need to be updated, and most have no means of propagating such changes to other systems which use them.
    • While some cases make a lot of sense, other times I find myself having to bootstrap a lot of assumptions onto the identifier.

    • For example, we have a Marketer > Tier > Type >hierarchy defined, used fairly ubiquitously around the company.
    • My new app, say, wants to classify vendor data, so I could in theory map my concepts onto this like 

    • Vendor = Marketer
    • DataProduct = Tier
    • DataSet = Type
  • There's an argument to be made that this mapping makes sense, and that it makes integration with existing systems more straight forward, but it also means that nobody can tell what a "Marketer" actually is any more
  • Plus my gut says this is  the right approach. However I'm also aware that

    • In some cases, this decoupling abstraction won't even come into play
    • There is additional work now to translate between identifiers
    • This feels like it has the potential to become a twisted hellscape of mapping tables
    • I could be overestimating the drawbacks to bootstrapping my logic to existing identifiers
    • I could be underestimating the benefit of the decoupling, or effort to maintain such mapping constructs

    I know there's probably not a single correct answer here,
    but if anyone has worked with issues like this before and would like to share their stories and/or opinions, I would love to get other perspectives.

    Executive Junior Cowboy Developer, Esq.[/url]

  • Hey Xedni, 

    I'm having a tough time following you on this.  Maybe because it's early and I've only had 1 cup of coffee 😉  I'll share something below which I think may be related to what you are asking, but I'm not sure.

    We have people in 2 different systems (1 system for recruiting and 1 for managing staff). - Actually, we have people in more than 2 systems, but I'm keeping this reply simple. 

    People can be found in 1 or both systems.  In the warehouse, I architect it like this:


    Next, I add abstraction through views.  The views allow me to do just about anything I need while maintaining maximum flexibility.  With the views, I can union candidates and staff and show which source system they come from. Another thing I can do is match candidates and staff based on email to show people who are both candidates and staff.

    Hope this helps a little.

  • Viewing 2 posts - 1 through 1 (of 1 total)

    You must be logged in to reply to this topic. Login to reply