Data Modeling in a New World

  • Comments posted to this topic are about the item Data Modeling in a New World

  • You all should be very thankful you are not siting in meetings with ME discussing this issue.

    I'm currently working with data from an application that encompasses some thirty years, and one of the data elements is 'CATEGORY'.  This application vendor has always included a whole list of them as defaults which unfortunately have varied from version to version.  But even more worrisome is that the user is free to create their own in addition to those provided.  You should see the problems this has created.   Even without the 'user created' ones, the variation over the years simply from the vendor is astounding.

    When everyone is free to create their own category, the data becomes meaningless.   Pretty soon, the issue becomes 'Well, what day is it?'

    Just thought of another good example.  One of the common data elements in digital music is GENRE.  Try updating your files from the online sources of tag data.  It pretty much destroys any usefulness of that element.

    • This reply was modified 1 year, 6 months ago by  skeleton567.

    Rick

    One of the best days of my IT career was they day I told my boss if the problem was so simple he should go fix it himself.

  • This is even more challenging in medical. If you are a blood lab and run a panel of tests, some of them greatly depend on the birth sex of the individual. A particular hormone level that is fine for a female would be way out of range for a male. And this has nothing to do with gender identity but your sex at birth. A lab system I work on is tri-state. Male, Female, Unknown. Unknown is used when the ordering physician did not provide the patient's sex. This gets flagged and tests cannot be released until this is fixed because many test values depend on the sex of the patient. I'm sure this is going to need more work by the medical community because what are the proper test ranges for someone who's birth sex is male but is transitioning and taking female hormone treatments. Or who's birth sex is female and taking male hormone treatments? I don't know that there is much data out there and I think the medical community is going to need to deal with this sooner rather than later,

  • A bit of a sidebar... I cannot agree with changing the ISO spec.  People who make the change all want to be identified by the gender they've chosen and the ISO spec has that covered.  Most also don't want to be identified in any way, shape, or form as someone who has made the change.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Tom Uellner wrote:

    This is even more challenging in medical. If you are a blood lab and run a panel of tests, some of them greatly depend on the birth sex of the individual. A particular hormone level that is fine for a female would be way out of range for a male. And this has nothing to do with gender identity but your sex at birth. A lab system I work on is tri-state. Male, Female, Unknown. Unknown is used when the ordering physician did not provide the patient's sex. This gets flagged and tests cannot be released until this is fixed because many test values depend on the sex of the patient. I'm sure this is going to need more work by the medical community because what are the proper test ranges for someone who's birth sex is male but is transitioning and taking female hormone treatments. Or who's birth sex is female and taking male hormone treatments? I don't know that there is much data out there and I think the medical community is going to need to deal with this sooner rather than later,

     

    Maybe a solution for this would be that legally there be a Category called Birth Gender with  a third option called 'Unspecified' and that selecting this option legally releases caregivers from responsibility for performing such testing, drawing related conclusions, and providing related care.  You don't pay, you don't play. I'm sure that would go all the way to the Supreme Court.  But sometimes we just have to take a stand.  And it would avoid accumulation of many varied categories as time progresses.

    • This reply was modified 1 year, 6 months ago by  skeleton567.

    Rick

    One of the best days of my IT career was they day I told my boss if the problem was so simple he should go fix it himself.

  • skeleton567 wrote:

    Tom Uellner wrote:

    This is even more challenging in medical. If you are a blood lab and run a panel of tests, some of them greatly depend on the birth sex of the individual. A particular hormone level that is fine for a female would be way out of range for a male. And this has nothing to do with gender identity but your sex at birth. A lab system I work on is tri-state. Male, Female, Unknown. Unknown is used when the ordering physician did not provide the patient's sex. This gets flagged and tests cannot be released until this is fixed because many test values depend on the sex of the patient. I'm sure this is going to need more work by the medical community because what are the proper test ranges for someone who's birth sex is male but is transitioning and taking female hormone treatments. Or who's birth sex is female and taking male hormone treatments? I don't know that there is much data out there and I think the medical community is going to need to deal with this sooner rather than later,

    Maybe a solution for this would be that legally there be a Category called Birth Gender with  a third option called 'Unspecified' and that selecting this option legally releases caregivers from responsibility for performing such testing, drawing related conclusions, and providing related care.  You don't pay, you don't play. I'm sure that would go all the way to the Supreme Court.  But sometimes we just have to take a stand.  And it would avoid accumulation of many varied categories as time progresses.

    Not sure but I think that could be a problem for some people that have deep hormone therapy to help them with their change.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • I'd suggest it would probably be better to store genetic phenotype rather than birth sex for medical purposes i.e. XX, XY which would correspond to birth sexes of female and male but also allow the rare XXX, XXY, XYY and XXYY. Wouldn't generally need to cater for YY or YYY as they are are non viable.

  • crmitchell wrote:

    I'd suggest it would probably be better to store genetic phenotype rather than birth sex for medical purposes i.e. XX, XY which would correspond to birth sexes of female and male but also allow the rare XXX, XXY, XYY and XXYY. Wouldn't generally need to cater for YY or YYY as they are are non viable.

    CR,  therein lies one of the large hazards in data design.  We've suddenly gone from a two-value data element to an eight-value element.  I would argue that probably for most of the world outside the medical realm, this would be overkill and in the long run creates more complexity than the vast majority of data systems need or should have to deal with forever.  And even within the medical world, the huge majority of data needs can be met without this granularity.

    So, my possibly oversimplified solution would be that we maintain and ENFORCE the simple category of birth  gender of M/F,  and not burden the rest of system logic with what should be a separate data element of genetic phenotype.  From my experience, it has always been dangerous to mix multiple types of data into a single element.

    I just found this comment in an online article referencing this thought:

    "Wilhelm Johannsen proposed the genotype-phenotype distinction in 1911 to make clear the difference between an organism's heredity and what that heredity produces.[2][3] The distinction resembles that proposed by August Weismann (1834-1914), who distinguished between germ plasm (heredity) and somatic cells (the body)."

    • This reply was modified 1 year, 6 months ago by  skeleton567.
    • This reply was modified 1 year, 6 months ago by  skeleton567.

    Rick

    One of the best days of my IT career was they day I told my boss if the problem was so simple he should go fix it himself.

  • And that raises another of the basic tenets of data design - we should hold only that which is required and that which we do hold should be correct - and this aspect is now legally enforceable in many jurisdictions.

    The same areas where you argue we don't need to know about the additional complexity this involves also don't need to know about birth gender in the first place. But where it is required it needs to allow for all cases - expanding on the earlier example if prescribing medication depends on the person's genetic condition then it needs to know what that condition actually is, not just an approximation otherwise someone born androgynous or hermaphrodite or with other genetic abnormalities may receive the incorrect dose.

    With good database design we would normalise gender/sex which would mean we could easily add additional values to properly handle these data elements without adding complexity - I can join to a table holding 20 values just as easily as to one holding 2.

    Of course when many legacy systems were developed it wasn't conceived that people would perceive themselves as anything but male or female and the very low numbers of those in the actual population who were genetically otherwise was considered so low as to not be worth considering in the design of the system - indeed many would still make that argument today but designing new systems to address this properly doesn't add significant complexity. Mostly this data would be used only for dis-aggregation of results and where it does impact the logic of the systems then that additional complexity should already be present. Where there will be additional complexity would be where a legacy system was designed to minimise storage requirements and as a consequence having the data stored as a binary datatype - both in the database and the application code.

  • I think trying to model this in one way. Medical system modeling is much different than what most of us do, which is marketing modeling.

    The latter ought to capture something wider and more variant, after all, demographics should be inclusive of what we observe, not what we want. The former ought to be accurate, and perhaps needs a birth/original and a current/identifying or even some type of understanding of current biological makeup.

  • OK, some of you may not like my reasoning, and that's OK with me, but here goes:

    Quoting from the KJV as my ultimate authority:

    "So God created man in his own image, in the image of God created he him; male and female created he them."

    I would posit that aberrations from these two categories, physical or mental, natural or surgical, are therefore subcategories of same.  Granted there can be and is varying presence or absence of physical characteristics of both.  When you get into attempting to have a major category for every possible variation is where you really get into the weeds.  Are you going to uniquely categorize everything from one testicle, two testicles, one breast, two breasts, regular vs. inverted nipples?  Then what do you do if an individual, due to multiple characteristics fits into more than one of the above?  How about circumcised, uncircumcised, castrated, sterilized...  Your option is pretty much limited to a new category.  Creating arbitrary categories can only lead to the creation of MORE arbitrary categories.  And the more there are, the less useful they become.

     

    Rick

    One of the best days of my IT career was they day I told my boss if the problem was so simple he should go fix it himself.

Viewing 11 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply