Database design issue

  • This is going to sound stupid at first glance, but I can not come up with a good design, so I would like to get some outside input to the problem and get to a reasonable solution. We are designing a new medical billing system. Each billing record COULD use any of about 650 different fields of all data types, dates, int, string, etc., that are needed by various insurance providers. While any 1 record will use, on average, about 45 fields, they may not be the same 45 fields from one insurer to the next. The current system uses a single table with a collum for each field. Not a good, scalable solution, IMHO. One designer wants to use a table with a variant field. Thus generating 45 rows for each claim. An even worse design IMHO. The only other idea I can come up with is to break the data up into logically grouped tables of 40-50 of the fields each and link them with a common key and then use a view to access the data. Also a lousy idea.

    I am hoping that someone has looked at something like this before and found a workable solution. The app will use the .Net 3.5 framework and SQL2005 for the db.

    Please share any ideas you may have.

    Stan

    WABALUBADUBDUB

  • First, let me address the suggestion of using a SQL Variant data type: DON'T. This looks terribly convenient at first glance, but it's ripe for disaster in real-world use.

    You mentioned that there are 650 or so different possibilities of data elements. Rather than try to break those apart, is it possible to group some of them together logically? For example, you could consider a visit or encounter to be the logical grouping of data. A visit/encounter could have a DATETIME admit date, a DATETIME discharge date, a VARCHAR patient type, a DECIMAL account balance, among others. These could easily be grouped in the same table together (in separate columns, of course). You'll still have a number of tables to create, but with the right grouping, you could cut down the number of tables to 5 to 10 percent of that.

    Another option would be to use code tables, which would group logically dissimilar data of the same data type into a single table. Using this method, you'd have a column to describe what that row represents. This method doesn't scale as well as what I mentioned above, since you could be joining your code tables dozens or even hundreds of times in large queries. Code tables tend to be IO hotspots if you have a busy system.

    Hope this helps, feel free to post back more details for more specific advice.

    Tim Mitchell, Microsoft Data Platform MVP
    Data Warehouse and ETL Consultant
    TimMitchell.net | @Tim_Mitchell | Tyleris.com
    ETL Best Practices

  • One idea worth serious consideration is whether or not you are willing to upgrade to SQL 2008, now that SP1 is available. Sparse columns sound like the answer to your prayer. Read up on them before committing to a design that you might regret in six months.

    A second option is to set up one table per vendor, with just the columns used for that vendor. A nest of snakes in many respects.

    Make all the common columns normal SQL columns and dump the rest into XML? I believe I would take this approach before taking the one-row-per-data element approach. I've seen that tried before and it's NOT pretty on a large scale.

    Those are the only thoughts that occur to me at the moment. Without really digging into your system, it's hard to do anything other than throw out ideas. Good luck to you and your team.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

  • Yeah, I'd stay away from defining your metadata at the vendor level. Bob is right - it's a nightmare waiting to happen. Billing companies could end up with thousands of clients.

    Stan, I do most of my work in healthcare as well so I can probably address your specific taxonomies. Let me know if I can help.

    Tim Mitchell, Microsoft Data Platform MVP
    Data Warehouse and ETL Consultant
    TimMitchell.net | @Tim_Mitchell | Tyleris.com
    ETL Best Practices

  • How about storing data in XML Format and then XQuery the data on the maximum available elements...?

    --XML For Client 1

    Declare @vXML XML

    Set @vXML = ''

    Select x.header.value('@ClientID','int') ClientID,

    x.header.value('@C1','nvarchar(50)') Col1,

    x.header.value('@C2','nvarchar(50)') Col2,

    x.header.value('@C3','nvarchar(50)') Col3,

    x.header.value('@C4','nvarchar(50)') Col4 ,

    x.header.value('@C5','nvarchar(50)') Col5 ,

    x.header.value('@C6','nvarchar(50)') Col6 ,

    x.header.value('@C7','nvarchar(50)') Col7

    from @vXML.nodes('/Main') as x(header)

    --XML For Client 2

    Set @vXML = ''

    Select x.header.value('@ClientID','int') ClientID,

    x.header.value('@C1','nvarchar(50)') Col1,

    x.header.value('@C2','nvarchar(50)') Col2,

    x.header.value('@C3','nvarchar(50)') Col3,

    x.header.value('@C4','nvarchar(50)') Col4 ,

    x.header.value('@C5','nvarchar(50)') Col5 ,

    x.header.value('@C6','nvarchar(50)') Col6 ,

    x.header.value('@C7','nvarchar(50)') Col7

    from @vXML.nodes('/Main') as x(header)

    All you have to do is to keep your Select query updated with the available columns...

    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Sometimes, winning is not an issue but trying.
    You can check my BLOG
    [font="Arial Black"]here[/font][/url][/right]

  • Bob,

    Thanks for the ideas!

    As to more details, I think it would get boring pretty quick. It is about isurance.:-)

    Stan

    WABALUBADUBDUB

  • Tim,

    Great! How do you guys handle the storage of the NCPDP field data for claims? This is exactly what I am working on.

    Stan

    WABALUBADUBDUB

  • Atif,

    That would be a good way to go, but I can not do that one easily because I will also have to do financial reporting on the data and I think that storing the data as XML will slow the reporting down quite a bit.

    Great idea though. I will keep that one in mind for other systems.

    Stan

    WABALUBADUBDUB

  • If you strongly type the XML, index it, and define it with schemas, it can be pretty efficient and can query pretty rapidly. Plus, if reporting becomes a big deal, you'll probably pull the data out of there into cubes (or something similar) anyway.

    I've used the XML type solution, where each wanted their own set of columns, and it's more efficient than a bunch of varient columns, but you do have to watch out, because it's basically a violation of 1NF and can turn into a maintenance nightmare unless you manage and document it really well.

    If managed and documented properly, it's easy on you as a DBA and on the devs who can use the schemas to do things like build online forms and such.

    More relationally, the way to go is to vertially partition the table into groups of related data. This might be by customer, or it might be by category of column. What you don't want to end up with is 650 tables with an ID and one column each, though that would be the most flexible.

    SQL 2008 sparse columns are pretty much designed for exactly the situation you're looking at. If you can move that direction, it should work really well. I'm assuming you're currently on 2005, from the forum you posted in.

    Without the sparse columns option, I'd be inclined towards a base table with as many well-defined columns as is reasonable, and an XML column for the rest. The ones that all/most of the customers will use should be well-defined, relational columns with defined data types, and the rest in an XML "junkyard column".

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Code4Cash,

    I dont know the details of XML in SQL Server 2000, but there is alot you can do in SQL Server 2005.

    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------
    Sometimes, winning is not an issue but trying.
    You can check my BLOG
    [font="Arial Black"]here[/font][/url][/right]

  • Atif,

    We are on 2005, but, as GSquared points out, an XML design like that would be a violation of 1NF, and I just can not bring myself to design a solution that is not going to rise to that level. (Actually, I have a problem with anything less than 3NF, but I have been called an A-Hole about it on more than one occassion, so I may be expecting too much for others. 🙂 ).

    Thanks again for the ideas.

    Code4Cash

    WABALUBADUBDUB

  • The normal forms have a purpose. There are times when it's better to violate them.

    Like every rule that isn't a natural law, there are times when it's best to break it.

    186,282 miles per second, it's not a suggestion, it's the law! If it's not like that, there are times when it's a good idea to break rules or standards.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • Also keep in mind that normalisation is a logical concept, not a physical one.

    Multiple indexes and summary tables are everyday examples of redundancy that don't seem to offend anyone.

    __________________________________________________

    Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
    Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply