Existing Table division

  • I have a table "Claim" which contains 85+ columns. records in thousand(Not High-transactional table). The data get populated with the help of application (users enter it ) in around 60 columns and for other column's population(updated) based on some calculations (by ssis package) .

    first table "Claim" will contain all the columns which get directly populated by users.

    second table "claim_summary" will contain all calculation based columns and will get populated by ssis and i am thinking to putting it on another filergroup(separate disk).

    There is another the reason of spliting the table that is some basic level recports get the data directly from "claim" table while advanced level reports from "claim_summary" + couple of columns from claims table .

    so below are the advantages i can get here

    1) better management/maintenance

    2) performance gain as individual indexes for basic and advance reports

    3) in case of basic reports only claim table will get touched.

    4) when SSIS package will run it wil also touch "claim_summary" table.

    5) readabiliy will also get increased.

    Please correct me if i am wrong.

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • Sounds like a solid plan. Is it true that claim to claim_summary will be a 1 to 1, i.e. you're just offloading some columns from claim into this new table claim_summary?

    The tradeoff I am seeing is that while you're reducing the size of the claim you may be adding a JOIN into your batch process and reports to bring the whole row back together. That could amount to a minor amount of complexity and overhead but its worth considering how that would affect things.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • opc.three (2/1/2013)


    Is it true that claim to claim_summary will be a 1 to 1, i.e. you're just offloading some columns from claim into this new table claim_summary?

    Yes there would be 1 to 1 relationship

    opc.three (2/1/2013)


    The tradeoff I am seeing is that while you're reducing the size of the claim you may be adding a JOIN into your batch process and reports to bring the whole row back together. That could amount to a minor amount of complexity and overhead but its worth considering how that would affect things.

    Yes that is my concern too. IF i place these two tables on different disk . will i get benefit in IO context ? or should i place them on same disk ? one more advantage which i am seeing here is during the ssis package only claim_summary table/disk will get affected

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • opc.three (2/1/2013)


    The tradeoff I am seeing is that while you're reducing the size of the claim you may be adding a JOIN into your batch process and reports to bring the whole row back together. That could amount to a minor amount of complexity and overhead but its worth considering how that would affect things.

    Yes that is my concern too. IF i place these two tables on different disk . will i get benefit in IO context ? or should i place them on same disk ? one more advantage which i am seeing here is during the ssis package only claim_summary table/disk will get affected

    [/quote]

    You may see a benefit but I am not sure how much. Also, a trade-up in one area usually means a trade-down in another. Do some reading on vertical partitioning and test to see which method benefits you most.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • For thousands of records performance will not be an issue. But it is good that you are thinking about it this early.

  • i just came to know about one problem while reading an steve jone's article http://www.sqlservercentral.com/blogs/steve_jones/2010/02/14/saving-space-_1320_-vertical-partitioning/

    lets say we have one table

    create table Customers( CustomerID int identity(1,1), Firstname varchar(50), Lastname varchar(50), Address1 varchar(50), Address2 varchar(50), city varchar(50) )

    and we split this table into two

    create table Customers ( CustomerID int identity(1,1), Firstname varchar(50), Lastname varchar(50))

    And

    create table Customer_Details ( CustomerID int , Address1 varchar(50), Address2 varchar(50), city varchar(50))

    How we are going to manage customerID values in both tables when in one of table it is identity ?

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • The stored procedure (or application layer object) that creates the record takes care of that. For example you create the record in the first table, use SCOPE_IDENTITY() to get the identity just created, and use it in the next table.

    But seriously you are not going to require vertical partitioning for a table with thousands of records. In fact it's quite an extreme design measure to take, it adds a lot of complexity so you need some very good reasnos for doing so.

  • nick.mcdermaid (2/5/2013)


    But seriously you are not going to require vertical partitioning for a table with thousands of records. In fact it's quite an extreme design measure to take, it adds a lot of complexity so you need some very good reasnos for doing so.

    The prime objective is tio divide the load of SSIS package's population and manual population plus reports will also pull the records.Readability will also have benefit

    Suggestion please

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • Why do want to divide the load? Do you have a performance or design issue requiring this? For population of summary table you could also use views, triggers and/or a stored proc which makes the data real time as opposed to having to run a scheduled external SSIS Package.

    I don't really understand the problem you're trying to solve.

  • nick.mcdermaid (2/6/2013)


    Why do want to divide the load? Do you have a performance or design issue requiring this? For population of summary table you could also use views, triggers and/or a stored proc which makes the data real time as opposed to having to run a scheduled external SSIS Package.

    I don't really understand the problem you're trying to solve.

    so below are the advantages i can get here

    1) better management/maintenance

    2) performance gain as individual indexes for basic and advance reports

    3) in case of basic reports only claim table will get touched.

    4) when SSIS package will run it wil also touch "claim_summary" table.

    5) readabiliy will also get increased.

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • Those are very general benefits but it sounds like you've convinced yourself anyway.

    If you are talking about having a seperate detail and summary table then it's not vertical partitioning because it isn't a 1:1 relationship.

  • nick.mcdermaid (2/6/2013)


    Those are very general benefits but it sounds like you've convinced yourself anyway.

    Honestly speaking database design is under physical design phase so cant tell whether my approach is good or not ? thats the reason i posted the question here.

    nick.mcdermaid (2/6/2013)


    If you are talking about having a seperate detail and summary table then it's not vertical partitioning because it isn't a 1:1 relationship.

    i didnt get this , please explain .what i inderstand here is for every claimID from claim table there must be a record in claim_summary table.

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • Honestly speaking database design is under physical design phase so cant tell whether my approach is good or not ? thats the reason i posted the question here.

    I really think you're over analysing it for that number of records. As far as DBA performance goes nowadays the focus is less on splitting components of tables over physical disks and more about getting everything into RAM. (as compared to when I did the SQL 2000 MCDBA) You will seriously see no performance advantage in vertically splitting data accross disks if you have only a couple thousand records.

    i didnt get this , please explain .what i inderstand here is for every claimID from claim table there must be a record in claim_summary table.

    If that's the case then yes it's 1:1. My mistake - usually a 'summary' table is a summary of some kind of detail and has less rows and less detail but in your case the summary appears to be additional generated information.

    Looking at your original question, are the claim summary calculations based solely on columns in the claims table? Because you can simply use calculated columns in the same table for that. Then you don't need SSIS at all (and I would argue you still don't need it anyway)

    I am of the opinion that a simple design is a good design and adding SSIS and filegroups and vertical table partitioning to a table with a few thousand rows is way over the top. But everyone has made good contributions here and some valid reasons have indeed been given to split it.

  • I tend to agree that for a few thousand rows you'll find that the effort to split the tables will not be worth the gains you might receive, not to mention that it may actually perform poorly in some scenarios. That said it may still be worth trying to implement if you have the time to code it and test it. If you decide not to implement it in production you will at least have gone through the process of making an informed decision and there is value in that. Just make sure you're weighing the maintenance costs going forward with the performance gains you may achieve.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply