SSIS Data Tuning Question

  • The section Estimated Row Size/Length of this article seems to contradict that.

    It says that the length is determined by the largest possible values in the columns. Which makes sense I think. The rows in the buffer have always the same size (unless they are all made smaller or bigger by some component), so it should be big enough to hold a row where all the columns contain their largest value. In the database however you can play with the row length in the pages. But unfortunately for you, buffers are not the same as pages.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • damn it. I suspected as much.

    The 'largest values' row size I have to deal with is a horendous 4276 bytes! the actual data contained in the fields is probably more like 500-1000 bytes.

    Thanks for answer my question.

    _____________________________________________________________________________MCITP: Business Intelligence Developer (2005)

  • Can every column contain it's maximum size? There's no chance in shrinking the size?

    If not, just don't use blocking components 🙂

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • But adding a semi blocking component will improve the performance. Check the following article. Hope this helps.

    http://msdn.microsoft.com/en-us/library/cc966529.aspx

    Regards,
    Pravasis

  • pravasis (11/16/2010)


    But adding a semi blocking component will improve the performance. Check the following article. Hope this helps.

    http://msdn.microsoft.com/en-us/library/cc966529.aspx

    Thanks for the very interesting article.

    But I have trouble finding the section that says partial blocking transformations will improve performance. They introduce a new buffer and possibly a new worker thread, but does this automatically mean a performance improvement?

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Well, it depends on the number of CPUs you are having. In Figure 6 of "Design approaches" section, you will find two additional union all which are redundant. This has been added to add an additional thread to improve the performance. Hope its clear now.

    Regards,
    Pravasis

  • da-zero (11/16/2010)


    Can every column contain it's maximum size? There's no chance in shrinking the size?

    If not, just don't use blocking components 🙂

    highly unlikely that they will - it's a poorly designed database, pretty much from the ground up, but hey, we are where we are - just got to live with it unfortunately.

    the rest of the process is pretty clean a derived column task to add some audit info and a row count - thats it.

    my next plan of attack is to look at reducing the field sizes, but it should be a one off move data migration, so it the effort may be worth more than the benefit...

    _____________________________________________________________________________MCITP: Business Intelligence Developer (2005)

  • pravasis (11/16/2010)


    Well, it depends on the number of CPUs you are having. In Figure 6 of "Design approaches" section, you will find two additional union all which are redundant. This has been added to add an additional thread to improve the performance. Hope its clear now.

    Hmmm. The section you are referring to describes the use of parallelism to speed up a slow component. The union alls are used to reduce buffer size (I guess, there's not really an explanation) in each thread and the final union all is used to merge the two threads back together.

    But it is not the same as saying "adding a semi blocking component will improve the performance". That is simply not true. It can improve performance in some occasions, but not always.

    Need an answer? No, you need a question
    My blog at https://sqlkover.com.
    MCSE Business Intelligence - Microsoft Data Platform MVP

  • Very correct. Probably thats the reason I suggested to go through the link to get a complete understanding and design the package accoring to the available resources and selecting best approach. Any ways, thanks for clarifying the point.

    Regards,
    Pravasis

  • pravasis (11/16/2010)


    But adding a semi blocking component will improve the performance. Check the following article. Hope this helps.

    http://msdn.microsoft.com/en-us/library/cc966529.aspx

    Thank you for that article! It helped clarify some of the "unknown areas" of SSIS for me. In depth, yet easy to read - perfect! It definitely provoked some future awareness in this area for me.

  • Even it helped me to understand the internal principles of SSIS in a better way and how we can tune the SSIS package to get the best performance.

    Regards,
    Pravasis

Viewing 11 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply