SSIS Slower for Data Imports??

  • I have been told by my manager that SSIS 2008 is slower than Hibernate for data migration, apparently because of cache problems..how do I reason that SSIS is better. As for the requirement, we are moving from a Object Model to SQL Server 2008. SSIS will be used for data migration

    Any suggestions are welcome...

  • It is very possible he is right. Set up representative tests and prove it one way or the other. Be sure to include development effort and package creation in the 'end to end' scenario evaluation.

  • Thanks for the reply. I used SSIS already and the results were convincing with a million odd records, but the daily job only processes somewhere around 100K records, which is not a big volume. I am not familiar with Hibernate, and they have not set up any job on it to show me that it indeed is faster. But I am not using lookup transformations in SSIS, so where could be an issue with cache...? Am I missing something??

  • I believe the implication is that Hibernate, which is a RedGate product (and they are known for their performance oriented applications), is more efficient with its memory management than SSIS, which, if you look strictly at Hibernate vs SSIS...is probably true. My take on this is that RedGate and their utility is not dependent on the SQL OS and does not manage the entire database so, by virtue of its limited requirements, RedGate developers can optimize Hibernate's memory management routines for this very specific purpose. SSIS has to fall in line with the rest of the SQL Server suite which implies that there are many compromises in an effort to satisfy a much larger set of requirements. That is just a guess...my opinion only.

    More to your point and problem at hand...100k rows is tiny and, if you have a server with any beef at all, the actual migration/transformations will execute in a few seconds.

  • I think a bigger question is do you own Hibernate already? So faster/slower may not be any issue. Also, 100K records is not much, I doubt I would ever spend $$$ on a tool for such a small payoff.

    CEWII

  • Hibernate was there for the last one year, but the change is that it will be used for the data migration to new system based on SQL Server 2008, apparently to avoid some issues I mentioned in my OP

    Thanks.

  • There a many different reasons why the data flow tranformation may seem sluggish.

    1.) Using Fast load option in the OLEDB Destination should always be used when inserting data to SQL table unless you have a specific reason not to.

    2.) A common misconfiguration on SQL Servers is the min/max memory allocations. By default SQL will set its maximum to the full amount of RAM on the server. Generally, its a good idea to leave 1-2GB out of SQL's reach so the OS and any additional services have space to run in. SSIS is it's own service and has it's own memory space to operate in. The number you choose is proportional to the amount of RAM on the server and what those additional applications will likely need. If the engine is holding all the RAM then the OS and SSIS will be starved for resources. This will result in both of them relying heavily on the page file to buffer data which is much slower than RAM. That will kill performance very quickly.

    3.) Network throughput between the SSIS instance and the Database engine, as well as the throughput between SSIS and the data source you are loading the data from can definitely result in slow performance if you pulling from a remote source.

    4.) Indexes on the tables your loading the data to. If the table has large or significant number of indexes this can greatly increase the time needed to commit the data to the table. This can be mitigated by setting the table not to automatically update its indexes stats. It's more a application question whether that will cause any problems. If you opt for this option you should incorporate some a reindex task into you package so the table is optimized after loading. You should also audit the indexes. See if there any that are unnecessary. I have seen some cases where people have actually disabled all the indexes at package start..load their data..then re-enable the indexes. You can really ram the data in quickly with the indexes out of the way. The major concern with that method is if you have any applications that might try to query the table during the loading process. This can result in errors being returned by SQL to the calling application.

  • T-SQL_NewBie (3/4/2010)


    I have been told by my manager that SSIS 2008 is slower than Hibernate for data migration, apparently because of cache problems..how do I reason that SSIS is better. As for the requirement, we are moving from a Object Model to SQL Server 2008. SSIS will be used for data migration

    Any suggestions are welcome...

    WHICH Hibernate??? The one associated with Java or something else?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • This hibernate? https://www.hibernate.org/

    That's from Red HAT, not Red Gate. Red Gate doesn't have a product to do this, AFAIK (http://www.red-gate.com/products/index.htm)

    As far as which is faster, it depends on what you do, your environment, and your skill. None of these products that you have to "modify" or program, is necessarily faster. If you don't develop the process well, or the code, it won't be faster.

    SSIS is extremely fast for loading data, but whether it's faster in your situation than Hibernate is something you test. Going to 1mm rows as a test for 100k rows seems reasonable.

  • mike 84768 (3/5/2010)


    I believe the implication is that Hibernate, which is a RedGate product (and they are known for their performance oriented applications), is more efficient with its memory management than SSIS, which, if you look strictly at Hibernate vs SSIS...is probably true. My take on this is that RedGate and their utility is not dependent on the SQL OS and does not manage the entire database so, by virtue of its limited requirements, RedGate developers can optimize Hibernate's memory management routines for this very specific purpose. SSIS has to fall in line with the rest of the SQL Server suite which implies that there are many compromises in an effort to satisfy a much larger set of requirements. That is just a guess...my opinion only.

    More to your point and problem at hand...100k rows is tiny and, if you have a server with any beef at all, the actual migration/transformations will execute in a few seconds.

    You have lost me here because Hibernate was an opensource project that is now owned by Red Hat a Linux distributor. And Hibernate which is a data access tool and SSIS which is an ETL tool are not really related because data access is just one use of a tool like SSIS.

    In my experience most applications that uses ORM (object relational mapper) like Hibernate do not use constraints and DRI (declarative referential integrity) things that tools like SSIS maintains and uses. So this is comparing apples and oranges which are not related.

    https://www.hibernate.org/

    Kind regards,
    Gift Peddie

  • Jeff Moden (3/7/2010)


    T-SQL_NewBie (3/4/2010)


    I have been told by my manager that SSIS 2008 is slower than Hibernate for data migration, apparently because of cache problems..how do I reason that SSIS is better. As for the requirement, we are moving from a Object Model to SQL Server 2008. SSIS will be used for data migration

    Any suggestions are welcome...

    WHICH Hibernate??? The one associated with Java or something else?

    Yeah, the one associated with Java. I am not aware of the way it exactly works, but I reckon it can genarate HQL scripts based off the database structure.

    Thanks..

  • T-SQL_NewBie (3/8/2010)


    Jeff Moden (3/7/2010)


    T-SQL_NewBie (3/4/2010)


    I have been told by my manager that SSIS 2008 is slower than Hibernate for data migration, apparently because of cache problems..how do I reason that SSIS is better. As for the requirement, we are moving from a Object Model to SQL Server 2008. SSIS will be used for data migration

    Any suggestions are welcome...

    WHICH Hibernate??? The one associated with Java or something else?

    Yeah, the one associated with Java. I am not aware of the way it exactly works, but I reckon it can genarate HQL scripts based off the database structure.

    Thanks..

    It and all other ORM(object relational mapper) tools access columns as properties, in SQL Server the only such property is IDENTITY so saying such a tool can replace ETL( extraction transformation and loading) tool like SSIS is just plain lack of understanding.

    SQL Server 2008 automated SSIS package using CDC(change data capture) can collect and exchange data with any data source, no ORM tool can do that.

    Kind regards,
    Gift Peddie

  • Gift Peddie (3/7/2010)

    In my experience most applications that uses ORM (object relational mapper) like Hibernate do not use constraints and DRI (declarative referential integrity) things that tools like SSIS maintains and uses. So this is comparing apples and oranges which are not related.

    https://www.hibernate.org/

    Thanks for the input. If Hibernate cannot use constraints and DRI then I just wonder how it can be used for inserting data into a relational database which is normalized...:unsure:

  • T-SQL_NewBie (3/8/2010)


    Gift Peddie (3/7/2010)

    In my experience most applications that uses ORM (object relational mapper) like Hibernate do not use constraints and DRI (declarative referential integrity) things that tools like SSIS maintains and uses. So this is comparing apples and oranges which are not related.

    https://www.hibernate.org/

    Thanks for the input. If Hibernate cannot use constraints and DRI then I just wonder how it can be used for inserting data into a relational database which is normalized...:unsure:

    These tools use a lot of lookup tables, SELECT(*) and OUTER JOIN which are usually technically not needed. Developers can use such tools without knowing anything relational.

    Kind regards,
    Gift Peddie

  • T-SQL_NewBie (3/8/2010)


    Jeff Moden (3/7/2010)


    T-SQL_NewBie (3/4/2010)


    I have been told by my manager that SSIS 2008 is slower than Hibernate for data migration, apparently because of cache problems..how do I reason that SSIS is better. As for the requirement, we are moving from a Object Model to SQL Server 2008. SSIS will be used for data migration

    Any suggestions are welcome...

    WHICH Hibernate??? The one associated with Java or something else?

    Yeah, the one associated with Java. I am not aware of the way it exactly works, but I reckon it can genarate HQL scripts based off the database structure.

    Thanks..

    We ("they") used the "Java Version" of Hibernate to "simplify" transfers between Oracle and SQL Server at one company I worked at. It usually worked fairly well but, like anything else, "It Depends". Same goes with NHibernate in the "C#" world. We had one developer who decided to "get" everything (he called it an "Aggressive GET") that a rather complicated screen would ever need even if it wasn't needed at the time to "save trips to the server". It turned out to be a monster 30 table join. The 30 tables wasn't the "monster"... the way they were joined was... it took 7 minutes to run on a single user development server and the developer (actually... Senior Application Developer) thought that was "ok" because it was "getting a whole lot of data". I had to leave the room or the man would have died from instant pork chop poisoning. 😉

    Can it beat SSIS? Dunno... I don't use SSIS if I can help it (and, so far, I've been able to help it :-P).

    Of course, I also won't use Hibernate because of the problems I've seen in the HQL behind the scenes. It has no sensibilities when it comes to making a sargeable join or where clause.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 1 through 15 (of 16 total)

You must be logged in to reply to this topic. Login to reply