xml openrowset performance

  • I am bulk inserting a whole bunch of xml files into a sql table using OPENROWSET. The problem is that this process is really really slow. Any idea if this can be sped up?

  • peacesells (10/15/2014)


    I am bulk inserting a whole bunch of xml files into a sql table using OPENROWSET. The problem is that this process is really really slow. Any idea if this can be sped up?

    Quick question, what version and edition of SQL server are you on?

    😎

  • Eirikur Eiriksson (10/15/2014)


    peacesells (10/15/2014)


    I am bulk inserting a whole bunch of xml files into a sql table using OPENROWSET. The problem is that this process is really really slow. Any idea if this can be sped up?

    Quick question, what version and edition of SQL server are you on?

    😎

    Microsoft SQL Server 2012 , Enterprise

  • Done this on exactly the same version/edition and the performance is brilliant so no worries there. One thing to check though is the configuration of tempdb, rule of thumb, recommend having the number of files the same as the cpu cores available for the instance and the files sizes fixed and equal.

    😎

    There are few other factors which can affect the performance, you'll have to provide some more information here;-) I'll start with some questions, don't limit the answers to the questions if you think I'm missing anything important.

    1) How large are the files?

    2) Any file larger than 2Gb?

    3) Are the files loaded into a staging table before shredding?

    4) Is the XML typed or untyped?

    5) Is the XML structure simple-medium-complex?

    6) Is the BulkColumn used directly in any XQuery?

    7) Any XML Indices?

    8) Are the files on a local drive (server)?

    9) Has the file-system performance been checked?

  • Eirikur Eiriksson (10/15/2014)


    Done this on exactly the same version/edition and the performance is brilliant so no worries there. One thing to check though is the configuration of tempdb, rule of thumb, recommend having the number of files the same as the cpu cores available for the instance and the files sizes fixed and equal.

    😎

    There are few other factors which can affect the performance, you'll have to provide some more information here;-) I'll start with some questions, don't limit the answers to the questions if you think I'm missing anything important.

    1) How large are the files?

    2) Any file larger than 2Gb?

    3) Are the files loaded into a staging table before shredding?

    4) Is the XML typed or untyped?

    5) Is the XML structure simple-medium-complex?

    6) Is the BulkColumn used directly in any XQuery?

    7) Any XML Indices?

    8) Are the files on a local drive (server)?

    9) Has the file-system performance been checked?

    1. The files are not big around 300 KB on average.

    2. No

    3.Yes. Loading into the staging is what is taking a long time right now. The shredding would take as long if not longer.

    4.Mostly untyped

    5.The structure is complex

    6.Yes, we get that column and shred it into sql tables.

    7.Not yet (need suggestions on that one as well)

    8.No on the network.

    9.No idea and i don't know what that is.

    Thank you, your suggestions would be much appreciated.

  • First suggestion is to add one step where the files are directly loaded into a staging table as an XML blob, without any shredding or processing. This will give you the actual performance of the file share minus the write performance of the SQL Server. In my experience, 5-10Mb/sec at least. Using the BulkColumn directly has very adverse affects on the performance so lets try to avoid that one.

    From that initial staging table, do the shredding, if you share some more details here we can help you optimize that step.

    On the performance side of things, what you can expect is an overhead of around 50ms for initialization in addition to 5-10Mb/sec transfer rate for each file, somehow there seams to be a barrier there although using SSD type subsystems I've managed to get this much much higher. The shredding part normally is where the low hanging fruits resides so I'm looking forward to analyse your approach and hopefully improve it.

    To sum up, can you post some XML examples and the code you are using to shred those?

    😎

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply