Sideout1972 (1/31/2012)
TedT,It ran horribly from the command line as well. After taking steps to eliminate memory issues by putting the code that processes an XML doc into its own package, and using a for each container to call that package for each xml, I was able to eliminate some of the memory issues. While I could get it to completion, the complex XML docs took upwards of 20 seconds each to process - not good when you've got hundreds of thousands to process. Command line shaved about 10% of processing time. Good but not good enough.
As an update - I eliminated the use of the XML Source all together. We are parsing the xml using a stored procedure instead using something like this:
DECLARE @xml_table TABLE(xml_data xml)
SET @sqlstmt= 'SELECT * FROM OPENROWSET ( BULK ''' + @l_xml_path + ''', SINGLE_CLOB ) AS XMLDATA'
INSERT INTO @xml_table EXEC (@sqlstmt)
Then, once the XML doc is in the table:
INSERT INTO MyTable
(
[Field1]
,[Field2]
,[Field3]
)
SELECT
x.value('(Field1/text())[1]', 'nvarchar(255)') Field1
,x.value('(Field2/text())[1]', 'nvarchar(255)') Field2
,x.value('(Field3/text())[1]', 'nvarchar(255)') Field3
FROM
@xml_table tbl
CROSS APPLY tbl.xml_data.nodes('/root/Subnode/Subsubnode') e(x)
This technique allows us to process each XML doc, almost regardless of size, in under 2 seconds.
We are still in development, but signs are positive. I am now convinced, after months of agony, that SSIS can't process XML efficiently.
Not that you need confirmation of your success but I avoid SSIS entirely in favor of things like you've done above. Well done!
--Jeff Moden
Change is inevitable... Change for the better is not.