SSIS XML import

  • I am importing xml multiple times a day from a vendor. However when SSIS created the ID's for nested XML data it is not unique. So importing the first time and I get 3-4 records it looks fine. However subsequent imports all use the same ID's so it isn't unique, how do I go about changing this as I cant find anything about it.

    I can provide more info if needed, and searching on google bring lots of results but no clear answers.

    Any help is appreciated!

  • What method are you using to import the xml (within SSIS)? How are you calculating and assigning these IDs?

  • I dont generate them. I am using the XML source and it generates all the child ID's automatically. However running the same import again (say 400 records) it reuses the exact same ID's over again.

  • This was removed by the editor as SPAM

  • Hi, thanks for trying to help but that link is just a basic guide on importing. It has no information on the auto generation of ID's for nested data etc.

  • Just throwing in my 2 cents, my preferred way is to use SSIS to import the XML into a staging table and shred it with XQuery into the destination tables.

    😎

  • The ids created by the XML Source in Dataflow aren't guaranteed to be unique between calls for different xml files. How could they be?

    Add a fileId column to your tables, make it a primary key or unique constraint with the other column and include it in the pipeline with a Derived Column task. Here's an example from my SSIS package that imports deadlock xml. The deadlock_Id column is created by the XML Source and the fileId added in via Derived Column:

    CREATE TABLE dbo.deadlock (

    fileIdINTNOT NULL,

    deadlock_IdNUMERIC(20, 0) NOT NULL,

    victimNVARCHAR(255) NULL,

    CONSTRAINT PK_deadlock PRIMARY KEY ( fileId, deadlock_Id )

    ) ON [PRIMARY]

    GO

  • dan.james (8/1/2014)


    I dont generate them. I am using the XML source and it generates all the child ID's automatically. However running the same import again (say 400 records) it reuses the exact same ID's over again.

    Use a derived column with a datetime stamp as a secondary column for your key. Alternatives include GUIDs or detecting the next secondary key out of the table with a MAX()+1 or something similar. In pretty much all cases, you end up with a dual (or more) column key.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply