Modeling XML data into SQL tables via SSIS - ID attributes coming in incorrectly

  • I have an ssis package that imports data from individual xml files into sql server. in the data flow, xml source, it shows me 3 different layers, which is fine.  the 2 main layers that i care about have a bit of xml code that have IDs, like ID=2 or ID=1.
    I would have assumed this helps me tie together the data from the resulting 2-3 different sql tables so that I can properly join them.  

    But the ID values from xml are coming in totally wrong.  they might be 1 or 2, but them come in as 5, or 3.  I can't quite put my finger on a pattern, although there probably is one which if I spotted it, would help me pinpoint the root cause....or conversely, if I knew more about the potential root cause, I would probably immediately be able to see the pattern.

    I'm fairly new to doing this with XML, so please speak in simple terms to me.  Any ideas?

  • pisorsisaac - Monday, February 6, 2017 9:43 AM

    I have an ssis package that imports data from individual xml files into sql server. in the data flow, xml source, it shows me 3 different layers, which is fine.  the 2 main layers that i care about have a bit of xml code that have IDs, like ID=2 or ID=1.
    I would have assumed this helps me tie together the data from the resulting 2-3 different sql tables so that I can properly join them.  

    But the ID values from xml are coming in totally wrong.  they might be 1 or 2, but them come in as 5, or 3.  I can't quite put my finger on a pattern, although there probably is one which if I spotted it, would help me pinpoint the root cause....or conversely, if I knew more about the potential root cause, I would probably immediately be able to see the pattern.

    I'm fairly new to doing this with XML, so please speak in simple terms to me.  Any ideas?

    Question, do you have a schema definition for the XML or is this a result of SSIS reverse engineering?
    😎

    My normal approach is to load the XML into a staging table and parse it using XQuery, find it much easier and more efficient in majority of circumstances. The SSIS schema reverse engineering is inheritently bad at it's best, wouldn't use it.

  • Hi, thanks for helping me out.  It's true, I did allow SSIS to generate an XSD file, from which it then reads.  Before doing this task, I researched for a while and was warned about the errors that might occur.  I tried to correct them by editing the generated XSD file to specify max length on strings, but SSIS then told me something about how that type of code was not allowed in the schema specified.  That's about where my XML "expertise" ended, (joking on expertise), so I gave up on correcting the error and just did a data conversion and allowed the truncation to exist, which in my case seemed OK.  (but I would prefer not to do that, really).

    Regardless of all of this, from what I see so far, I think you are right about SSIS being very poorly designed to work with XML (which seems sad, as XML is so common and popular these days).  

    By using xQuery for this, do you mean for the loading itself, too?  Or just the parsing?

    Because I think right now my problem is coming into the picture even at the loading stage.  If only the values from the ID attribute would load correctly, I might be OK.  (though would like to take your advice on the rest of it too).

    The ID's show up in the data flow task (input, output, etc) just fine.....and they look like they will come in correctly....but then the values that get into sql are wrong -really unexpected to me.

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply