XML source with MaxOccurs=Unbounded subelements

  • I have an XML source designed something like this:

    <xs:element name="Resident" maxOccurs="unbounded">

    <xs:complexType>

    <xs:sequence>

    <xs:element name="ResidentID" type="xs:string"/>

    <xs:element name="ResidentName" type="xs:string"/>

    <xs:element name="Child" maxOccurs="unbounded" type="xs:string"/>

    <xs:element name="Car" maxOccurs="unbounded" type="xs:string"/>

    <xs:element name="Spouse" type="xs:string"/>

    </xs:sequence>

    </xs:complexType>

    </xs:element>

    When I load this into an XML Source Component, it creates two outputs: one for Item and one for RepeatingSub3. I have a table as my destination which has four columns: Sub1, Sub2, RepeatingSub3, and Sub4. Ultimately, I want each Item to map to one row in my table.

    The only way I can think to get this done is to use a script component to transpose RepeatingSub3 to comma separated values, then merge the results with the rest of the subelements to create a single row output for each Item. The problem is that my real schema has 5 of these repeating subelements, which means I've got to transform 5 outputs, then do 5 or 6 separate Merge Joins to get all the pieces back together (since I can't find a way to join more than two sources at a time).

    Is there an easier way to get this done?

    Thanks in advance!

  • I would like to help but am having a tough time picturing what the XML looks like that models what you have on your side since I can't see your screen 🙂

    Can you post some sample XML for us to try?

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • An example of the XML Source is something like this:

    <Root>

    <Resident>

    <ResidentID>11111</ResidentID>

    <ResidentName>John</ResidentName>

    <Child>Jimmy</Child>

    <Child>Mary</Child>

    <Child>Johnny</Child>

    <Car>Pinto</Car>

    <Car>Gremlin</Car>

    <Car>Vespa</Car>

    <Spouse>Jill</Spouse>

    </Resident>

    </Root>

    Obviously this isn't the actual data, but hopefully it represents what I'm trying to accomplish. Essentially, I want my output for this Resident to return a single row:

    ResidentID | ResidentName | Children | Cars | Spouse

    11111 | John | Jimmy,Mary,Johnny | Pinto, Gremlin, Vespa | Jill

  • I think I am close...but not quite there...could you adapt your initial schema doc to your example XML please so the names and elements match up? I think Child and Car both map to an element like RepeatingSub3 but it's missing from your schema and I want to be sure I am not making any assumptions.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Updated the schema, thanks for taking a look!

  • Thanks for playing along too...I thought this might be doable using the Pivot transformation however I have decided 'no'. If the elements were Child1, Child2, Child3, Car1, Car2, Car3, etc. then yes, but I don't see a way to do this through the Pivot, or another basic transformation. I am hoping another SSIS guru will step in and set the record straight.

    Given that I cannot see the path forward with an existing transformation I would try tackling this with a script:

    1. From your XML source you have three outputs: Resident (one flat row, your anchor), Child (0-n rows) and Car (0-n rows).

    2. Send the Car output into a Transformation Script Component. In your script you can iterate through the input buffer keeping track of the Resident_Id from the previous row and if its the same concatenating onto the Cars column until you reach a new Id...whereby you'll send the row you've been working on to the output buffer and start working on the new row.

    3. Connect the output of the Car Transformation Script Component to a Sort component and sort it on Resident_Id.

    4. Send the Resident output from your XML source to a Sort component and sort it on Resident_Id.

    5. Now send the sorted Resident and Car outputs into a (Left) Merge Join transformation to do "Resident LEFT JOIN Car" to essentially append the concatenated Cars column onto the Resident DataSet.

    6. Repeat steps 2-5 for the Children XML output. The output from your second Merge Join should be the resultset you're looking for.

    I see it being somewhat easier to shred the XML and concatenate the data using T-SQL if that's an option for you...but as I said maybe someone with more SSIS skill than I will pickup the thread and chime in with some magic.

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

  • Yea, that's pretty much the only way I could see to get it done. It just seems somewhat ridiculous to split it out into 5 different outputs and then have to use 5 subsequent merges to get it all back together again. My data flow diagram is going to be a complete mess...hopefully I'll never have to look at it again.

    In the future I think I'll just stick with doing this sort of thing in .NET (a co-worker convinced me that SSIS was the way to go for this). 🙂

    Thanks again for your time, I appreciate the second set of eyes, at least to validate that I'm not missing something completely obvious.

  • I was a full-time C#/SQL dev 4 years ago (80/20 split) and I could not stand SSIS when it first hit the scene because it seemingly made things so much more difficult and convoluted but SSIS has grown on me a little. Coding up these kinds of transformation in .NET can get tedious...although I find it much easier to debug and integrate non-data services (FTP, SMTP, SOAP/REST, etc). All things being equal for simpler tasks I prefer .NET as well. Do you know about FileHelpers? If you process a lot of files in .NET it's worth it's weight in gold. It's an open source project that became a centerpiece in a framework I designed so I give them a plug whenever I can 😀

    One note from your comment about 5 outputs in case it matters...I only had 3 outputs coming out from the XML source, therefore only 2 Merge Joins to reconstitute the DataSet you wanted to see in the end.

    Happy SSIS'ing !

    There are no special teachers of virtue, because virtue is taught by the whole community.
    --Plato

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply