Parsing XML file to import data into SQL 2008

  • I have a client who needs a DNN module created to import data from flat files into SQL 2008. I didn't see a 2008 forum but 2005/2008 are close buddies.

    The issue is more ,Net but it's data specific.

    If in a record on an XML file there is no cell data, then within tags, so child element is made. This seems strange ... let's say I have 5 cells per record and in some cases only 3 cells have data, thus there are 2 blanks:

    1

    2

    3

    4

    5

    We're all good on this row but let's say our next one has no data in cells 2 and 3:

    1

    4

    5

    I get this. The only way I saw around it was to insert "NA" for Not Applicable as obviously if there was no data for this record and nulls/empties were an option, then NA applies.

    Clearly this is an error with how Microsoft does an xml export from Excel. OBVIOUSLY to preserve data sequence YOU MAKE CHILD CELL ENTRIES WHERE BLANKS EXIST.

    I looked for different types of .xml save options and so on.

    If anyone is thinking save as .csv, even with saving to throw out incompatible items for CSV, I still get corruption. I get a square showing up in front of the first data item in a row in the database.

    I'm sure this is basic but I don't see what is going on.

    As Dr. Evil says ... "Throw me a bone."

  • It's a little bit confusing to me what you're trying to achieve...

    What I figured is that you want to import xml structure from a flat file.

    Right after that I'm lost: "If in a record on an XML file there is no cell data, then within tags, so child element is made."

    I'd recommend that you read the article referenced in my signature on how to post data.

    Fort posting xml data please use [ code="xml" ] without blanks.

    What I'd like to see is the data structure you're coming from (e.g. flat file with sample data), the target structure (table def?) the data need to be stored in, what source code you've tried so far, where you get stuck and what result you expect.

    Providing information in the way described above will help us to help you.



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • OK, here is a screen shot of the file, attached.

    You'll see on the Excel file there are many blanks for Customer Product. This is the OEM Part number. (Page one of attachment)

    When this is saved as an xml spreadsheet (and this may be my error, I want a full XML file ... :cool:) that's on the second page of the attachment.

    What happens is this -- all Row elements should have 6 child cell elements. Well they don't. If a cell is blank there is no element.

    Thus when parsing the record to find the next and in this element read to following , etc. to extract data I get mismatches as a result.

    To get around this, I sorted out all blanks and then filled in "NA" or some such in "Customer Part" so that the data would line up.

    Have I described it well? Is it as I suspect ... using an xml spreadsheet is the error?

  • Wow, it seems when I was trying to write in the XML tags in my first post they didn't show up. No wonder there's confusion.

    My screen shots will clear it all up. Let me know if it isn't clear.

  • Please read and follow the advice I refer to in my signature. It's still not clear what you're looking for. Please provide usable sample data.

    Btw: I did see your original xml structure (when you click "quote" you'll see the original format...)



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • First of all, thank you for taking your time.

    I must be honest -- I don't know how to describe it clearer.

    When saving an Excel file as XML so that it can be parsed and imported into a SQL Table, I'm running into a problem. I'm sure this is a basic mistake.

    I'm parsing the XML file by readtofollowing("Row"), then reading data the child elements similarly.

    The problem I ran into is, since I have x number of fields in each row in the original excel file, I'm expecting x number of elements to read in each row.

    Well, if there is no data in a cell in the original Excel file, the element within the parent tag won't ever get made.

    It throws my import routine off.

    I can't see how I'm failing to put this across

  • Now it's starting to get a little less cloudy...

    The question is: Why would you try to save an Excel file using xml just to be able to import it into SQL Server?

    Why don't you just take the Excel file as it is, import it into a table and start from there?

    At the moment I can't see any reason not to do so.

    I'd recommend to use the Excel file as it is and go from there.

    To get Excel data in SQL Server you can import the data using OPENROWSET, import wizard or simply copy and paste using Management Studio (assuming to have a table created with required columns) or you can link the XLS file directly as a data source using linked server.

    What method are you trying to use to get the data into SQL Server? Since you're trying to force it into xml structure I'm under the impression that you're using some Java stuff... If so, you should consider doing it directly without any disturbing "middleware"...

    But to help further, we'd need to know more details. I'm starting to guess already...



    Lutz
    A pessimist is an optimist with experience.

    How to get fast answers to your question[/url]
    How to post performance related questions[/url]
    Links for Tally Table [/url] , Cross Tabs [/url] and Dynamic Cross Tabs [/url], Delimited Split Function[/url]

  • As first let me make your XML readable for other people here.

    Usually you get:

    1

    2

    3

    4

    5

    But sometimes you get:

    1

    4

    5

    Now, let me join the clan of confused people.

    Most important question is already asked by Lutz. How do you import your data into SQL Server?

    Since your XML screen-shot showed Visual Studio I suppose you work with .NET to import your data into SQL Server. Which methods do you use?

    * XmlReader

    * XmlDocument

    * XDocument

    You can use an XSLT to add your missing "Cell" elements or handle this manual in each of my previously listed options. Could you provide a piece of the source you are currently working with please?

    If you work with XML features could you this script you're currently working with please?

    Flo

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply