Dealing with the presence of an XML tag rather than the data within the tag

  • I'm having a head banging moment where I am sure the answer is painfully obvious but I just can't seem to get there. First, some context:

    Retail Transaction log in XML being sucked into an SQL Server 2005 database. At it's simplest the TLog looks something like:

    [PointOfSale]

    [Contents]

    [TLOG]

    [Transactions]

    [Transaction]

    [Items]

    ...

    [/Items]

    [Tenders]

    ...

    [/Tenders]

    [Customer]

    ...

    [/Customer]

    [/Transaction]

    [/Transactions]

    [/TLOG]

    [/Contents]

    [/PointOfSale]

    Following I have a code snippet to check starting at an expected node and building what is my transaction header table; essentially specific information pertaining to the transaction. I'm not having any problems doing this when the start and end tags each have data within them.

    DECLARE @iDoc INT

    DECLARE @xmlDoc XML

    SET @xmlDoc = (SELECT * FROM OPENROWSET ( BULK 'C:\tlog.xml', SINGLE_CLOB ) AS xmlData)

    --Prepare the XML Document by executing a system stored procedure

    EXEC SP_XML_PREPAREDOCUMENT @iDoc OUTPUT, @xmlDoc

    -- INSERT stmt for header table

    INSERT INTO [Header_Table]

    (

    [Type],TrxID,Date,Register,[User],Store,Charges,Taxes,Tenders,CustID,LName,FName

    )

    -- SELECT stmt using OPENXML rowset provider

    SELECT * FROM OPENXML (@iDoc, '/PointOfSale/Contents/TLOG/Transactions/Transaction',2)

    WITH([Type] VARCHAR(6),

    [GUID] VARCHAR(25),

    Date DATETIME,

    RegisterCode VARCHAR(6),

    [User] BIGINT,

    StoreUID VARCHAR(6),

    Charges MONEY,

    Taxes MONEY,

    Tenders MONEY,

    SyncID VARCHAR(25) 'Customer/SyncID',

    NameLast VARCHAR(25) 'Customer/NameLast',

    NameFirst VARCHAR(25) 'Customer/NameFirst'

    )

    WHERE NOT EXISTS (SELECT * FROM [Header_Table] WHERE TrxID = [GUID])

    -- Free up the memory used by the XML document.

    EXEC SP_XML_REMOVEDOCUMENT @iDoc

    My problem starts when I have tags with no data; in fact ending tags with no start tags and no data. It looks like this:

    ...

    [IsPostVoid /]

    [Transaction]

    [Items]

    ...

    [/Items]

    [Tenders]

    ...

    [/Tenders]

    [Customer]

    ...

    [/Customer]

    [/Transaction]

    ...

    The presence of the tag [IsPostVoid /] denotes this is a void transaction, so I need to record into some field the transaction "Mode" if you like. I'll create a new field on my [Header_Table] and call it Mode and would like to store in this field the word "Post Void" if [IsPostVoid /] exists in a transaction.

    I'm stumped though. Ideas? Thank you in advance!

  • See if this helps - the convoluted xquery below gets the name of the previous element

    Note that if you are using SQL Server 2005, you may find it easier to use the "nodes" syntax.

    DECLARE @xmlDoc XML

    SET @xmlDoc='

    [PointOfSale]

    [Contents]

    [TLOG]

    [Transactions]

    [Transaction]

    [Items]I1[/Items]

    [Tenders]T1[/Tenders]

    [Customer]C1[/Customer]

    [/Transaction]

    [Transaction]

    [Items]I2[/Items]

    [Tenders]T2[/Tenders]

    [Customer]C2[/Customer]

    [/Transaction]

    [IsPostVoid/]

    [Transaction]

    [Items][/Items]

    [Tenders][/Tenders]

    [Customer][/Customer]

    [/Transaction]

    [Transaction]

    [Items]I4[/Items]

    [Tenders]T4[/Tenders]

    [Customer]C4[/Customer]

    [/Transaction]

    [/Transactions]

    [/TLOG]

    [/Contents]

    [/PointOfSale]'

    SELECT d.value('Items[1]','VARCHAR(10)') AS Item,

    d.value('Tenders[1]','VARCHAR(10)') AS Tender,

    d.value('Customer[1]','VARCHAR(10)') AS Customer,

    d.value('for $a in . return local-name($a/../*[. << $a][last()])','VARCHAR(20)') AS PreviousElementName

    FROM @xmlDoc.nodes('/PointOfSale/Contents/TLOG/Transactions/Transaction') AS x(d)

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • Mark,

    Cheers for the excellent response. Using my example that I provided I see how this caters to my needs. Unfortunately I dumbed down my example, here's a sample chunk of actual transaction code I am looking at:

    [Transaction]

    [Type]SREG[/Type]

    [State]C[/State]

    [Date]3/10/2008 2:13:00 PM[/Date]

    [RegisterCode]9415[/RegisterCode]

    [User]255[/User]

    [UserTimeClockID]54595[/UserTimeClockID]

    [UserLogin]cstine[/UserLogin]

    [GUID]00415-9415-4584[/GUID]

    [OpeningTender]0[/OpeningTender]

    [BalanceFwd]0[/BalanceFwd]

    [Charges]24.9900[/Charges]

    [Taxes]1.62[/Taxes]

    [Tenders]0[/Tenders]

    [BalanceDue]26.6100[/BalanceDue]

    [Voided /]

    [PostVoid /]

    [VoidDate]3/10/2008 4:45:45 PM[/VoidDate]

    [VoidReasonCode /]

    [VoidReasonOther /]

    [VoidingTransactionGUID]00415-9415-4587[/VoidingTransactionGUID]

    [NextItemNumber]6[/NextItemNumber]

    [StoreUID]00415[/StoreUID]

    [RegisterSequenceNumber]4584[/RegisterSequenceNumber]

    [Sale]

    [PaymentReqd]0[/PaymentReqd]

    [MerchandiseIn]0[/MerchandiseIn]

    [MerchandiseOut]1[/MerchandiseOut]

    [/Sale]

    [Items]

    [Item]

    [Type]MDSE[/Type]

    [Charges]24.9900[/Charges]

    [Taxes]0[/Taxes]

    [Tenders]0[/Tenders]

    [SequenceNo]1[/SequenceNo]

    [Merchandise]

    [Department]17[/Department]

    [Class]17655[/Class]

    [Style]276[/Style]

    [Color]93[/Color]

    [SKU]7629943[/SKU]

    [Description]BASIC DENIM SHORTS[/Description]

    [Quantity]1[/Quantity]

    [Price]24.9900[/Price]

    [QtyAtPrice]1[/QtyAtPrice]

    [IsTaxable]true[/IsTaxable]

    [MerchandiseAmount]24.9900[/MerchandiseAmount]

    [Disposition]S[/Disposition]

    [IsSpecialOrder]false[/IsSpecialOrder]

    [/Merchandise]

    [/Item]

    [/Items]

    [/Transaction]

    I'd like to specifically look for the existence of the [PostVoid /] empty tag that may or may not appear in the highest level of the Transaction. It appears to always occur after the [BalanceDue][/BalanceDue] tags.

    How can I go back further than the most recent element?

    Thanks again!

  • Maybe this?

    declare @x xml

    set @x='

    ... your xml

    '

    select

    r.value('Type[1]','VARCHAR(6)') as Type,

    r.value('GUID[1]','VARCHAR(25)') as GUID,

    r.value('Date[1]','DATETIME') as Date,

    r.value('RegisterCode[1]','VARCHAR(6)') as RegisterCode,

    r.value('User[1]','BIGINT') as [User],

    r.value('StoreUID[1]','VARCHAR(6)') as StoreUID,

    r.value('Charges[1]','MONEY') as Charges,

    r.value('Taxes[1]','MONEY') as Taxes,

    r.value('Tenders[1]','MONEY') as Tenders,

    case when r.value('PostVoid[1]','VARCHAR(20)') is not null then 'PostVoid found' else 'No PostVoid' end

    from @x.nodes('/PointOfSale/Contents/TLOG/Transactions/Transaction') as x(r)

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • Brilliant! I am off and running. For anyone in future that follows this thread this is what I ended up with:

    DECLARE @iDoc INT

    DECLARE @xmlDoc XML

    SET @xmlDoc = (SELECT * FROM OPENROWSET ( BULK 'C:\tlog.xml', SINGLE_CLOB ) AS xmlData)

    --Prepare the XML Document by executing a system stored procedure

    EXEC SP_XML_PREPAREDOCUMENT @iDoc OUTPUT, @xmlDoc

    -- INSERT statement for header table

    INSERT INTO [Header_Table]

    (

    [Type],Mode,TrxID,OrigTrxID,Date,Register,[User],UserName,Store,

    BalanceFwd,Charges,Taxes,Tenders,BalanceDue,CustomerID,LName,

    FName,EmployeeID,EmpLName,EmpFName,ReasonCode

    )

    -- SELECT statement using XQUERY

    SELECT

    r.value('Type[1]','VARCHAR(6)') AS [Type],

    CASE

    WHEN r.value('PostVoid[1]','VARCHAR(20)') IS NOT NULL THEN 'Post Void'

    WHEN r.value('Voided[1]','VARCHAR(20)') IS NOT NULL AND r.value('PostVoid[1]','VARCHAR(20)') IS NULL THEN 'Void'

    WHEN r.value('(Sale/IsEmployeeTrans)[1]','VARCHAR(20)') IS NOT NULL THEN 'Emp Sale'

    WHEN r.value('(TrainingMode)[1]','VARCHAR(20)') IS NOT NULL THEN 'Training Mode'

    WHEN r.value('Type[1]','VARCHAR(20)') = 'NS' THEN 'No Sale'

    WHEN r.value('Type[1]','VARCHAR(20)') = 'SLWAY' THEN 'Layaway'

    WHEN r.value('Type[1]','VARCHAR(20)') = 'SREG' THEN 'Sales'

    WHEN r.value('Type[1]','VARCHAR(20)') = 'Event' THEN 'Event'

    ELSE 'Other' END AS Mode,

    r.value('GUID[1]','VARCHAR(25)') AS [GUID],

    r.value('OrigTransactionGUID[1]','VARCHAR(25)') AS OrigTransactionGUID,

    r.value('Date[1]','DATETIME') AS Date,

    r.value('RegisterCode[1]','VARCHAR(6)') AS RegisterCode,

    r.value('User[1]','BIGINT') AS [User],

    r.value('UserLogin[1]','VARCHAR(25)') AS UserLogin,

    r.value('StoreUID[1]','VARCHAR(6)') AS StoreUID,

    r.value('BalanceFwd[1]','MONEY') AS BalanceFwd,

    r.value('Charges[1]','MONEY') AS Charges,

    r.value('Taxes[1]','MONEY') AS Taxes,

    r.value('Tenders[1]','MONEY') AS Tenders,

    r.value('BalanceDue[1]','MONEY') AS BalanceDue,

    r.value('(Customer/SyncID)[1]','VARCHAR(25)') AS SyncID,

    r.value('(Customer/NameLast)[1]','VARCHAR(25)') AS NameLast,

    r.value('(Customer/NameFirst)[1]','VARCHAR(25)') AS NameFirst,

    r.value('(Sale/EmployeeID)[1]','VARCHAR(12)') AS EmployeeID,

    r.value('(Sale/EmployeeNameFirst)[1]','VARCHAR(25)') AS EmpLName,

    r.value('(Sale/EmployeeNameLast)[1]','VARCHAR(25)') AS EmpFName,

    CASE

    WHEN r.value('Type[1]','VARCHAR(20)') = 'NS' THEN r.value('(NoSale/ReasonCode)[1]','VARCHAR(25)')

    ELSE '' END AS ReasonCode

    FROM @xmlDoc.nodes('/PointOfSale/Contents/TLOG/Transactions/Transaction') AS x(r)

    WHERE NOT EXISTS (SELECT * FROM [Header_Table] WHERE TrxID = r.value('GUID[1]','VARCHAR(25)'))

    An observation though. While the query (without insert) was running a couple seconds on a 30,000 line XML file; it's taking a few minutes to write to a new empty table. And I expect it to get worse as the table grows (obviously) and as I tweak the indexing I choose to use.

    I suppose the extra leg-work to read through the file to honour my CASE statement is the main culprit. I can handle a few minutes no problem, it will be interesting to monitor as I go down the road.

    Thanks Mark!

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply