Inserting data from an XML string

  • Hi guys,

    I have a situation whereby I'm passing an XML string to a stored procedure and it extracts the data items from each "XML table". We have developed a generic process, so unfortunately I don't actually know what columns/data I'll be receiving at runtime. I am using two tables, one called UploadFileDataRows (DataRowID PK) and one called UploadFileDataItems (DataRowID FK). Each "XML table" creates a new data row id and all the elements (TICKER, NAME, etc) are inserted as rows into the DataItems table using the DataRowID foreign key.

    The definition of the UploadFileDataRows table is:

    IF OBJECT_ID('dbo.UploadFileDataRows') IS NOT NULL

    DROP TABLE dbo.UploadFileDataRows

    GO

    CREATE TABLE [dbo].[UploadFileDataRows](

    [DataRowID] [int] NOT NULL,

    [FileEventID] [int] NOT NULL,

    [ValidationStatus] [varchar](1) NOT NULL CONSTRAINT [DF_UploadFileDataRows_ValidationStatus] DEFAULT ('I'),

    [ValidationNotes] [varchar](MAX) NULL,

    [CreateDate] [datetime] NOT NULL CONSTRAINT [DF_UploadFileDataRows_CreateDate] DEFAULT (getdate()),

    [CreateUser] [varchar](30) NOT NULL CONSTRAINT [DF_UploadFileDataRows_CreateUser] DEFAULT (SYSTEM_USER),

    [UpdateDate] [datetime] NULL,

    [UpdateUser] [varchar](30) NULL,

    CONSTRAINT [PK_UploadFileDataRows] PRIMARY KEY CLUSTERED

    (

    [DataRowID] ASC

    ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ,CONSTRAINT CK_UploadFileData_ValidationStatus CHECK (ValidationStatus IN('I', 'V', 'A', 'P','G','L'))

    ) ON [PRIMARY]

    and the definition of the UploadFileDataItems table is:

    IF OBJECT_ID('dbo.UploadFileDataItems') IS NOT NULL

    DROP TABLE dbo.UploadFileDataItems

    GO

    CREATE TABLE [dbo].[UploadFileDataItems](

    [DataRowID] [int] NOT NULL,

    [FileClassElementCode] [varchar](20) NOT NULL,

    [OriginalData] [varchar](MAX) NOT NULL,

    [TranslatedData] [sql_variant] NULL,

    [ValidationStatus] [varchar](1) NOT NULL CONSTRAINT [DF_UploadFileDataItems_ValidationStatus] DEFAULT ('I'),

    [ValidationMessage] [varchar](MAX) NULL,

    [CreateDate] [datetime] NOT NULL CONSTRAINT [DF_UploadFileDataItems_CreateDate] DEFAULT (getdate()),

    [CreateUser] [varchar](30) NOT NULL CONSTRAINT [DF_UploadFileDataItems_CreateUser] DEFAULT (SYSTEM_USER),

    [UpdateDate] [datetime] NULL,

    [UpdateUser] [varchar](30) NULL,

    CONSTRAINT [PK_UploadFileDataItems] PRIMARY KEY CLUSTERED

    (

    [DataRowID] ASC,

    [FileClassElementCode] ASC

    ) WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

    ,CONSTRAINT CK_UploadFileDataItems_ValidationStatus CHECK (ValidationStatus IN('I', 'V'))

    ) ON [PRIMARY]

    GO

    ALTER TABLE [dbo].[UploadFileDataItems] WITH CHECK ADD CONSTRAINT [FK_UploadFileDataItems_UploadFileDataRows] FOREIGN KEY([DataRowID])

    REFERENCES [dbo].[UploadFileDataRows] ([DataRowID])

    GO

    ALTER TABLE [dbo].[UploadFileDataItems] CHECK CONSTRAINT [FK_UploadFileDataItems_UploadFileDataRows]

    GO

    For example, using the data below I'd like to insert the items into UploadFileDataItems as follows:

    DataRowID FileClassElementCode OriginalData

    --------------------------------------------

    1 TICKER TTS AU Equity

    1 NAME TATTS GROUP LTD

    etc...

    2 TICKER SGM AU Equity

    2 NAME SIMS METAL MANAGEMENT LTD

    etc...

    The definition of the data is as follows:

    <FileData>

    <Table>

    <TICKER>TTS AU Equity</TICKER>

    <NAME>TATTS GROUP LTD</NAME>

    <SECURITY_DES>TTS</SECURITY_DES>

    <EXCH_CODE>AU</EXCH_CODE>

    <COUNTRY_ISO>AU</COUNTRY_ISO>

    <EQY_PRIM_EXCH_SHRT>AT</EQY_PRIM_EXCH_SHRT>

    <INDUSTRY_GROUP>Entertainment</INDUSTRY_GROUP>

    <MARKET_SECTOR_DES>Equity</MARKET_SECTOR_DES>

    <CRNCY>AUD</CRNCY>

    <PX_POS_MULT_FACTOR>1</PX_POS_MULT_FACTOR>

    <EXCH_TRADE_STATUS>N</EXCH_TRADE_STATUS>

    </Table>

    <Table>

    <TICKER>SGM AU Equity</TICKER>

    <NAME>SIMS METAL MANAGEMENT LTD</NAME>

    <SECURITY_DES>SGM</SECURITY_DES>

    <EXCH_CODE>AU</EXCH_CODE>

    <COUNTRY_ISO>US</COUNTRY_ISO>

    <EQY_PRIM_EXCH_SHRT>AT</EQY_PRIM_EXCH_SHRT>

    <INDUSTRY_GROUP>Metal Fabricate/Hardware</INDUSTRY_GROUP>

    <MARKET_SECTOR_DES>Equity</MARKET_SECTOR_DES>

    <CRNCY>AUD</CRNCY>

    <PX_POS_MULT_FACTOR>1</PX_POS_MULT_FACTOR>

    <EXCH_TRADE_STATUS>N</EXCH_TRADE_STATUS>

    </Table>

    </FileData>

    At present my stored procedure is incredibly inefficient and is taking ages to process a 30-table XML string. In the future we're hoping to process 1000+ xml table strings, so I need something a bit better!! 🙂

    If anyone has any ideas of improving the process, or SQL that extracts XML tables/elements efficiently I'll be really interested!!

    As always, many thanks in advance.

    Kev.

  • Ok, so I've made a few changes and now have the data but with a different issue. Here's my SQL at present:

    DECLARE @p_FileDataXML,@p_FileClassElementCount INTEGER

    SET @p_FileClassElementCount = 25

    SET @p_FileData = '<FileData>

    <Table>

    <TICKER>TTS AU Equity</TICKER>

    <ID_ISIN>AU000000TTS5</ID_ISIN>

    <ID_SEDOL1>B0CRCP7</ID_SEDOL1>

    <NAME>TATTS GROUP LTD</NAME>

    <SECURITY_DES>TTS</SECURITY_DES>

    <EXCH_CODE>AU</EXCH_CODE>

    <COUNTRY_ISO>AU</COUNTRY_ISO>

    <EQY_PRIM_EXCH_SHRT>AT</EQY_PRIM_EXCH_SHRT>

    <INDUSTRY_GROUP>Entertainment</INDUSTRY_GROUP>

    <MARKET_SECTOR_DES>Equity</MARKET_SECTOR_DES>

    <CRNCY>AUD</CRNCY>

    <PX_POS_MULT_FACTOR>1</PX_POS_MULT_FACTOR>

    <BS_PAR_VAL>#N/A N/A</BS_PAR_VAL>

    <EXCH_TRADE_STATUS>N</EXCH_TRADE_STATUS>

    <DVD_CRNCY>AUD</DVD_CRNCY>

    <DVD_FREQ>Semi-Anl</DVD_FREQ>

    <LAST_DPS_GROSS>0.15</LAST_DPS_GROSS>

    <CPN>#N/A Field Not Applicable</CPN>

    <CPN_CRNCY>#N/A Field Not Applicable</CPN_CRNCY>

    <CPN_FREQ>#N/A Field Not Applicable</CPN_FREQ>

    <PRIOR_CLOSE_BID>2.34</PRIOR_CLOSE_BID>

    <PRIOR_CLOSE_ASK>2.35</PRIOR_CLOSE_ASK>

    <PX_YEST_CLOSE>2.32</PX_YEST_CLOSE>

    <CLASSIFICATION>X</CLASSIFICATION>

    </Table>

    <Table>

    <TICKER>SGM AU Equity</TICKER>

    <ID_ISIN>AU000000SGM7</ID_ISIN>

    <ID_SEDOL1>B0LCW75</ID_SEDOL1>

    <NAME>SIMS METAL MANAGEMENT LTD</NAME>

    <SECURITY_DES>SGM</SECURITY_DES>

    <EXCH_CODE>AU</EXCH_CODE>

    <COUNTRY_ISO>US</COUNTRY_ISO>

    <EQY_PRIM_EXCH_SHRT>AT</EQY_PRIM_EXCH_SHRT>

    <INDUSTRY_GROUP>Metal Fabricate/Hardware</INDUSTRY_GROUP>

    <MARKET_SECTOR_DES>Equity</MARKET_SECTOR_DES>

    <CRNCY>AUD</CRNCY>

    <PX_POS_MULT_FACTOR>1</PX_POS_MULT_FACTOR>

    <BS_PAR_VAL>#N/A N/A</BS_PAR_VAL>

    <EXCH_TRADE_STATUS>N</EXCH_TRADE_STATUS>

    <DVD_CRNCY>AUD</DVD_CRNCY>

    <DVD_FREQ>Semi-Anl</DVD_FREQ>

    <LAST_DPS_GROSS>0.1416</LAST_DPS_GROSS>

    <CPN>#N/A Field Not Applicable</CPN>

    <CPN_CRNCY>#N/A Field Not Applicable</CPN_CRNCY>

    <CPN_FREQ>#N/A Field Not Applicable</CPN_FREQ>

    <PRIOR_CLOSE_BID>17.82</PRIOR_CLOSE_BID>

    <PRIOR_CLOSE_ASK>18</PRIOR_CLOSE_ASK>

    <PX_YEST_CLOSE>17.47</PX_YEST_CLOSE>

    <CLASSIFICATION>X</CLASSIFICATION>

    </Table></FileData>'

    CREATE TABLE #ProcessUploadFileTemp (

    DataRowIDINTEGERNOT NULL

    ,FileClassElementCodeVARCHAR(20)NOT NULL

    ,OriginalDataVARCHAR(100)NOT NULL)

    INSERT INTO #ProcessUploadFileTemp ([DataRowID],[FileClassElementCode],[OriginalData])

    SELECT CASE WHEN ROW_NUMBER() OVER (ORDER BY T.n.value('parent[1]','varchar(100)')) % @p_FileClassElementCount > 0

    THEN ROW_NUMBER() OVER (ORDER BY T.n.value('parent[1]','varchar(100)')) % @p_FileClassElementCount

    ELSE 0 END

    ,T.n.value('localname[1]', 'varchar(100)')-- FileClassElementCode

    ,T.n.value('value[1]', 'VARCHAR(100)')-- OriginalData

    --,T.n.value('parent[1]', 'VARCHAR(100)') AS parent

    FROM ( SELECT

    x.query('

    for $node in /descendant::node()[local-name() != "xxx"]

    return <node>

    <namespace>{ namespace-uri($node) }</namespace>

    <localname>{ local-name($node) }</localname>

    <value>{ $node }</value>

    <parent>{ local-name($node/..) }</parent>

    </node>') AS nodes

    FROM @p_FileData.nodes('/FileData') x(x)

    ) q1

    CROSS APPLY q1.nodes.nodes('/node') AS T ( n )

    WHERET.n.value('parent[1]', 'VARCHAR(100)') = 'Table'

    SELECT*

    FROM#ProcessUploadFileTemp

    This produces a nice little resultset, with incrementing DataRowID, the correct FileElementClassCode and correct OriginalData. I identify the start of each XML record in #ProcessUploadFileTemp with a DataRowID of zero. Now, my current issue is that I would like this value to be a single value for all the associated items rows within a table, and then increment for each further table. For example, the first 24 rows would have a DataRowID = 1, the second set would be 2, and so on.

    Does anyone have any idea of how to do this please?

    Thanks again!!

  • I modified your temp table and then performed a 'quirky update' to get the DataRowID. You'll have to test this with the volume of data you described.

    DECLARE @p_FileData XML,

    @p_FileClassElementCount INTEGER,

    @p_count integer = 0

    SET @p_FileClassElementCount = 25

    SET @p_FileData = '<FileData>

    <Table>

    <TICKER>TTS AU Equity</TICKER>

    <ID_ISIN>AU000000TTS5</ID_ISIN>

    <ID_SEDOL1>B0CRCP7</ID_SEDOL1>

    <NAME>TATTS GROUP LTD</NAME>

    <SECURITY_DES>TTS</SECURITY_DES>

    <EXCH_CODE>AU</EXCH_CODE>

    <COUNTRY_ISO>AU</COUNTRY_ISO>

    <EQY_PRIM_EXCH_SHRT>AT</EQY_PRIM_EXCH_SHRT>

    <INDUSTRY_GROUP>Entertainment</INDUSTRY_GROUP>

    <MARKET_SECTOR_DES>Equity</MARKET_SECTOR_DES>

    <CRNCY>AUD</CRNCY>

    <PX_POS_MULT_FACTOR>1</PX_POS_MULT_FACTOR>

    <BS_PAR_VAL>#N/A N/A</BS_PAR_VAL>

    <EXCH_TRADE_STATUS>N</EXCH_TRADE_STATUS>

    <DVD_CRNCY>AUD</DVD_CRNCY>

    <DVD_FREQ>Semi-Anl</DVD_FREQ>

    <LAST_DPS_GROSS>0.15</LAST_DPS_GROSS>

    <CPN>#N/A Field Not Applicable</CPN>

    <CPN_CRNCY>#N/A Field Not Applicable</CPN_CRNCY>

    <CPN_FREQ>#N/A Field Not Applicable</CPN_FREQ>

    <PRIOR_CLOSE_BID>2.34</PRIOR_CLOSE_BID>

    <PRIOR_CLOSE_ASK>2.35</PRIOR_CLOSE_ASK>

    <PX_YEST_CLOSE>2.32</PX_YEST_CLOSE>

    <CLASSIFICATION>X</CLASSIFICATION>

    </Table>

    <Table>

    <TICKER>SGM AU Equity</TICKER>

    <ID_ISIN>AU000000SGM7</ID_ISIN>

    <ID_SEDOL1>B0LCW75</ID_SEDOL1>

    <NAME>SIMS METAL MANAGEMENT LTD</NAME>

    <SECURITY_DES>SGM</SECURITY_DES>

    <EXCH_CODE>AU</EXCH_CODE>

    <COUNTRY_ISO>US</COUNTRY_ISO>

    <EQY_PRIM_EXCH_SHRT>AT</EQY_PRIM_EXCH_SHRT>

    <INDUSTRY_GROUP>Metal Fabricate/Hardware</INDUSTRY_GROUP>

    <MARKET_SECTOR_DES>Equity</MARKET_SECTOR_DES>

    <CRNCY>AUD</CRNCY>

    <PX_POS_MULT_FACTOR>1</PX_POS_MULT_FACTOR>

    <BS_PAR_VAL>#N/A N/A</BS_PAR_VAL>

    <EXCH_TRADE_STATUS>N</EXCH_TRADE_STATUS>

    <DVD_CRNCY>AUD</DVD_CRNCY>

    <DVD_FREQ>Semi-Anl</DVD_FREQ>

    <LAST_DPS_GROSS>0.1416</LAST_DPS_GROSS>

    <CPN>#N/A Field Not Applicable</CPN>

    <CPN_CRNCY>#N/A Field Not Applicable</CPN_CRNCY>

    <CPN_FREQ>#N/A Field Not Applicable</CPN_FREQ>

    <PRIOR_CLOSE_BID>17.82</PRIOR_CLOSE_BID>

    <PRIOR_CLOSE_ASK>18</PRIOR_CLOSE_ASK>

    <PX_YEST_CLOSE>17.47</PX_YEST_CLOSE>

    <CLASSIFICATION>X</CLASSIFICATION>

    </Table></FileData>'

    Drop TABLE #ProcessUploadFileTemp

    CREATE TABLE #ProcessUploadFileTemp (

    RowIdentity int identity(1,1) not null

    ,DataRowID INTEGER

    ,FileClassElementCode VARCHAR(20) NOT NULL

    ,OriginalData VARCHAR(100) NOT NULL)

    INSERT INTO #ProcessUploadFileTemp ([FileClassElementCode],[OriginalData])

    SELECT T.n.value('localname[1]', 'varchar(100)') -- FileClassElementCode

    ,T.n.value('value[1]', 'VARCHAR(100)') -- OriginalData

    --,T.n.value('parent[1]', 'VARCHAR(100)') AS parent

    FROM ( SELECT

    x.query('

    for $node in /descendant::node()[local-name() != "xxx"]

    return <node>

    <namespace>{ namespace-uri($node) }</namespace>

    <localname>{ local-name($node) }</localname>

    <value>{ $node }</value>

    <parent>{ local-name($node/..) }</parent>

    </node>') AS nodes

    FROM @p_FileData.nodes('/FileData') x(x)

    ) q1

    CROSS APPLY q1.nodes.nodes('/node') AS T ( n )

    WHERE T.n.value('parent[1]', 'VARCHAR(100)') = 'Table'

    --SELECT *

    --FROM #ProcessUploadFileTemp

    --Quirky update to modify DataRowId based on Ticker Class Element Code

    update #ProcessUploadFileTemp

    set @p_count = DataRowID = Case when FileClassElementCode = 'TICKER' then

    @p_count + 1

    else @p_count

    end

    OPTION (MAXDOP 1)

    select DataRowID, FileClassElementCode, OriginalData

    from #ProcessUploadFileTemp

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

  • Mike01,

    Thanks for the update and the code looks great, however....I don't actually know what the name of starting element will be! 🙁 We have 15 or so templates, all of which are a subset of a larger field set, and thus I don't know which fields will be included, or whereabouts each element will appear.

    Is there any way I can modify the code to represent this?

    Thanks for the reply though!

    Cheers,

    Kev.

    EDIT: I changed the code like this, and although not particularly elegant, it works!

    DECLARE @TopElementClassCodeVARCHAR(20)

    SELECT @TopElementClassCode = FileClassElementCode

    FROM #ProcessUploadFileTemp

    WHERERowIdentity = 1

    --Quirky update to modify DataRowId based on Ticker Class Element Code

    update #ProcessUploadFileTemp

    set @p_count = DataRowID = Case when FileClassElementCode = @TopElementClassCode then

    @p_count + 1

    else @p_count

    end

    OPTION (MAXDOP 1)

  • How about getting the first record in the temp table since the identity column is being used. Something like this

    declare @TopElementClassCode varchar(30)

    select @TopElementClassCode = FileClassElementCode

    from #ProcessUploadFileTemp

    where RowIdentity = 1

    --Quirky update to modify DataRowId based on Ticker Class Element Code

    update #ProcessUploadFileTemp

    set @p_count = DataRowID = Case when FileClassElementCode = @TopElementClassCode then

    @p_count + 1

    else @p_count

    end

    OPTION (MAXDOP 1)

    select DataRowID, FileClassElementCode, OriginalData

    from #ProcessUploadFileTemp

    For better, quicker answers, click on the following...
    http://www.sqlservercentral.com/articles/Best+Practices/61537/

    For better answers on performance questions, click on the following...
    http://www.sqlservercentral.com/articles/SQLServerCentral/66909/

Viewing 5 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply