xml

  • hi,

    i have an xml file contains title and abstract which is stored s in tables...now i want to index an abstract...how do i do it?kindly help me..

  • Hi and welcome to the forum. Could you do us a favor and post a create table script and some sample data, makes it much easier to answer your question.

    😎

  • CREATE DATABASE pap17

    GO

    USE pap17

    GO

    CREATE TABLE paptable17

    (

    Id INT IDENTITY PRIMARY KEY,

    XMLData XML,

    LoadedDateTime DATETIME

    )

    INSERT INTO paptable17(XMLData, LoadedDateTime)

    SELECT CONVERT(XML, BulkColumn) AS BulkColumn, GETDATE()

    FROM OPENROWSET(BULK 'D:\test.xml', SINGLE_BLOB) AS x;

    SELECT * FROM paptable17

    while clicking test.xml the following code appears...

    <ROOT>

    <Customers>

    <Customer CustomerID="C001" CustomerName="Arshad Ali">

    <Orders>

    < Order OrderID="Over-expression of miR-146a in basal-like breast cancer cells confers enhanced tumorigenic potential in association with altered p53 status" OrderDate="2012-07-04T00:00:00">

    <Abstract>The tumor suppressor p53 is the most frequently mutated gene in human cancers.

    </Abstract>

    </Order >

    </Orders>

    </Customer>

    </Customers>

    </ROOT>

    now i want to index an abstract....help me...

  • Quick questions,

    1. What Version and Edition of SQL Server are you on (SELECT @@VERSION) ?

    2. Do you have a XSD (XML Schema Definition) for the XML date?

    Here is a quick sample, only a skeleton as I'll need the answers to the questions in order to advice further.

    USE tempdb;

    GO

    DECLARE @TXML XML = N'<ROOT>

    <Customers>

    <Customer CustomerID="C001" CustomerName="Arshad Ali">

    <Orders>

    <Order OrderID="Over-expression of miR-146a in basal-like breast cancer cells confers enhanced tumorigenic potential in association with altered p53 status" OrderDate="2012-07-04T00:00:00">

    <Abstract>The tumor suppressor p53 is the most frequently mutated gene in human cancers.

    </Abstract>

    </Order >

    </Orders>

    </Customer>

    </Customers>

    </ROOT>

    ';

    CREATE TABLE dbo.paptable17

    (

    Id INT IDENTITY(1,1) PRIMARY KEY,

    XMLData XML,

    LoadedDateTime DATETIME DEFAULT(GETDATE())

    )

    INSERT INTO dbo.paptable17 (XMLData)

    SELECT @TXML UNION ALL

    SELECT @TXML UNION ALL

    SELECT @TXML UNION ALL

    SELECT @TXML UNION ALL

    SELECT @TXML;

    GO

    /* Create primary xml index

    */

    CREATE PRIMARY XML INDEX [PRXMLIDX_DBO_PAPTABLE7_XMLDATA] ON [dbo].[paptable17]

    (

    [XMLData]

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

    GO

    GO

    /* Secondary Index, DON'T run this one yet

    CREATE XML INDEX [SCNDXMLIDX_DBO_PAPTABLE7_XMLDATA_ABSTRACT] ON [dbo].[paptable17]

    (

    [XMLData]

    )

    USING XML INDEX [PRXMLIDX_DBO_PAPTABLE7_XMLDATA] FOR VALUE WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

    GO

    */

    SELECT

    TX.Id

    ,TX.LoadedDateTime

    ,CUSTOMER.DATA.value('@CustomerID' ,'VARCHAR(10)' ) AS CustomerID

    ,CUSTORDER.DATA.value('@OrderDate' ,'DATETIME' ) AS OrderDate

    ,ABSTRACT.DATA.value('.[1]' ,'NVARCHAR(1000)') AS Abstract

    FROM dbo.paptable17 TX

    OUTER APPLY TX.XMLData.nodes('ROOT/Customers/Customer') AS CUSTOMER(DATA)

    OUTER APPLY CUSTOMER.DATA.nodes('Orders/Order') AS CUSTORDER(DATA)

    OUTER APPLY CUSTORDER.DATA.nodes('Abstract') AS ABSTRACT(DATA)

    WHERE ABSTRACT.DATA.value('.[1]' ,'NVARCHAR(1000)') LIKE '%suppressor%'

  • Bunch of thanks for your fast reply....

    1.2008 r2

    2.don hav XSD file..

    suppose if we index an Xml data....how do we can view those indexes????

  • sakthikarajen (9/4/2014)


    Bunch of thanks for your fast reply....

    1.2008 r2

    2.don hav XSD file..

    suppose if we index an Xml data....how do we can view those indexes????

    There are few options and which is best depends on the usage. An alternative to an XML index is a computed column, searching or fetching a single element text or attribute value is many times quicker than parsing, even with an XML index in place (in general, read almost and depends).

    😎

    Here is a sample of most options you have

    USE tempdb;

    GO

    DECLARE @TXML XML = N'<ROOT>

    <Customers>

    <Customer CustomerID="C001" CustomerName="Arshad Ali">

    <Orders>

    <Order OrderID="Over-expression of miR-146a in basal-like breast cancer cells." OrderDate="2012-07-04T00:00:00">

    <Abstract>The tumor suppressor p53 is the most frequently mutated gene in human cancers.

    </Abstract>

    </Order >

    </Orders>

    </Customer>

    </Customers>

    </ROOT>

    ';

    /* Create the table with an XML column*/

    CREATE TABLE dbo.paptable17

    (

    Id INT IDENTITY(1,1) PRIMARY KEY,

    XMLData XML,

    LoadedDateTime DATETIME DEFAULT(GETDATE())

    );

    /* Insert some sample data */

    INSERT INTO dbo.paptable17 (XMLData)

    SELECT @TXML UNION ALL

    SELECT @TXML UNION ALL

    SELECT @TXML UNION ALL

    SELECT @TXML UNION ALL

    SELECT @TXML;

    GO

    /* Create primary xml index

    */

    CREATE PRIMARY XML INDEX [PRXMLIDX_DBO_PAPTABLE7_XMLDATA] ON [dbo].[paptable17]

    (

    [XMLData]

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

    GO

    /* Secondary Index FOR VALUE

    */

    CREATE XML INDEX [SCNDXMLIDX_DBO_PAPTABLE7_XMLDATA_ABSTRACT_VALUE] ON [dbo].[paptable17]

    (

    [XMLData]

    )

    USING XML INDEX [PRXMLIDX_DBO_PAPTABLE7_XMLDATA] FOR VALUE WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

    GO

    /* Secondary Index FOR PATH

    */

    CREATE XML INDEX [SCNDXMLIDX_DBO_PAPTABLE7_XMLDATA_ABSTRACT_PATH] ON [dbo].[paptable17]

    (

    [XMLData]

    )

    USING XML INDEX [PRXMLIDX_DBO_PAPTABLE7_XMLDATA] FOR PATH WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

    GO

    /* Secondary Index FOR PROPERTY

    */

    CREATE XML INDEX [SCNDXMLIDX_DBO_PAPTABLE7_XMLDATA_ABSTRACT_PROPERTY] ON [dbo].[paptable17]

    (

    [XMLData]

    )

    USING XML INDEX [PRXMLIDX_DBO_PAPTABLE7_XMLDATA] FOR PROPERTY WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

    GO

    /* Create a user defined function which returns the

    Absract column for a given Id

    */

    CREATE FUNCTION dbo.STVFN_GET_PAPTABLE7_ABSTRACT

    (

    @Id INT

    ) RETURNS NVARCHAR(250)

    WITH SCHEMABINDING

    AS

    BEGIN

    RETURN (

    SELECT

    ABSTRACT.DATA.value('.[1]' ,'NVARCHAR(500)') AS Abstract

    FROM dbo.paptable17 TX

    OUTER APPLY TX.XMLData.nodes('ROOT/Customers/Customer/Orders/Order/Abstract') ABSTRACT(DATA)

    WHERE TX.Id = @Id

    )

    END

    GO

    /*

    Add a computed column using the user defined function

    */

    ALTER TABLE dbo.paptable17 ADD

    Abstract AS ([dbo].[STVFN_GET_PAPTABLE7_ABSTRACT]([Id])); -- Cannot be PERSISTED;

    GO

    /* view the xml indexes */

    SELECT * FROM sys.xml_indexes;

    /* Cleanup

    ALTER TABLE dbo.paptable17 DROP COLUMN Abstract;

    DROP FUNCTION dbo.STVFN_GET_PAPTABLE7_ABSTRACT;

    DROP TABLE dbo.paptable17;

    */

  • thank u...you helped me a lot..:-)

    one more query....

    now in case i want to update my xmlfile i.e i want to add an <result> after <abstract>...how do i perform?

    if i want to index result also along with abstract...how can it be done?

  • sakthikarajen (9/7/2014)


    thank u...you helped me a lot..:-)

    one more query....

    now in case i want to update my xmlfile i.e i want to add an <result> after <abstract>...how do i perform?

    if i want to index result also along with abstract...how can it be done?

    You'll do it exactly the same way as before, here is an extended code sample

    😎

    USE tempdb;

    GO

    DECLARE @TXML XML = N'<ROOT>

    <Customers>

    <Customer CustomerID="C001" CustomerName="Arshad Ali">

    <Orders>

    <Order OrderID="Over-expression of miR-146a in basal-like breast cancer cells." OrderDate="2012-07-04T00:00:00">

    <Abstract>The tumor suppressor p53 is the most frequently mutated gene in human cancers.

    </Abstract>

    <!-- new node Result -->

    <Result>The result goes here</Result>

    </Order >

    </Orders>

    </Customer>

    </Customers>

    </ROOT>

    ';

    /* Create the table with an XML column*/

    CREATE TABLE dbo.paptable17

    (

    Id INT IDENTITY(1,1) PRIMARY KEY,

    XMLData XML,

    LoadedDateTime DATETIME DEFAULT(GETDATE())

    );

    /* Insert some sample data */

    INSERT INTO dbo.paptable17 (XMLData)

    SELECT @TXML UNION ALL

    SELECT @TXML UNION ALL

    SELECT @TXML UNION ALL

    SELECT @TXML UNION ALL

    SELECT @TXML;

    GO

    /* Create primary xml index

    */

    CREATE PRIMARY XML INDEX [PRXMLIDX_DBO_PAPTABLE7_XMLDATA] ON [dbo].[paptable17]

    (

    [XMLData]

    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

    GO

    /* Secondary Index FOR VALUE

    */

    CREATE XML INDEX [SCNDXMLIDX_DBO_PAPTABLE7_XMLDATA_ABSTRACT_VALUE] ON [dbo].[paptable17]

    (

    [XMLData]

    )

    USING XML INDEX [PRXMLIDX_DBO_PAPTABLE7_XMLDATA] FOR VALUE WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

    GO

    /* Secondary Index FOR PATH

    */

    CREATE XML INDEX [SCNDXMLIDX_DBO_PAPTABLE7_XMLDATA_ABSTRACT_PATH] ON [dbo].[paptable17]

    (

    [XMLData]

    )

    USING XML INDEX [PRXMLIDX_DBO_PAPTABLE7_XMLDATA] FOR PATH WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

    GO

    /* Secondary Index FOR PROPERTY

    */

    CREATE XML INDEX [SCNDXMLIDX_DBO_PAPTABLE7_XMLDATA_ABSTRACT_PROPERTY] ON [dbo].[paptable17]

    (

    [XMLData]

    )

    USING XML INDEX [PRXMLIDX_DBO_PAPTABLE7_XMLDATA] FOR PROPERTY WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON)

    GO

    /* Create a user defined function which returns the

    Absract column for a given Id

    */

    CREATE FUNCTION dbo.STVFN_GET_PAPTABLE7_ABSTRACT

    (

    @Id INT

    ) RETURNS NVARCHAR(250)

    WITH SCHEMABINDING

    AS

    BEGIN

    RETURN (

    SELECT

    ABSTRACT.DATA.value('.[1]' ,'NVARCHAR(500)') AS Abstract

    FROM dbo.paptable17 TX

    OUTER APPLY TX.XMLData.nodes('ROOT/Customers/Customer/Orders/Order/Abstract') ABSTRACT(DATA)

    WHERE TX.Id = @Id

    )

    END

    GO

    /* Create a user defined function which returns the

    Reusult column for a given Id

    */

    CREATE FUNCTION dbo.STVFN_GET_PAPTABLE7_RESULT

    (

    @Id INT

    ) RETURNS NVARCHAR(250)

    WITH SCHEMABINDING

    AS

    BEGIN

    RETURN (

    SELECT

    ABSTRACT.DATA.value('.[1]' ,'NVARCHAR(500)') AS Abstract

    FROM dbo.paptable17 TX

    OUTER APPLY TX.XMLData.nodes('ROOT/Customers/Customer/Orders/Order/Result') ABSTRACT(DATA)

    WHERE TX.Id = @Id

    )

    END

    GO

    /*

    Add a computed column using the user defined function

    */

    ALTER TABLE dbo.paptable17 ADD

    Abstract AS ([dbo].[STVFN_GET_PAPTABLE7_ABSTRACT]([Id])); -- Cannot be PERSISTED;

    GO

    /*

    Add a computed column using the user defined function

    */

    ALTER TABLE dbo.paptable17 ADD

    Result AS ([dbo].[STVFN_GET_PAPTABLE7_RESULT]([Id])); -- Cannot be PERSISTED;

    GO

    /* view the xml indexes */

    SELECT * FROM sys.xml_indexes;

    /* Cleanup

    ALTER TABLE dbo.paptable17 DROP COLUMN Abstract;

    ALTER TABLE dbo.paptable17 DROP COLUMN Result;

    DROP FUNCTION dbo.STVFN_GET_PAPTABLE7_ABSTRACT;

    DROP FUNCTION dbo.STVFN_GET_PAPTABLE7_RESULT;

    DROP TABLE dbo.paptable17;

    */

  • whether it is optable for very large xml file?

  • sakthikarajen (9/7/2014)


    whether it is optable for very large xml file?

    Depends on how large, system specs, tempdb configuration etc. A general answer given that it is an untyped XML would be to skip the indexes and use only the calculated columns if sizes are too great for the system to handle, something you'll have to try out.

    Quick question, what edition of SQL Server (Standard, Enterprise etc.) do you have?

    😎

  • sqlserver 2008 r2 enterprise edition

    MICROSOFT SQL SERVER COMPACT 3.5 WITH SERVICE PACK 2

  • sql server enterprise edition 2008 r2

    MICROSOFT SQL SERVER COMPACT 3.5 WITH SERVICE PACK 2

  • sakthikarajen (9/7/2014)


    sql server enterprise edition 2008 r2

    Some options on the Enterprise edition, in case of a very large set, an option would be to export the content of the attribute columns to a table with data compression and index compression.

    MICROSOFT SQL SERVER COMPACT 3.5 WITH SERVICE PACK 2

    Not much you can do here apart of what we have already gone through.

    To improve performance, my advice is to assert whether it is possible to either obtain or create an XSD in order to use the benefits of a typed XML .

    😎

  • Found an amazing website http://techgurulab.com/study-materials having combination of Quizzes, Test, Tutorials and Study Materials.

Viewing 14 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply