Why XML Datatype present in SQL Server.

  • Hi all experts,

    Every now and then i am learning something new in SQL Server. Latest in that series is XML Datatype. While learning XML datatype i came to know that they are used to stored the XML Document, but being a newbie its hard for me to relate this to a real life scenario where we would be using it. Like when we have to stored number its int, for text its nvarchar, we do have datatype for storing the image also. But really can't get storing XML Document! :w00t:

    Could you guys please help on this.

  • probably this would help:

    http://msdn.microsoft.com/en-us/library/bb522493(SQL.105).aspx

    _____________
    Code for TallyGenerator

  • Shadab Shah (5/7/2013)


    Hi all experts,

    Every now and then i am learning something new in SQL Server. Latest in that series is XML Datatype. While learning XML datatype i came to know that they are used to stored the XML Document, but being a newbie its hard for me to relate this to a real life scenario where we would be using it. Like when we have to stored number its int, for text its nvarchar, we do have datatype for storing the image also. But really can't get storing XML Document! :w00t:

    Could you guys please help on this.

    "Measure all that is measurable and attempt to make measurable that which is not yet so.

    -- Galileo (1564 to 1642)"

    The XML datatype makes a bunch of unmeasurable stuff measurable because it allows you to store and read XML. That may not sound like a big deal until you look under the covers you will discover that almost everything is stored as XML. Thanks to the XML datatype Word Documents*, Excel Files*, web pages (you are looking at XML right now)**, SQL traces, SSRS Report files (RDL), SSIS packages, RSS feeds, whatever.config, etc, etc, etc, etc, etc can be read, manipulated and created using T-SQL.

    For fun, here's a query that will read data from a table and return an MS Word Document:

    First let's setup the sample data:

    IF OBJECT_ID('tempdb..#testdata') IS NOT NULL

    DROP TABLE #testdata;

    CREATE TABLE #testdata (id int primary key, txt varchar(100) NOT NULL);

    INSERT INTO #testdata

    SELECT 1,'Paragraph 1...' UNION ALL

    SELECT 2,'Paragraph 2...' UNION ALL

    SELECT 3,'Paragraph 3...';

    Query that turns that data into a Word Doc:

    DECLARE @pars AS varchar(1000)='',

    @wordDoc varchar(1000)='<!--?-->

    <w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml">

    <w:body><!--x--></w:body>

    </w:wordDocument>';

    WITH data(d) AS

    (SELECT '<w:p><w:r><w:t>'+CAST(id AS char(1))+': '+txt+'</w:t></w:r></w:p>'+CHAR(13)

    FROM #testdata)

    SELECT @pars=@pars+d+CHAR(13)

    FROM data;

    SELECT REPLACE(REPLACE( @wordDoc,'<!--?-->',

    '<?xml version="1.0" encoding="ISO-8859-1"?>'+CHAR(13)+

    '<?mso-application progid="Word.Document"?>'),

    '<!--x-->',CHAR(13)+@pars);

    Results:

    <?xml version="1.0" encoding="ISO-8859-1"?>

    <?mso-application progid="Word.Document"?>

    <w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml">

    <w:body>

    <w:p><w:r><w:t>1: Paragraph 1...</w:t></w:r></w:p>

    <w:p><w:r><w:t>2: Paragraph 2...</w:t></w:r></w:p>

    <w:p><w:r><w:t>3: Paragraph 3...</w:t></w:r></w:p>

    </w:body>

    </w:wordDocument>

    If you copy/paste the result set into notepad and save it as, say, wordDoc.xml you will see this (if you have MS office installed):

    When you open it will look like this:

    You could save it as a docx file or you could automate the whole thing.

    * Find a .docx (Word) or .xlsx (Excel) file > change the extension to .zip and open it up :w00t:

    ** HTML 4.01 and higher must be well-formed XML

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

  • Alan above walked you through XML in general, but... um... hm. A few comments (not on Alan's work, that was pretty thorough).

    for text its nvarchar

    No. For text it's VARCHAR(), unless you happen to really need asian characters. Otherwise you're doubling your storage space for no reason.

    INT is only one form of number. Usually you end up with DECIMAL for real values, INTs for IDs.

    The datatype image (and text, for that matter) are going obsolete. You want the VARCHAR(MAX) and VARBINARY(MAX) datatypes now. It'll save you a headache later, and they're more versatile to boot.

    However, even though Alan mentioned all the places where XML can come from, in general XML comes from a bare few places in your DB career in my particular experience.

    - Your front end folks insist on storing it with some other data.

    - You're passing table data back and forth across a linked server trying to reduce round trips.

    - Service Broker

    There are of course exceptions to every rule but you typically won't know nor care about the XML datatype until you need it. Similar to the (MAX) datatypes. They're very rarely used, if at all, in most systems.

    EDIT: Think of it as a left-handed smoke bender. It's a tool you'll almost never use, but when you need it, nothing else will compare.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Alan.B,

    I think you missing the point.

    The question was not about use of XML in real life, but about why we cannot store XML files using good old nvarchar/ntext data type.

    Your example does not show any use of XML data processing.

    Yes, the outcome is an XML file, but you totally operate with varchar values.

    You could read, parse, build XML files in SQL Server 2000, having no XML data in place.

    Out of curiosity I put your script into Query Analyzer and after minor syntax tweaks it worked fine in SQL 2000:

    IF OBJECT_ID('tempdb..#testdata') IS NOT NULL

    DROP TABLE #testdata

    CREATE TABLE #testdata (id int primary key, txt varchar(100) NOT NULL)

    INSERT INTO #testdata

    SELECT 1,'Paragraph 1...' UNION ALL

    SELECT 2,'Paragraph 2...' UNION ALL

    SELECT 3,'Paragraph 3...'

    GO

    DECLARE @pars AS varchar(1000),

    @wordDoc varchar(1000)

    SET @pars=''

    SET @wordDoc ='<!--?-->

    <w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml">

    <w:body><!--x--></w:body>

    </w:wordDocument>'

    SELECT @pars=@pars+d+CHAR(13)

    FROM (

    SELECT '<w:p><w:r><w:t>'+CAST(id AS char(1))+': '+txt+'</w:t></w:r></w:p>'+CHAR(13)

    FROM #testdata) data(d)

    SELECT REPLACE(REPLACE( @wordDoc,'<!--?-->',

    '<?xml version="1.0" encoding="ISO-8859-1"?>'+CHAR(13)+

    '<?mso-application progid="Word.Document"?>'),

    '<!--x-->',CHAR(13)+@pars)

    You see? You do not need XML data type for this at all.

    _____________
    Code for TallyGenerator

  • ...

    The question was not about use of XML in real life, but about why we cannot store XML files using good old nvarchar/ntext data type.

    ...

    Yes, the outcome is an XML file, but you totally operate with varchar values.

    You could read, parse, build XML files in SQL Server 2000, having no XML data in place.

    ...

    Why SQL has INT, SMALLINT etc, while it has DECIMAL which also can be used to stored integer values?

    Why we need VARCHAR, CHAR and NCHAR if NVARCHAR will be capable of storing any character values?

    I guess, it's because some more restricted datatype provide benefits of being more tuned for the relevant purpose. In case of XML datatype it provides explicit validation that XML is in well-format form. Also, it allows you to enforce that your XML is compliant to the required schema (Typed XML).

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Eugene Elutin (5/9/2013)


    Why SQL has INT, SMALLINT etc, while it has DECIMAL which also can be used to stored integer values?

    I guess because DECIMAL (9,0) (same capacity as INT) takes twice as much storage space and N times more time for calculations, as it requires floating point arythmetics. 🙂

    In case of XML datatype it provides explicit validation that XML is in well-format form. Also, it allows you to enforce that your XML is compliant to the required schema (Typed XML).

    Also it provides "in-XML" indexing (full-text index kinda stuff), selecting subsets from XML data, reformatting XML and lots of other things.

    Which are wery expensive in terms of resourses, very time consuming and, strictly speaking, have nothing to do with relational database.

    But, on another hand, allow easy handling of small XML messages without spending time on developing numerous inerfaces for parsing XMLs and placing data into proper relational structure. Which (adding to the trouble) may change on very short notice. In such situation it's easier to alter the scema assocoated with XML column then redo database structure.

    In this case it makes sense to sacrifice computing resources of the server to gain flexibility.

    _____________
    Code for TallyGenerator

  • Sergiy (5/8/2013)


    Alan.B,

    I think you missing the point.

    The question was not about use of XML in real life, but about why we cannot store XML files using good old nvarchar/ntext data type.

    Your example does not show any use of XML data processing.

    Yes, the outcome is an XML file, but you totally operate with varchar values.

    You could read, parse, build XML files in SQL Server 2000, having no XML data in place.

    Out of curiosity I put your script into Query Analyzer and after minor syntax tweaks it worked fine in SQL 2000:

    IF OBJECT_ID('tempdb..#testdata') IS NOT NULL

    DROP TABLE #testdata

    CREATE TABLE #testdata (id int primary key, txt varchar(100) NOT NULL)

    INSERT INTO #testdata

    SELECT 1,'Paragraph 1...' UNION ALL

    SELECT 2,'Paragraph 2...' UNION ALL

    SELECT 3,'Paragraph 3...'

    GO

    DECLARE @pars AS varchar(1000),

    @wordDoc varchar(1000)

    SET @pars=''

    SET @wordDoc ='<!--?-->

    <w:wordDocument xmlns:w="http://schemas.microsoft.com/office/word/2003/wordml">

    <w:body><!--x--></w:body>

    </w:wordDocument>'

    SELECT @pars=@pars+d+CHAR(13)

    FROM (

    SELECT '<w:p><w:r><w:t>'+CAST(id AS char(1))+': '+txt+'</w:t></w:r></w:p>'+CHAR(13)

    FROM #testdata) data(d)

    SELECT REPLACE(REPLACE( @wordDoc,'<!--?-->',

    '<?xml version="1.0" encoding="ISO-8859-1"?>'+CHAR(13)+

    '<?mso-application progid="Word.Document"?>'),

    '<!--x-->',CHAR(13)+@pars)

    You see? You do not need XML data type for this at all.

    Thanks and sorry for the late reply (been on vacation).

    I think that the link you included above pretty much summed it up. I was just adding my 2 cents on XML, the XML datatype, etc. Perhaps I should have included a couple examples.

    Below are five real-life examples of where you would use the XML datatype.

    /*******************************************************

    (0) Create some sample data

    Note: For simplicity I am only including XML fragments and

    I removed the namespaces, namespace declarations, etc.

    *******************************************************/

    IF OBJECT_ID('tempdb..#data') IS NOT NULL

    DROP TABLE #data;

    CREATE TABLE #data (id int primary key, doc xml not null);

    DECLARE @worddoc xml='

    <wordDocument>

    <body>

    <p><r><t>1: Paragraph 1...</t></r></p>

    <p><r><t>2: Paragraph 2...</t></r></p>

    <p><r><t>3: Paragraph 3...</t></r></p>

    </body>

    </wordDocument>';

    DECLARE @ssrs_report xml=

    REPLACE(CAST(

    '<Report>

    <DataSources>

    <DataSource Name="QuestNTDev">

    <DataSourceReference>Something_Dev</DataSourceReference>

    <rd:SecurityType>None</rd:SecurityType>

    <rd:DataSourceID>54f84859-75fd-45a8-a5b7-5f6fec3ede7b</rd:DataSourceID>

    </DataSource>

    </DataSources>

    <DataSets>

    <DataSet Name="ds_p_major_region"/>

    <DataSet Name="ds_p_country_code"/>

    <DataSet Name="ds_p_acct_group"/>

    <DataSet Name="ds_p_currency_code"/>

    <DataSet Name="ds_p_profit_center"/>

    <DataSet Name="ds_p_office_id"/>

    <DataSet Name="ds_p_total_by_office"/>

    <DataSet Name="ds_p_subtotal_on_summary"/>

    <DataSet Name="ds_get_spid"/>

    <DataSet Name="ds_output"/>

    </DataSets>

    </Report>' AS varchar(1000)),'rd:','');

    DECLARE @events xml='

    <Events>

    <Event>

    <System>

    <Provider Name="MSSQL$SQL2012"/>

    <EventID Qualifiers="16384">17137</EventID>

    <TimeCreated SystemTime="2012-05-10T16:18:53.000000000Z"/>

    <EventRecordID>16563</EventRecordID>

    <Channel>Application</Channel>

    <Computer>SQLSERVER1</Computer>

    </System>

    <EventData>

    <Data>ReportServer$SQL2012TESTTempDB</Data>

    <Binary>F14200000A000000...</Binary>

    </EventData>

    </Event>

    <Event>

    <System>

    <Provider Name="MSSQL$SQL2012"/>

    <EventID Qualifiers="19999">1000225</EventID>

    <TimeCreated SystemTime="2013-05-6T16:18:53.000000000Z"/>

    <EventRecordID>16999</EventRecordID>

    <Channel>Application</Channel>

    <Computer>SQLSERVER1</Computer>

    </System>

    <EventData>

    <Data>blah</Data>

    <Binary>F14200000A000000...</Binary>

    </EventData>

    </Event>

    </Events>';

    DECLARE @ssis_package xml=

    REPLACE(CAST(

    '<DTS:Executable DTS:ExecutableType="SSIS.Package.2">

    <DTS:Property DTS:Name="PackageFormatVersion">2</DTS:Property>

    <DTS:Property DTS:Name="VersionComments"></DTS:Property>

    <DTS:Property DTS:Name="CreatorName">DOMAIN\username</DTS:Property>

    <DTS:Property DTS:Name="CreatorComputerName">AJBTEST</DTS:Property>

    <DTS:Property DTS:Name="CreationDate"

    DTS:DataType="7">11/25/2012 3:35:28 PM</DTS:Property>

    </DTS:Executable>

    ' AS varchar(1000)),'DTS:','');

    DECLARE @sqlTrace xml='

    <TraceData>

    <Events>

    <Event id="65534" name="Trace Start">

    <Column id="14" name="StartTime">2013-05-10T17:21:28.007-05:00</Column>

    </Event>

    <Event id="13" name="SQL:BatchStarting">

    <Column id="1" name="TextData">SELECT 1</Column>

    <Column id="3" name="DatabaseID">7</Column>

    <Column id="35" name="DatabaseName">ajbTest</Column>

    <Column id="12" name="SPID">60</Column>

    <Column id="14" name="StartTime">2013-05-10T17:21:48.917-05:00</Column>

    <Column id="26" name="ServerName">MYSERVER\SQL2012Express</Column>

    </Event>

    <Event id="65533" name="Trace Stop">

    <Column id="14" name="StartTime">2013-05-10T17:21:54.46-05:00</Column>

    </Event>

    </Events>

    </TraceData>'

    INSERT INTO #data

    SELECT 1, @worddoc UNION ALL SELECT 2, @ssrs_report UNION ALL

    SELECT 3, @events UNION ALL SELECT 4, @ssis_package UNION ALL SELECT 5, @sqlTrace;

    /*******************************************************

    (1) Reading from a word doc

    *******************************************************/

    SELECT p.value('(r/t/text())[1]', 'varchar(20)') AS p_num

    FROM #data

    CROSS APPLY doc.nodes('wordDocument/body/p') AS doc(p)

    WHERE id=1;

    /*******************************************************

    (2) Reading from an SSRS Report

    *******************************************************/

    SELECTdatasource.value('(DataSourceReference/text())[1]', 'varchar(20)') AS DataSource,

    dataset.value('(@Name)', 'varchar(20)') AS DataSet

    FROM #data

    CROSS APPLY doc.nodes('Report/DataSources/DataSource') AS datasources(datasource)

    CROSS APPLY doc.nodes('Report/DataSets/DataSet') AS datasets(dataset)

    WHERE id=2;

    /*******************************************************

    (3) Reading from extracted event viewer data

    *******************************************************/

    SELECT[System].value('(Computer/text())[1]', 'varchar(20)') AS SQLBox,

    [System].value('(../EventData/Data/text())[1]', 'varchar(50)') AS DB,

    [System].value('(TimeCreated/@SystemTime)[1]', 'varchar(20)') AS StartupTime

    FROM #data

    CROSS APPLY doc.nodes('Events/Event/System') AS events([System])

    WHERE id=3

    AND [System].value('(EventRecordID/text())[1]', 'int')=16563

    AND [System].value('(Channel/text())[1]', 'varchar(20)')='Application';

    /*******************************************************

    (4) Reading from a SSIS package

    *******************************************************/

    SELECTx.value('(@ExecutableType)[1]', 'varchar(100)') AS PkgType,

    x.value('(Property[@Name="CreatorName"])[1]', 'varchar(100)') AS Creator,

    x.value('(Property[@Name="CreationDate"])[1]', 'varchar(100)') AS CreatedOn,

    x.value('(Property[@Name="CreatorComputerName"])[1]', 'varchar(100)') AS Computer

    FROM #data

    CROSS APPLY doc.nodes('Executable') AS ex(x)

    WHERE id=4;

    /*******************************************************

    (5) Reading from an SQL trace (XML format)

    *******************************************************/

    SELECT e.value('(@name)[1]', 'varchar(100)') AS [event],

    e.value('(Column[@name="StartTime"]/text())[1]', 'varchar(100)') AS StartTime,

    e.value('(Column[@id="1"]/text())[1]', 'varchar(100)') AS Command

    FROM #data

    CROSS APPLY doc.nodes('TraceData/Events/Event') AS events(e)

    WHERE id=5;

    --Cleanup

    DROP TABLE #data;

    "I cant stress enough the importance of switching from a sequential files mindset to set-based thinking. After you make the switch, you can spend your time tuning and optimizing your queries instead of maintaining lengthy, poor-performing code."

    -- Itzik Ben-Gan 2001

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply