XML Schema Validation with SQLXMLBulkLoad.3.0

  • Hi,

    I have created a DTS package with an Activex Script Task using SQLXMLBulkLoad.3.0, to import XML files into the database. Before executing the bulk load I want to validate the XML files against a Schema file. If there are any errors I want to list all these errors in a seperate file, so that I can inform the respective sender about the problems in the XML file. How do I do this?

    Note: I am using SQLServer 2000 on Windows 2003 server.

    Thanks

  • When you use SQLXMLBulkLoad the XML file will be validated against the schema file, and you can specify an error file that will record any failures including schema problems...

    Here's some code that I've recently implemented within a DTS package - I use global variables to specify things like destination database and the error file name...

    '**********************************************************************

    ' Visual Basic ActiveX Script

    '************************************************************************

    Function Main()

    Dim objSQLXMLBL ' Bulk Load Object.

    Set objSQLXMLBL = CreateObject("SQLXMLBULKLOAD.SQLXMLBulkLoad.3.0")

    ' Because we want control over the transaction, we're using the

    ' ConnectionCommand to get that control.

    Dim objConn ' ADODB.Connection for Bulk Loading

    Dim objCmd ' ADODB.Command for Bulk Loading

    dim strConn

    strConn = "Provider=SQLOLEDB;Server=(local);Database=" & DTSGlobalVariables("gv_Destination_DB").Value & ";Integrated Security=SSPI"

    Set objConn = CreateObject("ADODB.Connection")

    objConn.Open strConn

    Set objCmd = CreateObject("ADODB.Command")

    Set objCmd.ActiveConnection = objConn

    objSQLXMLBL.ConnectionCommand = objCmd ' Let, not Set for Properties

    objSQLXMLBL.Transaction = True ' Required with ConnectionCommand

    objSQLXMLBL.TempFilePath = DTSGlobalVariables("gv_Temp_Folder").Value ' Must have enough space.

    objSQLXMLBL.BulkLoad = True ' Really do it, don't just create the tables.

    ' Create a named XML Error Log file so we may see any messages.

    objSQLXMLBL.ErrorLogFile = DTSGlobalVariables("gv_Error_File_Name").Value

    objSQLXMLBL.ForceTableLock = True ' For better performance.

    objSQLXMLBL.CheckConstraints = True ' Do the foreign key checking

    objSQLXMLBL.KeepIdentity = True ' The default we don't want new values

    objSQLXMLBL.KeepNulls = True ' Don't use column defaults

    objConn.BeginTrans ' Start a transaction

    objSQLXMLBL.Execute DTSGlobalVariables("gv_XSD_File").Value, DTSGlobalVariables("gv_XML_File").Value

    ' Commit only if there are no errors. Otherwise rollback.

    If Err.Number = 0 Then

    objConn.CommitTrans ' Finalize the result

    Set objCmd = Nothing

    objConn.Close ' Close before setting to Nothing

    Set objCmd = Nothing

    Set objSQLXMLBL = Nothing ' free the object.

    Main = DTSTaskExecResult_Success

    Else

    objConn.RollbackTrans ' undo everything in the load

    ' Close and clean up all COM objects.

    Set objCmd = Nothing

    objConn.Close ' Close before setting to Nothing

    Set objCmd = Nothing

    Set objSQLXMLBL = Nothing ' free the object.

    Main = DTSTaskExecResult_Failure

    End If

    End Function

    HTH...

  • this is something I do to get the schema without having to create it myself:

        'first the schema

        oComm.CommandText = "SELECT top 1 * from [owner].

    with (nolock) where 1=2 for xml auto,xmldata,binary base64"

        oComm.Properties("Output Stream") = stOutput

        oComm.Execute , , adExecuteStream

        'make sure the file doesn't exist!

        sXMLSchemaFile = "C:\XML_Schema.xml"

        If fso.FileExists(sXMLSchemaFile) Then fso.DeleteFile sXMLSchemaFile, True

        stOutput.SaveToFile sXMLSchemaFile

    It may work for you depending on your input of course, I thought it was handy to know. 

     

    regards,

    Mark Baekdal

    http://www.dbghost.com

    http://www.innovartis.co.uk

    +44 (0)208 241 1762

    Build, Comparison and Synchronization from Source Control = Database change management for SQL Server

     

     

     

  • Thank you guys for your replies.

    Stewart, I don't think SQLXMLBulkLoad validates the XML file against a schema file. Let me know if I am missing anything.

    Here is my sample XML file:

    <?xml version="1.0" ?>

    <account-house ah-id="389" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="Statement.xsd">

     <ah-name>Account House Name</ah-name>

     <statement>

      <start-date>2005-06-01</start-date>

      <end-date>2005-06-30</end-date>

     </statement>

    </account-house>

    and the Schema file:

    <?xml version="1.0" ?>

    <xs:schema id="AccountHouse" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:sql="urn:schemas-microsoft-com:mapping-schema"

     version="1.0">

     <xs:element name="account-house" sql:relation="Table3">

      <xs:complexType>

       <xs:sequence>

        <xs:element name="ah-name" type="xs:string" sql:field="ah_name" />

        <xs:element name="statement" minOccurs="1" maxOccurs="unbounded" sql:is-constant="1">

         <xs:complexType>

          <xs:sequence>

           <xs:element name="start-date" type="xs:date" sql:field="start_date" />

           <xs:element name="end-date" type="xs:date" sql:field="end_date" />

          </xs:sequence>

         </xs:complexType>

        </xs:element>

       </xs:sequence>

       <xs:attribute name="ah-id" type="xs:int" sql:field="ah_id" default="389" />

      </xs:complexType>

     </xs:element>

    </xs:schema>

    The script for this:

    '**********************************************************************

    '  Visual Basic ActiveX Script

    '************************************************************************

    Function Main()

    Dim objXBulkLoad

    Set objXBulkLoad = CreateObject("SQLXMLBulkLoad.SQLXMLBulkLoad.3.0")

    objXBulkLoad.ConnectionString = "PROVIDER=SQLOLEDB.1;SERVER=local;UID=sa;PWD=dev;DATABASE=development;"

    objXBulkLoad.KeepIdentity = False

    'Optional Settings

    objXBulkLoad.ErrorLogFile = "D:\XML\test\Error.LOG"

    objXBulkLoad.TempFilePath = "c:\temp"

    'Executing the bulk-load

    objXBulkLoad.Execute "D:\XML\test1.xsd", "D:\XML\test1.xml"

    Main = DTSTaskExecResult_Success

    End Function

    In this case it is a valid xml file, so no errors, perfect..

    Let's say, we take out the start-date in the xml file.

    i.e.,

    <?xml version="1.0" ?>

    <account-house ah-id="389" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="Statement.xsd">

     <ah-name>Account House Name</ah-name>

     <statement>

      <end-date>2005-06-30</end-date>

     </statement>

    </account-house>

     
    Now again, if we run the job it is success, where as it should have failed b'cos, start-date is missing.
     
    If use other tool to validate the xml it gives the error promptly by returning:
     

    Validation error:

    Account House NameError at (5,4): The element 'statement' has invalid child element 'end-date'. Expected 'start-date'. An error occurred at , (5, 4).2005-06-30 Let me know if I am missing here.Thanks,

  • It looks as if the 'validation' that I have previously seen is probably not schema validation as such, but results when supplied data does not conform to the data-types specified for the mapped database columns.

    I have seen data imports fail where data types do not match or data is too long, but have never tested for missing mandatory data.

    Within the schema you can specify that an element is required (e.g. xsd:attribute name="status" type="xsd:string" sql:field="status" use="required"), but when I tried this, no errors were generated when the data was not present. The database column was populated with null. I even tried changing the column to not allow null, but the import process just populated the column with a space character.

    I’m not sure if some property of the SQLXMLBulkLoad object needs setting to enforce the use="required" for the elements… If there is, I can’t find it – I’ve tried setting KeepNulls=true and CheckConstraints=True, but no luck…

    Sorry I couln't be more help...

  • When I tested "missing mandatory data" earlier, I was using a character field. I've now retried this on a date field and am now generating errors. All I've done is set the mapped database column to not allow nulls, the SQLXMLBulkLoad properties are set as follows : keepnulls=true and CheckConstraints=True.

    The errors generated are not strictly schema validation errors, but are generated by the OLE DB Provider... [Bulk Insert failed. Unexpected NULL value in data file row 1, column 4. Destination column (shipDate) is defined NOT NULL.]

    Hope this helps...

Viewing 6 posts - 1 through 5 (of 5 total)

You must be logged in to reply to this topic. Login to reply