Zip XML data

  • You welcome.

    But it still gonna be too much waste.

    Size of backups is only a part of the issue.

    Ok.

    See it for yourself.

    Go to your local/DEV instance of SQL server.

    Create 2 empty databases.

    In one of them create the table of your structure: 3 columns + JSON.

    In another one create a table with the same 3 columns, but make 4th one a varchar(300) to hold path\name for JSON files.

    Extract 1000 (or may by just 100 for the initial test) of JSON files from your actual database and store them locally.

    Now, write 2 scripts to emulate the BizTalk behaviour.

    1 to populate the DB 1 with JSON's, as your BizTalk does now.

    Another one - to copy JSON files into a folder and save the links in the 4th column of the table.

    Start your timer and launch the scripts one after another.

    You may wish to monitor performance metrics of the server while the scripts are executed, like memory usage, PLE, etc.

    After it's finished - compare the size of the database1 with the total storage size taken by the database 2 and the JSON files of disk.

    After that you may wish to try zipping the JSON messages.

    See how long will it take to zip messages stored in the table comparing to reading the links from the DB and zip files in a folder.

    At the end - imitate a cleanup process.

    Delete half of the JSONs which are "expired" from both storages. In the second storage you're gonna need to delete both links and the "expired" files they pointing to.

    You must have most of the infrastructure ready, so making up such a test should not take long.

    But it must give you a lot go points to discuss with your BizTalk team.

    _____________
    Code for TallyGenerator

  • Sergiy - Friday, January 19, 2018 6:12 AM

    1 column value is json there are 4 other columns . the json value is stored as nvarchar(max) .i will convert the whole data for a day as 1 xml record and planning to zip it.

    Ok! let's see.
    You take copies of outgoing messages generated by BizTalk and save it into a table (what's wrong with the file system?)
    Then you collect all the saved messages for a day and merge them into a huge XML. That CPU intensive process requires all the inbound JSON messages and outbound XML message to be placed into SQL Server memory, not to mention - they'll to go through the buffer pool. Which will make SQL Server resources unavailable for other, actual data manipulation, tasks.
    And at the end of it you want to dump the bulky output file by zipping it (Lynn mentioned - Zip is quite useless in compressing XMLs) into an archive on some remote location (returning to the roots - apparently, there is nothing wrong with the file system storage), so you can never see it again.

    And it's all instead of having a log table with description of the JSON messages (those 3 columns you mentioned) and PK link(s) to the data which went into every one of them. So any message can be recreated when requested.
    Should not exceed 50 bytes per record.

    Look, if you have so much spare computing power at your disposal, why not consider other, more productive uses of it?
    They say, bit coin mining is quite profitable.

    hahahahaha....."was" quite profitable.....

    -------------------------------------------------------------------------------------------------------------------------------------
    Please follow Best Practices For Posting On Forums to receive quicker and higher quality responses

Viewing 2 posts - 16 through 16 (of 16 total)

You must be logged in to reply to this topic. Login to reply