Quickly Copy Data

  • EdVassie (6/10/2013)


    There are other ways of tackling this issue, but they do not exist yet for SQL Server.

    Back in the late 1990's a number DB2 shops stopped taking full backups of their databases. They used functionality in some third-party backup products called 'offline database merge'.

    The idea of this way to take an existing full backup, and apply to it either a differential or a log backup, giving you a new full backup. If you applied a log backup, you could choose any point of time in the log that the new full backup should relate to.

    One advantage of this aproach is that the merge process could take place away from the database server. If you already had your full backup on an offsite server, you only needed to get your log backup sent there, and you could do the backup merge. The only time this process needed to connect to a database server was to register the new full backup.

    IMHO this is a process that vendors of SQL backup products could ad to their product, assuming they can work around any patent issues that may exist. Anyone want to take up this challenge?

    Gosh... sounds as simple as "merge replication" or "log shipping".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • sounds as simple as "merge replication" or "log shipping".

    More like log shipping than merge replication, certainly in terms of complexity and effort.

    Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

    When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara

  • Jeff Moden (6/6/2013)


    After the systems were down, I made a tape backup and FedEx'd it. We had everything back up and online with nearly half a day to spare.

    Just goes to show that the solution doesn't always involve technology.

    We do have electronic scrum boards but the old post card and blue tack method seems to work well.

    Post-It notes for defects are good as well though for some reason I've started to have nightmares about Sesame Street.

  • To start with, here's an example of a .7z compression command line might be:

    7z a -t7z -mx1 -m0=LZMA2 -mmt8 -ssw -bd -y -pPassword -r My7zFile.7z "C:\test\*"

    -t7z means use 7z file format (excellent compression at -mx9, and good speed with -mx1)

    -mx1 means use very fast compression, while -mx2..8 are intermediate steps, and -mx9 is very tight compression More compression = more RAM required

    -mmt8 means 8 threads. -mmt3 means 3 threads, -mmt1 means 1 thread, etc. More threads = more RAM require (at 4, 6, 8, etc., so 3/5/7 are good choices)

    -ssw means also try to back up files with share locks, i.e. some types of opened/locked files. Consider carefully about this one.

    -bd means no percentage indicator

    -y means no prompts

    -pPassword means use enryption (for -t7z files, that's AES), and use Password as the password. You could use -psWO4UXZRaykA and have sWO4UXZRaykA as your password.

    -r means recursive - take if off if you're specifying files

    As another post suggested, you must break each situation down.

    First, you have requirements:

    A) Maximum timeframe available (starting)

    Ai) On the source side

    Aii) On the target side

    B) Maximum timeframe available (change set)

    Bi) On the source side

    Bii) On the target side

    C) Allowable impacts (slowness, offline, shutdown, etc.)

    Ci) On the source side

    Cii) On the target side

    D) Privacy, confidentiality, and other encryption/security requirements

    Di) In some cases, this means encryption

    Dii) In others, this means a licensed, bonded courier

    Then, you have your situation - note that "compressed" values depend on your particular dataset - you'll have to test them.

    1) Starting data size (uncompressed)

    1a) Accounting for growth over N years

    2) Starting data size (compressed)

    2a) Accounting for growth over N years

    3) Change set over time size (uncompressed) (minimum, average, maximum)

    3a) Accounting for growth over N years

    4) Change set over time size (compressed) (minimum, average, maximum)

    4a) Accounting for growth over N years

    5) Latency for each transfer method (uncompressed)

    6) Latency for each transfer method (compressed)

    7) Throughput for each transfer method (uncompressed)

    8) Throughput for each transfer method (compressed)

    9) Restartability

    9a) Impact of restartability if not required, and if required

    If we look at a few possible transfer options, remembering that the physical drives can be easily set up in parallel if your machines have the ports:

    I) Local gigabit

    I5) 5ms latency uncompressed

    I7) 110MB/s throughput uncompressed

    II) 15Mbps internet link

    II5) 50ms latency uncompressed

    II7) 1.5MB/s throughput uncompressed

    III) 4TB USB2.0 drive, local

    III5) 9ms latency uncompressed

    III7) 30MB/s throughput uncompressed

    IV) 4TB USB3.0 or eSATA or SATA or SAS drive, local

    IV5) 9ms-22hr latency uncompressed

    IV7) 110MB/s throughput uncompressed

    V) 4TB USB3.0 or eSATA or SATA or SAS drive, shipped, 12 hour transit time

    V5) 12-34 hour latency uncompressed (0-11 hours to fill, 12 hours to ship, 0-11 hours to read)

    V7) 0-32MB/s throughput uncompressed (0-11 hours to fill, 12 hours to ship, 0-11 hours to read)

    VI) 10.3TB Fusion ioDrive Octal, local

    VI5) negligible latency uncompressed

    VI7) 3.9GB/s write, 6.7GB/s read

    VII) 10.3TB Fusion ioDrive Octal, shipped, 12 hour transit time

    VII5) 0-12 hr latency uncompressed

    VII7) 0-215MB/s throughput uncompressed (.75 hrs to fill, 12 hours to ship, .5 hours to read)

    For transfer over networks, we can use copy, robocopy, Windows GUI, etc.

    For compression, I'm going to join in with the 7-zip crowd, and specify a few things:

    DO use 7z format, and LZMA2 mode.

    You do NOT have to compress, copy, uncompress. Instead, you can compress locally, and then uncompress from the remote machine, so only the compressed data goes over the network! Alternately, you can compress _to_ the remote machine.

    Note that I ran a quick, fake data test - a 768MB files of all 0x00 values (trivially compressible)

    Straight Windows copy to a machine several hundred miles away, 250KB/s.

    7-Zip compression via GUI, .7z, LZMA2, 64KB dictionary size, Solid, 273 byte word size, Fastest, directly to the machine (\\remote\share\test.7z), 20MB/s BEFORE decompression.

    7-Zip compression via GUI, .7z, LZMA2, 64KB dictionary size, Solid, 273 byte word size, Fastest, directly to local and then the .7z file copied via Windows GUI, 96MB/s BEFORE decompression.

  • David.Poole (6/10/2013)


    Post-It notes for defects are good as well though for some reason I've started to have nightmares about Sesame Street.

    Heh... and "Agile Scrums". 🙂 "Big Bird" attends both.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 5 posts - 31 through 34 (of 34 total)

You must be logged in to reply to this topic. Login to reply