RE: Quickly Copy Data – SQLServerCentral

SSC-Insane

Points: 20039

June 10, 2013 at 11:24 am

To start with, here's an example of a .7z compression command line might be:

7z a -t7z -mx1 -m0=LZMA2 -mmt8 -ssw -bd -y -pPassword -r My7zFile.7z "C:\test\*"

-t7z means use 7z file format (excellent compression at -mx9, and good speed with -mx1)

-mx1 means use very fast compression, while -mx2..8 are intermediate steps, and -mx9 is very tight compression More compression = more RAM required

-mmt8 means 8 threads. -mmt3 means 3 threads, -mmt1 means 1 thread, etc. More threads = more RAM require (at 4, 6, 8, etc., so 3/5/7 are good choices)

-ssw means also try to back up files with share locks, i.e. some types of opened/locked files. Consider carefully about this one.

-bd means no percentage indicator

-y means no prompts

-pPassword means use enryption (for -t7z files, that's AES), and use Password as the password. You could use -psWO4UXZRaykA and have sWO4UXZRaykA as your password.

-r means recursive - take if off if you're specifying files

As another post suggested, you must break each situation down.

First, you have requirements:

A) Maximum timeframe available (starting)

Ai) On the source side

Aii) On the target side

B) Maximum timeframe available (change set)

Bi) On the source side

Bii) On the target side

C) Allowable impacts (slowness, offline, shutdown, etc.)

Ci) On the source side

Cii) On the target side

D) Privacy, confidentiality, and other encryption/security requirements

Di) In some cases, this means encryption

Dii) In others, this means a licensed, bonded courier

Then, you have your situation - note that "compressed" values depend on your particular dataset - you'll have to test them.

1) Starting data size (uncompressed)

1a) Accounting for growth over N years

2) Starting data size (compressed)

2a) Accounting for growth over N years

3) Change set over time size (uncompressed) (minimum, average, maximum)

3a) Accounting for growth over N years

4) Change set over time size (compressed) (minimum, average, maximum)

4a) Accounting for growth over N years

5) Latency for each transfer method (uncompressed)

6) Latency for each transfer method (compressed)

7) Throughput for each transfer method (uncompressed)

8) Throughput for each transfer method (compressed)

9) Restartability

9a) Impact of restartability if not required, and if required

If we look at a few possible transfer options, remembering that the physical drives can be easily set up in parallel if your machines have the ports:

I) Local gigabit

I5) 5ms latency uncompressed

I7) 110MB/s throughput uncompressed

II) 15Mbps internet link

II5) 50ms latency uncompressed

II7) 1.5MB/s throughput uncompressed

III) 4TB USB2.0 drive, local

III5) 9ms latency uncompressed

III7) 30MB/s throughput uncompressed

IV) 4TB USB3.0 or eSATA or SATA or SAS drive, local

IV5) 9ms-22hr latency uncompressed

IV7) 110MB/s throughput uncompressed

V) 4TB USB3.0 or eSATA or SATA or SAS drive, shipped, 12 hour transit time

V5) 12-34 hour latency uncompressed (0-11 hours to fill, 12 hours to ship, 0-11 hours to read)

V7) 0-32MB/s throughput uncompressed (0-11 hours to fill, 12 hours to ship, 0-11 hours to read)

VI) 10.3TB Fusion ioDrive Octal, local

VI5) negligible latency uncompressed

VI7) 3.9GB/s write, 6.7GB/s read

VII) 10.3TB Fusion ioDrive Octal, shipped, 12 hour transit time

VII5) 0-12 hr latency uncompressed

VII7) 0-215MB/s throughput uncompressed (.75 hrs to fill, 12 hours to ship, .5 hours to read)

For transfer over networks, we can use copy, robocopy, Windows GUI, etc.

For compression, I'm going to join in with the 7-zip crowd, and specify a few things:

DO use 7z format, and LZMA2 mode.

You do NOT have to compress, copy, uncompress. Instead, you can compress locally, and then uncompress from the remote machine, so only the compressed data goes over the network! Alternately, you can compress _to_ the remote machine.

Note that I ran a quick, fake data test - a 768MB files of all 0x00 values (trivially compressible)

Straight Windows copy to a machine several hundred miles away, 250KB/s.

7-Zip compression via GUI, .7z, LZMA2, 64KB dictionary size, Solid, 273 byte word size, Fastest, directly to the machine (\\remote\share\test.7z), 20MB/s BEFORE decompression.

7-Zip compression via GUI, .7z, LZMA2, 64KB dictionary size, Solid, 273 byte word size, Fastest, directly to local and then the .7z file copied via Windows GUI, 96MB/s BEFORE decompression.