Backup Directly to a GCS Bucket in SQL Server 2022

  • Comments posted to this topic are about the item Backup Directly to a GCS Bucket in SQL Server 2022

  • This was removed by the editor as SPAM

  • This was removed by the editor as SPAM

  • Short question: WHY should I backup direct from my local server to cloud storage?

    Wouldn't it be much better / faster to create the backup locally and copy it to the storage when done?

    Google (or any other network storage) may not always be available because of strange network problems, which can be very annoying, when it happens at 95% of a 1 TB backup. And of course everything that goes over the wire is slower than local, so you are blocking at least the transaction log (much) longer that if you would do your backup local first.

    And the same happens when you need to restore your database - local is much faster and a network problem will not cancel your restore after 6 h at 95%.

    PS: of course you should / need to copy your local backups automatical to another server / storage / cloud / whatever, otherwise you'll have a problem, if your server catches fire, gets infected by some ransom ware etc.. But / and ideally wouldn't your local server copy the stuff to the other server / cloud, but the other server / cloud would pull the backups from your local server. This way you need no write permission on the other storage, which helps e.g. vs. ransom ware or junior DBAs.

    God is real, unless declared integer.

  • This was removed by the editor as SPAM

  • Let me add, there is a limitation on the size of a backup that can be stored the way described in the article. There are two limits:

    1. As per MS doc, "S3-compatible object storage provider must split files in multiple blocks called parts", "each part size ranges from 5 MB to 20 MB, this range is controlled by the T-SQL BACKUP command through the parameter MAXTRANSFERSIZE". The maximum supported size of a single file is the result of 10'000 parts * MAXTRANSFERSIZE. So the maximum size for a single backup file in this approach is 10'000 parts * 20 MB = 200'000 MB.
    2. As per another MS doc, "up to 64 backup devices might be specified" in BACKUP command. So maximum size for a backup  into GCP bucket is: 200'000 MB * 64 = 12'800'000 MB or 12.2 TB

    Addressing the legit question "WHY" from the previous comment, I would say that for non critical environments it's really tempting to backup directly to the bucket. The approach with an intermediate backup to a local drive and then to the bucket might get double time and resources (and remember, the local drive comes for extra money).

     

  • Regarding to the costs of local storage:

    On "my" SQL Server we bought four 16 TB SSD drives last year (there were no more internal spots free) that can store the ~30 TB backups that we are creating every week (~25 TB of it are the weekly full backup). The drives are not in a raid, I split our backups into 4 files (per database) and direct every file to another drive.

    On another server I have an asynchron job that searches every x minutes for new backups on my SQL server and copies it (pull not push, this way only read permissions between both servers are required which makes it more unlikely that an attacker can encrypt the backups on the other server).

    It cost us about 5,000 USD (= almost nothing) including a spare disk that I can swap in in the case of a failure. Smaller drives are even cheaper in USD per TB.

    PS: Before this I created the backups on the local attached Dell PowerStore system (but it slowly ran out of space, so we moved the backups out). Sideeffect: the duration for the creation of 25 TB full backups went down from 7 to 5 hours, since 25 TB already compressed and encrypted backup files that are written, while at the same time 60 TB databases are read, seems to be not that type of work the PowerStore is designed for (but it really likes tons of smaller reads / writes as are typical for databases).

    PPS: since I do SAN Snapshots on the PowerStore and copy the SQL backup files almost immediate to another server I can risk a disk failure (= lost of all local backups) on the local SSDs. But if I would have had more free SSD slots or had slightly fewer backups or would have been able to get bigger SSDs I would have used RAID 10 instead.

    God is real, unless declared integer.

Viewing 7 posts - 1 through 6 (of 6 total)

You must be logged in to reply to this topic. Login to reply