What is the best method to upload a large text file (20 million rows) - SFTP?

  • We are running SQL Server 2012 (SP3) on Windows Server 2008 R2 Enterprise. We have an SSIS Project that creates 57 nightly text files (.csv). This takes about 7 minutes. We are okay with the file creation process. The problem appears to be with the SFTP Process; it takes 3 hours and 20 minutes to upload the files to the vendor. There is one huge file (20 million rows with row sizes range from 75 to 350 bytes.) The one huge file took 1 hour 13 minutes to upload.

    How can we decrease the time to upload this huge file? As a proof of concept, for one year of data we zipped all 57 text files. It took 20 minutes to create the zip file and approximately 1 hour to SFTP the zip file.

    Would it be more efficient or a  better method to extract the data (that makes up the 57 text files) from our main database to a new database (with 57 tables), back it up, compress and/or zip it up, and use SFTP to upload the new database?

    Any suggestions and recommendations are appreciated.

  • Sounds like you have huge bottlenecks somewhere. I download compressed 1GB files that are more than 20 million rows where I'm able to unzip, re-encode to a new file, and upload without compression under a hour easily.

    Some things to do here would obviously be to parallel upload the files. Instead of uploading one-by-one, go ahead and upload multiple file pieces at the same time. Maybe break up the smaller file into smaller pieces are create a way that can chunk it out and parallel.

    I would also look at your hardware and throughput because it sounds like you are bottlenecked elsewhere other than the fact the files are pretty large. Yes, they are somewhat large, but not that large.

Viewing 2 posts - 1 through 1 (of 1 total)

You must be logged in to reply to this topic. Login to reply