SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


SQL-only ETL using a bulk insert into a temporary table (SQL Spackle)


SQL-only ETL using a bulk insert into a temporary table (SQL Spackle)

Author
Message
Stan Kulp-439977
Stan Kulp-439977
Hall of Fame
Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)Hall of Fame (3.9K reputation)

Group: General Forum Members
Points: 3856 Visits: 1159
Comments posted to this topic are about the item SQL-only ETL using a bulk insert into a temporary table (SQL Spackle)
Luis Cazares
Luis Cazares
SSC Guru
SSC Guru (65K reputation)SSC Guru (65K reputation)SSC Guru (65K reputation)SSC Guru (65K reputation)SSC Guru (65K reputation)SSC Guru (65K reputation)SSC Guru (65K reputation)SSC Guru (65K reputation)

Group: General Forum Members
Points: 65522 Visits: 20259
It's a nice article, short and simple.

However, I would have liked if you specified one of the main problems I've seen with beginners (including me) which is that the file must be on a location available to the server. Most of us have been bitten by this problem using a local address instead of the address the server needs (or putting the file on the server).
Another improvement would have been to include some references for more information such as how to make it dynamic or how to use format files.


Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?


How to post data/code on a forum to get the best help: Option 1 / Option 2
Ken Wymore
Ken Wymore
SSCertifiable
SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)SSCertifiable (7.9K reputation)

Group: General Forum Members
Points: 7907 Visits: 2418
Nice simple article. I may have to try this out sometime.
bapings
bapings
SSC-Enthusiastic
SSC-Enthusiastic (160 reputation)SSC-Enthusiastic (160 reputation)SSC-Enthusiastic (160 reputation)SSC-Enthusiastic (160 reputation)SSC-Enthusiastic (160 reputation)SSC-Enthusiastic (160 reputation)SSC-Enthusiastic (160 reputation)SSC-Enthusiastic (160 reputation)

Group: General Forum Members
Points: 160 Visits: 22
Very nice article! Thank you so much!
jcasement
jcasement
SSC Journeyman
SSC Journeyman (96 reputation)SSC Journeyman (96 reputation)SSC Journeyman (96 reputation)SSC Journeyman (96 reputation)SSC Journeyman (96 reputation)SSC Journeyman (96 reputation)SSC Journeyman (96 reputation)SSC Journeyman (96 reputation)

Group: General Forum Members
Points: 96 Visits: 127
Extremely academic -- since as was pointed out, the file must be on a disk local to that instance of the SQL Server. Doesn't happen often and most DBA's would argue against it.

It would be better if you could target local files to a Windows client and bulk those into SQL Server.

A look at SQLXMLBulkLoad may be worth the while since more and more files move from business to business in a XML format and this can run from any Windows client.
snh 51890
snh 51890
Grasshopper
Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)Grasshopper (19 reputation)

Group: General Forum Members
Points: 19 Visits: 5
Nice article. This is a typical operation when the server is in a SOHO type situation or maybe some older XP-era gear that is co-located somewhere. Basically anyplace where the person using SSMS also has access to the C:\ drive of the server.

As my career has progressed and my situations become more corporate, the server administration is heavily managed. So things that were simple like bulk importing a file located on the same server near impossible now.

I've grown fond of using Linked Servers for large CSV files I only need temporarily. Usually this means buying the server admin a beer to enable this feature. The benefits are -


  • Treats the data file as a database

  • Does not have to be imported into existing database which means its a little easier on the transaction logs

  • Can select against it using quad-addressing (server.schema.table.column)

  • When done, merely unlink the file (server)

  • File does not have to be on server

  • Done entirely by script using T-SQL



Sean
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (340K reputation)SSC Guru (340K reputation)SSC Guru (340K reputation)SSC Guru (340K reputation)SSC Guru (340K reputation)SSC Guru (340K reputation)SSC Guru (340K reputation)SSC Guru (340K reputation)

Group: General Forum Members
Points: 340920 Visits: 42650
I'm actually very torn on this article...

On the one hand, I agree. As noted by some of the others, the article would certainly have benefitted from the inclusion of some extra information, such as which machine drive-letter reference files must live on or the fact that you can use UNCs instead, etc

On the other hand, this IS a "Spackle" article and, as it says at the beginning of the article, "These short articles are NOT meant to be complete solutions". Is it a 5 Star article in the classic sense? No. But, if you don’t know about BULK INSERT, it IS, most definitely, a MUST READ article because, although it doesn't fill an entire crack in knowledge of T-SQL, it DOES demonstrate the very important and academic concept that the Extract, Translate, and Load process of data doesn't need to be complicated. While that might not provide a benefit to those of us that already know about the likes of BULK INSERT, it does provide the start of an answer to the recent outbreak of posts on this and other sites that can be generically summarized as “How can I import a file into a table”?

What's, unfortunately, not stated in the article and requires someone to actually have enough intellectual curiosity to try the code out and do a little research on their own, is the fact that the code provided imports more than 122 THOUSAND rows and then extracts the required (501) rows in just over a second. And, the author provided the substantial amount of test data to prove it! The article clearly demonstrates that it doesn’t require learning a new language (such as PowerShell), doesn’t require the use of another system/program (such as SSIS or any of a thousand other programs), and doesn’t require much additional knowledge of T-SQL.

In other words, this is a very high altitude and incredibly simple "introduction" to BULK INSERT that says “Look! Importing files can be easy!” and, for those with enough intellectual curiosity to lookup BULK INSERT in Books Online or other web articles, provides a clue to an extremely effective ETL method that they might not have otherwise considered or might not have even been aware of.

With all of that in mind, I say “Well done, Mr. Kulp”. Add a little more detail and advantageous reasons to use a particular feature to future “Spackle” articles and keep ‘em coming!

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Luis Cazares
Luis Cazares
SSC Guru
SSC Guru (65K reputation)SSC Guru (65K reputation)SSC Guru (65K reputation)SSC Guru (65K reputation)SSC Guru (65K reputation)SSC Guru (65K reputation)SSC Guru (65K reputation)SSC Guru (65K reputation)

Group: General Forum Members
Points: 65522 Visits: 20259
Just to be clear, the file doesn't need to be on the same server as SQL Server. Typically, the files will be stored in a separate location and referenced by a network address (\\SomeFileServer\Path\file).


Luis C.
General Disclaimer:
Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?


How to post data/code on a forum to get the best help: Option 1 / Option 2
sgross 10581
sgross 10581
SSC Rookie
SSC Rookie (26 reputation)SSC Rookie (26 reputation)SSC Rookie (26 reputation)SSC Rookie (26 reputation)SSC Rookie (26 reputation)SSC Rookie (26 reputation)SSC Rookie (26 reputation)SSC Rookie (26 reputation)

Group: General Forum Members
Points: 26 Visits: 13
How could we extend this code to add the ability to convert datatypes, like parse a datetime.

Also so do an Upsert per row?
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (340K reputation)SSC Guru (340K reputation)SSC Guru (340K reputation)SSC Guru (340K reputation)SSC Guru (340K reputation)SSC Guru (340K reputation)SSC Guru (340K reputation)SSC Guru (340K reputation)

Group: General Forum Members
Points: 340920 Visits: 42650
sgross 10581 (7/23/2014)
How could we extend this code to add the ability to convert datatypes, like parse a datetime.

Also so do an Upsert per row?



Unless you can guarantee that the person(s) or application will always provide a proper date and time, it's best to import the data into a staging table (as was done in the article) as VARCHAR and validate the data using standard techniques for validating such things. Once validated, simply inserting the data from the staging table to the final data table will do many implicit conversions such as converting proper VARCHAR renditions of date and time to an actual DATETIME (for example) data type of the target column in the final table.

As for "Upserts", it should be handled as any other data. You have a source table (the staging table) and a target table (the final table). I'll typically pre-mark each row with "I" (for insert) or "U" (for update) (sometimes, "D" for delete but I normally don't delete data due to audit requirements) using simple joins like you would in any classic upsert, and then do a separate insert and update. You could also use MERGE between the staging and target tables. BULK INSERT is not meant to provide upsert capabilities on its own. It's meant to be relatively simple and very fast... and it is.

If the source of the data is very good, you can, in fact, import into a staging table that has the correct data types. Even if the source is questionable in quality, you can setup BULK INSERT to sequester any bad lines of data in a separate file for future analysis and repair.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search