Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««123

A Case For Concatenation (Building Delimited Strings ) In T-SQL Expand / Collapse
Author
Message
Posted Friday, February 27, 2009 8:17 AM


SSCarpal Tunnel

SSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal TunnelSSCarpal Tunnel

Group: General Forum Members
Last Login: Yesterday @ 4:24 PM
Points: 4,011, Visits: 6,082
I was referring to inadequacies of alleged IT professionals ;)

My war story is that our development group used to have a guy whose entire job was to import data. I found this out after we went through some rightsizing and he was let go. When I reviewed his code it turned out to be slow AND inaccurate. He was driving it through a VB program with maximum RBAR. We just scrapped it all. Now things he used to run for a day are handled in under a minute, without duplications or omissions. No brilliance required, it was just a sad example of developers (who don't understand database) thinking that knowing how to do an INSERT or an UPDATE is all there is to it.


__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? -- Stephen Stills
Post #665772
Posted Friday, February 27, 2009 6:19 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 8:06 PM
Points: 36,786, Visits: 31,243
Bruce W Cassidy (2/26/2009)
Any chance of you sharing some of the techniques you're using for that? I've been doing similar things, building an ODS for a client company. But I use bcp :D

I have issues with SSIS. Namely, to use it well, you have to be fluent in not one, but three programming languages at once (SQL for your database connections, VB for any plug-in code and C for the inline code). It's incredibly complex, and it's insanely hard to try and modify. I have met several companies who have SSIS packages that everyone flat out refuses to change simply because of the complexity of even just understanding what it does.

Give me well-written straight line code in T-SQL any day. It may not perform quite as well as well-written SSIS, but at least I can easily understand and change it.


Sure... most of it is just importing whole rows of the file into a 2 column table... one for an IDENTITY and one for the data. Of course, that requires the use of a BCP format file so I can skip over the identity.

Then, I simply do a Tally table split to an NVP (Name/Value Pair) table in order by the IDENTITY column (which gets saved in the NVP) and the order of the Tally table.

The reason I do this is two fold... that allows me to split the data into "columns" in the NVP without knowing how many there are and it also allows me to skip rows with BCP (Bulk Insert, actually). Neither like it very much when there are different numbers of delimiters in the header rows than the body of the file and they won't skip rows based just on the EOL character... the number of delimiters must match in all rows. So, I just treat each row as if it had no delimiters to allow the skip of 20 or 30 header rows.

Note that I do NOT use a splitter function for this... I split the whole table all at once. If you haven't seen that before, take a look at the code in the following article (near the end)...

http://www.sqlservercentral.com/articles/TSQL/62867/

The section that I'm talking about in the article is titled, "One Final "Split" Trick with the Tally Table".


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #666086
Posted Friday, February 27, 2009 6:22 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 8:06 PM
Points: 36,786, Visits: 31,243
Bob Hovious (2/26/2009)
files that have been mangled by people who don't have the faintest clue what they're doing


I just have to ask, how much of your coding efforts are spent on compensating for the inadequacies of others?




I make about 90-95% of my living at it. The other 5-10% is building new stuff and maintaining servers.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #666087
Posted Friday, February 27, 2009 6:30 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Yesterday @ 8:06 PM
Points: 36,786, Visits: 31,243
Bob Hovious (2/27/2009)
I was referring to inadequacies of alleged IT professionals ;)

My war story is that our development group used to have a guy whose entire job was to import data. I found this out after we went through some rightsizing and he was let go. When I reviewed his code it turned out to be slow AND inaccurate. He was driving it through a VB program with maximum RBAR. We just scrapped it all. Now things he used to run for a day are handled in under a minute, without duplications or omissions. No brilliance required, it was just a sad example of developers (who don't understand database) thinking that knowing how to do an INSERT or an UPDATE is all there is to it.


The really sad part is, some people probably thought he was doing a good job.

And, absolutely spot on... just because someone walked by a book store that sells SQL books, doesn't mean that they even know how to spell it. ;) I pretty much blame management for "bad" SQL Developers because it's those managers that are totally clueless... For example... you've seen the "want ads"... "Wanted, Web developer. Must have x number of years experience with Java/VB/C#/flavor-of-the-month and 1 year of experience with SQL." Such a person (usually, there are those wonderful exceptions) shouldn't even be allowed near the DB except by calling stored procedures that someone else wrote. But, sure enough, they get put into the position of solving ETL and other large batch problems.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #666089
Posted Monday, March 2, 2009 7:07 AM


SSCoach

SSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoachSSCoach

Group: General Forum Members
Last Login: Friday, June 27, 2014 12:43 PM
Points: 15,444, Visits: 9,596
Bob Hovious (2/27/2009)
I was referring to inadequacies of alleged IT professionals ;)

My war story is that our development group used to have a guy whose entire job was to import data. I found this out after we went through some rightsizing and he was let go. When I reviewed his code it turned out to be slow AND inaccurate. He was driving it through a VB program with maximum RBAR. We just scrapped it all. Now things he used to run for a day are handled in under a minute, without duplications or omissions. No brilliance required, it was just a sad example of developers (who don't understand database) thinking that knowing how to do an INSERT or an UPDATE is all there is to it.


Looks like were migrating back in the direction of The Thread. :)

My current job, the prior DBA was okay for some things, but for some aweful reason, he built a whole series of DTS packages that import text files into tables in SQL, and it's all done through VBS. Open file item, set a bunch of variables to the values in the first row, build an SQL string that does an insert statement with those variables, move to the next line in the file, rinse, repeat. Over 7-million times on one of the files. Some of these take as much as six hours to import a few hundred thousand lines of text. No transformations, just move from text to SQL, same columns and everything.

He also wrote this wonderfully complex archive process for another database. The source database is third-party and subject to change without notice. The process queried the schema data for the tables, built a select and insert statement from sys.columns, dynamically, then dumped the data into a holding table using Select...Into..., then finally moved the data into the final archive tables, with the same structure as the primary tables.

One problem was that it didn't compare the table structure and modify the target table, which completely defeated the purpose of the dynamic SQL. Errored out at least once because the target table didn't have a column that the source table did. The other problem was that it's a daily job, and the run-time was up to just over 20 hours, and the data is growing. So I changed it to a simple "Insert ... Select * from ... for XML raw, type" and an insert into a single XML column. No dynamic SQL, no staging tables, but handles schema changes without problem. Took the whole process down to about 3 minutes, and this copy is moving the data across a network to a different database server. Uses about 10% more disk space, and is somewhat more complex to query (XQuery instead of just straight XML), though views help with that, but worth it for not tying up a whole server for 20+ hours per day.

On the other hand, he built a number of reasonably good databases, made sure there was adequate automatic maintenance on them, managed the backups and such, and otherwise did his job. Just those two things were pretty messed up.

They also had a dev here who did a lot of "pay no attention to the man behind the curtain" type support for his applications. In other words, they don't run unless someone holds their hand and issues "back door" commands. Everyone thought he was great, till he left. His name was Dan, and now it's not uncommon to have people refer to broken things as "Dangineered" (pronounced like "engineered", but with his name incorporated).

So, yeah, some time spent on such things. But for me, it's not even close to the majority of my time. Most of it is spent constructively, either on new systems or new features for existing systems.


- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Post #666575
« Prev Topic | Next Topic »

Add to briefcase «««123

Permissions Expand / Collapse