Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase ««123»»

A Case For Concatenation (Building Delimited Strings ) In T-SQL Expand / Collapse
Author
Message
Posted Wednesday, February 25, 2009 9:14 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Sunday, August 17, 2014 3:10 PM
Points: 2,787, Visits: 6,098
Thanks Jeff:

I realize that I've got a double thread going now, and I just had a light bulb go on when I looked at elapsed time for writing to disk, as opposed to elapsed time for displaying data.

Essentially, when displaying data I was seeing elapsed times for simple display being double that of concatenation and I was attributing the bottleneck to the SQL Server. That made me want to stick with concatenation because it was taking 11 seconds to run as opposed to 24. But when I wrote the results to disk (on my laptop), I saw elapsed times similar to yours. So I hit the reset button in my brain.

Bob



__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? -- Stephen Stills
Post #664373
Posted Wednesday, February 25, 2009 11:06 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Sunday, August 17, 2014 3:10 PM
Points: 2,787, Visits: 6,098
I'm still curious as to why the elapsed times for you (displaying the data) are so dramatically different from the times I get running your code. Any thoughts?



__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? -- Stephen Stills
Post #664466
Posted Wednesday, February 25, 2009 8:02 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 12:09 AM
Points: 35,348, Visits: 31,886
Bob Hovious (2/25/2009)
I'm still curious as to why the elapsed times for you (displaying the data) are so dramatically different from the times I get running your code. Any thoughts?



I have a 7 year old desktop box with server quality cache (whatever that means... salesmen!) but with a single P4 1.8Ghz CPU, a single gig of ram, and twin IDE harddrives. When I boot it, I've got TempDB setup for 2 gig on the MDF and a half gig on the LDF. Dunno if that's it or not. I've found that some laptops do the damndest things especially when it comes to display and execution times.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #664733
Posted Thursday, February 26, 2009 7:13 AM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Sunday, August 17, 2014 3:10 PM
Points: 2,787, Visits: 6,098
Gracias.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? -- Stephen Stills
Post #664989
Posted Thursday, February 26, 2009 12:43 PM


Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Tuesday, July 22, 2014 4:12 PM
Points: 758, Visits: 1,031
One of the recent (i.e. in the last ten years) changes in data warehousing has been a move away from specialised ETL tools like Informatica, Data Stage, etc, and towards letting the RDBMS do the work. This approach has been called ELT (extract, load, transform).

For smaller data warehouses, this makes sense. There's no need to support two disctinct platforms for throwing data around. It's all done in the RDBMS.

Oracle have a product that does that (sorry, can't remember the name), and we use a product at work named Wherescape RED that does the same over the top of SQL Server 2005.

So the database does a lot of work in chopping, changing and reformatting data during the staging process. There's no real presentation layer, as the data just gets posted into another database.

That's a very valid reason to be looking at string operations within an RDBMS, from my perspective.
Post #665246
Posted Thursday, February 26, 2009 1:00 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 12:09 AM
Points: 35,348, Visits: 31,886
Bruce W Cassidy (2/26/2009)
One of the recent (i.e. in the last ten years) changes in data warehousing has been a move away from specialised ETL tools like Informatica, Data Stage, etc, and towards letting the RDBMS do the work. This approach has been called ELT (extract, load, transform).

For smaller data warehouses, this makes sense. There's no need to support two disctinct platforms for throwing data around. It's all done in the RDBMS.

Oracle have a product that does that (sorry, can't remember the name), and we use a product at work named Wherescape RED that does the same over the top of SQL Server 2005.

So the database does a lot of work in chopping, changing and reformatting data during the staging process. There's no real presentation layer, as the data just gets posted into another database.

That's a very valid reason to be looking at string operations within an RDBMS, from my perspective.


Very cool... I've been doing it that way since day 1 with SQL Server... currently working on a project that I've dubbed as "STEPS".... "Simplified Table Extraction, Parsing, and Summarization". It's all being done in T-SQL and it imports files not importable directly, have an unknown number of columns all with unknown names, and some of the columns must be pair with adjacent columns depending on name. Currently, I'm using a 100k Tally table to do the heavy lifting. I just got done testing it... old process with Perl script and some ActiveX took 40 minutes to process two particular files and that was just to get them ready to import... my process actually does everything including the import in 107 seconds and I haven't even tuned the temp tables involved, yet. We import hundreds of such files a day... I'm thinking we'll finally be able to disconnect the garden hose from the server. ;)

My mantra in the project has been, "We don' need no stinkin' DTS/SSIS".


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #665262
Posted Thursday, February 26, 2009 1:11 PM


Right there with Babe

Right there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with BabeRight there with Babe

Group: General Forum Members
Last Login: Tuesday, July 22, 2014 4:12 PM
Points: 758, Visits: 1,031
Any chance of you sharing some of the techniques you're using for that? I've been doing similar things, building an ODS for a client company. But I use bcp :D

I have issues with SSIS. Namely, to use it well, you have to be fluent in not one, but three programming languages at once (SQL for your database connections, VB for any plug-in code and C for the inline code). It's incredibly complex, and it's insanely hard to try and modify. I have met several companies who have SSIS packages that everyone flat out refuses to change simply because of the complexity of even just understanding what it does.

Give me well-written straight line code in T-SQL any day. It may not perform quite as well as well-written SSIS, but at least I can easily understand and change it.
Post #665276
Posted Thursday, February 26, 2009 1:13 PM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Friday, June 27, 2014 12:43 PM
Points: 13,872, Visits: 9,596
I've found SSIS is great for files with a relative degree of consistency, and has major problems with files that have been mangled by people who don't have the faintest clue what they're doing. What I've done in those cases is pretty much pull the whole thing into a table with 1 column of nvarchar(max) and an ID and nothing else, then process in T-SQL from there.

Didn't have a prior automated process. Before I took it on, it was manual, and took about an hour per file. I took it down, eventually, to about 10 seconds per file (average file size about 5k rows, up to about 20k for the biggest ones). Numbers table and such were key to the process. I could do even better with 2k5 and some of the XML functions, but this was on 2k and didn't have what I would have used.


- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Post #665277
Posted Thursday, February 26, 2009 4:47 PM


SSCrazy

SSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazySSCrazy

Group: General Forum Members
Last Login: Sunday, August 17, 2014 3:10 PM
Points: 2,787, Visits: 6,098
files that have been mangled by people who don't have the faintest clue what they're doing


I just have to ask, how much of your coding efforts are spent on compensating for the inadequacies of others?




__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? -- Stephen Stills
Post #665394
Posted Friday, February 27, 2009 8:01 AM


SSChampion

SSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampionSSChampion

Group: General Forum Members
Last Login: Friday, June 27, 2014 12:43 PM
Points: 13,872, Visits: 9,596
Bob Hovious (2/26/2009)
files that have been mangled by people who don't have the faintest clue what they're doing


I just have to ask, how much of your coding efforts are spent on compensating for the inadequacies of others?


Not much. For the most part, a customer not knowing how to format a spreadsheet for import into SQL isn't their problem. That's what I'm being paid for.


- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Post #665753
« Prev Topic | Next Topic »

Add to briefcase ««123»»

Permissions Expand / Collapse