SQL Clone
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 


A Case For Concatenation (Building Delimited Strings ) In T-SQL


A Case For Concatenation (Building Delimited Strings ) In T-SQL

Author
Message
The Dixie Flatline
The Dixie Flatline
SSCertifiable
SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)

Group: General Forum Members
Points: 5390 Visits: 6900
Thanks Jeff:

I realize that I've got a double thread going now, and I just had a light bulb go on when I looked at elapsed time for writing to disk, as opposed to elapsed time for displaying data.

Essentially, when displaying data I was seeing elapsed times for simple display being double that of concatenation and I was attributing the bottleneck to the SQL Server. That made me want to stick with concatenation because it was taking 11 seconds to run as opposed to 24. But when I wrote the results to disk (on my laptop), I saw elapsed times similar to yours. So I hit the reset button in my brain.

Bob

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
The Dixie Flatline
The Dixie Flatline
SSCertifiable
SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)

Group: General Forum Members
Points: 5390 Visits: 6900
I'm still curious as to why the elapsed times for you (displaying the data) are so dramatically different from the times I get running your code. Any thoughts?

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)

Group: General Forum Members
Points: 85699 Visits: 41089
Bob Hovious (2/25/2009)
I'm still curious as to why the elapsed times for you (displaying the data) are so dramatically different from the times I get running your code. Any thoughts?



I have a 7 year old desktop box with server quality cache (whatever that means... salesmen!) but with a single P4 1.8Ghz CPU, a single gig of ram, and twin IDE harddrives. When I boot it, I've got TempDB setup for 2 gig on the MDF and a half gig on the LDF. Dunno if that's it or not. I've found that some laptops do the damndest things especially when it comes to display and execution times.

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
The Dixie Flatline
The Dixie Flatline
SSCertifiable
SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)

Group: General Forum Members
Points: 5390 Visits: 6900
Gracias.

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
Bruce W Cassidy
Bruce W Cassidy
Ten Centuries
Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)

Group: General Forum Members
Points: 1277 Visits: 1033
One of the recent (i.e. in the last ten years) changes in data warehousing has been a move away from specialised ETL tools like Informatica, Data Stage, etc, and towards letting the RDBMS do the work. This approach has been called ELT (extract, load, transform).

For smaller data warehouses, this makes sense. There's no need to support two disctinct platforms for throwing data around. It's all done in the RDBMS.

Oracle have a product that does that (sorry, can't remember the name), and we use a product at work named Wherescape RED that does the same over the top of SQL Server 2005.

So the database does a lot of work in chopping, changing and reformatting data during the staging process. There's no real presentation layer, as the data just gets posted into another database.

That's a very valid reason to be looking at string operations within an RDBMS, from my perspective.

Jeff Moden
Jeff Moden
SSC Guru
SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)SSC Guru (85K reputation)

Group: General Forum Members
Points: 85699 Visits: 41089
Bruce W Cassidy (2/26/2009)
One of the recent (i.e. in the last ten years) changes in data warehousing has been a move away from specialised ETL tools like Informatica, Data Stage, etc, and towards letting the RDBMS do the work. This approach has been called ELT (extract, load, transform).

For smaller data warehouses, this makes sense. There's no need to support two disctinct platforms for throwing data around. It's all done in the RDBMS.

Oracle have a product that does that (sorry, can't remember the name), and we use a product at work named Wherescape RED that does the same over the top of SQL Server 2005.

So the database does a lot of work in chopping, changing and reformatting data during the staging process. There's no real presentation layer, as the data just gets posted into another database.

That's a very valid reason to be looking at string operations within an RDBMS, from my perspective.


Very cool... I've been doing it that way since day 1 with SQL Server... currently working on a project that I've dubbed as "STEPS".... "Simplified Table Extraction, Parsing, and Summarization". It's all being done in T-SQL and it imports files not importable directly, have an unknown number of columns all with unknown names, and some of the columns must be pair with adjacent columns depending on name. Currently, I'm using a 100k Tally table to do the heavy lifting. I just got done testing it... old process with Perl script and some ActiveX took 40 minutes to process two particular files and that was just to get them ready to import... my process actually does everything including the import in 107 seconds and I haven't even tuned the temp tables involved, yet. We import hundreds of such files a day... I'm thinking we'll finally be able to disconnect the garden hose from the server. Wink

My mantra in the project has been, "We don' need no stinkin' DTS/SSIS". Hehe

--Jeff Moden

RBAR is pronounced ree-bar and is a Modenism for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
If you think its expensive to hire a professional to do the job, wait until you hire an amateur. -- Red Adair

Helpful Links:
How to post code problems
How to post performance problems
Forum FAQs
Bruce W Cassidy
Bruce W Cassidy
Ten Centuries
Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)Ten Centuries (1.3K reputation)

Group: General Forum Members
Points: 1277 Visits: 1033
Any chance of you sharing some of the techniques you're using for that? I've been doing similar things, building an ODS for a client company. But I use bcp BigGrin

I have issues with SSIS. Namely, to use it well, you have to be fluent in not one, but three programming languages at once (SQL for your database connections, VB for any plug-in code and C for the inline code). It's incredibly complex, and it's insanely hard to try and modify. I have met several companies who have SSIS packages that everyone flat out refuses to change simply because of the complexity of even just understanding what it does.

Give me well-written straight line code in T-SQL any day. It may not perform quite as well as well-written SSIS, but at least I can easily understand and change it.

GSquared
GSquared
SSC-Insane
SSC-Insane (23K reputation)SSC-Insane (23K reputation)SSC-Insane (23K reputation)SSC-Insane (23K reputation)SSC-Insane (23K reputation)SSC-Insane (23K reputation)SSC-Insane (23K reputation)SSC-Insane (23K reputation)

Group: General Forum Members
Points: 23395 Visits: 9730
I've found SSIS is great for files with a relative degree of consistency, and has major problems with files that have been mangled by people who don't have the faintest clue what they're doing. What I've done in those cases is pretty much pull the whole thing into a table with 1 column of nvarchar(max) and an ID and nothing else, then process in T-SQL from there.

Didn't have a prior automated process. Before I took it on, it was manual, and took about an hour per file. I took it down, eventually, to about 10 seconds per file (average file size about 5k rows, up to about 20k for the biggest ones). Numbers table and such were key to the process. I could do even better with 2k5 and some of the XML functions, but this was on 2k and didn't have what I would have used.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
The Dixie Flatline
The Dixie Flatline
SSCertifiable
SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)SSCertifiable (5.4K reputation)

Group: General Forum Members
Points: 5390 Visits: 6900
files that have been mangled by people who don't have the faintest clue what they're doing


I just have to ask, how much of your coding efforts are spent on compensating for the inadequacies of others?

__________________________________________________

Against stupidity the gods themselves contend in vain. -- Friedrich Schiller
Stop, children, what's that sound? Everybody look what's going down. -- Stephen Stills
GSquared
GSquared
SSC-Insane
SSC-Insane (23K reputation)SSC-Insane (23K reputation)SSC-Insane (23K reputation)SSC-Insane (23K reputation)SSC-Insane (23K reputation)SSC-Insane (23K reputation)SSC-Insane (23K reputation)SSC-Insane (23K reputation)

Group: General Forum Members
Points: 23395 Visits: 9730
Bob Hovious (2/26/2009)
files that have been mangled by people who don't have the faintest clue what they're doing


I just have to ask, how much of your coding efforts are spent on compensating for the inadequacies of others?


Not much. For the most part, a customer not knowing how to format a spreadsheet for import into SQL isn't their problem. That's what I'm being paid for.

- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread

"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
Go


Permissions

You can't post new topics.
You can't post topic replies.
You can't post new polls.
You can't post replies to polls.
You can't edit your own topics.
You can't delete your own topics.
You can't edit other topics.
You can't delete other topics.
You can't edit your own posts.
You can't edit other posts.
You can't delete your own posts.
You can't delete other posts.
You can't post events.
You can't edit your own events.
You can't edit other events.
You can't delete your own events.
You can't delete other events.
You can't send private messages.
You can't send emails.
You can read topics.
You can't vote in polls.
You can't upload attachments.
You can download attachments.
You can't post HTML code.
You can't edit HTML code.
You can't post IFCode.
You can't post JavaScript.
You can post emoticons.
You can't post or upload images.

Select a forum

































































































































































SQLServerCentral


Search