switching address with Subtring & len funtion

  • Eugene Elutin (11/21/2012)


    Mark-101232 (11/21/2012)


    ...

    Another tweak helps a bit (maybe... assuming I've got this right)

    Change

    ,1, '.' + LEFT(a.email, CHARINDEX('.', a.email)-1) + '@')

    to

    ,0, '.' + LEFT(a.email, CHARINDEX('.', a.email)-1))

    Nope, it doesn't do much for performance on 1000,000 rows. May be if you have 100,000,000 of emails, then one less string concatenation would be visible...

    It actually does help but, like you said, you won't see much here because of the low usage. The biggest problem there was with the old DelimitedSplit8K function (for example) was the fact that it did do such concatenations. When you started talking about hundreds of elements to be split out per row, it got real expensive right around the 4k character mark.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I have read about "What is in your CLR" just now and noticed the below questions.

    The above Q&A is part of a teaser for the PASS conference but it does bring up a couple of questions of my own...

    1. What's in YOUR CLR? In other words, what have you written a CLR for? What did it do? If you don't use CLR's, why not?

    2. If you have written CLR's, why did you write them (or it)? Was it because you didn't know how to do item 1 above in T-SQL, it couldn't be done in T-SQL, or because it was more performant as a CLR or something else?

    3. Looking back at it, was it an appropriate thing to do?

    4. Even if you haven't written a CLR, what would you consider a CLR to be appropriate for? Please be a bit specific if you can and if you have the time. Saying something like "math intensive tasks" or "string manipulation" tasks is what most people say but there's a lot of those things that can easily be done in T-SQL.

    For this Friday poll, what would you consider a task that should done in a CLR instead of T-SQL and why?

    EE,

    Can you clarify some of the questions ? since you used CLR to resolve this problem, I am just asking this question.

    To answer this question

    If you don't use CLR's, why not?

    Poor knowledge on CLR.

    No idea about when to use / not to use.

    How to get rid out of this? Read abour CLR on BOL. This could be answer from the expeerts. Apart from that

    it couldn't be done in T-SQL

    ...If so, why should i learn CLR?

    The real question is

    No idea about when to use / not to use

    For a give problem, Again No idea which one should be used i.e T-SQL or CLR. On what basis this decision/estimation will be taken?

    karthik

  • dwain.c (11/21/2012)


    Jeff Moden (11/21/2012)


    I don't believe that COLLATE will help a STUFF (but I haven't tested it). Logically speaking, it should only help when string comparisons are being made.

    I didn't think so either. 🙂 That wasn't what I meant by impact.

    It does seem to be helping the CHARINDEX in this case though, but sometimes I've found no effect.

    These result I got when I remove COLLATE:

    Using SUBSTRING and CHARINDEX

    1953

    Using PARSENAME

    1936

    Using PARSENAME with CROSS APPLY

    1923

    Using STUFF (Dwain) no check for a@b.com and empty

    1046

    Using STUFF (Dwain) with check for a@b.com and empty

    2016

    Using CLR

    360

    It's dramatic difference: 5 to 7 times slower than with binary collation.

    At least that what I can see on my server (default collation Latin1_General_CI_AS)

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • ...

    I should also say, that while I commend you on your ability to quickly build working CLRs, in my opinion if you've got a reasonably close pure SQL solution it's probably better to simply go with it. Less in your CLR library to maintain. And in this particular case, the problem seems so specific I'd be surprised if the need ever came up again in the same application - it looks like a data migration (or student problem) kind of thing to me. To me "common library routines" should consist of a set of utility-oriented functions that are reusable many times under many different circumstances, rather than for one-off cases.

    Hard to disagree on the above!

    I guess my CLR function is just a good indicator of the performance difference between string operations in T-SQL and managed code, especially as my version of CLR does exactly the same as T-SQL version with CHARINDEX and SUBSTRING's.

    While the OP question is most likely an one-off data-fix (for which, I guess any T-SQL solution will do, even the most slowest one), in case where you need to do complex string manipulations on a regular bases, you should remember about what CLR option can give you in terms of performance.

    And just a bit about CLR maintainability: maintainability concern is regularly raised when there is a talk about CLR. However, it should be admitted, that most of the time this concern is raised by pure T-SQL developers/DBA's. Personally, I don't take it at all! (There many more things in SQL Server database, which require regular maintenance and attention). I guess it's more about change/release. However, when a proper change release processes are in place, having CLR's doesn't create any maintenance hassles.

    It's a great feature of SQL Server. Since SQL2005, together with some another features (eg. partitioning, SSAS and some other), ability of writing managed code for SQL Server makes SQL Server an enterprise level RDBMS product. Before that, not many serious players would use it instead of ORACLE for enterprise level systems.

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • karthik M (11/21/2012)


    ...

    ...If so, why should i learn CLR?

    ...

    No, you shouldn't. It's purely up to you what you would like to learn.

    For example, I used to work in team of around 20 great SQL programmers and no one except me used to write CLR. And of course the rest of team didn't even want to hear about CLR's, until rewriting their UDF's in C# gave almost 5 fold boost to the ETL processes which used them. Saying that, none wanted to learn c# do write them, so they were comfortable with me doing that 🙂

    You may give this task to .NET developer if you have one in your team.

    Actually, badly written CLR may not give any benefits, but I guess that could be said for any technology including T-SQL, haven't you ever seen a code that uses cursors for something which can be easily done in set-based operation?

    karthik M (11/21/2012)


    The real question is

    No idea about when to use / not to use

    For a give problem, Again No idea which one should be used i.e T-SQL or CLR. On what basis this decision/estimation will be taken?

    I don't think that I will be able to compile the full list of cases, but I'll try to get some, where I would always consider a CLR approach:

    1. Complex and long string transformation/concatenations (eg. string split)

    2. Pattern search and replacement.

    3. Complex formatting

    4. A vague one: something cannot be done in T-SQL

    Usually, you wouldn't write CLR for one-off tasks (as discussed in this post for example).

    Also, we should admit that not every system is so eager for the best possible performance in all cases... 😉

    Actually, I do remember writing aggregate CLR to concatenate a strings from rows, haven't tested it against "FOR XML PATH", may be, that I should try doing at my next "free-time".

    _____________________________________________
    "The only true wisdom is in knowing you know nothing"
    "O skol'ko nam otkrytiy chudnyh prevnosit microsofta duh!":-D
    (So many miracle inventions provided by MS to us...)

    How to post your question to get the best and quick help[/url]

  • Yes, my initial concern wasn't with performance per se but with readability/understandability.

    Since PARSENAME is available, fully tested and documented already, I would use it in preference to hand-coded SUBSTRING, etc.. I didn't believe it would perform noticeably worse than the hand coding.

    Likewise, when I want to remove leading spaces from a string, I use LTRIM() rather than

    SUBSTRING([column], PATINDEX('%[^ ]%', [column] + 'a'), LEN([column]))

    although functionally both will do the same thing.

    SQL DBA,SQL Server MVP(07, 08, 09) A socialist is someone who will give you the shirt off *someone else's* back.

Viewing 6 posts - 31 through 35 (of 35 total)

You must be logged in to reply to this topic. Login to reply