Is It Time To Clear Out Those Quirky Functions?

  • We got hit by ISNUMERIC early. It will convert text that happens to be valid hexadecimal into a positive hit. So we went to PATINDEX to find values that were numeric. So in the background it does conversion checks. :ermm:

  • Perhaps some kind of deprecation warning (with the ability to locally disable - equivalent of an old C styled # command) would be beneficial for script updates. I completely agree that any kind of deprecation should be 2 stepped just like Java and .NET i.e. first marked as deprecated in a release (with warnings raised) then a standard minimum number of releases occur before removal is even allowed, if at all.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • JustMarie (7/18/2014)


    We got hit by ISNUMERIC early. It will convert text that happens to be valid hexadecimal into a positive hit. So we went to PATINDEX to find values that were numeric. So in the background it does conversion checks. :ermm:

    I know I'm being a bit pedantic here but you didn't "get hit" by ISNUMERIC... you used ISNUMERIC incorrectly. 😉 I'll admit that the MS documentation isn't in-your-face clear about what it doesn't do (it was never meant to be an IsAllDigits function), but it does state exactly what it does do.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • I would like to see a (T-)SQL function like normalize-space() from XPath, which I believe is slightly more useful than just TRIM, as it deals with all white space characters including sequences in between other characters. It is particularly useful when combining chunks of text.

    Possibly there are other scalar functions from XPath worth considering as well.

  • Tavis Reddick (8/18/2014)


    I would like to see a (T-)SQL function like normalize-space() from XPath, which I believe is slightly more useful than just TRIM, as it deals with all white space characters including sequences in between other characters. It is particularly useful when combining chunks of text.

    Possibly there are other scalar functions from XPath worth considering as well.

    Understanding the danger of showing my naivety, surely that would be better served by cleaner data. N'est pas?

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • Gary Varga (8/18/2014)


    Tavis Reddick (8/18/2014)


    I would like to see a (T-)SQL function like normalize-space() from XPath, which I believe is slightly more useful than just TRIM, as it deals with all white space characters including sequences in between other characters. It is particularly useful when combining chunks of text.

    Possibly there are other scalar functions from XPath worth considering as well.

    Understanding the danger of showing my naivety, surely that would be better served by cleaner data. N'est pas?

    There are three scenarios I am thinking of where normalize-space() might be useful:

    1) where there is no input mask/validation preventing insertion of extra spaces (the simplest example I can think of is UK postcodes);

    2) where white space did not matter in the original text (like text from a HTML page which can have all sorts of white space characters that are typically treated as if normalize-space() was applied);

    3) where you are concatenating strings (particularly from char, nchar fixed columns), some of which can be optional/NULL (you could just add spaces to each and normalize the result without getting duplicate spaces). This is common when you have lots of columns making up a personal name, for example.

    There are some cases where a text editor assumes if you put more than one consecutive space, you really mean it, and it can replace these with no break spaces. Normalizing the text before dumping it in the editor might avoid this problem.

    Of course, you could also use such a function in data cleaning before or after it hits your database.

  • Tavis Reddick (8/18/2014)


    Gary Varga (8/18/2014)


    Tavis Reddick (8/18/2014)


    I would like to see a (T-)SQL function like normalize-space() from XPath, which I believe is slightly more useful than just TRIM, as it deals with all white space characters including sequences in between other characters. It is particularly useful when combining chunks of text.

    Possibly there are other scalar functions from XPath worth considering as well.

    Understanding the danger of showing my naivety, surely that would be better served by cleaner data. N'est pas?

    There are three scenarios I am thinking of where normalize-space() might be useful:

    1) where there is no input mask/validation preventing insertion of extra spaces (the simplest example I can think of is UK postcodes);

    2) where white space did not matter in the original text (like text from a HTML page which can have all sorts of white space characters that are typically treated as if normalize-space() was applied);

    3) where you are concatenating strings (particularly from char, nchar fixed columns), some of which can be optional/NULL (you could just add spaces to each and normalize the result without getting duplicate spaces). This is common when you have lots of columns making up a personal name, for example.

    There are some cases where a text editor assumes if you put more than one consecutive space, you really mean it, and it can replace these with no break spaces. Normalizing the text before dumping it in the editor might avoid this problem.

    Of course, you could also use such a function in data cleaning before or after it hits your database.

    Fair points but I cannot see anything that cannot be resolved by a combination of data cleansing and judicious use of TRIM and ISNULL/COALESCE (although the code might not be very pretty - but when is string building???).

    I am not sure that manipulating free format text on the way out is the best design, however, it might be an important requirement (regulatory compliance, for example?) to capture it "as typed" and store it unmanipulated.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • Gary Varga (8/18/2014)


    Fair points but I cannot see anything that cannot be resolved by a combination of data cleansing and judicious use of TRIM and ISNULL/COALESCE (although the code might not be very pretty - but when is string building???).

    The kind of code I am considering here is typified by a real-life, in-my-organization example of a course title composed of component columns, which may come from fixed width columns (due to contraints on extending the proprietary database):

    SELECT LTRIM(ISNULL(RTRIM(QualName), '') + ' ') + LTRIM(ISNULL(RTRIM(QualLevel), '') + ' ') + LTRIM(ISNULL(RTRIM(SupInfoPrefix), '') + ' ') + LTRIM(ISNULL(RTRIM(SubjectArea), '') + ' ') + ISNULL(RTRIM(SupInfoSuffix), '') AS title,

    which might be compared with an XPath-style function like:

    SELECT normalize-space(string-join((QualName, QualLevel, SupInfoPrefix, SubjectArea, SupInfoSuffix), ' ') AS title,

    where the string-join function is taking a sequence of strings (which might be NULL) and interposing the string (here, a space) from the second argument.

    Basically, it is not just about prettiness but increasing the understandability and editability of the code, and reducing error and debugging time. Certainly, I would regard the second phrasing as more elegant and more closely aligned with the intent of the code author.

  • Tavis Reddick (8/18/2014)


    Gary Varga (8/18/2014)


    Fair points but I cannot see anything that cannot be resolved by a combination of data cleansing and judicious use of TRIM and ISNULL/COALESCE (although the code might not be very pretty - but when is string building???).

    The kind of code I am considering here is typified by a real-life, in-my-organization example of a course title composed of component columns, which may come from fixed width columns (due to contraints on extending the proprietary database):

    SELECT LTRIM(ISNULL(RTRIM(QualName), '') + ' ') + LTRIM(ISNULL(RTRIM(QualLevel), '') + ' ') + LTRIM(ISNULL(RTRIM(SupInfoPrefix), '') + ' ') + LTRIM(ISNULL(RTRIM(SubjectArea), '') + ' ') + ISNULL(RTRIM(SupInfoSuffix), '') AS title,

    which might be compared with an XPath-style function like:

    SELECT normalize-space(string-join((QualName, QualLevel, SupInfoPrefix, SubjectArea, SupInfoSuffix), ' ') AS title,

    where the string-join function is taking a sequence of strings (which might be NULL) and interposing the string (here, a space) from the second argument.

    Basically, it is not just about prettiness but increasing the understandability and editability of the code, and reducing error and debugging time. Certainly, I would regard the second phrasing as more elegant and more closely aligned with the intent of the code author.

    I TOTALLY agree with your aims of producing the maintainability of code but sometimes I wonder if it still would be SQL. Are we trying to change what SQL is for? Should we change how SQL is used? Perhaps and, if so, then this sort of functionality should be built in.

    Gaz

    -- Stop your grinnin' and drop your linen...they're everywhere!!!

  • I'd almost rather MS stopped trying to improve T-SQL because they either leave out critical functionality of new features, deprecate and delete useful stuff, or things run slower than "Black Arts" methods and they also miss some really good stuff.

    For an example of leaving out "critical functionality", look what the first did to OVER... while its first rendition was very useful, indeed, it took them several revs to include Lead/Lag, which most other engines already had. Then, there things like PIVOT. If you've ever worked with Access, you know the PIVOT function there is absolutely incredibly easy to use and useful. The current rendition of PIVOT in SQL Server is almost a cruel joke.

    I know a whole lot of people will disagree but for the "deprecate and delete" example, I found that sp_MakeWebtask was incredibly useful for making DBA reports to email. There are many other examples but that was one of my favorites.

    For things that run "slower", OVER with Lead/Lag is a good example. The "Quirky Update" still whups even the fastest rendition.

    As to missing really good stuff, there's been a CONNECT item open for at least 7 years asking for a "Tally" generator. I forget how many years it's been that a BULK EXPORT request has been open. It would also be nice if they fixed BCP and BULK INSERT so that they're row sensitive and not just delimiter sensitive so that you could actually and easily skip disparate header rows. Adding a "footer skip" would be helpful, as well. Of course, MS says "Use SSIS" instead. Yeah... not me. Not sure it would be able to do those things any better anyway.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 10 posts - 46 through 54 (of 54 total)

You must be logged in to reply to this topic. Login to reply