Comparing Different Data Types

  • Ed Pollack

    Hall of Fame

    Points: 3107

    Comments posted to this topic are about the item Comparing Different Data Types

  • davoscollective

    SSCertifiable

    Points: 6325

    Good article, and very important.

    I noticed some queries only today being executed by very experienced analysts doing awful non-sargable sins like joining a date column to a datetime column, by converting the datetime to a varchar. In the same query they were casting the date column to a varchar in the where clause and comparing it to a string literal.

    I thought I was being nice in my pointers on how to improve the query & link to article on sargable queries, but the actuary in question didn't appreciate my advice, stating they were just rerunning another actuary's query. The fact that the query went from 6 minutes to 2 seconds with those very basic changes was beside the point.

    On the other hand this person can probably do things with R and SAS that would explode my head, but then, I'm not trying to do those things on their test machine!

  • Ed Pollack

    Hall of Fame

    Points: 3107

    davoscollective (5/19/2014)


    Good article, and very important.

    I noticed some queries only today being executed by very experienced analysts doing awful non-sargable sins like joining a date column to a datetime column, by converting the datetime to a varchar. In the same query they were casting the date column to a varchar in the where clause and comparing it to a string literal.

    I thought I was being nice in my pointers on how to improve the query & link to article on sargable queries, but the actuary in question didn't appreciate my advice, stating they were just rerunning another actuary's query. The fact that the query went from 6 minutes to 2 seconds with those very basic changes was beside the point.

    On the other hand this person can probably do things with R and SAS that would explode my head, but then, I'm not trying to do those things on their test machine!

    It definitely is tough to sell some people on optimization, but if you go armed with lots of facts (CPU, memory, reads, writes, duration, etc...), it becomes a bit easier to make your case, even if others may not like feeling corrected.

  • Jeff Moden

    SSC Guru

    Points: 994279

    Short, sweet, and to the point. This would be a great intro into what ORMs can do to do you if you don't pay attention. Nice article on this subject, Ed.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column.
    "If you think its expensive to hire a professional to do the job, wait until you hire an amateur."--Red Adair
    "Change is inevitable... change for the better is not."
    When you put the right degree of spin on it, the number 3|8 is also a glyph that describes the nature of a DBAs job. 😉

    Helpful Links:
    How to post code problems
    Create a Tally Function (fnTally)

  • Dave Vroman

    SSC Eights!

    Points: 821

    Great article. I do have a very small point in that all phone numbers can't be BIGINT for the reason that there are some international numbers which require the operator intervention and therefore require a leading zero.

  • Ed Pollack

    Hall of Fame

    Points: 3107

    Dave Vroman (5/19/2014)


    Great article. I do have a very small point in that all phone numbers can't be BIGINT for the reason that there are some international numbers which require the operator intervention and therefore require a leading zero.

    That is a good point---and I generally advocate for international numbers where symbols, extra digits, or formatting may be needed to use a "format type" along with your number. This additional piece of data would represent any additional instructions on how the number is formatted, if there are special dialing codes, etc...It could be presented in many ways, but is a way to avoid the inevitable usage of VARCHAR(alot) for phone numbers, credit card numbers, soc#, etc...

  • curious_sqldba

    SSC-Dedicated

    Points: 36266

    Ed Pollack (5/18/2014)


    Comments posted to this topic are about the item <A HREF="/articles/Data+Types/109214/">Comparing Different Data Types</A>

    Is there any benefit using CAST Vs Convert to explicitly convert the data types , also which one's should we be converting, is the one on right of operator?

  • Ed Pollack

    Hall of Fame

    Points: 3107

    curious_sqldba (5/21/2014)


    Ed Pollack (5/18/2014)


    Comments posted to this topic are about the item <A HREF="/articles/Data+Types/109214/">Comparing Different Data Types</A>

    Is there any benefit using CAST Vs Convert to explicitly convert the data types , also which one's should we be converting, is the one on right of operator?

    In terms of CAST vs. CONVERT, they will functionally do the same thing. The primary difference between the two is that CONVERT can be used for a variety of formatting functions. For example, you can convert a DATETIME to a 10 character VARCHAR using CONVERT like this:

    CONVERT(VARCHAR, MyDate, 10)

    Microsoft could easily bore you to death with their long explanation of CAST & CONVERT:

    http://msdn.microsoft.com/en-us/library/ms187928.aspx

    CAST is simpler, though, and I tend to prefer using it if I have no additional formatting needs.

    When choosing an operator to convert, your goal should be to always convert the scalar variable and leave the column from the table clean. If you are forced to convert a column, such as in the example from the article:

    SELECT

    EMPLOYEE_DATA.*

    FROM EmployeeData EMPLOYEE_DATA

    INNER JOIN HumanResources.Employee EMPLOYEE

    ON EMPLOYEE.NationalIDNumber = CAST(EMPLOYEE_DATA.NationalIDNumber AS NVARCHAR(15))

    WHERE EMPLOYEE_DATA.NationalIDNumber = 658797903

    I would honestly consider altering your database design or finding a way around this. How the query performs will be based somewhat on statistics and the optimizer, but you'll often suffer a table scan or index scan, rather than a seek. In this case, I'd definitely recommend experimentation and try out both to confirm if either performs better.

    More than anything else, look for any way to avoid having to do such an ugly join in the first place. Even if they perform acceptably now, there's no guarantee they will in the future as your row counts grow.

Viewing 8 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply