CASE TRY_CONVERT causes error message

  • Hello All,

    My first statement works but performs poorly. Subsequent statements generate errors.

    Any insight into why the error? Any alternate syntaxes that might work that I can try on my live datasets?

    DROP TABLE IF EXISTS #Test
    CREATE TABLE #Test (TestData VARCHAR(800),TestDataType VARCHAR(10))
    INSERT INTO #Test VALUES ('1','Integer')
    INSERT INTO #Test VALUES ('A','Integer')
    INSERT INTO #Test VALUES ('1.1','Decimal')
    INSERT INTO #Test VALUES ('1A','Decimal')
    INSERT INTO #Test VALUES ('2022-01-01','Date')
    INSERT INTO #Test VALUES ('1','Date')

    --This works but performs poorly with larger data sets
    SELECT * FROM #Test
    WHERE
    (TestDataType = 'Integer' AND TRY_CONVERT(INT,TestData) IS NULL) OR
    (TestDataType = 'Decimal' AND TRY_CONVERT(DECIMAL,TestData) IS NULL) OR
    (TestDataType = 'Date' AND TRY_CONVERT(DATE,TestData) IS NULL)

    --This does not work; I hoped it might perform better
    --SELECT * FROM (
    SELECT
    CASE TestDataType
    WHEN 'Integer' THEN TRY_CONVERT(INT,TestData)
    WHEN 'Decimal' THEN TRY_CONVERT(DECIMAL,TestData)
    WHEN 'Date' THEN TRY_CONVERT(DATE,TestData)
    END AS TryConvertedValue,
    *
    FROM #Test
    --) T WHERE TryConvertedValue IS NULL

    SELECT
    CASE
    WHEN TestDataType = 'Integer' THEN TRY_CONVERT(INT,TestData)
    WHEN TestDataType = 'Decimal' THEN TRY_CONVERT(DECIMAL,TestData)
    WHEN TestDataType = 'Date' THEN TRY_CONVERT(DATE,TestData)
    END AS TryConvertedValue,
    *
    FROM #Test

     

     

     

     

  • The reason why it produces an error is because the first thing to qualify in the CASE expression is an INTEGER and so the resulting column formed by the CASE is assigned the INT datatype.  When it gets to doing the proper conversion to the DATE data type, the conversion is successful and a DATE is returned but it cannot put a DATE into an INTEGER column.

    The only way you can do exactly what your code's intent is to convert the result to an SQL_VARIANT datatype, and I don't which SQL_VARIANT on anyone, with the possible exception of using it in a "By-Column" Audit Table, which I've done in the past and still hate it. 😀

    You've posted code according to WHAT you're trying to do... If you tell us WHY you're trying to do it, perhaps we could suggest a decent or better alternative.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Chrissy321 wrote:

    Hello All,

    My first statement works but performs poorly. Subsequent statements generate errors.

    Any insight into why the error? Any alternate syntaxes that might work that I can try on my live datasets?

    Quick questions:

    1.  What are you trying to achieve?
    2. What is the error?
    3. The structure of the actual dataset?
    4. Sample data that conforms with the actual dataset?

    😎

     

  • Thank you both for replying. I apologize for my delayed response, I though I would get an email when follow-ups were posted.

    I am trying to validate data during an ETL process. The column can contain text, dates or integers, classic EAV form. I do have metadata that defines each column as Integer, Decimal or Date. I am trying to use TRY_CONVERT to identify records to delete. My delete statement would then use the OUTPUT statement to log deleted data so there would be a record of data that was thrown away. End goal is to remove and log data that would break the ETL process.

    What are you trying to achieve?-Above.

    What is the error?-Operand type clash: decimal is incompatible with date

    The structure of the actual dataset?-Very similar to what I posted. the column data is VARCHAR(800)

    Sample data that conforms with the actual dataset?-Very similar to what I posted. I haven't seen any bad data yet so I am just trying to be proactive. I can't provide exact DDL and data out of security concerns with my employer. I know this is not ideal but I do attempt to accurately represent the problem.

    This article was helpful in getting better performance out of my actual data.

    https://www.brentozar.com/archive/2020/03/making-try_cast-and-try_convert-queries-faster-with-indexed-computed-columns/

    In my record set of 600,000 records adding the computed columns reduced query time from 90 to 3 seconds. Of interest creating an index containing the computed columns did not seem to have any impact, the performance improvement occurs with just the creation of the computed columns.

    Again apologize for my late response and my inability to provide actual DDL, data and execution plans. Hopefully my response is interesting or could help someone in the future.

     

  • Chrissy321 wrote:

    Thank you both for replying. I apologize for my delayed response, I though I would get an email when follow-ups were posted.

    You thought correctly except that functionality was broken starting on Oct 26th and they just fixed it a few days ago.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Shifting gears back to the problem, is there a UNIQUE column in the table to identify individual rows with?

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Thanks Jeff

    There is not a UNIQUE column in the table to identify individual rows with. Its an EAV slowly changing dimension.

    DROP TABLE IF EXISTS #Test

    CREATE TABLE #Test

    (

    Entity VARCHAR(20),

    Attribute VARCHAR(800),

    AttributeType VARCHAR(10),--This is actually in a seperate reference table keyed off Attribute

    Value VARCHAR(800),

    StartDate DATE NOT NULL,

    EndDate DATE NULL,

    )

    As an aside I did not get an email with your last post. Last email I get from the forum was back in August and my email has not changed. I am checking 'Notify me of follow-up replies via email'.

     

  • But, you DO have a UNIQUE set of column keys based on Entity, Attribute, and StartDate.

    This looks suspiciously like a column based audit table or Slowly Changing Dimension format.  Is that what the ultimate use is?  Where does all the Decimal, Integer, and VARCHAR stuff come into play here?  Would that be the purpose of the AttributeType column?

     

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)
    Intro to Tally Tables and Functions

  • Yes a unique index can be created on Entity, Attribute, and StartDate and yes this is a Slowly Changing Dimension format. Refined code and data below.

    At this point my performance is acceptable once I created the computed columns.  I haven't created the index like in the Ozar article.

    So I think my problem is resolved but any additional insight is welcome.

    DROP TABLE IF EXISTS #Test

    CREATE TABLE #Test

    (
    Entity VARCHAR(20),
    Attribute VARCHAR(800),
    AttributeType VARCHAR(10),
    Value VARCHAR(800),
    StartDate DATE NOT NULL,
    EndDate DATE NULL,
    --These are the computed columns which improve performance on large data sets
    ValueTryConvertDate AS TRY_CONVERT(DATE,Value),
    ValueTryConvertDecimal AS TRY_CONVERT(DECIMAL,Value),
    ValueTryConvertInteger AS TRY_CONVERT(INT,Value)
    )

    IF NOT EXISTS(SELECT * FROM sys.indexes WHERE Name = 'UI_Test') CREATE UNIQUE INDEX UI_Test ON dbo.#Test (Entity,Attribute,StartDate)

    INSERT INTO #Test (Entity,Attribute,AttributeType,Value,StartDate,EndDate) VALUES ('A','AppointmentDate','Date','2023-01-02','2022-11-05',NULL)
    INSERT INTO #Test (Entity,Attribute,AttributeType,Value,StartDate,EndDate) VALUES ('B','Temperataure','Decimal','100.3','2022-11-04','2022-11-04')
    INSERT INTO #Test (Entity,Attribute,AttributeType,Value,StartDate,EndDate) VALUES ('B','Temperataure','Decimal','100.1','2022-11-05',NULL)
    INSERT INTO #Test (Entity,Attribute,AttributeType,Value,StartDate,EndDate) VALUES ('C','Grade','Integer','8','2022-11-04','2022-11-04')
    INSERT INTO #Test (Entity,Attribute,AttributeType,Value,StartDate,EndDate) VALUES ('C','Grade','Integer','5','2022-11-05','2022-11-08')
    INSERT INTO #Test (Entity,Attribute,AttributeType,Value,StartDate,EndDate) VALUES ('C','Grade','Integer','9','2022-11-09',NULL)
    --These records should be flagged becauae TRY_CONVERT converts to NULL
    INSERT INTO #Test (Entity,Attribute,AttributeType,Value,StartDate,EndDate) VALUES ('D','AppointmentDate','Date','0000-11-05','2022-11-05',NULL)
    INSERT INTO #Test (Entity,Attribute,AttributeType,Value,StartDate,EndDate) VALUES ('E','Temperataure','Decimal','A','2022-11-04','2022-11-04')
    INSERT INTO #Test (Entity,Attribute,AttributeType,Value,StartDate,EndDate) VALUES ('F','Grade','Integer','9.9','2022-11-10',NULL)

    SELECT * FROM #Test
    WHERE
    (AttributeType = 'Integer' AND TRY_CONVERT(INT,Value) IS NULL) OR
    (AttributeType = 'Decimal' AND TRY_CONVERT(DECIMAL,Value) IS NULL) OR
    (AttributeType = 'Date' AND TRY_CONVERT(DATE,Value) IS NULL)

     

     

     

     

Viewing 9 posts - 1 through 8 (of 8 total)

You must be logged in to reply to this topic. Login to reply