Is overpunch amenable to cross apply?

  • I have a file that needs to have its contents converted to money from overpunch characters. I started on an approach, but my conscience is nagging me that it is a dumb approach. My instinct is that this needs to be a function, but then I'd have to write that over 40x, so that doesn't sound to bright either.

    Another idea is to use an overpunch table, but joining on the end of the substring also didn't sound like a good idea to me either. the gist of the problem is like so;

    select NDCNumber

    ,cast( case

    when right(ingredientcost,1)='{' then substring(ingredientcost, 1,len(ingredientcost)-1)+'0'

    when right(ingredientcost,1)='A' then substring(ingredientcost, 1,len(ingredientcost)-1)+'1'

    when right(ingredientcost,1)='B' then substring(ingredientcost, 1,len(ingredientcost)-1)+'2'

    when right(ingredientcost,1)='C' then substring(ingredientcost, 1,len(ingredientcost)-1)+'3'

    when right(ingredientcost,1)='D' then substring(ingredientcost, 1,len(ingredientcost)-1)+'4'

    when right(ingredientcost,1)='E' then substring(ingredientcost, 1,len(ingredientcost)-1)+'5'

    when right(ingredientcost,1)='F' then substring(ingredientcost, 1,len(ingredientcost)-1)+'6'

    when right(ingredientcost,1)='G' then substring(ingredientcost, 1,len(ingredientcost)-1)+'7'

    when right(ingredientcost,1)='H' then substring(ingredientcost, 1,len(ingredientcost)-1)+'8'

    when right(ingredientcost,1)='I' then substring(ingredientcost, 1,len(ingredientcost)-1)+'9'

    when right(ingredientcost,1)='J' then '-'+ substring(ingredientcost, 1,len(ingredientcost)-1)+'1'

    when right(ingredientcost,1)='K' then '-'+ substring(ingredientcost, 1,len(ingredientcost)-1)+'1'

    when right(ingredientcost,1)='L' then '-'+ substring(ingredientcost, 1,len(ingredientcost)-1)+'1'

    when right(ingredientcost,1)='M' then '-'+ substring(ingredientcost, 1,len(ingredientcost)-1)+'1'

    when right(ingredientcost,1)='N' then '-'+ substring(ingredientcost, 1,len(ingredientcost)-1)+'1'

    when right(ingredientcost,1)='O' then '-'+ substring(ingredientcost, 1,len(ingredientcost)-1)+'1'

    when right(ingredientcost,1)='P' then '-'+ substring(ingredientcost, 1,len(ingredientcost)-1)+'1'

    when right(ingredientcost,1)='Q' then '-'+ substring(ingredientcost, 1,len(ingredientcost)-1)+'1'

    when right(ingredientcost,1)='R' then '-'+ substring(ingredientcost, 1,len(ingredientcost)-1)+'1'

    when right(ingredientcost,1)='}' then '-'+ substring(ingredientcost, 1,len(ingredientcost)-1)+'1'end as money)/100 IngredientCost

    .

    .

    .

    into #ESIStage From ESILanded

    so, the column named ingredientcost should be the parameter to the function, so it can be @ColumnName, but then, what kind of function? returns table? I don't think it should return a table because I need to work on selected columns. As above, I should be able to join to it, but how to do that for only 40 out of 200 columns?

    Can you suggest how can I organize the work to get the overpunches in over forty columns to be updated to numbers to fix each affected column?

    iow, I think I have disqualified

    select

    dbo.fnFixOverpunch(IngredientCost)IngredientCost

    dbo.fnFixOverpunch(AWP)AWP

    dbo.fnFixOverpunch(DiscountAmount)DiscountAmount

    into #ESIStage from ESILanded

    and

    select * from ESILanded a join overpunch b on

    substring(....all the columns that need to be fixed??)= b.overpunchCharacter

    is there a way to use cross apply to reference the columns that need to be fixed as well as the other columns that need to be brought along with them (there are over 200 columns in the import, the overpunch columns are different money attributes ), or do I need to split the work somehow so I can use cross apply on the money columns and plain old select into for the other columns, then join them up again (this idea also means managing six columns that determine uniqueness, NDCNumber, DateFilled, Refill, Member, Pharmacy & Prescriber)

    my gut tells me writing the same thing a zillion times is a nonstarter, but my head cannot feature how to do this with cross apply, which I think is what it really needs.

    Hope this is clear, and the design alternatives understandable.

    thanks in advance for any guidance you can provide

  • Hi Drew

    Not sure if I have this right (first time I've heard of overpunch), but this may be what you are after

    -- An inline table function to convert overpunch to money

    CREATE FUNCTION itvfDeOverPunch(@op AS varchar(40)) RETURNS TABLE

    WITH SCHEMABINDING

    RETURN

    SELECT CASE

    WHEN (ASCII(UPPER(REPLACE(RIGHT(@op,1),'{','@'))) - 64) > 9 THEN

    (((CAST(LEFT(@op,LEN(@op) - 1) AS INT) * 10) + (ASCII(UPPER(REPLACE(RIGHT(@op,1),'}','I'))) - 73)) * -1) / 100.

    ELSE

    ((CAST(LEFT(@op,LEN(@op) - 1) AS INT) * 10) + (ASCII(UPPER(REPLACE(RIGHT(@op,1),'{','@'))) - 64)) / 100.

    END retMoney

    GO

    -- TEST

    SELECT AV, a.retMoney, b.retMoney

    FROM (VALUES

    ('0002137}','0004000{')

    ,('0002137J','0004000A')

    ,('0002137K','0004000B')

    ,('0002137L','0004000C')

    ,('0002137M','0004000D')

    ,('0002137N','0004000E')

    ,('0002137O','0004000F')

    ,('0002137P','0004000G')

    ,('0002137Q','0004000H')

    ,('0002137R','0004000I')

    ,('0002137{','0004000}')

    ,('0002137A','0004000J')

    ,('0002137B','0004000K')

    ,('0002137C','0004000L')

    ,('0002137D','0004000M')

    ,('0002137E','0004000N')

    ,('0002137F','0004000O')

    ,('0002137G','0004000P')

    ,('0002137H','0004000Q')

    ,('0002137I','0004000R')

    ) AS TEST_VALUES(AV, BV) -- Replace this with table

    CROSS APPLY ( SELECT * FROM itvfDeOverPunch(AV) ) a -- convert column av

    CROSS APPLY ( SELECT * FROM itvfDeOverPunch(BV) ) b -- convert column bv

  • SELECT

    NDCNumber,

    IngredientCost = CAST(CASE

    WHEN Trailer IN ('{','A','B','C','D','E','F','G','H','I')

    THEN LEFT(ingredientcost,LEN(ingredientcost)-1) + CAST(CHARINDEX('{','{ABCDEFGHI')-1 AS VARCHAR(1))

    WHEN Trailer IN ('J','K','L','M','N','O','P','Q','R','}')

    THEN '-' + LEFT(ingredientcost,LEN(ingredientcost)-1) + '1'

    END AS MONEY)/100

    FROM ...

    โ€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.โ€ - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (8/12/2013)


    SELECT

    NDCNumber,

    IngredientCost = CAST(CASE

    WHEN Trailer IN ('{','A','B','C','D','E','F','G','H','I')

    THEN LEFT(ingredientcost,LEN(ingredientcost)-1) + CAST(CHARINDEX('{','{ABCDEFGHI')-1 AS VARCHAR(1))

    WHEN Trailer IN ('J','K','L','M','N','O','P','Q','R','}')

    THEN '-' + LEFT(ingredientcost,LEN(ingredientcost)-1) + '1'

    END AS MONEY)/100

    FROM ...

    I like the use of CHARINDEX:cool: but it should probably be getting the index of trailer.

    A question for the OP. When I was looking up an explanation of overpunch, the series }, J to K incremented from 0 to 9 and negated the number. Is the negation and setting to 1 really what you want to do for this series?

  • SELECT

    NDCNumber,

    IngredientCost = CAST(CASE

    WHEN Trailer IN ('{','A','B','C','D','E','F','G','H','I')

    THEN LEFT(ingredientcost,LEN(ingredientcost)-1)

    + CAST(CHARINDEX(Trailer,'{ABCDEFGHI')-1 AS VARCHAR(1))

    WHEN Trailer IN ('J','K','L','M','N','O','P','Q','R','}')

    THEN '-' + LEFT(ingredientcost,LEN(ingredientcost)-1) + '1'

    END AS MONEY)/100

    FROM ...

    CROSS APPLY (SELECT Trailer = RIGHT(ingredientcost,1)) x

    Thanks Micky


    [font="Arial"]Low-hanging fruit picker and defender of the moggies[/font]

    For better assistance in answering your questions, please read this[/url].


    Understanding and using APPLY, (I)[/url] and (II)[/url] Paul White[/url]

    Hidden RBAR: Triangular Joins[/url] / The "Numbers" or "Tally" Table: What it is and how it replaces a loop[/url] Jeff Moden[/url]

  • Yikes!

    That code should be following the same pattern as the positive numbers did, not to 1.

    It is a mistake.

    Blown away by the use of charindex()!

    I will try them all out.

    Thank you very much all the contributors for all the help and insight into the problem.

  • Is it too late for me to join the party?

    WITH SampleData (A, B) AS (

    SELECT *

    FROM (VALUES

    ('0002137}','0004000{')

    ,('0002137J','0004000A')

    ,('0002137K','0004000B')

    ,('0002137L','0004000C')

    ,('0002137M','0004000D')

    ,('0002137N','0004000E')

    ,('0002137O','0004000F')

    ,('0002137P','0004000G')

    ,('0002137Q','0004000H')

    ,('0002137R','0004000I')

    ,('0002137{','0004000}')

    ,('0002137A','0004000J')

    ,('0002137B','0004000K')

    ,('0002137C','0004000L')

    ,('0002137D','0004000M')

    ,('0002137E','0004000N')

    ,('0002137F','0004000O')

    ,('0002137G','0004000P')

    ,('0002137H','0004000Q')

    ,('0002137I','0004000R')

    ) a(A,B))

    SELECT OldA=A, A=STUFF(A, LEN(A), 1, (A1-1)%10)*POWER(-1, CASE WHEN A1 > 9 THEN 1 ELSE 0 END)

    ,OldB=B, B=STUFF(B, LEN(B), 1, (B1-1)%10)*POWER(-1, CASE WHEN B1 > 9 THEN 1 ELSE 0 END)

    FROM SampleData a

    CROSS APPLY (

    SELECT A1=CHARINDEX(RIGHT(A,1), '{ABCDEFGHIJKLMNOPQR}')

    ,B1=CHARINDEX(RIGHT(B,1), '{ABCDEFGHIJKLMNOPQR}')) b


    My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]

    My thought question: Have you ever been told that your query runs too fast?

    My advice:
    INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
    The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

    Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
    Since random numbers are too important to be left to chance, let's generate some![/url]
    Learn to understand recursive CTEs by example.[/url]
    [url url=http://www.sqlservercentral.com/articles/St

  • dwain.c (8/12/2013)


    Is it too late for me to join the party?

    Not at all and very nice ๐Ÿ™‚

    Another variation building on Chris's CHARINDEX and removing the CASE.

    WITH SampleData (A, B) AS (

    SELECT *

    FROM (VALUES

    ('0002137}','0004000{')

    ,('0002137J','0004000A')

    ,('0002137K','0004000B')

    ,('0002137L','0004000C')

    ,('0002137M','0004000D')

    ,('0002137N','0004000E')

    ,('0002137O','0004000F')

    ,('0002137P','0004000G')

    ,('0002137Q','0004000H')

    ,('0002137R','0004000I')

    ,('0002137{','0004000}')

    ,('0002137A','0004000J')

    ,('0002137B','0004000K')

    ,('0002137C','0004000L')

    ,('0002137D','0004000M')

    ,('0002137E','0004000N')

    ,('0002137F','0004000O')

    ,('0002137G','0004000P')

    ,('0002137H','0004000Q')

    ,('0002137I','0004000R')

    ) a(A,B))

    SELECT A,

    CAST(

    LEFT(A,LEN(A) - 1) + --Digit

    RIGHT(CAST(CHARINDEX(RIGHT(A,1),'}RQPONMLKJ{ABCDEFGHI') - 11 AS VARCHAR(3)), 1) --translated alpha

    AS MONEY) * SIGN(CHARINDEX(RIGHT(A,1),'}RQPONMLKJ {ABCDEFGHI') - 11) / 100 AS MoneyA,

    B,

    CAST(

    LEFT(B,LEN(B) - 1) + --Digit

    RIGHT(CAST(CHARINDEX(RIGHT(B,1),'}RQPONMLKJ{ABCDEFGHI') - 11 AS VARCHAR(3)), 1) --translated alpha

    AS MONEY) * SIGN(CHARINDEX(RIGHT(B,1),'}RQPONMLKJ {ABCDEFGHI') - 11) / 100 AS MoneyB

    FROM SampleData;

  • Holy good night!

    You guys are making me dizzy!

    What great ideas....thanks a ton

  • If you want maximum performance you could use something like this:

    create function fixop2(@a varchar(20)) returns table

    as

    return select

    case

    when right(@a,1) between 'A' and 'I' then +cast(left(@a, len(@a)-1) as int)*10 + (ascii(right(@a,1)) - ascii('A') + 1)

    when right(@a,1) = '{' then +cast(left(@a, len(@a)-1) as int)*10

    when right(@a,1) between 'J' and 'R' then -cast(left(@a, len(@a)-1) as int)*10 - (ascii(right(@a,1)) - ascii('J') + 1)

    when right(@a,1) = '}' then -cast(left(@a, len(@a)-1) as int)*10

    end as result

    go

    Queries involving CHARINDEX might look more elegant or clever but remember that each call to CHARINDEX performs a linear search in the string. This is much slower than simply subtracting ascii values.

    Many of the other solutions perform two CHARINDEX calls for each translated value.

    On my machine, my code is almost twice as fast as the solutions using CHARINDEX.

  • Stefan_G (8/13/2013)


    If you want maximum performance you could use something like this:

    create function fixop2(@a varchar(20)) returns table

    as

    return select

    case

    when right(@a,1) between 'A' and 'I' then +cast(left(@a, len(@a)-1) as int)*10 + (ascii(right(@a,1)) - ascii('A') + 1)

    when right(@a,1) = '{' then +cast(left(@a, len(@a)-1) as int)*10

    when right(@a,1) between 'J' and 'R' then -cast(left(@a, len(@a)-1) as int)*10 - (ascii(right(@a,1)) - ascii('J') + 1)

    when right(@a,1) = '}' then -cast(left(@a, len(@a)-1) as int)*10

    end as result

    go

    Queries involving CHARINDEX might look more elegant or clever but remember that each call to CHARINDEX performs a linear search in the string. This is much slower than simply subtracting ascii values.

    Many of the other solutions perform two CHARINDEX calls for each translated value.

    On my machine, my code is almost twice as fast as the solutions using CHARINDEX.

    It's two orders of magnitude out though, Stefan ๐Ÿ˜‰

    โ€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.โ€ - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Here's a sample data generator. I've tested it using Stefan's code and mine and it seems ok. Anyone else up for a race? Package your code into a iTVF.

    SELECT TOP (1000000)

    Amount,

    IngredientCost

    INTO #SomeTestTable

    FROM sys.all_columns ac1

    CROSS JOIN sys.all_columns ac2

    CROSS JOIN sys.all_columns ac3

    CROSS APPLY (SELECT Amount = CHECKSUM(NEWID())%1000000) x

    CROSS APPLY (

    SELECT IngredientCost = CASE

    WHEN Amount >= 0 THEN LEFT(RIGHT('00000'+CAST(Amount AS VARCHAR(7)),7),6) + SUBSTRING('{ABCDEFGHI',1+RIGHT(AMOUNT,1),1)

    ELSE LEFT(RIGHT('00000'+CAST(ABS(Amount) AS VARCHAR(7)),7),6) + SUBSTRING('JKLMNOPQR}',1+RIGHT(AMOUNT,1),1)

    END

    ) y ;

    โ€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.โ€ - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • ChrisM@Work (8/13/2013)


    It's two orders of magnitude out though, Stefan ๐Ÿ˜‰

    OK, So, the corrected code with conversion to money and divide by 100 looks like this:

    alter function fixop2(@a varchar(20)) returns table

    as

    return select

    cast(case

    when right(@a,1) between 'A' and 'I' THEN +cast(left(@a, len(@a)-1) as int)*10 + (ascii(right(@a,1)) - ascii('A') + 1)

    when right(@a,1) = '{' THEN +cast(left(@a, len(@a)-1) as int)*10

    when right(@a,1) between 'J' and 'R' THEN -cast(left(@a, len(@a)-1) as int)*10 - (ascii(right(@a,1)) - ascii('J') + 1)

    when right(@a,1) = '}' THEN -cast(left(@a, len(@a)-1) as int)*10

    end as money) / 100 as result

    Note that many of the other posted solutions do not handle conversion of negative values correctly (because of a bug in the OP)

    If you want to compare performance you should use solutions that give correct results.

    It is easy to be fast if you do not have to be correct ๐Ÿ˜‰

  • Stefan_G (8/13/2013)


    If you want maximum performance you could use something like this:

    create function fixop2(@a varchar(20)) returns table

    as

    return select

    case

    when right(@a,1) between 'A' and 'I' then +cast(left(@a, len(@a)-1) as int)*10 + (ascii(right(@a,1)) - ascii('A') + 1)

    when right(@a,1) = '{' then +cast(left(@a, len(@a)-1) as int)*10

    when right(@a,1) between 'J' and 'R' then -cast(left(@a, len(@a)-1) as int)*10 - (ascii(right(@a,1)) - ascii('J') + 1)

    when right(@a,1) = '}' then -cast(left(@a, len(@a)-1) as int)*10

    end as result

    go

    Queries involving CHARINDEX might look more elegant or clever but remember that each call to CHARINDEX performs a linear search in the string. This is much slower than simply subtracting ascii values.

    Many of the other solutions perform two CHARINDEX calls for each translated value.

    On my machine, my code is almost twice as fast as the solutions using CHARINDEX.

    On my machine, your code is three times faster - but the negative numbers are different to mine and to Dwain's.

    โ€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.โ€ - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • I apologize for the original error, and want to express my gratitude for the time and effort you all spent on my problem. It was a great lesson in so many explicit (charindex trick even though it weighed on performance, it was still ...I was going to say opportunistic, but elegant is more like it, and its substitute with the corrected negative result...it was breathtaking) and implicit ways (my own rush to post and overlooking the flaw in my original post)

    Thank you all again

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply