Cannot Select the max of an alphanumeric column

  • Hi. Well, I think this is my very first post to SCC...and it's kinda long - sorry about that...I was just trying to be as understandable as possible and to show what my own efforts were like.

    Normally I am able to figure out most of problems, usually because they aren't that tricky, but this one is confusing me to no end. I've spent the last several hours pouring over previous posts in SCC that are somewhat related (E.g. sorting alphanumeric, group by clause with max(), etc.), but still don't know enough to transfer the knowledge to my particular problem. I'm not a seasoned TSQL guy -- I am a software developer with enough knowledge to get by the simpler SQL problems.

    Here's my situation...(btw, this is specific to SQL Server 2000, but I suspect a solution will also have to work in 2005/2008 when this thing gets migrated.)

    I have a table that contains document records, and each doc has versions. If I query for a single doc I might get more than one record back if it has multiple versions. This is true for all the different docs.

    There is a varchar column called ver that keeps track of the version. It adheres to the following rules:

    1) ver is varchar(2) (I.e. it can have 1 or 2 chars;; null is not allowed)

    2) draft versions are given an alpha version starting with A.

    3) draft versions can be revised such that ver A --> ver B

    4) max alpha is ZZ due to 1) above

    5) once a doc is released for initial review it is given ver 00

    6) subsequent revisions get up-versioned in numerical order (I.e. 00 --> 01 --> 02 etc)

    7) max numerical ver, then, is 99

    8) the latest ver of a doc is based on numeric is newer than alpha, so 00 is newer than C and 01 is newer than 00

    I need a query that will return the row of doc containing the most recent version. Here is an example of some correct output...

    ---------------------------

    docIdprojIdtypeIdver

    ---------------------------

    1311E

    242102

    312231

    ProjId and TypeId are used as metadata to identify a document, where one proj can have multiple docs and docs associated with the same proj can be of multiple types. DocId is a surrogate key.

    IF object_id('Tempdb..#docs') IS NOT NULL BEGIN DROP TABLE #docs END

    CREATE TABLE #docs (docID int, projid varchar(10), typeid int, ver varchar(2))

    INSERT INTO #docs (docID, projid, typeid, ver) VALUES (10, 1, 1, 'A')

    INSERT INTO #docs (docID, projid, typeid, ver) VALUES (11, 1, 1, 'D')

    INSERT INTO #docs (docID, projid, typeid, ver) VALUES (12, 1, 1, 'C')

    INSERT INTO #docs (docID, projid, typeid, ver) VALUES (13, 1, 1, 'E') --latest ver for 1,1

    INSERT INTO #docs (docID, projid, typeid, ver) VALUES (14, 1, 1, 'B')

    INSERT INTO #docs (docID, projid, typeid, ver) VALUES (20, 2, 1, 'A')

    INSERT INTO #docs (docID, projid, typeid, ver) VALUES (21, 2, 1, 'B')

    INSERT INTO #docs (docID, projid, typeid, ver) VALUES (22, 2, 1, '00')

    INSERT INTO #docs (docID, projid, typeid, ver) VALUES (23, 2, 1, '01')

    INSERT INTO #docs (docID, projid, typeid, ver) VALUES (24, 2, 1, '02') -- latest ver for 2,1

    INSERT INTO #docs (docID, projid, typeid, ver) VALUES (30, 2, 2, '00')

    INSERT INTO #docs (docID, projid, typeid, ver) VALUES (31, 2, 2, '31') -- latest ver for 2,2

    INSERT INTO #docs (docID, projid, typeid, ver) VALUES (32, 2, 2, '10')

    INSERT INTO #docs (docID, projid, typeid, ver) VALUES (33, 2, 2, 'AA')

    INSERT INTO #docs (docID, projid, typeid, ver) VALUES (34, 2, 2, '29')

    INSERT INTO #docs (docID, projid, typeid, ver) VALUES (35, 2, 2, 'H')

    Objective:

    Return the latest ver for each grouping of projid and typeid

    Correct results should be:

    ---------------------------

    docIDprojidtypeidver

    ---------------------------

    1311E

    242102

    312231

    Simply using Max(ver) doesn't work because the ascii ordering of alpha chars is > numeric, so I did this:

    Generate a new column that is calculated from the ver column and use it to order the ver column values. The new column is calculated

    by multiplying the ascii value of the first char of ver by 100 if numeric, 10 if double alpha, nothing if single alpha,

    then adding it to the actual ascii value of the second char.

    The following SELECT will basically get the correct max ver for each combination of projid and typeid,

    but I need to return the actual unchanged ver column in the SELECT

    and I cannot include it in the SELECT clause because it is unique for each combination of projid and typeid, thereby

    throwing askew the GROUP BY clause

    SELECT PROJID, TYPEID,

    MAX(case

    when len(ver)>1 and patindex('%[0-9]%',left(ver,1)) = 1 then -- ver is numeric (I.e. one of 00, 01, 02, etc)

    (ascii(left(ver,1)) * 100) + ascii(right(ver,1)) -- E.g. For '01' --> ascii('0')=48 * 10 = 480 + 49 = 529

    when len(ver) > 1 and patindex('%[A-Z]%',left(ver,1)) = 1 then -- ver is alpha (I.e. A, B, AA, etc)

    (ascii(left(ver,1)) * 10) + (ascii(right(ver,1))) -- E.g. For 'AA' --> ascii('A')=65 *10 = 650 + 65 = 715

    when len(ver) = 1 then -- Must be alpha

    ascii(left(ver,1)) -- E.g. For 'H' --> ascii('H')=72

    end) as calcVer

    FROM #DOCS

    GROUP BY PROJID, TYPEID

    ORDER BY calcVer desc

    I then tried to get the ver column by a self-join (I think), within #Docs.*, but I hit a TSQL error in the commented code immediately below:

    -- SELECT #DOCS.* FROM

    -- (SELECT PROJID, TYPEID,

    -- MAX(case

    -- when len(ver)>1 and patindex('%[0-9]%',left(ver,1)) = 1 then -- ver is numeric (I.e. one of 00, 01, 02, etc)

    -- (ascii(left(ver,1)) * 100) + ascii(right(ver,1)) -- E.g. For '01' --> ascii('0')=48 * 10 = 480 + 49 = 529

    -- when len(ver) > 1 and patindex('%[A-Z]%',left(ver,1)) = 1 then -- ver is alpha (I.e. A, B, AA, etc)

    -- (ascii(left(ver,1)) * 10) + (ascii(right(ver,1))) -- E.g. For 'AA' --> ascii('A')=65 *10 = 650 + 65 = 715

    -- when len(ver) = 1 then -- Must be alpha

    -- ascii(left(ver,1)) -- E.g. For 'H' --> ascii('H')=72

    -- end) as calcVer

    -- FROM #DOCS

    -- GROUP BY PROJID, TYPEID

    -- ORDER BY calcVer desc) AS maxDocs

    -- INNER JOIN #DOCS ON #DOCS.PROJID = maxDocs.PROJID

    -- AND #DOCS.TYPEID = maxDocs.TYPEID

    I found out that my maxDocs derived(?) table cannot have an ORDER BY clause:

    The ORDER BY clause is invalid in views, inline functions, derived tables, and subqueries, unless TOP is also specified.

    So....I'm stumped. I think I might be headed in the right direction, but I cannot figure out how to determine the max ver for a given group of docs, without changing the value of ver itself. Thus, I am going :crazy:

    I'm sure that at 3:37 AM I am not explaining this very well, but I hope it is close enough. If anyone can get me going in the proper direction I would really appreciate it!

    Regards,

    Dan

  • This should do the trick:

    SELECT docID, projid, typeid, ver

    FROM #DOCS AS A

    WHERE ver = (

    SELECT TOP 1 ver

    FROM #DOCS AS B

    WHERE A.projid = B.projid

    AND A.typeid = B.typeid

    ORDER BY CASE WHEN ISNUMERIC(ver) = 1 THEN 'ZZ' + ver ELSE ver END DESC

    )

    Hope this helps.

    Gianluca

    -- Gianluca Sartori

  • Gianluca Sartori (4/8/2010)


    This should do the trick:

    ORDER BY CASE WHEN ISNUMERIC(ver) = 1 THEN 'ZZ' + ver ELSE ver END DESC

    Hope this helps.

    Gianluca

    It works beautifully!!! A thousand thank-yous! And the key part is one line of code! Boy, I never knew about that trick. It's like its doing 2 things at once on a single column. Very cool. I applied it to my larger scenario in a sproc and it works great. You pure SQL guys sure know your stuff. Again, thank you.

    Dan

  • You're welcome.

    Glad I could help.

    Gianluca

    -- Gianluca Sartori

Viewing 4 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply