April 8, 2010 at 1:44 am
Hi. Well, I think this is my very first post to SCC...and it's kinda long - sorry about that...I was just trying to be as understandable as possible and to show what my own efforts were like.
Normally I am able to figure out most of problems, usually because they aren't that tricky, but this one is confusing me to no end. I've spent the last several hours pouring over previous posts in SCC that are somewhat related (E.g. sorting alphanumeric, group by clause with max(), etc.), but still don't know enough to transfer the knowledge to my particular problem. I'm not a seasoned TSQL guy -- I am a software developer with enough knowledge to get by the simpler SQL problems.
Here's my situation...(btw, this is specific to SQL Server 2000, but I suspect a solution will also have to work in 2005/2008 when this thing gets migrated.)
I have a table that contains document records, and each doc has versions. If I query for a single doc I might get more than one record back if it has multiple versions. This is true for all the different docs.
There is a varchar column called ver that keeps track of the version. It adheres to the following rules:
1) ver is varchar(2) (I.e. it can have 1 or 2 chars;; null is not allowed)
2) draft versions are given an alpha version starting with A.
3) draft versions can be revised such that ver A --> ver B
4) max alpha is ZZ due to 1) above
5) once a doc is released for initial review it is given ver 00
6) subsequent revisions get up-versioned in numerical order (I.e. 00 --> 01 --> 02 etc)
7) max numerical ver, then, is 99
8) the latest ver of a doc is based on numeric is newer than alpha, so 00 is newer than C and 01 is newer than 00
I need a query that will return the row of doc containing the most recent version. Here is an example of some correct output...
---------------------------
docIdprojIdtypeIdver
---------------------------
1311E
242102
312231
ProjId and TypeId are used as metadata to identify a document, where one proj can have multiple docs and docs associated with the same proj can be of multiple types. DocId is a surrogate key.
IF object_id('Tempdb..#docs') IS NOT NULL BEGIN DROP TABLE #docs END
CREATE TABLE #docs (docID int, projid varchar(10), typeid int, ver varchar(2))
INSERT INTO #docs (docID, projid, typeid, ver) VALUES (10, 1, 1, 'A')
INSERT INTO #docs (docID, projid, typeid, ver) VALUES (11, 1, 1, 'D')
INSERT INTO #docs (docID, projid, typeid, ver) VALUES (12, 1, 1, 'C')
INSERT INTO #docs (docID, projid, typeid, ver) VALUES (13, 1, 1, 'E') --latest ver for 1,1
INSERT INTO #docs (docID, projid, typeid, ver) VALUES (14, 1, 1, 'B')
INSERT INTO #docs (docID, projid, typeid, ver) VALUES (20, 2, 1, 'A')
INSERT INTO #docs (docID, projid, typeid, ver) VALUES (21, 2, 1, 'B')
INSERT INTO #docs (docID, projid, typeid, ver) VALUES (22, 2, 1, '00')
INSERT INTO #docs (docID, projid, typeid, ver) VALUES (23, 2, 1, '01')
INSERT INTO #docs (docID, projid, typeid, ver) VALUES (24, 2, 1, '02') -- latest ver for 2,1
INSERT INTO #docs (docID, projid, typeid, ver) VALUES (30, 2, 2, '00')
INSERT INTO #docs (docID, projid, typeid, ver) VALUES (31, 2, 2, '31') -- latest ver for 2,2
INSERT INTO #docs (docID, projid, typeid, ver) VALUES (32, 2, 2, '10')
INSERT INTO #docs (docID, projid, typeid, ver) VALUES (33, 2, 2, 'AA')
INSERT INTO #docs (docID, projid, typeid, ver) VALUES (34, 2, 2, '29')
INSERT INTO #docs (docID, projid, typeid, ver) VALUES (35, 2, 2, 'H')
Objective:
Return the latest ver for each grouping of projid and typeid
Correct results should be:
---------------------------
docIDprojidtypeidver
---------------------------
1311E
242102
312231
Simply using Max(ver) doesn't work because the ascii ordering of alpha chars is > numeric, so I did this:
Generate a new column that is calculated from the ver column and use it to order the ver column values. The new column is calculated
by multiplying the ascii value of the first char of ver by 100 if numeric, 10 if double alpha, nothing if single alpha,
then adding it to the actual ascii value of the second char.
The following SELECT will basically get the correct max ver for each combination of projid and typeid,
but I need to return the actual unchanged ver column in the SELECT
and I cannot include it in the SELECT clause because it is unique for each combination of projid and typeid, thereby
throwing askew the GROUP BY clause
SELECT PROJID, TYPEID,
MAX(case
when len(ver)>1 and patindex('%[0-9]%',left(ver,1)) = 1 then -- ver is numeric (I.e. one of 00, 01, 02, etc)
(ascii(left(ver,1)) * 100) + ascii(right(ver,1)) -- E.g. For '01' --> ascii('0')=48 * 10 = 480 + 49 = 529
when len(ver) > 1 and patindex('%[A-Z]%',left(ver,1)) = 1 then -- ver is alpha (I.e. A, B, AA, etc)
(ascii(left(ver,1)) * 10) + (ascii(right(ver,1))) -- E.g. For 'AA' --> ascii('A')=65 *10 = 650 + 65 = 715
when len(ver) = 1 then -- Must be alpha
ascii(left(ver,1)) -- E.g. For 'H' --> ascii('H')=72
end) as calcVer
FROM #DOCS
GROUP BY PROJID, TYPEID
ORDER BY calcVer desc
I then tried to get the ver column by a self-join (I think), within #Docs.*, but I hit a TSQL error in the commented code immediately below:
-- SELECT #DOCS.* FROM
-- (SELECT PROJID, TYPEID,
-- MAX(case
-- when len(ver)>1 and patindex('%[0-9]%',left(ver,1)) = 1 then -- ver is numeric (I.e. one of 00, 01, 02, etc)
-- (ascii(left(ver,1)) * 100) + ascii(right(ver,1)) -- E.g. For '01' --> ascii('0')=48 * 10 = 480 + 49 = 529
-- when len(ver) > 1 and patindex('%[A-Z]%',left(ver,1)) = 1 then -- ver is alpha (I.e. A, B, AA, etc)
-- (ascii(left(ver,1)) * 10) + (ascii(right(ver,1))) -- E.g. For 'AA' --> ascii('A')=65 *10 = 650 + 65 = 715
-- when len(ver) = 1 then -- Must be alpha
-- ascii(left(ver,1)) -- E.g. For 'H' --> ascii('H')=72
-- end) as calcVer
-- FROM #DOCS
-- GROUP BY PROJID, TYPEID
-- ORDER BY calcVer desc) AS maxDocs
-- INNER JOIN #DOCS ON #DOCS.PROJID = maxDocs.PROJID
-- AND #DOCS.TYPEID = maxDocs.TYPEID
I found out that my maxDocs derived(?) table cannot have an ORDER BY clause:
The ORDER BY clause is invalid in views, inline functions, derived tables, and subqueries, unless TOP is also specified.
So....I'm stumped. I think I might be headed in the right direction, but I cannot figure out how to determine the max ver for a given group of docs, without changing the value of ver itself. Thus, I am going :crazy:
I'm sure that at 3:37 AM I am not explaining this very well, but I hope it is close enough. If anyone can get me going in the proper direction I would really appreciate it!
Regards,
Dan
April 8, 2010 at 7:01 am
This should do the trick:
SELECT docID, projid, typeid, ver
FROM #DOCS AS A
WHERE ver = (
SELECT TOP 1 ver
FROM #DOCS AS B
WHERE A.projid = B.projid
AND A.typeid = B.typeid
ORDER BY CASE WHEN ISNUMERIC(ver) = 1 THEN 'ZZ' + ver ELSE ver END DESC
)
Hope this helps.
Gianluca
-- Gianluca Sartori
April 8, 2010 at 12:01 pm
Gianluca Sartori (4/8/2010)
This should do the trick:
ORDER BY CASE WHEN ISNUMERIC(ver) = 1 THEN 'ZZ' + ver ELSE ver END DESC
Hope this helps.
Gianluca
It works beautifully!!! A thousand thank-yous! And the key part is one line of code! Boy, I never knew about that trick. It's like its doing 2 things at once on a single column. Very cool. I applied it to my larger scenario in a sproc and it works great. You pure SQL guys sure know your stuff. Again, thank you.
Dan
April 9, 2010 at 1:03 am
You're welcome.
Glad I could help.
Gianluca
-- Gianluca Sartori
Viewing 4 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply