June 17, 2013 at 4:58 am
I have an interesting little T-SQL problem that has me beaten.
I want to retrieve a small set of randomly selected rows from a table but also ensure that a particular column (AUTHORID) is unique (not duplicated) within that set.
SELECT TOP 6 WORKID, AUTHORID FROM dbo.LITERARYWORKS
ORDER BY CHECKSUM(NEWID())
works fine at getting a random selection, but of course does not prevent duplicates on AUTHORID. And I can't use DISTINCT without including WORKID and AUTHORID in the ORDER BY - and in any case the rows would always be unique if WORKID is included. I believe one way of achieving the effect of a 'DISTINCT' on a single column while returning other columns as well is by using GROUP BY, but I can't see how to use that in combination with ORDER BY.
Any ideas?
Jon
June 17, 2013 at 5:39 am
by introducing a row_number function that partitions by the two columns, you cna find unique rows; fromt here, it's just the same query you had before, but hitting a subquery isntead of the raw table:
SELECT
TOP 6
WORKID,
AUTHORID
FROM (SELECT
ROW_NUMBER() OVER(PARTITION BY WORKID, AUTHORID ORDER BY WORKID, AUTHORID) AS RW,
WORKID,
AUTHORID
FROM dbo.LITERARYWORKS) MyAlias
WHERE RW = 1--finds Uniqueness across WORKID, AUTHORID
ORDER BY
CHECKSUM(NEWID())
Lowell
June 17, 2013 at 5:54 am
Works a treat - many thanks!
June 17, 2013 at 6:31 am
A randomly-selected work from a number of randomly-selected authors:
SELECT
WORKID,
AUTHORID
FROM (
SELECT TOP 10
AUTHORID
FROM dbo.LITERARYWORKS
GROUP BY MASTNUM
ORDER BY NEWID()
) at
CROSS APPLY (
SELECT TOP 1
WORKID
FROM dbo.LITERARYWORKS
WHERE AUTHORID = at.AUTHORID
ORDER BY NEWID()
) MyAlias
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
June 18, 2013 at 1:21 am
Many thanks. I actually prefer this for my purposes as it will always return the required number of items (provided that there is at least that number of authors with at least 1 work each) whereas the other approach above is arguably more random but can return less than the required number of items if more than 1 work for the same author is initially selected.
Viewing 5 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply