Selecting Max and Min Rows in one SQL Statement

  • I have a table that has an BuildingID, UserID, DateID columns. There could be many rows for a BuildingID and Userid, but with different DateIDs.

    What I want is to get two rows per BuildingID/UserID match. 1 row with the MIN(DateID) and 1 row with the Max(DateID).

    I have a few concerns about the way it should be done. Right now the table I'm pulling from has around 16 million rows. I want to be able to run a insert query into a new table (pulling just the two rows per BuildingID/UserID match).

    I don't think it would be efficient to pull the Min DateID and the Max DateID as two separate queries. I think it would be best to write a query that can pull both rows once.

    I have thought about using the WITH common_table_expression to pull two "temp" tables and then select from them joining the two Min_Date table and Max_Date table, but I'm worried about temp storage and log file writing.

    Does anybody have any ideas?

  • You could probably use Ranking Functions but that does involve CTE, temp table or a table variable.

  • Without schema, I can't hand you actual code, but what's the concern doing something like the following:

    SELECT

    BldgID,

    UserID,

    Max(SomeDate) AS MaxDate,

    Min(SomeDAte) AS MinDate

    FROM

    -- Joins go here

    GROUP BY

    BldgID,

    UserID


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • I must mention that there are some other columns in the table that are specific to each row. I should have mentioned this right away but forgot. sorry 🙁

    So having min and max date in the same row wouldn't work.

    I would need something like this:

    bldgid userID dateid score1 score2

    1 1 123 90 99

    1 1 124 89 88

    1 1 125 99 99

    .

    .

    .

    I need a query that will pull the first row and the last due to the dateid being the min and the max...

    Hope this helps. Sorry for the confusion.

  • Dbloc (1/31/2012)


    I must mention that there are some other columns in the table that are specific to each row. I should have mentioned this right away but forgot. sorry 🙁

    So having min and max date in the same row wouldn't work.

    I would need something like this:

    bldgid userID dateid score1 score2

    1 1 123 90 99

    1 1 124 89 88

    1 1 125 99 99

    .

    .

    .

    I need a query that will pull the first row and the last due to the dateid being the min and the max...

    Hope this helps. Sorry for the confusion.

    It does, but it's harder to hand you an example of that. You're looking for the equivalent of 'last row' in a logging table. You'll want to use something similar to SELECT bid, uid, MIN() UNION Select bid, uid, MAX() as a subquery to join the main table back against. If you setup sample schema and data like you'll find in the first link in my signature you'll get tested code in return (usually).

    It's a bit difficult to explain otherwise if that snippet doesn't make sense.


    - Craig Farrell

    Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.

    For better assistance in answering your questions[/url] | Forum Netiquette
    For index/tuning help, follow these directions.[/url] |Tally Tables[/url]

    Twitter: @AnyWayDBA

  • Dbloc (1/31/2012)


    I must mention that there are some other columns in the table that are specific to each row. I should have mentioned this right away but forgot. sorry 🙁

    So having min and max date in the same row wouldn't work.

    I would need something like this:

    bldgid userID dateid score1 score2

    1 1 123 90 99

    1 1 124 89 88

    1 1 125 99 99

    .

    .

    .

    I need a query that will pull the first row and the last due to the dateid being the min and the max...

    Hope this helps. Sorry for the confusion.

    Assuming no duplicate dateids you could do something like

    SELECT * FROM MyTable

    WHERE dateid = (SELECT MAX(dateid) FROM MyTable)

    OR dateid = (SELECT MIN(dateid) FROM MyTable)

  • How about:

    SELECT c.* --Replace with the desired column list

    FROM

    /* Get list of all valid BuildingId, UserId combinations. I

    assume that these are separate tables that are linked with a foreign key.

    Not knowing which is the Primary and which is the foreign key, I just added the JOIN.

    You can probably replace that with a SELECT from the FK table. */

    (SELECT BuildingId, UserId FROM BuildingTable INNER JOIN UserTable ON JoinColumnA = JoinColumnB) a

    /* Using this list, get the MIN and MAX DateID of each valid BuildingId/UserId combination */

    CROSS APPLY (SELECT MAX(DateID) AS MaxDate, MIN(DateID) AS MinDate

    FROM myTable WHERE BuildingId = a.BuildingId AND UserID = a.UserId) b

    /* SELF JOIN your source table using this data to filter results. */

    INNER JOIN myTable c ON c.BuildingId = a.BuildingId AND c.UserID = a.UserID

    AND (c.DateID = MinDate OR c.DateID = MaxDate)

    On a similar table in my DB, I was able to get results in less than 1 second with cache, less than 10 seconds after a DBCC DROPCLEANBUFFERS statement.

    It was retrieving the first and last row for each "UserId/BuildingId" (Not UserId/BuildingId for me, but something similar) combination in a table of 23 million rows, with 6500 rows in the results.

  • @clubbavich - What you provided was very helpful. Actually what everybody was helpful. I will take what clubbavich provided, tweak it for what I need, but this is great!

    Thank you all for the help!!

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply