Get only the top 1 from each subset of data`

  • Okay, I should know this, but I am shaking my head in shame (I am really a programmer, not a SQL or DBA). Anyway, I have records for the year of client names, telephone numbers and appointment dates. Simple. What I need to get is a DISTINCT list of clients served for the year with their current telephone number. As some client have changed their telephone number, I am including apptDate and ordering by ClientName then apptDate DESC and what I want to do is to only pick that FIRST record for each client. I used TOP 1, but that is only returning 1 record.

    so the SQL is:

    SELECT DISTINCT ClientName, Telephone, apptdate FROM DailyCalls

    WHERE Telephone <>'Invalid Number'

    ORDER BY ClientName, apptdate desc

    Thanks for the help

  • Actually I think after some advice from our DBA here and playing around, I did this and it works

    Select * from (

    SELECT ClientName

    ,Telephone

    ,apptdate

    ,ROW_NUMBER() over (partition by ClientName order by apptDate desc) rn

    from DailyCalls

    where Telephone <>'Invalid Number'

    ) x

    where x.rn = 1

  • That looks workable. There is a solution with CROSS APPLY too. Look up Itzik Ben-Gan TOP rows per group.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

  • TheSQLGuru (12/5/2016)


    That looks workable. There is a solution with CROSS APPLY too. Look up Itzik Ben-Gan TOP rows per group.

    This probably doesn't meet the conditions for top rows per group. You need a table with a relatively small number of records and looking for the top records from another table with a large number of records per partition (high density). Given that the field is labeled ClientName rather than ClientID, it appears that there isn't a client table that could be used as the small table and getting DISTINCT ClientNames from the Telephone table may be expensive. Also, it's unlikely that the number of 'Invalid' records per client is large enough to overcome the cost of the nested loops. That is, it's likely that the data is low density.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • I don't think I said it would be better for the OP's scenario (for which we know too little). But he/she is learning and having another tool in the TSQL bag is a good thing. It will also show up for others who stumble across this via a search.

    Best,
    Kevin G. Boles
    SQL Server Consultant
    SQL MVP 2007-2012
    TheSQLGuru on googles mail service

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply