Top 1 by group query

  • I have been struggling with this one for ages and therefore hoping someone will take pity on me and point me in the right direction!

    Bit of background: We are a charity and therefore dealing with donations rather than sales. Thought that might help with some of my terminology.

    I have a table which shows (amongst other things) the person ('contact_number') to which a transaction relates and the date of the transaction. (financial_history aka fh)

    There is another table which holds details of the transactions themselves (including the reason, the 'amount' and the fund). (financial_history_details aka fhd)

    The two tables, when joined, share the batch_number and the transaction_number fields and must be joined on both to ensure the correct link.

    What I am trying to do is return the entire row from 'fhd' that gives the highest 'amount' for each 'contact_number' in 'fh' as well as returning the 'contact_number' to allow me to do other joins.

    The catch is that for an large number of people that have regular donations with us, their highest 'amount' appears multiple times on different dates. In this event I only want the most recent 'highest' donation.

    I have some code which is so very nearly there but I can either return the correct number of rows with not enough of the columns I need or too many rows with all the columns.

    If I've not given enough detail, please let me know and I'll supply more.

    Thanks in advance,

    Peter

  • Please provide existing code

    -----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    This thing is addressing problems that dont exist. Its solution-ism at its worst. We are dumbing down machines that are inherently superior. - Gilfoyle

  • Peter, i can somewhat visualize your existing system and i guess i have a code that might get u there, but as Henrico said, we need to see "physically" how your tables and data in the tables are.. so please provide:

    1. Table Structures - CREATE TABLE scripts

    2. Constraints or Indexes in the table, - CREATE scripts

    3. SAMPLE DATA - INSERT INTO TABLE (Columns) VALUES scripts

    4. Desired output - some visual representation of this.

    I tel you, you will amazed by the number of responses you get once you post the above-asked...

    😎

  • Thanks for offering to get back to me guys... Here's the first batch of information as requested.

    Create financial history table:

    CREATE TABLE [dbo].[financial_history](

    [batch_number] [int] NOT NULL,

    [transaction_number] [smallint] NOT NULL,

    [contact_number] [int] NOT NULL,

    [transaction_date] [datetime] NOT NULL,

    [transaction_type] [varchar](3) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

    [bank_details_number] [int] NULL,

    [amount] [decimal](9, 2) NOT NULL,

    [payment_method] [varchar](4) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

    [reference] [varchar](18) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [posted] [datetime] NULL,

    [address_number] [int] NOT NULL,

    [notes] [text] COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [status] [varchar](1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [currency_amount] [decimal](13, 2) NOT NULL,

    [transaction_origin] [varchar](10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL

    ) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

    Create financial history details table:

    CREATE TABLE [dbo].[financial_history_details](

    [batch_number] [int] NOT NULL,

    [transaction_number] [smallint] NOT NULL,

    [line_number] [smallint] NULL,

    [amount] [decimal](9, 2) NOT NULL,

    [product] [varchar](20) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [rate] [varchar](4) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [source] [varchar](10) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

    [quantity] [smallint] NULL,

    [vat_rate] [varchar](1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [vat_amount] [decimal](9, 2) NULL,

    [status] [varchar](1) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [sales_contact_number] [int] NULL,

    [invoice_payment] [varchar](1) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL,

    [distribution_code] [varchar](10) COLLATE SQL_Latin1_General_CP1_CI_AS NULL,

    [currency_amount] [decimal](13, 2) NOT NULL,

    [currency_vat_amount] [decimal](13, 2) NOT NULL

    ) ON [PRIMARY]

    Indexes:

    CREATE NONCLUSTERED INDEX [fhd1385bn4697tn6509ln4577] ON [dbo].[financial_history_details]

    (

    [batch_number] ASC,

    [transaction_number] ASC,

    [line_number] ASC

    )WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]

    CREATE NONCLUSTERED INDEX [fhd1385p2244r1729] ON [dbo].[financial_history_details]

    (

    [product] ASC,

    [rate] ASC

    )WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]

    CREATE NONCLUSTERED INDEX [fhd1385s2114] ON [dbo].[financial_history_details]

    (

    [source] ASC

    )WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]

    CREATE NONCLUSTERED INDEX [fh5766bdn1013] ON [dbo].[financial_history]

    (

    [bank_details_number] ASC

    )WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]

    CREATE UNIQUE NONCLUSTERED INDEX [fh5766bn4697tn6509] ON [dbo].[financial_history]

    (

    [batch_number] ASC,

    [transaction_number] ASC

    )WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]

    CREATE NONCLUSTERED INDEX [fh5766cn5453] ON [dbo].[financial_history]

    (

    [contact_number] ASC

    )WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]

    CREATE NONCLUSTERED INDEX [fh5766td6485] ON [dbo].[financial_history]

    (

    [transaction_date] ASC

    )WITH (SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF) ON [PRIMARY]

  • psmith-695163 (5/14/2010)


    The catch is that for an large number of people that have regular donations with us, their highest 'amount' appears multiple times on different dates. In this event I only want the most recent 'highest' donation.

    take ROW_NUMBER approach http://msdn.microsoft.com/en-us/library/ms186734.aspx

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • Insert Data:

    INSERT INTO [care_baseline].[dbo].[financial_history]

    ([batch_number]

    ,[transaction_number]

    ,[contact_number]

    ,[transaction_date]

    ,[transaction_type]

    ,[bank_details_number]

    ,[amount]

    ,[payment_method]

    ,[reference]

    ,[posted]

    ,[address_number]

    ,[notes]

    ,[status]

    ,[currency_amount]

    ,[transaction_origin])

    Select 228050,2807,8,'1994-01-17','P',NULL,10.00,'CHQ',NULL,'2003-02-12',1006,NULL,NULL,10.00,NULL

    UNION ALL

    Select 228316,3810,8,'1994-09-29','P',NULL,5.00,'CHQ',NULL,'2003-02-12',1006,NULL,NULL,5.00,NULL

    UNION ALL

    Select 110232,4,8,'1999-11-30','P',NULL,8.00,'CHQ',NULL,'2003-02-12',1006,NULL,NULL,8.00,NULL

    UNION ALL

    Select 120752,1,8,'2002-11-01','P',NULL,10.00,'CC'NULL,'2003-02-12',1006,'Ad hoc donation',NULL,10.00,NULL

    UNION ALL

    Select 100098,9,8,'1996-08-28','P',NULL,10.00,'CHQ',NULL,'2003-02-12',1006,'Ad hoc donation',NULL,10.00,NULL

    INSERT INTO [care_baseline].[dbo].[financial_history_details]

    ([batch_number]

    ,[transaction_number]

    ,[line_number]

    ,[amount]

    ,[product]

    ,[rate]

    ,[source]

    ,[quantity]

    ,[vat_rate]

    ,[vat_amount]

    ,[status]

    ,[sales_contact_number]

    ,[invoice_payment]

    ,[distribution_code]

    ,[currency_amount]

    ,[currency_vat_amount])

    Select 110232,1,1,10.00,'DON',0,'X99013',1,'Z',0.00,NULL,NULL,'N',5000010,10.00,0.00

    UNION ALL

    Select 110232,4,1,8.00,'DON',0,'X99011',1,'Z',0.00,NULL,NULL,'N',5000015,8.00,0.00

    UNION ALL

    Select 110232,9,1,250.00,'DON',0,'X99011',1,'Z',0.00,NULL,NULL,'N',5000004,250.00,0.00

    UNION ALL

    Select 120752,1,1,10.00,'DON',0,'TDG3-03',1,'Z',0.00,NULL,NULL,'N',5000015,10.00,0.00

    UNION ALL

    Select 228050,1,1,8.21,'DEED',0,'DOCAPPEAL',1,'Z',0.00,NULL,NULL,'N',5000011,8.21,0.00

    UNION ALL

    Select 228050,4,1,2.99,'DEED',0,'DOCAPPEAL',1,'Z',0.00,NULL,NULL,'N',5000012,2.99,0.00

    UNION ALL

    Select 228050,9,1,5.64,'DEED',0,'DOCAPPEAL',1,'Z',0.00,NULL,NULL,'N',5000002,5.64,0.00

    UNION ALL

    Select 228050,2807,1,10.00,'DON',0,'XMAS93',1,'Z',0.00,NULL,NULL,'N',5000015,10.00,0.00

    UNION ALL

    Select 228050,3810,1,5.00,'FUND',0,'BRNCHDON',1,'Z',0.00,NULL,NULL,'N',5000003,5.00,0.00

    UNION ALL

    Select 228316,1,1,15.00,'DON',0,'XMAS95',1,'Z',0.00,NULL,NULL,'N',5000002,15.00,0.00

    UNION ALL

    Select 228316,4,1,20.00,'DON',0,'XMAS93',1,'Z',0.00,NULL,NULL,'N',5000004,20.00,0.00

    UNION ALL

    Select 228316,9,1,389.00,'FUND',0,'LNMTHN95',1,'Z',0.00,NULL,NULL,'N',5000003,389.00,0.00

    UNION ALL

    Select 228316,2807,1,2.82,'DEED',0,'DOCAPPEAL',1,'Z',0.00,NULL,NULL,'N',5000014,2.82,0.00

    UNION ALL

    Select 228316,3810,1,5.00,'DON',0,'SUMMAPP94',1,'Z',0.00,NULL,NULL,'N',5000015,5.00,0.00

  • @psmith

    4. Desired output - some visual representation of this.

    need this also

    -------Bhuvnesh----------
    I work only to learn Sql Server...though my company pays me for getting their stuff done;-)

  • This code gives me too many rows...

    SELECT fh.contact_number

    ,fhd.*

    FROM financial_history_details fhd

    left outer join financial_history fh

    on fhd.batch_number = fh.batch_number

    and fhd.transaction_number = fh.transaction_number

    WHERE fhd.amount =(SELECT MAX(fhd2.amount)

    FROM financial_history_details fhd2

    left outer join financial_history fh2

    on fhd2.batch_number = fh2.batch_number

    and fhd2.transaction_number = fh2.transaction_number

    WHERE fh.contact_number = fh2.contact_number)

    AND fhd.amount > 0

    order by fh.contact_number

    and this code gives the correct number of rows but not the whole row from fhd table...

    select max(fhd.amount) as max_don

    ,fh.contact_number

    from financial_history fh

    left outer join financial_history_details fhd

    on fhd.batch_number = fh.batch_number

    and fhd.transaction_number = fh.transaction_number

    group by fh.contact_number

    order by contact_number

  • I've just realised I copied and pasted some unnecessary data in one of the above posts which I don't want to confuse things.

    Basically, with the sample data supplied, I would like to see the following result...

    Contact_number,fhd.*

    8,120752,1,8,'2002-11-01','P',NULL,10.00,'CC',NULL,'2003-02-12',1006,'Ad hoc donation',NULL,10.00,NULL

    That is for contact number 8, the highest donation of £10 of which there are 3 so therefore the most recent of them (November 2002).

    Hopefully that all makes sense now.

    To be honest, I've tied myself up into such a knot with all this that I don't know which way is up at the moment!

    Thanks for checking in on this thread and thanks for your offers of help so far.

    Peter

  • I will get this right in the end, I'm so sorry for making life more difficult.

    The correct result set should be...

    fh.Contact_number, fh.transaction_date, fhd.*

    8,'2002-11-01',120752,1,1,10.00,'DON',0,'TDG3-03',1,'Z',0.00,NULL,NULL,N,5000015,10.00,0.00

    Thanks.

  • Hi Peter

    Have a look at the output from this SELECT. If the row containing 1 for ChosenRow is the row you want to keep, then rewrite the SELECT including only the columns you want.

    SELECT ChosenRow = ROW_NUMBER() OVER (PARTITION BY fh.contact_number ORDER BY fh.amount DESC, fh.transaction_date DESC),

    fh.*, '#' AS '#', fhd.*

    FROM #financial_history fh

    LEFT JOIN #financial_history_details fhd

    ON fhd.batch_number = fh.batch_number and fhd.transaction_number = fh.transaction_number

    WHERE fh.contact_number = 8

    “Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • So little code, such big results! Thank you all for your input, very valuable.

    I will be keeping this little trick up my sleve for future use.

    Peter

Viewing 12 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply