Picking latest transaction from 3 tables

  • We have 3 tables of customer transactions: donations, communications and sales, from which I want to find the earliest transaction. Generally these all have different fields though the following fields appear in all 3:

    id (primary key)

    transaction_date

    communication_id

    I need a neat way of getting the communication_id of the row with the earliest date, no matter which table it came from. Two of the tables have over a million records. Any two tables could have a transaction on the same day and each table could have more than one transaction on a day, but an arbitrary choice between these is acceptable.

    Any ideas?

     

  • Sorry, this should have been entitled: 'Picking earliest transaction ...', but then it's still early in the day in London...

  • ---DDL--

    if

    object_id('test_donations') is not null drop table test_donations

    if

    object_id('test_communications') is not null drop table test_communications

    if

    object_id('test_sales') is not null drop table test_sales

    create table test_donations (id int identity, transaction_date smalldatetime, communication_id int)

    create

    table test_communications (id int identity, transaction_date smalldatetime, communication_id int)

    create

    table test_sales (id int identity, transaction_date smalldatetime, communication_id int)

     

    ---Populate Test Data--

    insert test_donations

    select

    getdate()-1 , 15 union all

    select

    getdate()-2 , 14 union all

    select

    getdate()-3 , 13 union all

    select

    getdate()-4 , 12 union all

    select

    getdate()-520 , 11 ----Earliest date so query should return 11

    insert

    test_communications

    select

    getdate()-6 , 10 union all

    select

    getdate()-7 , 9 union all

    select

    getdate()-8 , 8 union all

    select

    getdate()-9 , 7 union all

    select

    getdate()-10 ,6

    insert test_sales

    select getdate()-11 , 5 union all

    select

    getdate()-12 , 4 union all

    select

    getdate()-13 , 3 union all

    select

    getdate()-14 , 2 union all

    select

    getdate()-15 , 1

    --query returning communication id with earliest transaction date--

    select

    top 1 communication_id

    from

    (

    select

    transaction_date, communication_id

    from

    test_donations

    union

    all

    select

    transaction_date, communication_id

    from

    test_communications

    union

    all

    select

    transaction_date, communication_id

    from

    test_sales

    )

    x

    order

    by transaction_date asc

    www.sql-library.com[/url]

  • Yes, that certainly works. Thanks.

    I now have to find the earliest communication_id for each customer: each table has a customer_id field. Will your code work with suitable modification? i.e. the output should contain two fields: customer_id and communication_id, one row per customer.

  •  

    select min(transaction_date), customer_id

    from

    (

    select

    transaction_date, communication_id, customer_id

    from

    test_donations

    union

    all

    select

    transaction_date, communication_id, customer_id

    from

    test_communications

    union

    all

    select

    transaction_date, communication_id, customer_id

    from

    test_sales

    )

    x

    group by customer_id

    www.sql-library.com[/url]

  • ... but I need to have the communication_id of the earliest transaction for each customer!

  • Then how does Customer_ID relate to these tables. You told us he only common field names are

    id (primary key)

    transaction_date

    communication_id

     

  • sorry, it IS a common field. I didn't originally think it was key, but in fact it is what makes the problem so tricky.

    I'm sure this can be done using a cursor and stepping through customer by customer, e.g. using Jules' code. But can it be done on a dataset with just SQL?

  • ---DDL--

    if

    object_id('test_donations') is not null drop table test_donations

    if

    object_id('test_communications') is not null drop table test_communications

    if

    object_id('test_sales') is not null drop table test_sales

    create

    table test_donations (id int identity, transaction_date smalldatetime, communication_id int, Customer_ID int )

    create

    table test_communications (id int identity, transaction_date smalldatetime, communication_id int, Customer_ID int)

    create

    table test_sales (id int identity, transaction_date smalldatetime, communication_id int, Customer_ID int)

    ---Populate Test Data--

    insert

    test_donations

    select

    getdate()-1 , 15,1 union all

    select

    getdate()-2 , 14,1 union all

    select

    getdate()-3 , 13,1 union all

    select

    getdate()-4 , 12,2 union all

    select

    getdate()-520 , 11,2 ----Earliest date so query should return 11

    insert

    test_communications

    select

    getdate()-6 , 10,3 union all

    select

    getdate()-123 , 9,3 union all

    select

    getdate()-8 , 8,3 union all

    select

    getdate()-9 , 7,4 union all

    select

    getdate()-10 ,6,4

    insert

    test_sales

    select

    getdate()-11 , 5,4 union all

    select

    getdate()-12 , 4,4 union all

    select

    getdate()-13 , 3,5 union all

    select

    getdate()-14 , 2,5 union all

    select

    getdate()-15 , 1,5

    --query returning communication id with earliest transaction date--

    select

    stuff

    (min(convert(varchar(20),transaction_date, 102)+ cast(communication_id as varchar(10))),1,10,'')EarliestCommunicationID,

    max

    (Customer_ID) CustomerID

    from

    (

    select

    transaction_date, communication_id, Customer_ID

    from

    test_donations

    union

    all

    select

    transaction_date, communication_id, Customer_ID

    from

    test_communications

    union

    all

    select

    transaction_date, communication_id, Customer_ID

    from

    test_sales

    )

    x

    group

    by Customer_ID

     

    www.sql-library.com[/url]

  • Version above only distinguishes between transaction on different days. Version below will do it down to milli seconds which is probably what you want.

     

    ---DDL--

    if

    object_id('test_donations') is not null drop table test_donations

    if

    object_id('test_communications') is not null drop table test_communications

    if

    object_id('test_sales') is not null drop table test_sales

    create

    table test_donations (id int identity, transaction_date smalldatetime, communication_id int, Customer_ID int )

    create

    table test_communications (id int identity, transaction_date smalldatetime, communication_id int, Customer_ID int)

    create

    table test_sales (id int identity, transaction_date smalldatetime, communication_id int, Customer_ID int)

    ---Populate Test Data--

    insert

    test_donations

    select

    getdate()-1 , 15,1 union all

    select

    getdate()-2 , 14,1 union all

    select

    getdate()-3 , 13,1 union all

    select

    getdate()-4 , 12,2 union all

    select

    getdate()-520 , 11,2 ----Earliest date so query should return 11

    insert

    test_communications

    select

    getdate()-6 , 10,3 union all

    select

    getdate()-123 , 9,3 union all

    select

    getdate()-8 , 8,3 union all

    select

    getdate()-9 , 7,4 union all

    select

    getdate()-10 ,6,4

    insert

    test_sales

    select

    getdate()-11 , 5,4 union all

    select

    getdate()-12 , 4,4 union all

    select

    getdate()-13 , 3,5 union all

    select

    getdate()-14 , 2,5 union all

    select

    getdate()-15 , 1,5

    --query returning communication id with earliest transaction date--

    select

    stuff

    (min(convert(varchar(20),transaction_date, 20)+ cast(communication_id as varchar(10))),1,19,'')EarliestCommunicationID,

    max

    (Customer_ID) CustomerID

    from

    (

    select

    transaction_date, communication_id, Customer_ID

    from

    test_donations

    union

    all

    select

    transaction_date, communication_id, Customer_ID

    from

    test_communications

    union

    all

    select

    transaction_date, communication_id, Customer_ID

    from

    test_sales

    )

    x

    group

    by Customer_ID

     

    www.sql-library.com[/url]

  • Thanks, I think this is the way to do it. I can also add in a code for the tables and so prioritise those in a certain order.

    I'll have to see how long it takes to run, though I was intending to run it overnight and refresh a table with the results.

    It's a pity the 'top 1' construct could be used because that was (more) neat!

    Anyway, thanks a lot.

  • 'so prioritise those in a certain order.' Not sure what you mean by this. This code will run fast. Much much faster than a cursor.

    www.sql-library.com[/url]

  • it takes about 50 seconds to run which isn't too bad.

    what I meant was that I can prioritise one table over another id tey both have entries for the same date and time (the time is often 00:00:00).

    Cheers.

Viewing 13 posts - 1 through 12 (of 12 total)

You must be logged in to reply to this topic. Login to reply