How to obtain first, but not empty value?

  • Hello,

    I have a table of travel booking details (over several millions of records). Some of the records may have passenger's name, but some - may not, e.g. if there is an extra charge for a greeting upon arrival. I need to summarise the costs of the booking as well as to provide a first passenger's name, travelling under this booking. In the example below I need to show VERITY/SMITH MRS. When I use FIRST_VALUE it obviously shows THOMAS/SMITH MR. How can I achieve customer's request, please? I have seen some solutions, where I shall JOIN the table to itself, but I wonder if this is a right way to go with millions of records, even if there is a correct index in place.

    Much appreciated!

    IF OBJECT_ID('tempdb..##invoice') IS NOT NULL
    DROP TABLE ##invoice;

    WITH invoice AS(
    SELECT 1 AS invoice_number,
       '' AS passenger_name,
         10     AS fare,
         1     AS commission,
         2     AS gst
    UNION ALL
    SELECT 1,
       'VERITY/SMITH MRS',
             100,
             8,
             20
    UNION ALL
    SELECT 1,
       'THOMAS/SMITH MR',
             98,
             7,
             10         
    UNION ALL
    SELECT 1,
       '',
             4,
             0,
             0
             )

    SELECT *
    INTO ##invoice
    FROM invoice;

    SELECT MIN(first_passenger) AS first_passenger,
       SUM (fare)    AS total_fare,
         SUM(commission)  AS total_comission,
         SUM(gst)             AS total_gst

    FROM (
      SELECT FIRST_VALUE(passenger_name) OVER (PARTITION BY invoice_number ORDER BY CASE passenger_name WHEN '' THEN 'Z' ELSE passenger_name END) AS first_passenger,
             fare,
             commission,
             gst,
             invoice_number
        FROM ##invoice) i
    GROUP BY i.invoice_number

  • SQL Server does not guarantee an order unless you specify an order.  Which field(s) are you using to specify your order?

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • How do you know that 'VERITY/SMITH MRS' should be the correct value returned?  There is nothing in the data that will tell you that. Just because 'VERITY/SMITH MRS' is entered first really doesn't mean anything as tables are unordered sets of data.  You apply order by using the ORDER BY, as you did in your query which is why 'THOMAS/SMITH MR' is the value returned.

  • I can also add a line to it, but then it will always return the first, empty record as from the example below, no?

    IF OBJECT_ID('tempdb..##invoice') IS NOT NULL
    DROP TABLE ##invoice;

    WITH invoice AS(
    SELECT 1 AS invoice_number,
       1 AS line,
       '' AS passenger_name,
         10     AS fare,
         1     AS commission,
         2     AS gst
    UNION ALL
    SELECT 1,
       2,
       'VERITY/SMITH MRS',
             100,
             8,
             20
    UNION ALL
    SELECT 1,3,
       'THOMAS/SMITH MR',
             98,
             7,
             10         
    UNION ALL
    SELECT 1,4,
       '',
             4,
             0,
             0
             )

    SELECT *
    INTO ##invoice
    FROM invoice;

    SELECT MIN(first_passenger) AS first_passenger,
       SUM (fare)    AS total_fare,
         SUM(commission)  AS total_comission,
         SUM(gst)             AS total_gst

    FROM (
      SELECT FIRST_VALUE(passenger_name) OVER (PARTITION BY invoice_number ORDER BY line, CASE passenger_name WHEN '' THEN 'Z' ELSE passenger_name END) AS first_passenger,
             fare,
             commission,
             gst,
             invoice_number
        FROM ##invoice) i
    GROUP BY i.invoice_number

  • It might just be me, but I think you are making this more complicated than it needs to be.

    The way I would do it would be to have your select something like:
    SELECT TOP 1 passenger_name, fare, commission, gst
    FROM ##invoice
    WHERE passenger_name not like ''
    ORDER BY line

    At least, by using your latest example bit of code.  Without that "line" value, I don't see how to determine the order.

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • bmg002 - Tuesday, July 18, 2017 2:35 PM

    It might just be me, but I think you are making this more complicated than it needs to be.

    The way I would do it would be to have your select something like:
    SELECT TOP 1 passenger_name, fare, commission, gst
    FROM ##invoice
    WHERE passenger_name not like ''
    ORDER BY line

    At least, by using your latest example bit of code.  Without that "line" value, I don't see how to determine the order.

    I need to sum up fare, commission and gst as well, using TOP 1 will not help, I think.

  • Ok, if you need to sum those up, you would want something more like:
    SELECT TOP 1 passenger_name, fare + commission + gst AS total_cost
    FROM ##invoice
    WHERE passenger_name not like ''
    ORDER BY line


    no?  Or do you mean you need to get the sum of all of the fares, commissions and gst's for a particular invoice number?  Something like:
    SELECT TOP 1 passenger_name,
    SUM(fare) OVER (PARTITION BY invoice_number),
    SUM(commission) OVER (PARTITION BY invoice_number),
    SUM(gst) OVER (PARTITION BY invoice_number)
    FROM ##invoice
    WHERE passenger_name not like ''
    ORDER BY line

    The above is all just my opinion on what you should do. 
    As with all advice you find on a random internet forum - you shouldn't blindly follow it.  Always test on a test server to see if there is negative side effects before making changes to live!
    I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

  • You can fix this by changing the ORDER BY in your FIRST_VALUE calculation.
    Your first order field should be whether there is a name or not, and then by the line number.

    SELECT MIN(first_passenger) AS first_passenger,
    SUM (fare) AS total_fare,
    SUM(commission) AS total_comission,
    SUM(gst) AS total_gst

    FROM (
    SELECT FIRST_VALUE(passenger_name) OVER (PARTITION BY invoice_number ORDER BY CASE passenger_name WHEN '' THEN 1 ELSE 0 END, line) AS first_passenger,
    fare,
    commission,
    gst,
    invoice_number
    FROM ##invoice) i
    GROUP BY i.invoice_number

    The following gives the expected results.

    Drew

    J. Drew Allen
    Business Intelligence Analyst
    Philadelphia, PA

  • how about something like this?
    WITH invoice_sums AS
    (SELECT invoice_number, passenger_name,
        SUM(fare) OVER (PARTITION BY invoice_number) AS total_fare,
        SUM(commission) OVER (PARTITION BY invoice_number) AS total_commission,
        SUM(gst) OVER (PARTITION BY invoice_number) AS total_gst,
        ROW_NUMBER() OVER (PARTITION BY invoice_number ORDER BY CASE WHEN passenger_name = '' THEN 2 ELSE 1 END, line) AS passenger
      FROM ##invoice)
    SELECT invoice_number, passenger_name, total_fare, total_commission, total_gst
      FROM invoice_sums
      WHERE passenger = 1;

  • I'm guessing what is desired is for each invoice number, return the sum of each of the fare, commission, and gst columns, along with the "first" non-blank passenger_name associated with that invoice.

    That's not so tricky, but we do need a way to determine what passenger_name should be "first"; if the arbitrary line column introduced earlier is an acceptable way to measure who is first, then one of the solutions above should be fine.

    Cheers!

  • Drew / Chris

    Thank you both. I see now the solution was more than obvious. Massive thanks!

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply