Issue with max() and aggregate, missunderstanding

  • frederico_fonseca

    SSChampion

    Points: 14173

    Sorry if my previous reply sounded a bit harsh.

    Had a chance to review your replies and now I think the following will do what you need.

    two options - both should yield same result but performance may vary. testing on your side required and if either/both are slow we will need actual execution explain plan files so we can have a look at them.

    Note that I have changed your table references to use alias - this makes your code clearer and is the approach you should follow when coding.

    I also removed the order by - this should not be used on the view and should instead be done as part of the select from the view you create e.g. select ... from myview mv order by mv.Priority desc, mv.PromisedDeliveryDate

    With a outer apply and order by - subquery will return the most recent record for a given DocumentNo/JobNo

    create view myview
    as
    select soor.Priority
    , soor.PromisedDeliveryDate
    , soor.DocumentNo
    , slca.CustomerAccountNumber
    , soor.CustomerDocumentNo
    , soor.RequestedDeliveryDate
    , soor.AnalysisCode3
    , soor.InvoiceCreditStatusID
    , slca.AccountIsOnHold
    , soor.CancelledStatusID
    , soor.DocumentStatusID
    , slca.CustomerAccountName
    , soor.DocumentTypeID
    , soor.DespatchReceiptStatusID
    , sc.Dept
    , sc.ScanDate
    from dbo.SOPOrderReturn soor
    inner join dbo.SLCustomerAccount slca
    on soor.CustomerID = slca.SLCustomerAccountID
    outer apply (select top 1 sc.Dept
    , sc.ScanDate
    from dbo.Scans sc
    where soor.DocumentNo = sc.JobNo
    order by sc.ScanDate desc
    ) sc
    where (soor.InvoiceCreditStatusID = 0)
    and (not (soor.DocumentStatusID = '2'))
    and (soor.DocumentTypeID = '0'
    or soor.DocumentTypeID = '2')
    and (not (soor.DespatchReceiptStatusID = 2))

    With a sub-query with a row_number - records are assigned a sequence number which is reset per each group based on "Partition By" clause ordered by ScanDate descending.

    For each "Partition By" set of columns there will be a unique number 1 which is the one we are interested e.g. the most recent scan for those columns (JobNo on this case)

    create view myview
    as
    select soor.Priority
    , soor.PromisedDeliveryDate
    , soor.DocumentNo
    , slca.CustomerAccountNumber
    , soor.CustomerDocumentNo
    , soor.RequestedDeliveryDate
    , soor.AnalysisCode3
    , soor.InvoiceCreditStatusID
    , slca.AccountIsOnHold
    , soor.CancelledStatusID
    , soor.DocumentStatusID
    , slca.CustomerAccountName
    , soor.DocumentTypeID
    , soor.DespatchReceiptStatusID
    , sc.Dept
    , sc.ScanDate
    from dbo.SOPOrderReturn soor
    inner join dbo.SLCustomerAccount slca
    on soor.CustomerID = slca.SLCustomerAccountID
    left outer join (select sc.JobNo
    , sc.Dept
    , sc.ScanDate
    , row_number() over (partition by sc.JobNo
    order by sc.ScanDate desc
    ) rownum
    from dbo.Scans sc
    ) sc
    on soor.DocumentNo = sc.JobNo
    and sc.rownum = 1
    where (soor.InvoiceCreditStatusID = 0)
    and (not (soor.DocumentStatusID = '2'))
    and (soor.DocumentTypeID = '0'
    or soor.DocumentTypeID = '2')
    and (not (soor.DespatchReceiptStatusID = 2))
  • insomnius

    Old Hand

    Points: 335

    Hi Frederico, I did type a nice reply here but it seems to have vanished...

    I just wanted to thank you for posting above, and I do appreciate that a simple mis-read post can make one thing look entirely like another. No worries from me, I am very appreciative of all help wherever or however it arrives!

    I understand why you have re-referenced/aliased the columns, it makes sense even to me, thank you for the pointer there. I have also removed the ORDER BY from anything I already have applied, again, adding ORDER to the SELECT after creating a view makes sense.

    Out of curiosity, what are: execution explain plan files ?

  • insomnius

    Old Hand

    Points: 335

    I had my Eureka moment!!! I think I understand how your first set of code works 🙂

  • jcelko212 32090

    SSCrazy Eights

    Points: 8855

    Could you please start posting talents and venting your own language? Now we have to rewrite everything. Why do you think that the proprietary identity table property could ever be a relational key? Remember that the key by definition is a subset of the columns of a table that uniquely identify a row. It has nothing to do with the physical insertion order on a physical disk on one machine. Identity is basically the old UNIX sequential file record number that Sybase used implement their first versions of the product so many decades ago.

    Also, while it's not technically wrong, why do you have varying length columns? They make printing paper forms or screens a pain in the butt. If you look at the ISO standards find very very very few of them are varying length varying length columns?

    CREATE TABLE Scans

    (job_nbr CHAR(10) NOT NULL

    CHECK (job_nbr LIKE '[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]'),

    dept_name VARCHAR(20) NOT NULL,

    scan_timestamp DATETIME2(0) DEFAULT CURRENT_TIMESTAMP NOT NULL,

    PRIMARY KEY (job_nbr, dept_name, scan_timestamp));

    INSERT INTO Scans

    VALUES

    ('0000115717', 'Office', '2019-08-24 22:51:00),

    ('0000115717', 'Office', '2019-08-24 22:52:00'),

    ('0000115717', 'CAD', '2019-08-25 00:13:44'),

    ('0000115717', 'Despatch', '2019-08-26 16:30:26'),

    ('0000115717', 'Despatch', '2019-08-26 16:31:29'),

    ('0000115847', 'CAD', '2019-08-25 10:44:14'),

    ('0000115847', 'Laser', '2019-08-25 13:09:47'),

    ('0000115847', 'Eyeletting', '2019-08-26 16:36:09'),

    ('0000115847', 'Spirals', '2019-08-26 16:40:05'),

    ('0000119378', 'Machining', '2019-08-25 14:57:54'),

    ('0000126563', 'Office', '2019-08-25 00:12:30'),

    ('0000127652', 'Inspection', '2019-08-22 16:06:49'),

    ('0000127754', 'Office', '2019-08-22 15:50:51'),

    ('0000127754', 'Office', '2019-08-22 15:58:35');

    WITH Last_Scan

    AS

    (SELECT job_nbr, dept_name,

    MAX(scan_timestamp) OVER (PARTITION BY job_nbr, dept_name) AS last_scan_timestamp

    FROM Scans)

    SELECT S.job_nbr, S.dept_name, S.scan_timestamp

    FROM Scans AS S, Last_Scan AS LS

    WHERE S.scan_timestamp = LS.last_scan_timestamp;

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • ScottPletcher

    SSC Guru

    Points: 98123

    I recommend one simplification to make your SQL coding and maintenance easier: use "*" in the inner query using ROW_NUMBER().  SQL will still only actually retrieve the data that is required by the outer query; you can confirm this by looking at the query plan.  The difference is that when you add a column to the outer query, you don't have to make any change to the inner query.

    left outer join (select *
    , row_number() over (partition by sc.JobNo
    order by sc.ScanDate desc
    ) rownum
    from dbo.Scans sc
    ) sc
    on soor.DocumentNo = sc.JobNo
    and sc.rownum = 1

    SQL DBA,SQL Server MVP(07, 08, 09) Prosecutor James Blackburn, in closing argument in the Fatal Vision murders trial: If in the future, you should cry a tear, cry one for them [the murder victims]. If in the future, you should say a prayer, say one for them. And if in the future, you should light a candle, light one for them.

Viewing 5 posts - 16 through 20 (of 20 total)

You must be logged in to reply to this topic. Login to reply