Comparing one record within a SELECT SUM()

Question

Post reply

Comparing one record within a SELECT SUM()

Pete Bishop

Default port

Points: 1493

July 25, 2025 at 1:06 pm

#4627013

If I had data of the form:

rn  value  processed
1   10.0   N
2   12.2   N
3   8.4    Y 
4   5.9    N
5   3.1    N

Would there be a simple way to select SUM([value]) but *also* set a flag if the [processed] flag = 'Y' on the [rn] = 1 row? For example:

SELECT * FROM [table1]
rn  value  processed
1   10.0   N -- Note the first record is N
2   12.2   N
3   8.4    Y 
4   5.9    N
5   3.1    N

SELECT SUM([value]) AS [value], {some magic code} FROM [table1]
value  is_processed
39.6   N


SELECT * FROM [table1]
rn  value  processed
1   10.0   Y -- Note the record is now Y
2   12.2   N
3   8.4    N 
4   5.9    N
5   3.1    N

SELECT SUM([value]) AS [value], {some magic code} FROM [table1]
value  is_processed
39.6   Y

I know I could put an embedded:

(SELECT [processed] FROM [table1] WHERE [rn] = 1)

but given a large table, this could be expensive.

Thanks.

This topic was modified 4 months ago by Pete Bishop.

Viewing 9 posts - 1 through 9 (of 9 total)

You must be logged in to reply to this topic. Login to reply

Mr. Brian Gale SSC-Insane Points: 24979 More actions · Answer 1

As far as I am aware, there is no way to update a record AND select in a single query, but you could do it all in one transaction or you could toss it into a stored procedure instead.

As for large table and that embedded SQL being expensive, if you have an index on the rn column, the lookup to see where rn=1 would be VERY cheap even for large tables.

The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

Pete Bishop Default port Points: 1493 More actions · Answer 2

Hi Brian; thanks for the response.

Unfortunately, the [rn] is a pseudo-column from the ROW_NUMBER() window function so I'd have to store the output into a temporary table in order to create an index. Probably not worth it to "improve" performance 🙂

Ken McKelvey SSCoach Points: 18977 More actions · Answer 3

/* *** Test Data *** */
CREATE TABLE #t
(
    rn int NOT NULL
        PRIMARY KEY
    ,[value] decimal(5,2) NOT NULL
    ,processed char(1) NOT NULL
);
INSERT INTO #t
VALUES (1, 10.0, 'N')
    ,(2, 12.2, 'N')
    ,(3, 8.4, 'Y')
    ,(4, 5.9, 'N')
    ,(5, 3.1, 'N');
/* *** End Test Data *** */

SELECT rn, [value], processed
    ,FIRST_VALUE(processed) OVER (ORDER BY rn)
FROM #t
ORDER BY rn;

update #t set processed = 'Y' where rn = 1;

SELECT rn, [value], processed
    ,FIRST_VALUE(processed) OVER (ORDER BY rn)
FROM #t
ORDER BY rn;

Mr. Brian Gale SSC-Insane Points: 24979 More actions · Answer 4

Pete Bishop wrote:

Hi Brian; thanks for the response.
Unfortunately, the [rn] is a pseudo-column from the ROW_NUMBER() window function so I'd have to store the output into a temporary table in order to create an index. Probably not worth it to "improve" performance 🙂

If rn is a windowing function, I imagine you have some way to order your rows - just break out that logic. Even store it in a temporary variable. For example, if you have some key that identifies a row as unique and lets say for arguments sake it is a GUID. Your function to get the rn, you run that select statement except you store the GUID for where your logic produces rn of 1. Then your update happens on the row where the GUID matches.

IF you have no way of ordering your data except by this calculated rn column, you are going to have a hard time because SQL doesn't have any order to your data when it returns it unless explicitly asked for. If you don't put an order by clause in your query, the order of your resulting data is not guaranteed.

So whatever logic you are using to get rn=1, you should be able to produce similar logic to store that row identifier (primary key) into a variable or multiple variables if your PK uses multiple columns to ensure uniqueness.

The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.

Pete Bishop Default port Points: 1493 More actions · Answer 5

Hi Ken.

Thanks for this but I don't see how it includes the SUM([value]) in the SELECT statement.

Pete

Ken McKelvey SSCoach Points: 18977 More actions · Answer 6

Pete Bishop wrote:

Hi Ken.
Thanks for this but I don't see how it includes the SUM([value]) in the SELECT statement.
Pete

Your question is unclear so I showed you how to use FIRST_VALUE and left you to add the SUM.

Jeff Moden SSC Guru Points: 1004666 More actions · Answer 7

Pete Bishop wrote:

SELECT SUM([value]) AS [value], {some magic code} FROM [table1]
value  is_processed
39.6   Y

--===== Because 'Y' is greater than 'N', this works.
 SELECT  value     = SUM(value)
        ,processed = MAX(processed) --"Some Magic Code"
   FROM #table1
;

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Mr. Brian Gale SSC-Insane Points: 24979 More actions · Answer 8

I'd also like to add if rn, is_processed, and processed are all frequently used, I'd remove the "magic code" and persist those values to disk; even if they are calculated columns. Persist them to disk and then you can index them and use them in a more useful manner.

Just my 2 cents, but that's what I'd do. If the data needs to be calculated out each time and is semi-complicated, persist it to disk and index it for better performance and to reduce the chance of typos when calculating it if you can.

The above is all just my opinion on what you should do.
As with all advice you find on a random internet forum - you shouldn't blindly follow it. Always test on a test server to see if there is negative side effects before making changes to live!
I recommend you NEVER run "random code" you found online on any system you care about UNLESS you understand and can verify the code OR you don't care if the code trashes your system.