SQLServerCentral Article

Revisiting SQL Server Window Functions- A Practical Walkthrough

,

Introduction

Sometimes we just want SQL to be a bit smarter. Maybe we need to figure out who’s performing best in a team, find someone’s very first sale, or compare today’s numbers with yesterday’s. We might want to group customers into spending tiers, spot sudden jumps or drops in sales, or keep a running total without writing complex loops. We could write long queries with subselects, joins, and temp tables… but there’s a much cleaner way.

The common thread in all these problems is that we are not just looking at one row — We looking at it in relation to others. That’s exactly what a certain family of SQL features, called window functions, is designed for. They let you keep every row intact while still showing you how it stacks up against its peers, what came before, what’s coming next, or how values accumulate over time.

This isn’t just for advanced analytics — it’s something We shall use in everyday reporting too. Imagine needing to pull the top three sales for each department, calculate the percentage change from the previous month, highlight the first order from each new customer, or split employees into performance bands for a bonus calculation. These are all situations where window functions give you clearer logic, shorter SQL, and results that are easier to verify.

In this article, we shall work with one simple table for all examples so you can clearly see how each technique behaves. Each section starts with a small problem you might actually run into at work, followed by the query that solves it. By the end, we would know not just how each window function works, but when it’s the right tool for the job.

Build a Consistent Demo Table

To understand each window function in action, we would use a single sample table throughout this article. This will help you see how the functions behave across different data patterns—like duplicate values, time-based events, and groupings.

Let’s call the table SalesLog. It represents employees from different departments recording their daily sales over a few days. The structure is simple: a log ID (auto-incremented), employee name, department, the amount of sale, and the date of the log.

Here’s the script to create and populate it:

DROP TABLE IF EXISTS dbo.SalesLog;
GO

CREATE TABLE dbo.SalesLog
(
    LogID INT IDENTITY(1,1) PRIMARY KEY,
    EmployeeName VARCHAR(100),
    Department VARCHAR(100),
    SaleAmount INT,
    LogDate DATE
);

INSERT INTO dbo.SalesLog (EmployeeName, Department, SaleAmount, LogDate)
VALUES
-- Electronics Department
('Alice',   'Electronics', 2000, '2024-01-01'),
('Bob',     'Electronics', 2000, '2024-01-01'),
('Arav',   'Electronics', 1500, '2024-01-02'),
('David',     'Electronics', 1000, '2024-01-02'),
('Charley',     'Electronics', 3000, '2024-01-01'),
('Bjorn',     'Electronics', 3000, '2024-01-02'),
('L Chan',     'Electronics', 1800, '2024-01-03'),
('DK Bret',   'Electronics', 1800, '2024-01-03'),
('Eve',     'Electronics', 1500, '2024-01-03'),
('Eley',   'Electronics', 1200, '2024-01-04'),
('Sam',     'Electronics', 1200, '2024-01-04'),
('Sandy',     'Electronics', 1000, '2024-01-04'),

-- Clothing Department
('Pankaj', 'Clothing',    1200, '2024-01-01'),
('Danice',   'Clothing',    1200, '2024-01-01'),
('Darvin',   'Clothing',    1500, '2024-01-01'),
('Faisal', 'Clothing',     900, '2024-01-02'),
('Nancy',   'Clothing',     900, '2024-01-02'),
('Mira',   'Clothing',    1500, '2024-01-02'),
('Marry', 'Clothing',    1000, '2024-01-03'),
('Nguan',   'Clothing',    1100, '2024-01-03'),
('Ernest',   'Clothing',    1300, '2024-01-03'),
('Jagdeesh', 'Clothing',    1000, '2024-01-04'),
('Sulabh',   'Clothing',     800, '2024-01-04'),
('Chandan',   'Clothing',    1300, '2024-01-04'),
('Ajay',   'Clothing',    1500, '2024-01-05');

We have set up our data so there are overlapping departments, different sales amounts, and a variety of employee names across different dates.

As we work through each function, I will also point out how things might look in a real-world system. In our demo table, each employee appears only once per day. That keeps the examples clean and makes it easier to focus on the function itself. But in a real sales database, the same salesperson could log many sales in a single day or dozens over the month. In those cases, you would usually aggregate the data first — for example, total sales per employee per day — and then apply the window functions. That way you are ranking, comparing, or tracking trends on meaningful totals, not on every small transaction.

This setup gives us a nice balance: simple enough to follow, but varied enough to show exactly how each function behaves — whether we are ranking people, comparing performance over time, or calculating running totals. From here on, every example will use this same SalesLog table so we can stay focused on learning the function logic without constantly switching datasets.

Explore the Full Window Function Toolbox

With our SalesLog table set up, we can finally get to the fun part — seeing what these functions can actually do. If you’ve never used them before, the big idea is simple: a window function lets you do calculations while still keeping every individual row. You’re not collapsing data down the way you would with GROUP BY; you’re adding an extra layer of insight on top of the existing detail.

This is powerful because most real-world questions aren’t just “What’s the total?” or “What’s the average?” — they’re “How does this row compare to others in the same group?”, “What came right before this?” or “How is the total changing over time?”. That’s exactly the kind of question window functions are built for.

The magic happens inside the OVER() clause. That’s where you tell SQL how big your “window” should be and how it should be arranged. You can split the data into groups with PARTITION BY, decide the order inside each group with ORDER BY, and (when it matters) control exactly which rows are in view with a frame like ROWS BETWEEN …. Small changes here can completely change the answer you get, so we’ll call those out when they matter.

Over the next few sections, we’ll tackle one function at a time. You’ll see:

  • How to give every row a position inside its group
  • How to spot changes between one row and the next
  • How to pull out the very first or very last value in a group
  • How to break a set of rows into equal-sized segments
  • How to keep a running total or rolling average over time

Every example will be based on our same SalesLog dataset so you can focus on the function itself without adjusting to a new table every time. We’ll start each one with a real problem you might hit at work, then walk through how the function solves it, and finally explain why it works that way. By the time you’ve been through all of them, you’ll have a mental checklist of “Oh yeah, I can use this here” moments for your own queries.

ROW_NUMBER()

Imagine a school sports day with many races. ROW_NUMBER() is like handing out positions to the runners. ORDER BY RaceTime ASC ensures the fastest runner (lowest time) gets RowNum = 1. And PARTITION BY Department is like saying ‘don’t mix all races together, give positions separately for each race.’ So the Electronics department has its own 1st, 2nd, 3rd, and Clothing has its own too.

Lets take one more real world example, Imagine you’ve got a list of sales and you just want to give each row a number. Not a magic rank, not some complicated score , but literally just “this is row 1, this is row 2, this is row 3…” based on the order you care about. That’s what ROW_NUMBER() does. The cool part is, you get to tell SQL when to start counting over and how to sort before numbering. That’s what the OVER() bit is for.

  • PARTITION BY Department → “start counting fresh for every department.”
  • ORDER BY SaleAmount DESC → “give number 1 to the biggest sale, 2 to the next biggest, and so on.”

Example:

SELECT *,
    ROW_NUMBER() OVER(PARTITION BY Department ORDER BY SaleAmount DESC) AS RowNum
FROM dbo.SalesLog;

Output:

When we run the query with ROW_NUMBER() OVER(PARTITION BY Department ORDER BY SaleAmount DESC), SQL gives every row a running number inside its department. In the Clothing department, Darvin, Mira, and Ajay all have sales of 1500, so they take RowNum 1, 2, and 3 — but notice the order is arbitrary since SQL doesn’t know how to break ties unless we give it another column in the ORDER BY. After them, Chandan and Ernest both have 1300 and get RowNum 4 and 5, and the numbering continues downward. The same happens in Electronics: Charley and Bjorn both sold 3000, so they get RowNum 1 and 2, Alice and Bob both sold 2000 and take RowNum 3 and 4, and the rest follow in order. The key point is that every row gets a number, the count restarts for each department, and ties don’t share the same RowNum — they just get split up randomly unless we add more sorting logic.

Use Case: say you want “top 3 months of booming sale in a year per department” — just wrap this in a subquery and filter where RowNum <= 3. Done. No loops, no messy joins.

Lets elaborate it in more detail Here’s how ROW_NUMBER() makes it easy. First, we generate row numbers inside each department:

SELECT *,
       ROW_NUMBER() OVER (PARTITION BY Department ORDER BY SaleAmount DESC) AS RowNum
FROM dbo.SalesLog;

This gives every sale a RowNum, starting at 1 for the biggest sale in each department. Now we just wrap this in a subquery and filter where RowNum <= 3:

WITH RankedSales
AS (SELECT *,
           ROW_NUMBER() OVER (PARTITION BY Department ORDER BY SaleAmount DESC) AS RowNum
    FROM dbo.SalesLog)
SELECT Department,
       EmployeeName,
       SaleAmount,
       LogDate,
       RowNum
FROM RankedSales
WHERE RowNum <= 3
ORDER BY Department,
         RowNum;

For further details refer Microsoft Docs – ROW_NUMBER (Transact-SQL)

RANK()

RANK() is like ROW_NUMBER()’s more fair-minded cousin. It still gives positions, but if two people tie, they both get the same spot — and the next spot gets skipped.

Picture this: in a department, two salespeople both sold $2,000 — that’s the top sale. With RANK(), they both get position 1. But because those two people “took” the first place, the next sales person jumps straight to position 3. That’s why you sometimes see gaps in the numbering.

SELECT *,
    RANK() OVER(PARTITION BY Department ORDER BY SaleAmount DESC) AS RankVal
FROM dbo.SalesLog;

Let’s break it down:

  • PARTITION BY Department → we restart ranking inside each department. So Clothing and Electronics each get their own rank list.
  • ORDER BY SaleAmount DESC → within each department, the largest SaleAmount gets rank 1, the next highest gets rank 2, and so on.
  • RANK() → if two rows have the same SaleAmount, they share the same rank. But the next rank is skipped to make room for the tie.

Output:

In the Clothing department, notice how Darvin, Mira, and Ajay all sold 1500. With RANK(), they all share the same rank = 1. Because three rows tied for first place, the next rank available is 4. That’s why Chandan and Ernest, who both sold 1300, are ranked 4 instead of 2. After them, Pankaj and Danice sold 1200, so they tie at rank 6. The numbering keeps skipping like this: Nguan (1100) gets 8, Jagdeesh and Mamy (1000) both get 9, Faisal and Nancy (900) both get 11, and Sulabh (800) ends up at 13. You can see the “holes” in the sequence where ranks 2, 3, 5, 7, 10, and 12 are missing — that’s the hallmark of RANK().

In the Electronics department, the same pattern holds. Charley and Bjorn both sold 3000, so they tie at rank 1. Because two rows are ranked 1, the next rank jumps to 3. That’s why Alice and Bob (2000 sales) both show up as rank 3, and L Chan and DK Bret (1800 sales) tie at rank 5. Again, there’s no rank 2 or 4, because those numbers got skipped due to ties above.

The key point: RANK() respects ties by giving them the same rank, but it leaves gaps afterward. This makes sense in real-world scenarios like competitions, where if three players tie for gold, the next medal is bronze, not silver.

Here Darwin, Mira and Ajay, all 3 receives first position as they made equal sales, and similarly Chandan and Ernest holds 4th position, but no body holds, 2nd , 3rd and 5th position. These kind of holes exist in ranking systems real world analytics and we will solve the same issue with dense_rank() window function.

Use Case: when we want the positions to reflect actual standings. If two people truly share first place, they both deserve to be called “1st” — and whoever comes next is 3rd, not 2nd. Perfect for leaderboards, contests, or any “real competition” style ranking.

For further details refer Microsoft Docs – RANK (Transact-SQL)

DENSE_RANK()

The real strength of DENSE_RANK() is that it makes categories out of your data without confusing gaps. Where ROW_NUMBER() forces every row into a unique slot, and RANK() leaves holes after ties, DENSE_RANK() keeps the sequence tight.

Why does that matter? Because many business questions aren’t about who is “first, second, third” like in a race — they’re about banding people into performance levels or tiers.

For example: In the Clothing department, Darvin, Mira, and Ajay all sold 1500. DENSE_RANK() groups them together as “Rank 1” — they’re the top performers. Chandan and Ernest, with 1300 sales, become “Rank 2” — the second tier. Pankaj and Danice, with 1200 sales, fall into “Rank 3” — the third tier.

Now you can easily answer questions like:

  • “How many employees are in the top 3 sales tiers this month?” → just count rows with DenseRankVal <= 3.
  • “Which departments have at least 5 people in tier 1 or 2?” → filter ranks and group by department.

This extra analytic value comes from the categorization — you’re no longer worried about gaps or arbitrary row numbers, you’re treating tied values as the same “band.”

So where RANK() is useful for competitions (like gold/silver/bronze medals), DENSE_RANK() is perfect for analytics and reporting where you care about tiers of performance instead of literal places.

The next query simply goes through the sales table and adds a new column that shows the rank of each row within its department. Because we used PARTITION BY Department, the ranking starts fresh for every department, so Electronics has its own ranks and Clothing has its own. Inside each department, the rows are sorted by the sales amount from highest to lowest, and that’s what decides who gets rank 1, 2, 3, and so on. The special thing about DENSE_RANK() is how it handles ties. If a few people sold the same amount, they all get the same rank, but the next rank continues right after without skipping. So if three employees all tie for first place, they’re all rank 1, and the next person is rank 2, not rank 4. It’s a neat way of grouping people into clear performance levels without leaving any confusing gaps.

SELECT *,
    DENSE_RANK() OVER(PARTITION BY Department ORDER BY SaleAmount DESC) AS DenseRankVal
FROM dbo.SalesLog;

Output:

In the Clothing department, Darvin, Mira, and Ajay all sold the same amount, 1500, so they each get rank 1. The next sales amount down is 1300, so Chandan and Ernest both get rank 2. After that, Pankaj and Danice with 1200 are ranked 3, Nguan with 1100 is ranked 4, Jagdeesh and Mamy with 1000 are ranked 5, Faisal and Nancy with 900 are ranked 6, and Sulabh with 800 is ranked 7. The key thing here is that the numbers go 1, 2, 3, 4, 5, 6, 7 without skipping anything.

In the Electronics department, the same pattern happens. Charley and Bjorn sold 3000, so they’re both ranked 1. Alice and Bob sold 2000, so they’re both ranked 2. Then L Chan and DK Bret sold 1800, so they’re both ranked 3. Again, the numbers are tight and clean — 1, 2, 3.

So the point is: DENSE_RANK() groups people with the same result together, but it doesn’t leave empty gaps like RANK() does. It’s a very tidy way to create sales tiers or performance bands when you want everything to line up in order.

Even if two employees tie for the highest sale, the next rank is 2 instead of skipping to 3. That’s why DENSE_RANK() is great for keeping everyone happy — no one feels their position is being unfairly pushed down.

Use Case:

The real value of DENSE_RANK() shows up when you need to group people into tiers without gaps getting in the way. Think about a sales leaderboard: if three employees all sold the same highest amount, they each get rank 1. With RANK(), the next rank would jump to 4, which makes it harder to work with. With DENSE_RANK(), the next distinct sale amount is simply rank 2. That means you can easily say “these are all the rank 1 performers, these are the rank 2 performers,” and so on.

Now, it’s true that SQL doesn’t automatically “count how many people are in each rank.” But because DENSE_RANK() produces a clean sequence, you can just wrap it in another query and group by the rank value. For example:

WITH RankedSales
AS (SELECT EmployeeName,
           Department,
           SaleAmount,
           DENSE_RANK() OVER (PARTITION BY Department ORDER BY SaleAmount DESC) AS DenseRankVal
    FROM dbo.SalesLog)
SELECT Department,
       DenseRankVal,
       COUNT(*) AS PeopleInRank
FROM RankedSales
GROUP BY Department,
         DenseRankVal
ORDER BY Department,
         DenseRankVal;

Similarly this dense rank query can be wrapped as a sub query many nested use cases like this. This way, you not only assign ranks, but you also see how many employees are in each rank level — perfect for things like “how many people are in the top 3 tiers per department.” That’s the extra analytic value: DENSE_RANK() doesn’t just rank rows, it gives you a stable foundation to bucket and summarize results without confusing gaps.

For further details refer Microsoft Docs – DENSE_RANK (Transact-SQL)

NTILE(n)

The NTILE() function divides your result set into quantiles. A quantile is just a way of slicing ordered data into equal-sized groups. For example, if you split into 4 quantiles, you’re creating quartiles (four bands of 25% each). If you split into 10 quantiles, you’re creating deciles (ten bands of 10% each). With 100 quantiles, you get percentiles (hundred bands of 1% each). So when you write NTILE(4), you’re really saying: “Divide my rows into 4 quantiles (quartiles), sorted by whatever column I choose.”

This query takes all the rows from the SalesLog table and adds a new column called Quartile that shows which quartile each row falls into. Because we used PARTITION BY Department, the calculation starts fresh within each department, so Electronics and Clothing are split separately. Inside each department, the rows are ordered by SaleAmount from highest to lowest, and then NTILE(4) divides them into four equal-sized groups. That means the top 25% of sales in a department are labeled Quartile 1, the next 25% Quartile 2, then Quartile 3, and finally the lowest 25% Quartile 4. If the rows don’t divide perfectly, the earlier quartiles just get one extra row, but overall the effect is that every employee is placed into a performance band relative to their peers in the same department.

SELECT *,
    NTILE(4) OVER(PARTITION BY Department ORDER BY SaleAmount DESC) AS Quartile
FROM dbo.SalesLog;

Here:

  • PARTITION BY Department → Each department gets its own set of buckets.
  • ORDER BY SaleAmount DESC → The highest sales in each department go into Quartile 1, lowest into Quartile 4.

If a department has 10 rows, Quartile 1 gets the top 3, Quartile 2 the next 3, Quartile 3 the next 2, and Quartile 4 the bottom 2. The process repeats for every department.

Output:

Looking at this output, you can see how NTILE has split each department’s sales into four quartiles.

In the Clothing department, the highest sales — Darvin, Mira, and Ajay with 1500 — all land in Quartile 1, along with Chandan who sold 1300. That fills up the top band. Then Ernest (1300) plus Pankaj and Danice (1200) fall into Quartile 2. Nguan, Jagdeesh, and Mamy take Quartile 3 with their slightly lower sales. Finally, the lowest sales — Faisal, Nancy, and Sulabh — are grouped into Quartile 4. So Clothing is neatly split from top performers down to bottom performers in four equal bands.

In the Electronics department, Charley and Bjorn with 3000 sales and Alice with 2000 make up Quartile 1, showing the top earners. Bob with 2000 drops into Quartile 2, along with L Chan and DK Bret who both sold 1800. If there were more rows, we’d see Quartile 3 and 4 fill out with lower performers.

The key takeaway is this: NTILE has divided each department into four groups of roughly equal size, starting with the highest sales in Quartile 1 and working down to the lowest sales in Quartile 4. It’s basically SQL’s way of saying, “Here’s the top 25%, here’s the next 25%, and so on.”

How NTILE() assigns groups

If the number of rows divides evenly into the number of groups, each group gets the same number of rows:

  • 4 rows, 4 groups : each group gets 1 row.

If the rows don’t divide evenly, the first few groups get one extra row:

  • 5 rows, 4 groups : Group 1 and Group 2 get 2 rows each, Groups 3 and 4 get 1 row each.
  • 6 rows, 4 groups : Groups 1 and 2 get 2 rows each, Groups 3 and 4 get 1 row each.
  • 7 rows, 4 groups : Groups 1, 2, and 3 get 2 rows each, Group 4 gets 1 row.

This rule works no matter how many rows we have — NTILE always starts filling from Group 1, giving extra rows to the earlier groups until all rows are assigned. Useful in performance bands or customer segmentation.

Use case: Percentile or quartile grouping, like dividing customers into Top 25%, Middle 50%, Bottom 25%.

For further details refer Microsoft Docs – NTILE (Transact-SQL)

LAG()

LAG() lets us peek at a previous row’s value without writing a self-join or subquery. We tell it which column to look at, how many rows back to go (default is 1), and what to return if there’s no earlier row (default is NULL).

For example, if we want to see how each day’s sales compare to the previous day for the same employee, we can use:

SELECT *,
    LAG(SaleAmount) OVER(PARTITION BY EmployeeName ORDER BY LogDate) AS PrevSale
FROM dbo.SalesLog;

This query is asking SQL to show not just what each employee sold on a given day, but also what they sold the previous time they had a sale. It does this by adding a new column called PrevSale. The LAG(SaleAmount) part is the trick — it simply looks one row back in time. Because we wrote PARTITION BY EmployeeName, SQL only looks back within the same employee’s history, never mixing people together. And since we also added ORDER BY LogDate, the rows are sorted by date for each person, which means the “previous row” is really the previous day’s sale (or the last sale they made if they skipped days). So when the results come back, you see today’s sale side by side with whatever they sold last time. If it’s their very first record — like Bob’s sale on January 1st — there’s no earlier day to look at, so the PrevSale shows NULL. But when Bob makes another sale on January 5th, SQL now knows his previous sale was 2000 on January 1st, and it fills that into the new column. In plain words, the query is giving you a timeline view where every row answers the question: “What did this person sell today, and what did they sell the day before?”

Output:

As you can see below image, every employee appears only once in the table, so when SQL lines up each person’s rows by date and tries to “look one step back,” there’s nothing before that first row. That’s why PrevSale is NULL for everyone—there is no previous day/previous sale for any employee yet.

So lets register one more sale for employee 'Bob', on any random date:

INSERT INTO dbo.SalesLog (EmployeeName, Department, SaleAmount, LogDate)
VALUES
('Bob', 'Electronics', 1700, '2024-01-05')

Now Bob has two rows. On 2024-01-01, Bob’s PrevSale is NULL (it’s his first recorded sale). On 2024-01-05, PrevSale shows 2000 because LAG looks at Bob’s previous row in date order and finds his 2024-01-01 sale. Notice “previous” means the previous recorded sale for the same employee, not necessarily the calendar day right before—if someone skipped days, LAG still points to their last logged sale.

Use case: We use LAG() to track changes over time, calculate day-to-day differences, flag sudden drops, or find repeating patterns — all without extra joins.

For further details refer Microsoft Docs – LAG (Transact-SQL)

LEAD()

LEAD() is the forward-looking twin of LAG(). Instead of pulling data from an earlier row, it grabs a value from a later row — perfect for spotting what comes next without writing a self-join. We tell it which column to fetch, how many rows ahead to look (default is 1), and what to return if there’s no next row (default is NULL).

For example, if we want to see what each employee’s next day of sales looked like, we can write:

This query is adding a new column called NextSale that shows what the same employee sold on their next recorded date. The LEAD(SaleAmount) part is the key — it looks one row forward in the sequence. Because we wrote PARTITION BY EmployeeName, SQL keeps each employee separate, so Alice’s sales are only compared to Alice’s, Bob’s to Bob’s, and so on. Then ORDER BY LogDate makes sure the rows are arranged in time order, so “next” means the next date for that person. If an employee doesn’t have another sale after the current row, SQL just shows NULL in the NextSale column.

SELECT *,
    LEAD(SaleAmount) OVER(PARTITION BY EmployeeName ORDER BY LogDate) AS NextSale
FROM dbo.SalesLog;

Output:

Looking at this output, you can see how LEAD() pulls the next sale for the same employee. For most employees here, the NextSale column shows NULL because they only have one record in the table. There’s no “next day” to look forward to, so SQL leaves it empty.

The interesting part is Bob. He has two rows: one on 2024-01-01 for 2000, and another on 2024-01-05 for 1700. On his first row (2024-01-01), NextSale is 1700, because LEAD looked ahead and found his next recorded sale on Jan 5. On his second row (2024-01-05), there’s nothing after that, so NextSale is NULL.

So this output proves how LEAD works: it always looks forward within the same employee’s timeline, showing you what comes next. Everyone with only one sale has NULL, and Bob, with two sales, shows today’s sale side by side with his next one.

Use case: We use LEAD() for forecasting, detecting upcoming changes, or comparing today’s value to tomorrow’s. It’s especially handy in trend analysis, scheduling, and predictive reporting.

For further details refer Microsoft Docs – LEAD (Transact-SQL)

FIRST_VALUE()

Think about tracking someone’s progress — you can’t say they improved unless you know where they began. That’s what FIRST_VALUE() does for us in SQL. It lets us look at a whole set of rows, figure out which one is first based on the order we choose, and then show that “starting” value alongside every row in that set.

The next select query has been used for demonstration of FIRST_VALUE, it takes each employee’s sales history, lines their rows up by date from oldest to newest (ORDER BY LogDate), and then adds a new column called FirstSale that repeats the very first sale amount for that employee on every row. The PARTITION BY EmployeeName part tells SQL to keep people separate—Alice’s rows are only compared with Alice, Bob with Bob, etc. Because of the default window frame (from the start up to the current row), FIRST_VALUE() picks the first row in that ordered list, which—since we sorted ascending by date—is the earliest sale. That’s why Bob’s FirstSale shows 2000 on both his January 1 and January 5 rows. If you wanted the “first” to mean the latest sale instead, you’d simply flip the sort to ORDER BY LogDate DESC. And tiny gotcha: if two rows share the exact same date and you care which one counts as “first,” add a tiebreaker to the ORDER BY (e.g., ORDER BY LogDate, LogID).

SELECT *,
    FIRST_VALUE(SaleAmount) OVER(PARTITION BY EmployeeName ORDER BY LogDate) AS FirstSale
FROM dbo.SalesLog;

Output:

Here as we can see in below image, for latest sale made by Bob on 2024-01-05, first sale is always the oldest one. As we order doing Order by LogDate in ascending order(by default). So no matter how much more sales he make in January month, still firstsale will be 2000 Bucks.

Here, PARTITION BY EmployeeName means we start fresh for each employee, and ORDER BY LogDate makes sure “first” means earliest date.

Use case: Perfect when we want to measure change over time — like comparing every sale to the first one, seeing how far someone has come, or spotting when they dip below their starting point.

For further details refer Microsoft Docs – FIRST_VALUE (Transact-SQL)

LAST_VALUE()

This is the most tricky window function and you need to pay bit more attention to it.

If FIRST_VALUE() helps us remember where we started, LAST_VALUE() tells us where we end up. It returns the last value in the window we define. The catch is the default window frame: when we add ORDER BY to a window, SQL Server’s default frame is RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW.

Here before we move to explain this window functions, lets understand few terms in SQL:

  • CURRENT ROW → means “stop right here, don’t go further.” So the window frame ends on the row you are currently looking at. That’s why in the default LAST_VALUE, SQL only sees up to the current row, and the “last value” ends up being the same as the current row’s value.
  • UNBOUNDED PRECEDING → means “start from the very first row.”
  • UNBOUNDED FOLLOWING → means “go all the way to the very last row.”

Sample rows we’ll use

We will just use dataset for Bob, as it has two data points loaded.

SELECT EmployeeName, LogDate, SaleAmount
FROM dbo.SalesLog
WHERE EmployeeName = 'Bob'
ORDER BY LogDate;

The next query is taking Bob’s sales rows and adding a new column called LastSale_Default. It uses the LAST_VALUE(SaleAmount) function, which is supposed to return the last sale amount within a defined window. The window is set by the OVER() clause:

  • PARTITION BY EmployeeName → do the calculation separately for each employee (here only Bob).
  • ORDER BY LogDate → sort Bob’s rows by date, so they go from oldest to newest.

The tricky part: since we didn’t specify a custom frame, SQL automatically uses the default frame: “everything from the very first row up to the current row.”

Because of that, when SQL calculates LAST_VALUE, the “last row” it sees is always just the current row you’re standing on. So instead of showing Bob’s true final sale on every row, the LastSale_Default column just copies the SaleAmount for that row.

SELECT EmployeeName, LogDate, SaleAmount,
       LAST_VALUE(SaleAmount) OVER (
           PARTITION BY EmployeeName
           ORDER BY LogDate
           -- default frame is: RANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
       ) AS LastSale_Default
FROM dbo.SalesLog
WHERE EmployeeName = 'Bob'
ORDER BY LogDate;

Output (LastSale_Default just mirrors the current row):

Looking at the image, you can see exactly how the default LAST_VALUE() behaves. Bob has two sales: 2000 on 2024-01-01 and 1700 on 2024-01-05. The LastSale_Default column is supposed to show the “last value” for him, but in the screenshot it just mirrors whatever is in the SaleAmount column. On the first row it shows 2000, and on the second row it shows 1700. This proves the trap — by default SQL only looks “up to the current row,” so the “last” row in that frame is simply the row you’re standing on, not Bob’s actual final sale.

Why: with the default RANGE … CURRENT ROW, the last row in frame = the current row.

Correct LAST_VALUE() (extend the frame to the true end)

To get the true final value in the partition on every row, we must extend the frame to the end by using ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING and make the following change which is highlighted by comments

SELECT EmployeeName, LogDate, SaleAmount,
LAST_VALUE(SaleAmount) OVER (
PARTITION BY EmployeeName
ORDER BY LogDate
ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING ---Changed from default settings
) AS LastSale_Fixed
FROM dbo.SalesLog
WHERE EmployeeName = 'Bob'
ORDER BY LogDate;

Output (now we always see the real last value = 1700):

So unlike the default version where it just copied the current row, this one is consistent — both rows point to Bob’s true final sale of 1700. That’s the whole point of adding ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING: it makes sure SQL always looks at the entire window and picks the actual last row, not just the row you’re standing on.

Side-by-side (both columns together)

This query is showing Bob’s sales with two different versions of LAST_VALUE side by side so we can clearly see the difference. The first one, LastSale_Default, uses the function without changing the window frame, so SQL just looks from the first row up to the current row. That makes the “last value” equal to whatever the current row’s SaleAmount is — on January 1 it shows 2000, on January 5 it shows 1700. The second one, LastSale_Fixed, extends the frame all the way to the end using ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING, so both rows now correctly show 1700, Bob’s true final sale.

SELECT EmployeeName, LogDate, SaleAmount,
       LAST_VALUE(SaleAmount) OVER (
           PARTITION BY EmployeeName
           ORDER BY LogDate
       ) AS LastSale_Default, -- default RANGE…CURRENT ROW
       LAST_VALUE(SaleAmount) OVER (
           PARTITION BY EmployeeName
           ORDER BY LogDate
           ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING
       ) AS LastSale_Fixed
FROM dbo.SalesLog
WHERE EmployeeName = 'Bob'
ORDER BY LogDate;

Output:

In the output you can actually see this difference play out. For Bob’s first row on 2024-01-01, LastSale_Default shows 2000 (just copying the current row) while LastSale_Fixed shows 1700, the true last sale. On the second row, 2024-01-05, both columns show 1700, but only because the current row happens to be the last one. This side-by-side view makes it obvious why the default behavior can be misleading and why adding the explicit frame gives the correct result across all rows.

Takeaway:

  • FIRST_VALUE() is usually fine with the default frame.
  • LAST_VALUE() needs ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING when we really mean “the final row in the entire partition.”

Use case: Great when we want to keep the latest value visible on every row — like showing the final sale of the month alongside each day’s sales, tracking the most recent status in a workflow, or highlighting how far we are from the end point. It’s particularly useful in performance summaries, end-of-period comparisons, and progress tracking.

For further details, refer: Microsoft Docs – LAST_VALUE (Transact-SQL)

Conclusion

We began with nothing more than a small sales table—just 25 rows, a handful of employees, and a couple of departments. On the surface, it looked like the kind of dataset you might use for a quick demo and forget about. But once we brought in SQL Server’s window functions, it became a playground for analysis. We ranked people based on their totals, handled ties gracefully (or tightly), compared each day’s numbers to the one before or after, pulled out starting and ending values, and tracked how sales accumulated over time—all without ever losing sight of the individual rows.

That’s the magic here: window functions let us zoom in and out at the same time. We can see every detail, yet still understand the bigger pattern. Aggregates collapse data, joins can get messy, and procedural loops take too long to write and debug. But with the right OVER() clause—especially when we control how rows are partitioned and what frame we’re looking at—we can get answers that are both clear and efficient.

Once you get the hang of them, these functions aren’t just a neat trick—they’re a habit. They make queries shorter, easier to follow, and often faster to run. So next time you’re about to build a temp table, chain a bunch of joins, or write a cursor, take a breath and think: “Can I do this with a window function?” More often than not, the answer will be yes—and the solution will be cleaner, smarter, and a lot more satisfying.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating