Rank on repeating groups of bit field

Question

Rank on repeating groups of bit field

Jim Shaffer

Say Hey Kid

Points: 696
More actions
January 16, 2015 at 4:24 pm

#397885

Lets say I have some data where a bit value changes over time and I want to rank it by the repeating groups, how do I write the SQL so that I get the result below?
I want to sort by create date, and every time the bit changes value we start the rank over.
If I partition by the bit field, it will just group the entire data set by the bit field.
Bit CreateDate Rank
1 3/15/2014 1
1 3/14/2014 2
0 3/9/2014 1
0 3/8/2014 2
0 3/6/2014 3
1 3/4/2014 1
0 2/28/2014 1
0 2/20/2014 2
0 2/15/2014 3
0 2/10/2014 4

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply

Eirikur Eiriksson SSC Guru Points: 182880 More actions · Answer 1

Jim Shaffer (1/16/2015)
Lets say I have some data where a bit value changes over time and I want to rank it by the repeating groups, how do I write the SQL so that I get the result below?
I want to sort by create date, and every time the bit changes value we start the rank over.
If I partition by the bit field, it will just group the entire data set by the bit field.
Bit CreateDate Rank
1 3/15/2014 1
1 3/14/2014 2
0 3/9/2014 1
0 3/8/2014 2
0 3/6/2014 3
1 3/4/2014 1
0 2/28/2014 1
0 2/20/2014 2
0 2/15/2014 3
0 2/10/2014 4

Quick suggestion, use a running total of the Bit column in the Date order to mark the groups, here is an example.

😎

USE tempdb;

GO

SET NOCOUNT ON;

DECLARE @SAMPLE_DATA TABLE

(

XBit INT NOT NULL

,CreateDate DATE NOT NULL

);

INSERT INTO @SAMPLE_DATA (XBit,CreateDate)

VALUES

(1, '3/15/2014')

,(1, '3/14/2014')

,(0, '3/9/2014')

,(0, '3/8/2014')

,(0, '3/6/2014')

,(1, '3/4/2014')

,(0, '2/28/2014')

,(0, '2/20/2014')

,(0, '2/15/2014')

,(0, '2/10/2014')

;

;WITH BASE_DATA AS

(

SELECT

SD.CreateDate

,SD.XBit

,SUM(SD.XBit) OVER

(

PARTITION BY (SELECT NULL)

ORDER BY SD.CreateDate

ROWS BETWEEN UNBOUNDED PRECEDING

AND CURRENT ROW

) AS SMBIT

FROM @SAMPLE_DATA SD

)

SELECT

BD.XBit

,BD.CreateDate

,BD.SMBIT

,RANK() OVER

(

PARTITION BY BD.SMBIT

ORDER BY BD.CreateDate

) AS BD_RNK

FROM BASE_DATA BD;

Results

XBit CreateDate SMBIT BD_RNK

----- ----------- ------ -------

0 2014-02-10 0 1

0 2014-02-15 0 2

0 2014-02-20 0 3

0 2014-02-28 0 4

1 2014-03-04 1 1

0 2014-03-06 1 2

0 2014-03-08 1 3

0 2014-03-09 1 4

1 2014-03-14 2 1

1 2014-03-15 3 1

Dwain Camps SSC Guru Points: 86908 More actions · Answer 2

Another solution for those inclined to the cryptic:

SELECT XBit, CreateDate, b

,ROW_NUMBER() OVER

(

PARTITION BY b

ORDER BY CreateDate

)

FROM

(

SELECT XBit, CreateDate

,b=MAX(b) OVER

(

ORDER BY CreateDate

ROWS UNBOUNDED PRECEDING

)

FROM

(

SELECT XBit, CreateDate

,b=CAST(NULLIF(XBit, 0) AS BINARY(4)) + CAST(CreateDate AS BINARY(4))

FROM @SAMPLE_DATA

) a

) a;

For an explanation of how this works, I'll lead you to Mr. Itzik Ben-Gan's article, which provides a much better description than I ever could:

http://sqlmag.com/t-sql/last-non-null-puzzle

My mantra: No loops! No CURSORs! No RBAR! Hoo-uh![/I]
My thought question: Have you ever been told that your query runs too fast?

My advice:
INDEXing a poor-performing query is like putting sugar on cat food. Yeah, it probably tastes better but are you sure you want to eat it?
The path of least resistance can be a slippery slope. Take care that fixing your fixes of fixes doesn't snowball and end up costing you more than fixing the root cause would have in the first place.

Need to UNPIVOT? Why not CROSS APPLY VALUES instead?[/url]
Since random numbers are too important to be left to chance, let's generate some![/url]
Learn to understand recursive CTEs by example.[/url]
[url url=http://www.sqlservercentral.com/articles/St

ChrisM@Work SSC Guru Points: 186127 More actions · Answer 3

This might not be the most efficient way of solving the puzzle, but it does generate the correct results - which I always put before performance or code sophistication:

DECLARE @SAMPLE_DATA TABLE

(

XBit INT NOT NULL,

CreateDate DATE NOT NULL,

DesiredRank INT NOT NULL

);

INSERT INTO @SAMPLE_DATA (XBit,CreateDate, DesiredRank)

VALUES

(1, '3/15/2014', 1)

,(1, '3/14/2014', 2)

,(0, '3/9/2014', 1)

,(0, '3/8/2014', 2)

,(0, '3/6/2014', 3)

,(1, '3/4/2014', 1)

,(0, '2/28/2014', 1)

,(0, '2/20/2014', 2)

,(0, '2/15/2014', 3)

,(0, '2/10/2014', 4)

; WITH CTE AS (

SELECT *,

grp = ROW_NUMBER() OVER(ORDER BY CreateDate) - ROW_NUMBER() OVER(PARTITION BY XBit ORDER BY CreateDate)

FROM @SAMPLE_DATA

)

SELECT *,

NewRank = ROW_NUMBER() OVER(PARTITION BY grp ORDER BY CreateDate DESC)

FROM CTE

ORDER BY CreateDate DESC

^{“Write the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw}

For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden