Select integer ranges

  • Hi there,

    I have a simple table with 3 columns

    CREATE TABLE [dbo].[Ranges](

    [RangeId] int NOT NULL,

    [CountryCode] [varchar](3) NOT NULL,

    [CurrencyCode] [varchar](3) NOT NULL,

    ) ON [PRIMARY]

    and have the following sample data inserted

    INSERT INTO Ranges VALUES (1, 'USA', 'USD');

    INSERT INTO Ranges VALUES (2, 'USA', 'USD');

    INSERT INTO Ranges VALUES (3, 'USA', 'EUR');

    INSERT INTO Ranges VALUES (4, 'USA', 'EUR');

    INSERT INTO Ranges VALUES (5, 'USA', 'USD');

    INSERT INTO Ranges VALUES (6, 'USA', 'USD');

    INSERT INTO Ranges VALUES (7, 'USA', 'EUR');

    INSERT INTO Ranges VALUES (8, 'USA', 'EUR');

    INSERT INTO Ranges VALUES (9, 'USA', 'EUR');

    INSERT INTO Ranges VALUES (10, 'USA', 'EUR');

    What I need is to be able to run a query that returns the following:

    MinRangeId, MaxRangeId, CountryCode, CurrencyCode

    1, 2, USA, USD

    3, 4, USA, EUR

    5, 6, USA, USD

    7, 10, USA, EUR

    Using MIN and MAX and grouping by CountryCode and CurrencyCode I'm getting:

    MinRangeId, MaxRangeId, CountryCode, CurrencyCode

    1, 6, USA, USD

    3, 10, USA, EUR

    There will be approximately 10 million rows in the real table but I'd like to get a proof-of-concept working first.

    Any ideas?

  • I see that your groups are repeating with different ranges, so I guess you need groups of all consecutive values with MIN and MAX, right?

    Can you post the query you are using?

    -- Gianluca Sartori

  • WITH CTE AS (

    SELECT RangeId,

    CountryCode,

    CurrencyCode,

    ROW_NUMBER() OVER(PARTITION BY CountryCode ORDER BY RangeId) -

    ROW_NUMBER() OVER(PARTITION BY CountryCode,CurrencyCode ORDER BY RangeId) AS rnDiff

    FROM Ranges)

    SELECT MIN(RangeId) AS MinRangeId,

    MAX(RangeId) AS MaxRangeId,

    CountryCode,CurrencyCode

    FROM CTE

    GROUP BY CountryCode,CurrencyCode,rnDiff

    ORDER BY MIN(RangeId)

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • Hi Mark,

    Your solution is fantastic ... really ingenius ... thank you.

    I had read about using OVER, but would not have thought of your approach in a long, long time.

  • bmoynihan (7/23/2009)


    Hi Mark,

    Your solution is fantastic ... really ingenius ... thank you.

    I had read about using OVER, but would not have thought of your approach in a long, long time.

    Thanks for the feedback

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • bmoynihan way to go for posting the CREATE TABLE and INSERT INTO statements! you are providing a great example for others to emulate! Thank you!

    i came late to the party, and Mark provided a great example; you know it helped him to have the info you provided!

    I'm grabbing this example and saving it in my snippets.

    Lowell


    --help us help you! If you post a question, make sure you include a CREATE TABLE... statement and INSERT INTO... statement into that table to give the volunteers here representative data. with your description of the problem, we can provide a tested, verifiable solution to your question! asking the question the right way gets you a tested answer the fastest way possible!

  • There is one issue, but we only encountered it after further testing and in fairness my examples didn't highlight this potential scenario.

    Mark's solution assumes that the RangeId values are in order and sequential, however, if for example the entries were:

    INSERT INTO Ranges VALUES (1, 'USA', 'USD');

    INSERT INTO Ranges VALUES (2, 'USA', 'USD');

    INSERT INTO Ranges VALUES (3, 'USA', 'EUR');

    INSERT INTO Ranges VALUES (4, 'USA', 'EUR');

    INSERT INTO Ranges VALUES (5, 'USA', 'USD');

    INSERT INTO Ranges VALUES (6, 'USA', 'USD');

    INSERT INTO Ranges VALUES (7, 'USA', 'EUR');

    INSERT INTO Ranges VALUES (8, 'USA', 'EUR');

    INSERT INTO Ranges VALUES (9, 'USA', 'EUR');

    INSERT INTO Ranges VALUES (10, 'USA', 'EUR');

    INSERT INTO Ranges VALUES (15, 'USA', 'EUR');

    INSERT INTO Ranges VALUES (16, 'USA', 'EUR');

    Notice the gap in RangeId's of 11, 12, 13 and 14.

    What I'd like to return would be ...

    MinRangeId, MaxRangeId, CountryCode, CurrencyCode

    1, 2, USA, USD

    3, 4, USA, EUR

    5, 6, USA, USD

    7, 11, USA, EUR

    15, 16, USA, EUR

    Mark's solution gives the following:

    MinRangeId, MaxRangeId, CountryCode, CurrencyCode

    1, 2, USA, USD

    3, 4, USA, EUR

    5, 6, USA, USD

    7, 16, USA, EUR

    While perfectly correct in what it displays (min and max), it does not take into account 'gaps' in the RangeId ... any ideas?

  • Try this:

    Right Join it to a tally table and then remove groups for NULLs in an outer query.

    -- Gianluca Sartori

  • You can get a Tally table from here:

    http://www.sqlservercentral.com/articles/T-SQL/62867/

    And then you can go with this code:

    ;WITH CTE AS (

    SELECT n AS RangeId,

    CountryCode,

    CurrencyCode,

    n - ROW_NUMBER() OVER ( PARTITION BY CountryCode, CurrencyCode ORDER BY n ) AS rnDiff

    FROM (

    SELECT RangeId, ISNULL(CountryCode,'-') AS CountryCode, ISNULL(CurrencyCode,'-') as CurrencyCode, n

    FROM Ranges

    RIGHT JOIN Tally

    ON N = RangeId

    WHERE n <= (SELECT MAX(RangeId) FROM @Ranges)

    ) AS srcdata

    )

    SELECT *

    FROM (

    SELECT MIN(RangeId) AS MinRangeId,

    MAX(RangeId) AS MaxRangeId,

    CountryCode,

    CurrencyCode

    FROM CTE

    GROUP BY CountryCode,

    CurrencyCode,

    rnDiff

    ) AS srccte

    WHERE CountryCode '-'

    ORDER BY 1

    -- Gianluca Sartori

  • There is one issue, but we only encountered it after further testing and in fairness my examples didn't highlight this potential scenario.

    Mark's solution assumes that the RangeId values are in order and sequential, however, if for example the entries were:

    INSERT INTO Ranges VALUES (1, 'USA', 'USD');

    INSERT INTO Ranges VALUES (2, 'USA', 'USD');

    INSERT INTO Ranges VALUES (3, 'USA', 'EUR');

    INSERT INTO Ranges VALUES (4, 'USA', 'EUR');

    INSERT INTO Ranges VALUES (5, 'USA', 'USD');

    INSERT INTO Ranges VALUES (6, 'USA', 'USD');

    INSERT INTO Ranges VALUES (7, 'USA', 'EUR');

    INSERT INTO Ranges VALUES (8, 'USA', 'EUR');

    INSERT INTO Ranges VALUES (9, 'USA', 'EUR');

    INSERT INTO Ranges VALUES (10, 'USA', 'EUR');

    INSERT INTO Ranges VALUES (15, 'USA', 'EUR');

    INSERT INTO Ranges VALUES (16, 'USA', 'EUR');

    Notice the gap in RangeId's of 11, 12, 13 and 14.

    What I'd like to return would be ...

    MinRangeId, MaxRangeId, CountryCode, CurrencyCode

    1, 2, USA, USD

    3, 4, USA, EUR

    5, 6, USA, USD

    7, 11, USA, EUR

    15, 16, USA, EUR

    Mark's solution gives the following:

    MinRangeId, MaxRangeId, CountryCode, CurrencyCode

    1, 2, USA, USD

    3, 4, USA, EUR

    5, 6, USA, USD

    7, 16, USA, EUR

    While perfectly correct in what it displays (min and max), it does not take into account 'gaps' in the RangeId ... any ideas?

    WITH CTE AS (

    SELECT RangeId,

    CountryCode,

    CurrencyCode,

    RangeId-ROW_NUMBER() OVER(ORDER BY RangeId) AS rnDiff2,

    ROW_NUMBER() OVER(PARTITION BY CountryCode ORDER BY RangeId) -

    ROW_NUMBER() OVER(PARTITION BY CountryCode,CurrencyCode ORDER BY RangeId) AS rnDiff

    FROM Ranges)

    SELECT MIN(RangeId) AS MinRangeId,

    MAX(RangeId) AS MaxRangeId,

    CountryCode,CurrencyCode

    FROM CTE

    GROUP BY CountryCode,CurrencyCode,rnDiff,rnDiff2

    ORDER BY MIN(RangeId)

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • Great, Mark!

    Much cleaner and smoother than my suggestion!

    Kudos!

    -- Gianluca Sartori

  • Nice idea Mark.

    This may be a slightly simpler way to catch the gap in RangeId

    ;WITH CTE AS (

    SELECT RangeId,

    CountryCode,

    CurrencyCode,

    RangeId - --ROW_NUMBER() OVER(ORDER BY RangeId) -

    ROW_NUMBER() OVER(PARTITION BY CountryCode,CurrencyCode ORDER BY RangeId) AS rnDiff

    FROM Ranges)

    SELECT MIN(RangeId) AS MinRangeId,

    MAX(RangeId) AS MaxRangeId,

    CountryCode,CurrencyCode

    FROM CTE

    GROUP BY CountryCode,CurrencyCode,rnDiff

    ORDER BY MinRangeId

  • Ken McKelvey (7/23/2009)


    Nice idea Mark.

    This may be a slightly simpler way to catch the gap in RangeId

    ;WITH CTE AS (

    SELECT RangeId,

    CountryCode,

    CurrencyCode,

    RangeId - --ROW_NUMBER() OVER(ORDER BY RangeId) -

    ROW_NUMBER() OVER(PARTITION BY CountryCode,CurrencyCode ORDER BY RangeId) AS rnDiff

    FROM Ranges)

    SELECT MIN(RangeId) AS MinRangeId,

    MAX(RangeId) AS MaxRangeId,

    CountryCode,CurrencyCode

    FROM CTE

    GROUP BY CountryCode,CurrencyCode,rnDiff

    ORDER BY MinRangeId

    Yep, like it!

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537

Viewing 13 posts - 1 through 13 (of 13 total)

You must be logged in to reply to this topic. Login to reply