find group of time

  • hi all,

    i have the follwing table : 10 lines

    i want to show only row after 30 second (should be paramter :)) between them. ( as summery of previous records)

    on this record set:

    2020-08-26 00:00:00.000  --first line bring

    2020-08-26 00:00:20.000 -- 30 second not over avoid this record

    2020-08-26 00:00:25.000 -- 30 second not over avoid this record

    2020-08-26 00:00:40.000  --aftr 30 seconds bring.

    2020-08-26 00:01:00.000  -- 30 second not over avoid this record

    2020-08-26 00:01:20.000 --  aftr 30 seconds bring.

    2020-08-26 00:01:40.000 -- 30 second not over avoid this record

    2020-08-26 00:02:00.000 --aftr 30 seconds bring.

    i hope  i was clear 🙂

    thank you vary much

    sharon

     

     

     

  • For those are too lazy to copy paste for testing, added 2nd Colum for True False:

    create table #temp (
    Date1 datetime2
    ,Wanted bit)


    insert into #temp
    values (cast('2020-08-26 00:00:00.000' as Datetime2),1)
    ,(cast('2020-08-26 00:00:20.000' as Datetime2),0)
    ,(cast('2020-08-26 00:00:25.000' as Datetime2),0)
    ,(cast('2020-08-26 00:00:40.000' as Datetime2),1)
    ,(cast('2020-08-26 00:01:00.000' as Datetime2),0)
    ,(cast('2020-08-26 00:01:20.000' as Datetime2),1)
    ,(cast('2020-08-26 00:01:40.000' as Datetime2),0)
    ,(cast('2020-08-26 00:02:00.000' as Datetime2),1)

     

    I want to be the very best
    Like no one ever was

  • Recursive cte method, won't be very efficient

    with cte as (
    select top 1 Date1, cast(1 as bigint) as rn
    from #temp
    order by Date1

    union all

    select t.Date1, row_number() over(order by t.Date1) as rn
    from #temp t
    inner join cte c on t.Date1 > dateadd(second,30,c.Date1) and c.rn = 1
    )
    select Date1
    from cte
    where rn = 1
    order by Date1;

    ____________________________________________________

    Deja View - The strange feeling that somewhere, sometime you've optimised this query before

    How to get the best help on a forum

    http://www.sqlservercentral.com/articles/Best+Practices/61537
  • ktflash wrote:

    For those are too lazy to copy paste for testing, added 2nd Colum for True False: 

    You should speak to the OP about the "lazy" thing. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "Dear Lord... I'm a DBA so please give me patience because, if you give me strength, I'm going to need bail money too!"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • sharon-472085 wrote:

    hi all,

    i have the follwing table : 10 lines

    i want to show only row after 30 second (should be paramter :)) between them. ( as summery of previous records)

    on this record set:

    2020-08-26 00:00:00.000  --first line bring

    2020-08-26 00:00:20.000 -- 30 second not over avoid this record

    2020-08-26 00:00:25.000 -- 30 second not over avoid this record

    2020-08-26 00:00:40.000  --aftr 30 seconds bring.

    2020-08-26 00:01:00.000  -- 30 second not over avoid this record

    2020-08-26 00:01:20.000 --  aftr 30 seconds bring.

    2020-08-26 00:01:40.000 -- 30 second not over avoid this record

    2020-08-26 00:02:00.000 --aftr 30 seconds bring.

    i hope  i was clear 🙂

    thank you vary much

    sharon

    Is there a PK for the table?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "Dear Lord... I'm a DBA so please give me patience because, if you give me strength, I'm going to need bail money too!"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • How do you want the 30 second intervals defined?

    i.e. do you want them divided into clock-face 30 second intervals or for the 30 second interval to start at the time of the found row?

  • thank you vary much 🙂

    next time not lazy:)

  • thank you vary much 🙂

     

  • sharon-472085 wrote:

    thank you vary much 🙂

    next time not lazy:)

    😀

    Sorry... my reference was about creating readily consumable data.  You can see one way to do that in the article at the first link in my signature line below.  It also explains why it helps us help you.

    Shifting gears back to the original topic, if the Recursive CTE method proves to be a bit slow, post back... I have a bit of an unorthodox but very effective method that will blast through a million rows in about a second or two for something like this.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "Dear Lord... I'm a DBA so please give me patience because, if you give me strength, I'm going to need bail money too!"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • tank you solve 🙂

  • sharon-472085 wrote:

    tank you solve 🙂

    I'm not quite sure how to read that... Does that mean that you already have a solution that solved the problem or that you need a faster solution and are asking what it is?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.
    "Change is inevitable... change for the better is not".
    "Dear Lord... I'm a DBA so please give me patience because, if you give me strength, I'm going to need bail money too!"

    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden wrote:

    Shifting gears back to the original topic, if the Recursive CTE method proves to be a bit slow, post back... I have a bit of an unorthodox but very effective method that will blast through a million rows in about a second or two for something like this.

    Sounds like you have a quirky update in  mind... If so, I'll hold off on posting my solution so that I don't burst any bubbles or step on any toes. 😀

     

  • Quick question for @sharon-472085... How do you want to handle rows that have duplicate date-time values?

    For example you have two rows that both have a value of '2020-08-26 00:00:40.000'? If you want to "keep the first and drop the second", do you have a means to determine which one id first and which one is second?

  • What the heck... I'm just going to post what I have... See what you think of this.

    USE tempdb;
    GO
    --====================================================================================================================
    -- Start by creating 1M rows of test data in tempdb
    --====================================================================================================================
    IF OBJECT_ID('tempdb.dbo.TestData', 'U') IS NOT NULL
    BEGIN DROP TABLE tempdb.dbo.TestData; END;
    GO

    CREATE TABLE tempdb.dbo.TestData (
    rid int NOT NULL PRIMARY KEY NONCLUSTERED,
    some_date_time datetime NOT NULL
    );
    GO

    WITH
    cte_n1 (n) AS (SELECT 1 FROM (VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) n (n)), -- 10
    cte_n2 (n) AS (SELECT 1 FROM cte_n1 a CROSS JOIN cte_n1 b), -- 100
    cte_n3 (n) AS (SELECT 1 FROM cte_n2 a CROSS JOIN cte_n1 b), -- 10,000
    cte_Tally (n) AS (
    SELECT TOP (1000000)
    ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
    FROM
    cte_n3 a CROSS JOIN cte_n3 b -- 100,000,000
    )
    INSERT tempdb.dbo.TestData (rid, some_date_time)
    SELECT
    rid = ROW_NUMBER() OVER (ORDER BY sdt.some_date_time),
    sdt.some_date_time
    FROM
    cte_Tally t
    CROSS APPLY ( VALUES (DATEADD(SECOND, ABS(CHECKSUM(NEWID())) % 30 + t.n, '2020-08-26')) ) sdt (some_date_time);

    --====================================================================================================================
    -- Working with the assumption that we dont want to blindly alter the data in our "real" table or mess with its indexes,
    -- we start by dumping the required columns into a #temp_table that we can manipulate without having a heart attack.
    --====================================================================================================================
    IF OBJECT_ID('tempdb..#quirky', 'U') IS NOT NULL
    BEGIN DROP TABLE #quirky; END;

    CREATE TABLE #quirky (
    rid int NOT NULL,
    some_date_time datetime NOT NULL,
    sdt_rid_code AS CONVERT(binary(8), some_date_time) + CONVERT(binary(4), rid) PERSISTED, -- << this is a binary concatination that combines the datetime column with the unique pk column.
    is_keeper bit NOT NULL DEFAULT(0) -- it will be necessary to break ties when multiple rows share the same datetime value.
    );

    INSERT #quirky(rid, some_date_time)
    SELECT td.rid, td.some_date_time FROM tempdb.dbo.TestData td;

    ALTER TABLE #quirky ADD PRIMARY KEY CLUSTERED (some_date_time, rid) WITH (FILLFACTOR = 100); -- Add a clustered PK. This is necessary to ensure the "quirky update" processes in the correct order.

    --====================================================================================================================
    -- This is the funky voodoo (quirky) magic happens... Read more about it in Jeffs excelent article... https://www.sqlservercentral.com/articles/solving-the-running-total-and-ordinal-rank-problems-rewritten
    -- It chews through the full 1M rows on ~2 secs on my test machine.
    --====================================================================================================================
    DECLARE @keep_code binary(12) = 0x000000000000000000000000;

    UPDATE q SET
    @keep_code = CASE WHEN q.some_date_time >= DATEADD(SECOND, 30, CONVERT(datetime, SUBSTRING(@keep_code, 1, 8))) THEN q.sdt_rid_code ELSE @keep_code END,
    q.is_keeper = CASE WHEN q.sdt_rid_code = @keep_code THEN 1 ELSE 0 END
    FROM
    #quirky q WITH (TABLOCKX)
    OPTION(MAXDOP 1);

    ----------------------------------------------------------------------------------------------------------------------
    SELECT * FROM #quirky q -- do a quick validation to see which rows are "keepers" and which ones are not, before deleting non-keepers.

    ----------------------------------------------------------------------------------------------------------------------
    DELETE #quirky WHERE is_keeper = 0; -- get rid of the non-keepers.

    ----------------------------------------------------------------------------------------------------------------------
    -- The #quirky table can now be used to filter the original dboTestData.
    SELECT
    td.rid,
    td.some_date_time
    FROM
    tempdb.dbo.TestData td
    WHERE
    EXISTS (SELECT 1 FROM #quirky q WHERE td.rid = q.rid)
    ORDER BY
    td.rid ASC;

    --DROP TABLE dbo.TestData;
    --DROP TABLE #quirky;

    • This reply was modified 4 months, 3 weeks ago by  Jason A. Long.
    • This reply was modified 4 months, 3 weeks ago by  Jason A. Long.

Viewing 14 posts - 1 through 14 (of 14 total)

You must be logged in to reply to this topic. Login to reply