Need help building a query.

  • Remember the old ways before CROSS APPLY and ROW_NUMBER?

    In days of old,

    When Knights were bold,

    And Windowing was not invented,

    He wrapped his code

    In a familiar mode

    And RBAR was prevented. 😀

    To wit...

    CREATE INDEX dodah ON auData (Emp_ID, dt_End, dt_Start)

    SELECT auID,

    Emp_ID,

    dt_Start,

    dt_End = (SELECT MIN(dt_End) FROM auData csq WHERE csq.Emp_ID = d.Emp_ID AND csq.dt_End > d.dt_Start),

    Emp_Cat

    FROM auData d

    WHERE Emp_ID > 0

    AND d.dt_Start >= CAST('1753' AS DATETIME)

    ORDER BY d.Emp_ID, d.dt_Start

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • CELKO (11/24/2010)


    Can I re-write your DDL to ISO-11179 rules, add keys. I hope you know why NUMERIC(s,p) is not like DECIMAL(s,p). T-SQL has a DATE data type now and can use row constructors in a VALUES clause. Is this what you meant?

    CREATE TABLE Timesheets

    (emp_id NUMERIC (1, 0) NOT NULL,

    employment_start_date DATE,

    -- PRIMARY KEY (emp_id, employment_start_date),

    employment_end_date DATE, -- null means current

    emp_cat INTEGER NOT NULL);

    GO

    INSERT INTO Timesheets

    VALUES(1, '2010-11-18', NULL, 1),

    (1, NULL, '2010-11-18', 1),

    (1, '2010-11-18', NULL, 1),

    (1, NULL, '2010-11-23', 1),

    (1, '2010-11-23', NULL, 90);

    Opps! You screwed up the the DDL, so that the natrual primary key cannot be declared. That makes for complicaetd code:

    WITH X

    AS

    (SELECT *,

    (ROW_NUMBER() OVER (PARTITION BY emp_id

    ORDER BY COALESCE (employment_start_date, employment_end_date))

    +1) /2 AS pairs

    FROM Timesheets)

    SELECT X.emp_id, MAX(employment_start_date) AS employment_start_date,

    MAX(employment_end_date) AS employment_end_date,

    MAX(emp_cat)

    FROM X

    GROUP BY emp_id, pairs;

    "WITH X"???? :blink: BWWAAA-HAAA-HAAAA-HAAAAA!!! ROFLMAO!!! And you have the nerve to bitch at people about their naming conventions? HOOOOOO-EEEEEE!!! :Whistling: You having Fortran flashbacks or what?

    And what's with this ROW_NUMBER garbage? When are you going to come out of the 50's and stop treating tables like mag tape with sequence numbers on each record? Please read a book on basic RDBMS, so you can stop writing COBOL and assembly language in SQL and start using SQL for RDBMS.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden (11/24/2010)


    In days of old,

    When Knights were bold,

    And Windowing was not invented,

    He wrapped his code

    In a familiar mode

    And RBAR was prevented.

    You ought to put this into your signature block.

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • {edit} wrong post.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Wayne, just because a table has an ordered clustered index, doesn't mean an identity column will honor that order. Usually it will, but you can't guarantee it.

    Jeff's solution works if there aren't any "skip steps", where you have two start and then one end, or anything like that, in date sequences. If you do have skips in the data (and timeclocks sometimes do), you'd need to use some variation on a cursor. In SQL 2000, that means either an actual cursor or a hidden cursor (quirky update). In SQL 2005 and beyond, you can use row_number() (which is another hidden cursor).

    But don't use identity columns for that. They will usually work, but you can't guarantee it, especially if the data in the table is volatile, or the server has more than one query or more than one CPU core/thread available to it.

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • GSquared (11/29/2010)


    Wayne, just because a table has an ordered clustered index, doesn't mean an identity column will honor that order. Usually it will, but you can't guarantee it.

    Jeff's solution works if there aren't any "skip steps", where you have two start and then one end, or anything like that, in date sequences. If you do have skips in the data (and timeclocks sometimes do), you'd need to use some variation on a cursor. In SQL 2000, that means either an actual cursor or a hidden cursor (quirky update). In SQL 2005 and beyond, you can use row_number() (which is another hidden cursor).

    But don't use identity columns for that. They will usually work, but you can't guarantee it, especially if the data in the table is volatile, or the server has more than one query or more than one CPU core/thread available to it.

    GSquared - I wasn't depending on the clustered index... I was depending on the ORDER BY of the SELECT statement that was inserting the data into the table. Since I specified an ORDER BY, the results should be inserted in the proper order, which will assign the identity column in the proper sequence. Won't the identity column work in this case?

    Wayne
    Microsoft Certified Master: SQL Server 2008
    Author - SQL Server T-SQL Recipes


    If you can't explain to another person how the code that you're copying from the internet works, then DON'T USE IT on a production system! After all, you will be the one supporting it!
    Links:
    For better assistance in answering your questions
    Performance Problems
    Common date/time routines
    Understanding and Using APPLY Part 1 & Part 2

  • GSquared (11/29/2010)


    Wayne, just because a table has an ordered clustered index, doesn't mean an identity column will honor that order. Usually it will, but you can't guarantee it.

    Jeff's solution works if there aren't any "skip steps", where you have two start and then one end, or anything like that, in date sequences. If you do have skips in the data (and timeclocks sometimes do), you'd need to use some variation on a cursor. In SQL 2000, that means either an actual cursor or a hidden cursor (quirky update). In SQL 2005 and beyond, you can use row_number() (which is another hidden cursor).

    But don't use identity columns for that. They will usually work, but you can't guarantee it, especially if the data in the table is volatile, or the server has more than one query or more than one CPU core/thread available to it.

    The only time I've ever seen it actually fail is when you use the IDENTITY function and either you've used an indeterminate source to sort on (sorting by GETDATE() or the like) or parallelism occurs. I'm pretty sure that everyone knows what MAXDOP 1 does to fix that.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • WayneS (11/29/2010)


    GSquared (11/29/2010)


    Wayne, just because a table has an ordered clustered index, doesn't mean an identity column will honor that order. Usually it will, but you can't guarantee it.

    Jeff's solution works if there aren't any "skip steps", where you have two start and then one end, or anything like that, in date sequences. If you do have skips in the data (and timeclocks sometimes do), you'd need to use some variation on a cursor. In SQL 2000, that means either an actual cursor or a hidden cursor (quirky update). In SQL 2005 and beyond, you can use row_number() (which is another hidden cursor).

    But don't use identity columns for that. They will usually work, but you can't guarantee it, especially if the data in the table is volatile, or the server has more than one query or more than one CPU core/thread available to it.

    GSquared - I wasn't depending on the clustered index... I was depending on the ORDER BY of the SELECT statement that was inserting the data into the table. Since I specified an ORDER BY, the results should be inserted in the proper order, which will assign the identity column in the proper sequence. Won't the identity column work in this case?

    Ah, I misread what you wrote there.

    Yes, that should work. Details here: http://support.microsoft.com/kb/273586

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • you could do a correlated sub query.

    select

    d1.auID,

    d1.Emp_ID,

    d1.dt_Start,

    (

    select top 1

    d2.dt_End

    from auData d2

    where d2.auID > d1.auID

    and d2.Emp_ID = d2.Emp_ID

    order by d2.auID

    ) as dt_end

    Emp_Cat

    )

    from auData d1

    where d1.dt_Start is not null

    This will select only records with a start date, and then do a sub query to find the next record with the same employee id. When querying your data is this hard it may mean that you have not designed your tables correctly. What about filling in your end dates as the data is received? Just find the most recent record for an employee and update the end date. Then each record would have a full time interval.

  • michaelwelcome (12/1/2010)


    you could do a correlated sub query.

    select

    d1.auID,

    d1.Emp_ID,

    d1.dt_Start,

    (

    select top 1

    d2.dt_End

    from auData d2

    where d2.auID > d1.auID

    and d2.Emp_ID = d2.Emp_ID

    order by d2.auID

    ) as dt_end

    Emp_Cat

    )

    from auData d1

    where d1.dt_Start is not null

    This will select only records with a start date, and then do a sub query to find the next record with the same employee id. When querying your data is this hard it may mean that you have not designed your tables correctly. What about filling in your end dates as the data is received? Just find the most recent record for an employee and update the end date. Then each record would have a full time interval.

    That's pretty much what I did... correlated subquery with the ability to find the "min" end date. As some of the other folks pointed out, neither of ours works exactly correctly if there are "skips" in the data where there are things like 2 end dates for one start date.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Hi,

    Try this query

    Select Distinct x.auID, x.Emp_ID, x.dt_Start, (Select top 1 dt_End From auData Where dt_End > x.dt_Start) as dt_End

    From

    (Select auID, Emp_ID, dt_Start, Emp_Cat From auData Where dt_Start is not null) x,

    (Select auID, Emp_ID, dt_End, Emp_Cat From auData Where dt_End is not null) y

  • sri76 (12/6/2010)


    Hi,

    Try this query

    Select Distinct x.auID, x.Emp_ID, x.dt_Start, (Select top 1 dt_End From auData Where dt_End > x.dt_Start) as dt_End

    From

    (Select auID, Emp_ID, dt_Start, Emp_Cat From auData Where dt_Start is not null) x,

    (Select auID, Emp_ID, dt_End, Emp_Cat From auData Where dt_End is not null) y

    Try it on, say, 10,000 rows and watch what SQL Server does with that partial cross join on the Execution Plan... 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 12 posts - 16 through 26 (of 26 total)

You must be logged in to reply to this topic. Login to reply