|
|
|
SSC-Dedicated
           
Group: General Forum Members
Last Login: Yesterday @ 9:57 PM
Points: 32,893,
Visits: 26,771
|
|
|
|
|
|
Valued Member
      
Group: General Forum Members
Last Login: Thursday, May 09, 2013 5:23 PM
Points: 52,
Visits: 188
|
|
Sorry, I'm dumb. But I completely don't understand how the "test data" dates at the start of the article make any sense without the accompanying comments. Would you mind clarifying please? If so, thanks!
Edit: It's probably not even clear why I'm confused. But I mean, how does anyone know if any of the dates is supposed to be a "start and end date" or just a "start date" or just an "end date" without the comments? I assume I'm missing something obvious here...
"If I had been drinking out of that toilet, I might have been killed." -Ace Ventura
|
|
|
|
|
SSCertifiable
       
Group: General Forum Members
Last Login: 2 days ago @ 10:09 PM
Points: 5,658,
Visits: 6,100
|
|
Relax auto, you're not dumb. If you haven't had a need for it the test data doesn't make sense out of the gate. This comes up a lot, though, for people trying to deal with, say, employee work logging or information from a mechanical switch reporting status, and you want to find 'misses'.
Basically, you're looking for start/end dates for each contiguous grouping of the dates in the test data.
I have a feeling Jeff you're going to be fielding a few questions about the subtracting from the date, but a solid spackle walkthrough. Thanks as always!
- Craig Farrell
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions | Forum Netiquette For index/tuning help, follow these directions. |Tally Tables Twitter: @AnyWayDBA
|
|
|
|
|
Valued Member
      
Group: General Forum Members
Last Login: Thursday, May 09, 2013 5:23 PM
Points: 52,
Visits: 188
|
|
But how can you tell which dates are "start", "end" or "start and end" without any additional information (like the comments)? I don't see any rhyme or reason here, aside from the dates being in chronological order.
"If I had been drinking out of that toilet, I might have been killed." -Ace Ventura
|
|
|
|
|
UDP Broadcaster
      
Group: General Forum Members
Last Login: Friday, May 10, 2013 3:15 AM
Points: 1,476,
Visits: 1,943
|
|
Nice article as a whole... a bit weak/confusing intro though. But thx to Craigs little comment it became much more clear what/why was being done.
/T
|
|
|
|
|
Valued Member
      
Group: General Forum Members
Last Login: Thursday, May 09, 2013 5:23 PM
Points: 52,
Visits: 188
|
|
Okay, I think this is starting to make sense now. Contiguous is defined here as "at least one date entry per day during a range". Okay. And now the rest of the article makes sense too. Cool.
Wow, now that I understand what's going on, this article is freaking awesome. Jeff rocks. (Craig, too.)
"If I had been drinking out of that toilet, I might have been killed." -Ace Ventura
|
|
|
|
|
SSCertifiable
       
Group: General Forum Members
Last Login: 2 days ago @ 10:09 PM
Points: 5,658,
Visits: 6,100
|
|
Alright, let's make the data a little more real world scenario.
You have a contractor who comes by and does work as needed. You're looking for a per problem average of how long it takes for them to fix whatever issue they were brought in for. You've got a log of said person's hours:
1/1/2000 8 1/2/2000 4 1/4/2000 8 1/5/2000 8 1/6/2000 8 1/9/2000 3 1/12/2000 8 1/13/2000 8 1/14/2000 6
So, you need to find a way to 'group' this data, to know which sections are continuous. Using the code and teh data above, there are four distinct groups: 1/1 - 1/2, 1/4 - 1/6, 1/9, and 1/12 - 1/14
Now that you have them grouped, you can group on the result of the code from the article, and SUM( hours). Now you have a per group sum, which you can then average afterwards.
This is also useful if you have non-contiguous ranges and you need to find gaps. For example, instead of grouping hours above, let's say my consultant was lax in remembering to file his paperwork. I can tell him he's missing information on 1/3, 1/7-1/8, and 1/10 - 1/12.
There's a number of applications to this, the above simply being a couple of examples.
- Craig Farrell
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions | Forum Netiquette For index/tuning help, follow these directions. |Tally Tables Twitter: @AnyWayDBA
|
|
|
|
|
Valued Member
      
Group: General Forum Members
Last Login: Thursday, May 09, 2013 5:23 PM
Points: 52,
Visits: 188
|
|
Awesome, got it. Thanks! I've taken up a good deal of your time today, Craig!
"If I had been drinking out of that toilet, I might have been killed." -Ace Ventura
|
|
|
|
|
Ten Centuries
      
Group: General Forum Members
Last Login: Wednesday, May 15, 2013 3:57 AM
Points: 1,163,
Visits: 2,600
|
|
|
|
|
|
SSC-Enthusiastic
      
Group: General Forum Members
Last Login: Wednesday, January 23, 2013 1:37 PM
Points: 148,
Visits: 162
|
|
Nice article.
When Jeff said, "even though the DateGroup looks like a date, it really doesn't mean anything," he was fibbing a bit, probably to keep unnecessary detail out of the article.
To see what the DateGroup means, we can watch what happens to it as we move through the test results from the first set of test data. It starts out as '2009-12-31', and it would keep that value on every row if there were no gaps in the dates. Each time we pass a gap, the number of days in the gap are added to the DateGroup. So the DateGroup means:
The day prior to the first day in our input data, adjusted by the number of days skipped along the way to our current row. Or, put another way:
The day prior to what the first day in our input data would be, if we shuffled all the dates forward to fill in the gaps up to the current row. OK, so Jeff was probably right to gloss over this, but hang on a minute. Now that we know what DateGroup is, we can use it to count the skipped days. It's just the number of days between our earliest date and DateGroup. Here I changed Jeff's query to group by DaysSkipped instead of DateGroup.
WITH cteGroupedDates AS (SELECT UniqueDate = SomeDate, DaysSkipped = DATEDIFF(dd, MinDate, DATEADD(dd, - ROW_NUMBER() OVER (ORDER BY SomeDate), SomeDate) + 1) FROM #MyHead CROSS JOIN (SELECT MIN(SomeDate) AS MinDate FROM #MyHead) m GROUP BY MinDate, SomeDate ) SELECT StartDate = MIN(UniqueDate), EndDate = MAX(UniqueDate), Days = DATEDIFF(dd,MIN(UniqueDate),MAX(UniqueDate))+1, DaysSkipped FROM cteGroupedDates GROUP BY DaysSkipped ORDER BY StartDate ;
This gives:
StartDate EndDate Days DaysSkipped ----------------------- ----------------------- ----------- ----------- 2010-01-01 00:00:00.000 2010-01-01 00:00:00.000 1 0 2010-01-03 00:00:00.000 2010-01-03 00:00:00.000 1 1 2010-01-05 00:00:00.000 2010-01-06 00:00:00.000 2 2 2010-01-10 00:00:00.000 2010-01-12 00:00:00.000 3 5
DaysSkipped gives us a running total of the number of days in the gaps, which might be useful in some circumstances.
|
|
|
|