Median Calculations

Question

Median Calculations

ehlinger

Hall of Fame

Points: 3585
More actions
March 15, 2007 at 8:55 pm

#116604

I have created the following to calculate the median time frame that a sales opportunity is in a particular sales step for a specified rep. What I'd like to do now is be able to return the median time frame for each sales rep in the system without having to enter in each sales rep into the rep variable.
Does anyone have an idea on how I can modify this code so that regardless of how many sales reps we have in the system the results returned will list each of those reps and their corresponding median time?
IF OBJECT_ID('tempdb..##steps') IS NOT NULL
       DROP TABLE ##steps

IF OBJECT_ID('tempdb..##medianApproach') IS NOT NULL
       DROP TABLE ##medianApproach

DECLARE @rep AS char(20)                                                /*This is the rep owner of the opportunity that you want to run the analysis for. This is where they are either a primary or shared owner.*/

SET @rep = 'coleman'

/*Create a temporary table to hold the steps for the opportunities we want to evaluate.*/

CREATE TABLE ##steps (opid int, mileid int, mile_code char(25), mile_status char(15), first_open_date datetime, close_date datetime, type varchar(20), rep char(20))

INSERT into ##steps

SELECT op_m.opid,
                op_m.mileid,
                op_m.mile_code,
                op_m.mile_status,
                op_m.first_open_date,
                op_m.close_date,
                op_o.type,
                op_w.rep
FROM op_milestones op_m JOIN op_opportunity op_o ON op_m.opid = op_o.opid
                JOIN en_entity en_e ON op_o.enid = en_e.enid
                JOIN vw_SizeRollUp vwS ON en_e.orgid = vwS.orgid
                JOIN op_owner op_w ON op_o.opid = op_w.opid
WHERE op_m.mile_status = 'In Progress'
                AND op_o.type <> 'Contract Renewal'
                AND op_o.type <> 'Express Order'
                AND op_w.rep = @rep

/*Create temp table to input the values to figure the median value in days for Approach.*/
CREATE TABLE ##medianApproach (Value INT, uid INT identity(1,1) )

/*Insert the values into the temp table to then figure out the median. */

INSERT into ##medianApproach
                SELECT DATEDIFF(dd, s.first_open_date, GETDATE())
                FROM ##steps s
                WHERE s.mile_code = 'Approach'
                ORDER BY DATEDIFF(dd, s.first_open_date, GETDATE()) ASC

/*Figure out what the median is and set it to the declared variable.*/
DECLARE @medianApproach AS INT
SET @medianApproach =
(SELECT
    AVG(Value)
FROM
    ##medianApproach
WHERE
    uid = (SELECT (MAX(uid) + 1) / 2 + (MAX(uid) + 1) % 2 FROM ##medianApproach)
    OR
    uid = (SELECT (MAX(uid) + 1) / 2 FROM ##medianApproach)
)
SELECT @medianApproach

Viewing 15 posts - 1 through 15 (of 18 total)

You must be logged in to reply to this topic. Login to reply

Jeff Moden SSC Guru Points: 1003875 More actions · Answer 1

First, I think the use of global (##) temp tables, as you have, will eventually burn you with a "table already exists" error. Recommend you change them to local (#) temp variables.

It would appear that if you added WHERE op_m.mile_code = 'Approach' to you first INSERT into the Steps table, there would be no need to construct the Median Approach table... you could do your final calc directly from the Steps table. After that, it would be a simple thing to convert the temp table to table variable and stick your fine code into a User Defined Function that you would pass the Rep# to.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

ehlinger Hall of Fame Points: 3585 More actions · Answer 2

I could add the WHERE to the first INSERT statement if all I had was Approach. I actually have four other "steps" that I evaluate the median for as well. (I didn't include all of that so I could keep the posting as short as I could.)

Here is my other issue. We are still running on SQL 7 so I don't think that I can use a table variable, correct?

How would I use a UDF and pass the Rep to it? What I'd like to do is have all of this run in a package for instance and just know to run it for all our reps in the system without any user intervention.

Thanks for your help.

Jeff Moden SSC Guru Points: 1003875 More actions · Answer 3

I would solve each of the 4 separately... the ol' "peel one potatoe at a time" usually works the best on these types of things but, of course, that's your decision.

You are correct, SQL Server 7 does not have table variables... it doesn't have UDF capabilities either.

Understood on the "without any user intervention". Because of the absense of UDF capabilities, you should probably turn your script into a stored procedure where you pass the Rep id... if you don't know how to do that (recommend you check CREATE PROCEDURE in Books OnLine, first), then post back.

To save time on any future posts, you may want to include the fact that you're using SQL Server 7 so people don't waste their time on a UDF recommendation.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1003875 More actions · Answer 4

I was doing some checks on your formulas... because of the integer math you are using everywhere, you're coming up with the Statistical Median when you have an even number of entries as opposed to the Mathematical Median. For example, if you have 12 entries with values from 1 to 12, your calculations produce the number 6 (Statistical Median) rather than 6.5 (Mathematical Median). Not sure which one you actually want...

Obviously, I'm working on this... believe I've found a way to at least double the performance of this calc, give you the ability to pass @Rep as a parameter, and the option to select whether you want to return the Statistical Median or the Mathematical Median with another simple parameter. I just need a bit more time...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Jeff Moden SSC Guru Points: 1003875 More actions · Answer 5

Oki-doki... change of plans... decided to go full-bore on this one (Man! This one was FUN! ). First, if you had SQL Server 2005 (you don't and I don't ), we could use the awesome new power of CTE's and OVER as hi-lighted in the following URL (SQL Server Magazine article written by Itzik Ben-Gan)...

http://www.sqlmag.com/Article/ArticleID/49827/sql_server_49827.html

He does demo a way to do the same thing in SQL Server 7/2000 in his "Listing 3" ( http://www.sqlmag.com/Files/09/49872/Listing_03.txt ) , but, as he points out, it's horribly slow (that makes SQL Server 2005 better, right? NOT!). The reason is, he made the mistake of using aggragates for this instead of continuing the sort for a single row "page" as if doing paging in a GUI. YOU do something similar in your code in that you use aggrates as well as having the overhead of moving data from 1 temp table to another for each rep. I'm thinking that's going to be a bit slow... in fact, it takes 3.983 seconds on my box for your code to return just 1 Median on a million row table. Aggragates are the problem ...

Your goal, from what I can see, is to do an "analysis" (Median I take it) for each Rep and because it takes so long, you want to be able to pass in individual Rep names and the Step to analyze...

... well, how 'bout this... since I don't have your data at hand, let's simulate what you load into the "Steps" temporary table... we'll use a MILLION entries, 17,576 Rep "Names", 4 steps (as you asked), and 2000 different values of "DeltaT" (you called it "Value" in your script)... here's the code to build that simulation table... (takes about 55 seconds to build including the 2 indexes)...

--===== If the temp table already exists, drop it

IF OBJECT_ID('TempDB..#Steps') IS NOT NULL

DROP TABLE #Steps

--===== Simulate loading your real temp table with ...

-- A MILLION rows, 17576 reps (3 character "names"), a DeltaT from 0 to 1999,

-- and 4 "Steps" (Step 1, Step 2, etc)

SELECT TOP 1000000 IDENTITY(INT,1,1) AS RowNum,

CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65))

+ CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65))

+ CHAR(STR(RAND(CAST(NEWID() AS VARBINARY))*25+65)) AS Rep,

CAST(RAND(CAST(NEWID() AS VARBINARY))*2000 AS INT) AS DeltaT,

'Step '

+ CAST(CAST(RAND(CAST(NEWID() AS VARBINARY))*4 AS INT)+1 AS VARCHAR(25)) AS MileCode

INTO #Steps

FROM Master.dbo.SYSCOLUMNS sc1,

Master.dbo.SYSCOLUMNS sc2

--===== Every table, even temporary ones, should have a Primary Key...

-- Makes all the difference in the world, performance wise, on this script.

ALTER TABLE #Steps

ADD PRIMARY KEY CLUSTERED (RowNum)

--===== And, trust me, you want this index. It's a "covering" index for both the

-- ORDER BY's, the JOINs, and the data that will appear in the code that follows.

CREATE INDEX IX_Steps ON #Steps

(Rep,MileCode,DeltaT)

Here comes the "magic"... I decided to return the Median for ALL the reps by name and by step... this little gem, unlike the aggragate method, is fast as hell. It returns all the "analysis" rows in just 32 seconds. Remember, 1 MILLION rows, 17576 Reps, and 4 steps are simulated in the table (note that because this is all random data, some reps may be missing a step, but most have 4) for a total of about 70,200 Medians in just 32 seconds (or 1 Median analysis in only 0.0004558 seconds, a lot better than almost 4 seconds for 1)...

SELECT DISTINCT

m.Rep,

m.MileCode,

(

(SELECT TOP 1 DeltaT FROM

(SELECT TOP 50 PERCENT DeltaT FROM #Steps t

WHERE t.Rep = m.Rep AND t.MileCode = m.MileCode

ORDER BY DeltaT ASC

) lo

ORDER BY DeltaT DESC)

+(SELECT TOP 1 DeltaT FROM

(SELECT TOP 50 PERCENT DeltaT FROM #Steps t

WHERE t.Rep = m.Rep AND t.MileCode = m.MileCode

ORDER BY DeltaT DESC

) hi

ORDER BY DeltaT ASC)

) / 2 AS MEDIAN

FROM #Steps m

ORDER BY m.Rep,m.MileCode

You can, of course, put a WHERE clause between the FROM and ORDER BY (any 3 letters will do for this test) and you'll be amazed at how fast it returns. But, why not leave it out and do the analysis for ALL the reps at once? It's fast enough...

Ok, for your code, the first thing I'd do is incorporate the DeltaT column into the #Steps table, like I did, and loose the global (##) temp table reference... like this...

IF OBJECT_ID('TempDB..#Steps') IS NOT NULL

DROP TABLE #Steps

CREATE TABLE #Steps

(

RowNum INT IDENTITY(1,1) PRIMARY KEY CLUSTERED, --Added this column

opid INT,

mileid INT,

mile_code VARCHAR(25), --Changed this data type

mile_status VARCHAR(15), --Changed this data type

first_open_date DATETIME,

DeltaT INT, --Added this column

close_date DATETIME,

type VARCHAR(20),

rep VARCHAR(20) --Changed this data type

)

INSERT INTO #Steps

(opid,

mileid,

mile_code,

mile_status,

first_open_date,

DeltaT,

close_date,

type,

rep)

SELECT op_m.opid,

op_m.mileid,

op_m.mile_code,

op_m.mile_status,

op_m.first_open_date,

DATEDIFF(dd, op_m.first_open_date, GETDATE()) AS DeltaT --Added this calc

op_m.close_date,

op_o.type,

op_w.rep

FROM op_milestones op_m

JOIN op_opportunity op_o ON op_m.opid = op_o.opid

JOIN en_entity en_e ON op_o.enid = en_e.enid

JOIN vw_SizeRollUp vwS ON en_e.orgid = vwS.orgid

JOIN op_owner op_w ON op_o.opid = op_w.opid

WHERE op_m.mile_status = 'In Progress'

AND op_o.type <> 'Contract Renewal'

AND op_o.type <> 'Express Order'

CREATE INDEX IX_Steps ON #Steps --Don't forget the index!!

(Rep,MileCode,DeltaT)

Then, just use the same code as I did to do the full analysis... (takes about 32 seconds on the test data)

SELECT DISTINCT

m.Rep,

m.MileCode,

(

(SELECT TOP 1 DeltaT FROM

(SELECT TOP 50 PERCENT DeltaT FROM #Steps t

WHERE t.Rep = m.Rep AND t.MileCode = m.MileCode

ORDER BY DeltaT ASC

) lo

ORDER BY DeltaT DESC)

+(SELECT TOP 1 DeltaT FROM

(SELECT TOP 50 PERCENT DeltaT FROM #Steps t

WHERE t.Rep = m.Rep AND t.MileCode = m.MileCode

ORDER BY DeltaT DESC

) hi

ORDER BY DeltaT ASC)

) / 2 AS MEDIAN

FROM #Steps m

ORDER BY m.Rep,m.MileCode

If you want to see what the Median DeltaT is for each step across all reps... (takes about 18 seconds on the test data)

SELECT DISTINCT

m.MileCode,

(

(SELECT TOP 1 DeltaT FROM

(SELECT TOP 50 PERCENT DeltaT FROM #Steps t

WHERE t.MileCode = m.MileCode

ORDER BY DeltaT ASC

) lo

ORDER BY DeltaT DESC)

+(SELECT TOP 1 DeltaT FROM

(SELECT TOP 50 PERCENT DeltaT FROM #Steps t

WHERE t.MileCode = m.MileCode

ORDER BY DeltaT DESC

) hi

ORDER BY DeltaT ASC)

) / 2 AS MEDIAN

FROM #Steps m

ORDER BY m.MileCode

And, if you want the overall median... (takes about 7 seconds on the test data)

SELECT (

(SELECT TOP 1 DeltaT FROM

(SELECT TOP 50 PERCENT DeltaT FROM #Steps t

ORDER BY DeltaT ASC

) lo

ORDER BY DeltaT DESC)

+(SELECT TOP 1 DeltaT FROM

(SELECT TOP 50 PERCENT DeltaT FROM #Steps t

ORDER BY DeltaT DESC

) hi

ORDER BY DeltaT ASC)

) / 2 AS OverallMEDIAN

All of the code above produces the Statistical Median... if you want to change it to the Mathematical Median, just change the "/2" to "/2.0" and you're done.

Now, if you still really, really want to do it one rep at a time, write back and we'll put this puppy in a stored proc...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

ehlinger Hall of Fame Points: 3585 More actions · Answer 6

Jeff, haven't read through all of your post so far but am anxious to take dig into it. Wanted to let you know I'll get to it since you have been so good on this with me. Have plans for the evening and will be running kids around tomorrow so maybe tomorrow night or Monday morning.

Regarding your earlier post, I see what you are saying about the integer math. I had used something similar to this for a past application and the Statistical Median was fine for that. Now that I look at it the Mathematical Median may work better for what I'm doing here. I'll have to analyze some different things on what we are doing here.

I have done a little in the past with Stored Procedures and have passed parameters to them. What I wonder, however, and maybe I can’t do this, is let’s say I have 10 sales reps and I want this to run weekly to get their information would I have to set up 10 different jobs to do that, each specific to the rep or could I have the system cycle through and run it for each rep? Maybe you've answered this in the above post.

Good call on letting others know I’m using SQL 7. I sometimes forget that because as I’m trying to keep up with all the requests and new stuff from users on our production system I am also in the process of setting up a new server with 2005. (Can’t seem to get enough time to work on that – can’t wait to get there.) Just have some head scratchers to figure out with executing SSIS packages through SQL Agent and I think I can pull the trigger.

Anyway, thanks for your help. I really appreciate it and I'll post back once I get a chance to dig into this and take it for a spin.

Jeff Moden SSC Guru Points: 1003875 More actions · Answer 7

I think that once you read my "long" post with the sample code, you'll find that 10 reps on a single job will be the way to go...

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

ehlinger Hall of Fame Points: 3585 More actions · Answer 8

Jeff,

"WOW!" This is pretty cool. Not only does it work great, but is giving me some more ideas on some other things I am working on.

Sorry, it took so long for me to respond to your prompt replys. I walked into work Monday to a bunch of fires and just got everyone taken care of late this afternoon.

Couple of quick questions. You included a RowNum, is that just so you can get the Clustered index? I hadn't thought much about that because, although the opid isn't unique it's pretty close. (Some reps can be sharing a certain percentage of each opid.) Makes sense though and will help the performance.

Also you added the DeltaT column. That is really the same thing that I was calling Value, correct?

We may have a need at sometime to call this for an individual rep or group of reps out of our GUI system that users use. If that is the case, I could just create a stored procedure out of the whole thing and then pass a paramter to an added WHERE clause, correct?

Having seen what you are doing here I'm going to try(hopefully yet this week sometime) to incorporate this with something else I'm already doing, which is taking a weekly snapshot of the data each week (On Sunday) Currently, what I capture (per rep) is the number of new opids that were created in the last 7 days as well as what the quantity is in each step at that time. It would be great to also include the median as well and now I think I can. I'll post the solution I have when I get it figured out if you want.

Anyway, thanks for all your help. Cool approach and I can't thank you enough for the solution you provided as well as helping me to look at different and new ways to solve things.

Curt

Jeff Moden SSC Guru Points: 1003875 More actions · Answer 9

>"WOW!" This is pretty cool. Not only does it work great, but is giving me some more ideas on some other things I am working on.

That's always part of the idea

>Sorry, it took so long for me to respond to your prompt replys. I walked into work Monday to a bunch of fires and just got everyone taken care of late this afternoon.

Sometimes, it's no longer a matter of how many sticks you have in the fire, it's a matter of how many fires you have sticks in

Couple of quick questions. You included a RowNum, is that just so you can get the Clustered index? I hadn't thought much about that because, although the opid isn't unique it's pretty close. (Some reps can be sharing a certain percentage of each opid.) Makes sense though and will help the performance.

Yes... I needed full control over the data and that provides it just as you said.

>Also you added the DeltaT column. That is really the same thing that I was calling Value, correct?

Yes...

>We may have a need at sometime to call this for an individual rep or group of reps out of our GUI system that users use. If that is the case, I could just create a stored procedure out of the whole thing and then pass a paramter to an added WHERE clause, correct?

Now you're thinking... Exactly!! Just remember that if you create a temp table in a sub-proc, you can't see it from the calling proc. You could change the temp table to a sort of permanent scratchpad table that you could join to... have the proc run a bit after midnight to create the table of Medians and you can join to it until the cows come home.

>Having seen what you are doing here I'm going to try(hopefully yet this week sometime) to incorporate this with something else I'm already doing, which is taking a weekly snapshot of the data each week (On Sunday) Currently, what I capture (per rep) is the number of new opids that were created in the last 7 days as well as what the quantity is in each step at that time. It would be great to also include the median as well and now I think I can. I'll post the solution I have when I get it figured out if you want.

Be sure to lookup "cross tab" in Books Online... helps a lot in making such a report and, when done correctly, is nasty fast.

>Anyway, thanks for all your help. Cool approach and I can't thank you enough for the solution you provided as well as helping me to look at different and new ways to solve things.

Great... thank you for the feedback... I love it when folks "think" and come up with new ways to solve things. I do have to warn you, though... this type of "calculate across rows" can be a bit fickle if you don't do things just right. ALWAYS check the results before you productionalize. And, other uses include but are not limited Running Totals, Grouped Running Totals, Running Counts, Grouped Running Counts, and a host of other things. But, you must have full control of the data and the indexes (usually, a temp table helps). Also, most folks will agree that this type of thing is better accomplished in an application... that is usually true except in the face of large volumes of data which can "clog the pipe", if you know what I mean.

Anyway, thanks for the feed back... I really look forward to your new solutions!

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Kenneth Wilhelmsson SSC-Dedicated Points: 30043 More actions · Answer 10

You need to be careful about this, since it doesn't work in a predictable way.

It used to (I think anyway), but it doesn't anymore. I thought it used to in 7.0, but after a quick test, I don't think even 7.0 will behave 'as expected'.

The 'problem' is ordering, which is a requirement for a median calculation.

SQL Server does *not* guarantee to obide any ORDER BY except in the outermost scope.

eg

SELECT DISTINCT

m.Rep,

m.MileCode,

(

(SELECT TOP 1 DeltaT FROM

(SELECT TOP 50 PERCENT DeltaT FROM #Steps t

WHERE t.Rep = m.Rep AND t.MileCode = m.MileCode

ORDER BY DeltaT ASC <== Not guaranteed to order by *

) lo

ORDER BY DeltaT DESC) <== Not guaranteed to order by

+(SELECT TOP 1 DeltaT FROM

(SELECT TOP 50 PERCENT DeltaT FROM #Steps t

WHERE t.Rep = m.Rep AND t.MileCode = m.MileCode

ORDER BY DeltaT DESC <== Not guaranteed to order by *

) hi

ORDER BY DeltaT ASC) <== Not guaranteed to order by

) / 2 AS MEDIAN

FROM #Steps m

ORDER BY m.Rep,m.MileCode <== This is the ONLY guaranteed order by

..so, if it get's right, it's by chance - we can unfortunately not rely on this method for median calculations.

* note - it seems by a quick look on the plan when I tested on a 7.0 box, that the innermost sort (*) was actually performed, however, the 'flip-over' never happens, which leaves you with a median calculation of <lowest> + <highest> / 2 - not what we would like.

/Kenneth

ehlinger Hall of Fame Points: 3585 More actions · Answer 11

Kenneth,

I don't believe this is an issue because the ORDER BY clause is invalid in derived tables, subqueries, etc. unless TOP is also specified, which is what we are doing here. I've done a bunch of testing and compared it to what I had with the original code and am getting consistent results. If you still think I'm missing something here please let me know.

Jeff,

Now I'm having fun with this. I've added this to some other things I was doing and now am returning a weekly snapshot of the data (Will be run every Sunday) that returns the rep, mile_code, new opids created for each rep in the past 7 days and the number of records in each mile_code and the associated median for it. It is fast. In the past to get just the median calculation by step (Which was 5 different calculations and 6 different temp tables.) it was averaging 5.3745 seconds. Now, I'm getting all of this additional data here for each rep all at once in an average of 2.674 seconds. (Or, since we are dealing with Medians 2.695 seconds.)

Now, one thing I am trying to figure out, in case it comes up in the future, is if I run just what you came up with I get the rep, mile_code and corresponding median. If a rep doesn't have anything in one of the mile_codes how can I still get that to return but with a 0 as the median. I have a table with just the mile_codes in it and tried tying that in but can't seem to get it to work.

Anyway, here is the code that gets me the data I was after.

Thanks again for all your help.

IF OBJECT_ID('tempdb..#Report') IS NOT NULL

DROP TABLE #Report

CREATE TABLE #Report

(

RowNum INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,

rep VARCHAR(20),

description VARCHAR(20),

opportunities DECIMAL(5,1),

)

INSERT into #Report

(rep,

description,

opportunities)

SELECT op_w.rep AS [Rep],

op_o.curr_mile,

ISNULL(SUM(1*op_w.true_percentage), 0) AS [Opportunities]

FROM op_opportunity op_o JOIN op_owner op_w ON op_o.opid = op_w.opid

WHERE op_o.close_date IS NULL

AND op_o.type <> 'Contract Renewal'

AND op_o.type <> 'Express Order'

AND op_w.role = 'Sales'

GROUP BY op_w.rep, op_o.curr_mile, op_w.role

CREATE INDEX IX_Report ON #Report

(rep, description , opportunities)

IF OBJECT_ID('TempDB..#Steps') IS NOT NULL

DROP TABLE #Steps

CREATE TABLE #Steps

(

RowNum INT IDENTITY(1,1) PRIMARY KEY CLUSTERED,

mile_code VARCHAR(25),

DeltaT INT,

rep VARCHAR(20)

)

INSERT INTO #Steps

(mile_code,

DeltaT,

rep)

SELECT op_m.mile_code,

DATEDIFF(dd, op_m.first_open_date, GETDATE()) AS DeltaT,

op_w.rep

FROM op_milestones op_m JOIN op_opportunity op_o ON op_m.opid = op_o.opid

JOIN op_owner op_w ON op_o.opid = op_w.opid

WHERE op_m.mile_status = 'In Progress'

AND op_o.type <> 'Contract Renewal'

AND op_o.type <> 'Express Order'

AND op_w.role = 'Sales'

CREATE INDEX IX_Steps ON #Steps

(rep, mile_code, DeltaT)

SELECT CONVERT(char(10), GETDATE(), 101) AS [Record_Date],

r.rep,

CAST(ISNULL(NewOppCount.NewOpps, 0) AS decimal(5,1)) AS [NewOpps],

ISNULL(SUM(CASE description WHEN 'Approach' THEN Opportunities END),0) AS [Approach],

ISNULL(SUM(CASE Medianrep.mile_code WHEN 'Approach' THEN Median END),0) AS [Approach Median],

ISNULL(SUM(CASE description WHEN 'Interview' THEN Opportunities END),0) AS [Interview],

ISNULL(SUM(CASE Medianrep.mile_code WHEN 'Interview' THEN Median END),0) AS [Interview Median],

ISNULL(SUM(CASE description WHEN 'Demonstrate' THEN Opportunities END),0) AS [Demonstrate],

ISNULL(SUM(CASE Medianrep.mile_code WHEN 'Demonstrate' THEN Median END),0) AS [Demonstrate Median],

ISNULL(SUM(CASE description WHEN 'Negotiate' THEN Opportunities END),0) AS [Negotiate],

ISNULL(SUM(CASE Medianrep.mile_code WHEN 'Negotiate' THEN Median END),0) AS [Negotiate Median],

ISNULL(SUM(CASE description WHEN 'Close' THEN Opportunities END),0) AS [Closestep],

ISNULL(SUM(CASE Medianrep.mile_code WHEN 'Close' THEN Median END),0) AS [Close Median]

FROM #Report r

LEFT JOIN (

SELECT op_w.rep, SUM(1*op_w.true_percentage)

FROM op_opportunity op_o JOIN op_owner op_w ON op_o.opid = op_w.opid

WHERE CAST(FLOOR(CAST(op_o.open_date AS float))AS datetime) >= DATEADD(dd, -7, CAST(FLOOR(CAST(getdate() AS float))AS datetime))

AND op_o.type <> 'Contract Renewal'

AND op_o.type <> 'Express Order'

GROUP BY op_w.rep

) AS NewOppCount (rep, NewOpps)

ON NewOppCount.rep = r.rep

LEFT JOIN ( SELECT DISTINCT

m.Rep,

m.Mile_Code,

(

(SELECT TOP 1 DeltaT FROM

(SELECT TOP 50 PERCENT DeltaT FROM #Steps t

WHERE t.Rep = m.Rep AND t.Mile_Code = m.Mile_Code

ORDER BY DeltaT ASC

) lo

ORDER BY DeltaT DESC)

+(SELECT TOP 1 DeltaT FROM

(SELECT TOP 50 PERCENT DeltaT FROM #Steps t

WHERE t.Rep = m.Rep AND t.Mile_Code = m.Mile_Code

ORDER BY DeltaT DESC

) hi

ORDER BY DeltaT ASC)

) / 2 AS MEDIAN

FROM #Steps m) AS MedianRep (rep, Mile_code, Median)

ON Medianrep.rep = r.rep AND medianrep.mile_code = r.description

GROUP BY r.Rep, NewOppCount.NewOpps

ORDER BY r.rep

Jeff Moden SSC Guru Points: 1003875 More actions · Answer 12

> have a table with just the mile_codes in it and tried tying that in but can't seem to get it to work.

Do an outer join to that table...

Kenneth, you are correct about the ORDER BY guarantee...UNLESS you have total control of the data and the indexes as well as exclusive use of the table... hence, the temp table or a limited working table. Still, I've never had a failure with this method once the clustered index is established correctly and it will withstand multiple disparate inserts, deletes, updates, etc. I've also never had a failure with non-clustered indexes. Maybe, just lucky for the last 4 years I've been using this method

Sorry for the brevity but just got an emergency call from work and gotta run.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)

Kenneth Wilhelmsson SSC-Dedicated Points: 30043 More actions · Answer 13

Damn this stupid interface that suddenly throws away a post instead of posting it...

Oh, well.. Here goes again then...

Well, that was sort of my point.

Thing is, that if we have to rely on physical implementation details, then we admittedly need to 'tie the hands' of SQL Server, forcing it to behave in an unnatural way. So, what I'm saying is, that if we try this, then we're on our own entirely. SQL Server itself makes no warranties that we'll always get what we expect from this.

When I did my own 'quick-test' on a 7.0 box, it was on a column without an index, and it failed miserably. (unfortunately, I wish it'd work)

So, again - my point is *caution* - you're on your own here. Don't count on getting any help from SQL Server, it's more likely that it'll try to break it instead if it gets the chance.

/Kenneth

Jeff Moden SSC Guru Points: 1003875 More actions · Answer 14

Um, you need to go back and read the code, Ken... I'm not really relying on "Physical" implimentation at all and you should never try this on a "column without an index". Do notice the WITH (INDEX()) hints I used in the code... it relies on an Index or two... not the physical order. The clustered index certainly helps especially where performance comes in but this works without a clustered index, too.

I already warned about it being a bit "fickle" (the code must be perfect for content and order)in my original post but once you get one working, it's rock solid.

--Jeff Moden

RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
First step towards the paradigm shift of writing Set Based code:
________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

Change is inevitable... Change for the better is not.

Helpful Links:
How to post code problems
How to Post Performance Problems
Create a Tally Function (fnTally)