Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Calculate Weighted Median Expand / Collapse
Author
Message
Posted Tuesday, March 2, 2010 1:31 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Saturday, April 17, 2010 9:39 AM
Points: 16, Visits: 73
Hi,
Could anyone please help me in writing a query to find the weighted Median for the following data

CREATE TABLE [dbo].[dr_temp](
[PK] [int] IDENTITY(1,1) NOT NULL,
[ID] [int] NOT NULL,
 [nvarchar](10) NOT NULL,
[Value] [int] NOT NULL,
[Volume] [int] NOT NULL
)

insert into dr_temp(ID,Code,Volume,Value) values(1,'120',40,30778)
insert into dr_temp(ID,Code,Volume,Value) values(1,'120',40,24860)
insert into dr_temp(ID,Code,Volume,Value) values(1,'120',40,82043)
insert into dr_temp(ID,Code,Volume,Value) values(1,'120',41,136116)
insert into dr_temp(ID,Code,Volume,Value) values(1,'120',42,106886)
insert into dr_temp(ID,Code,Volume,Value) values(1,'120',45,179646)

[code="vb"]
ID CODE Value Volume RunningTotal Average
1 120 40 30778 30778
1 120 40 24860 55638
1 120 40 82043 137681
1 120 41 136116 273797
1 120 42 106886 380683
1 120 45 179646 560329 280164.5

I'm expectiing the Value to be 45... How

Cumulative/2 value is 280164.5 which actually falls b/w 106886 and 380683 but 380683 is greater than 280164.5 so the value is 45.

Hope this helps.

Kindly help me out .

If this is possible in SSIS also please let me know.

For any questions do reply me.

Thanks
Post #875503
Posted Tuesday, March 2, 2010 1:54 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 9:40 AM
Points: 6,842, Visits: 13,372
Would you mind elaborate the business case?
Especially the part
falls b/w 106886 and 380683 but 380683 is greater than 280164.5 so the value is 45.

Why do you compare the median value to a single Value (106886) instead of the previous running toal (273797)?
Also, whydo you use the value of the next row that is larger than the row that already exceeded your median value? (I would expect the target value to be 42, not 45).
Finally, your sample data don't match your result set (value and volume are mixed).
Please clarify.




Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question
How to post performance related questions
Links for Tally Table , Cross Tabs and Dynamic Cross Tabs , Delimited Split Function
Post #875520
Posted Tuesday, March 2, 2010 2:03 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Saturday, April 17, 2010 9:39 AM
Points: 16, Visits: 73
Hi,

Thanks for the reply

Here is the business rule

280164.5 falls in the range b/w 106886 and 380683

However the ending range i.e 380683 > than 280164.5, due to which we need to select the next column which matched the criteria.

However this might not be happening to othe Code's

Hope this helps.

And oops! Column Mismatched.

Thanks

Post #875524
Posted Tuesday, March 2, 2010 2:16 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 9:40 AM
Points: 6,842, Visits: 13,372
It still is not really clear:
280164.5 falls in the range b/w 106886 and 380683.

But it also falls in the range b/w 179646 and 560329. Why not use that one?




Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question
How to post performance related questions
Links for Tally Table , Cross Tabs and Dynamic Cross Tabs , Delimited Split Function
Post #875530
Posted Tuesday, March 2, 2010 2:20 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Saturday, April 17, 2010 9:39 AM
Points: 16, Visits: 73
Hi,

Yep we can use the last one.

Can this done in a SQL Query?

Thanks
Post #875535
Posted Tuesday, March 2, 2010 2:35 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 9:40 AM
Points: 6,842, Visits: 13,372
ashok_raja (3/2/2010)
Hi,

Yep we can use the last one.

Can this done in a SQL Query?

Thanks


It depends.
Usually, a SQL query will rely on some logic requirement(s). If both results are possible, which one has to be used as a base to determin "the next row"?
Let's assume your next row would look like the following (ignore the wrong math. I didn't bother to calculate the new RunningTotal value).
ID      CODE    Value   Volume  RunningTotal  
1 120 46 379646 660329

If we would use the logic from your previous statements, the the result would be 45.
But if we'd also consider your last statement, the result would be either 45 or 46.
That's possible to do with SQL as well. But the question is: What is the correct answer? And why?




Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question
How to post performance related questions
Links for Tally Table , Cross Tabs and Dynamic Cross Tabs , Delimited Split Function
Post #875548
Posted Tuesday, March 2, 2010 2:41 PM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Saturday, April 17, 2010 9:39 AM
Points: 16, Visits: 73
Hi,

Actually 46 will not be there as the next row will be for different ID
i.e
Insert into dr_temp(ID,Code,Value,Volume) Values(
2,'120',31,2311)

if present then Yes! it matches, I think Let's get the 1st match
B/w 106886 and 380683. Is this Possible.

Thanks
Post #875551
Posted Tuesday, March 2, 2010 2:53 PM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 9:40 AM
Points: 6,842, Visits: 13,372
ashok_raja (3/2/2010)
Hi,

Actually 46 will not be there as the next row will be for different ID
i.e
Insert into dr_temp(ID,Code,Value,Volume) Values(
2,'120',31,2311)

if present then Yes! it matches, I think Let's get the 1st match
B/w 106886 and 380683. Is this Possible.
Thanks

Yes.
Do you have a clustered index on your table and if yes, what columns are included?
Is there any correlation between the clustered index and the volume column? (eg. if I sort the table based on the clustered index column, will the values for the volume column per ID be in order as well?)




Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question
How to post performance related questions
Links for Tally Table , Cross Tabs and Dynamic Cross Tabs , Delimited Split Function
Post #875558
Posted Wednesday, March 3, 2010 9:47 AM
Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Saturday, April 17, 2010 9:39 AM
Points: 16, Visits: 73
Hi
Apoligize for the confusion..

Here is what I want

ID CODE Value Volume RunningTotal Average
1 120 40 30778 30778
1 120 40 24860 55638
1 120 40 82043 137681
1 120 41 136116 273797
1 120 42 106886 380683
1 120 45 179646 560329 280164.5

Since 280164.5 falls b/w 273797 and 380683, the result set should be
1 120 42 106886 380683

Should be able to find median b/w the running total...

Hope this clears...

Thanks
Post #876115
Posted Wednesday, March 3, 2010 10:06 AM


SSCertifiable

SSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiableSSCertifiable

Group: General Forum Members
Last Login: Yesterday @ 9:40 AM
Points: 6,842, Visits: 13,372
ashok_raja (3/3/2010)
Hi
Apoligize for the confusion..

Here is what I want

ID CODE Value Volume RunningTotal Average
1 120 40 30778 30778
1 120 40 24860 55638
1 120 40 82043 137681
1 120 41 136116 273797
1 120 42 106886 380683
1 120 45 179646 560329 280164.5

Since 280164.5 falls b/w 273797 and 380683, the result set should be
1 120 42 106886 380683

Should be able to find median b/w the running total...
Hope this clears...

Thanks


With your reply, none of my questions in my previous post has been answered.
Hard to help...




Lutz
A pessimist is an optimist with experience.

How to get fast answers to your question
How to post performance related questions
Links for Tally Table , Cross Tabs and Dynamic Cross Tabs , Delimited Split Function
Post #876136
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse