Median MDX Calculation Performance Problems

  • Ok, I've been hacking at this for a while now using the median function or using some combination of tail/head to do the same thing and I keep running into this same issue. In these examples, I'm going to use the median function to demonstrate just because the code is easier to look at.

    So, I run this:

    WITH MEMBER [Measures].[MedianPID] AS

    MEDIAN (

    NONEMPTY(([Measures].[Process Duration], [Job Completed Date].[Job Completed Calendar Period].members))

    )

    SELECT [Measures].[MedianPID] ON 0

    , ([Job Completed Date].[Job Completed Calendar Period].[Month].&[2014]&[5],[Department].[Department].&[7]) on 1

    FROM [MyCube]

    I get the correct answer (which is 1039 in this case), but it's slow. Takes around 10 seconds to run. However, if I run this:

    WITH MEMBER [Measures].[MedianPID] AS

    MEDIAN (

    NONEMPTY(([Measures].[Process Duration], [Job Completed Date].[Job Completed Calendar Period].currentmember))

    )

    SELECT [Measures].[MedianPID] ON 0

    , ([Job Completed Date].[Job Completed Calendar Period].[Month].&[2014]&[5],[Department].[Department].&[7]) on 1

    FROM [MyCube]

    It comes back in a blink, but the answer is way off (114543265 in this case - which I think is the SUM of all the measure elements). The difference between the two.

    Slow, correct: [Job Completed Date].[Job Completed Calendar Period].members

    fast, incorrect: [Job Completed Date].[Job Completed Calendar Period].currentmember

    What I don't understand is that logically, I would think that currentmember is what I want in that set. I only want the process duration measures that are in the month of May, which is declared in the row axis ([Job Completed Date].[Job Completed Calendar Period].[Month].&[2014]&[5]). But that doesn't seem to be the case.

    What in blazes is happening here?

    Also, have a look at this blog post: http://erikasblog.datainspirations.com/2010/01/22/calculating-non-numerical-medians-in-mdx/

    That dude is doing exactly what I think I need to do, but I can't seem to translate that into what I need. Ultimately, I need better a performing median calculation in my cube and the median function syntax that gives me the correct answer is far too slow when I start adding more filters or hierarchies. Any help would be appreciated.

    Thanks in advance.

  • I'm pretty sure the .members is the correct syntax. The query will filter down which numbers it includes to those in may,the .members in the definition just says "give me all the members in your set", which is what you want the median of.

    I think the other one is probably doing a bunch of seperate medians at the lowest granularity (day?) of your date dimension, and then aggregating them to the month, or something similar.

    That's my first reaction anyway. By no means an expert myself.

    As for the speed of it, getting a median means ordering the set by the measure, which is really slow in MDX if you have a lot of records.

  • edit: Duplicate post

Viewing 3 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic. Login to reply