Unknown Member & Null Processing

  • Hi

    Forgive me - this is probably a very basic question. I have followed Kimball methodology by including an unknown member in each dimension in the relational database. In the case of a time dimension I have an unknown member with key value -1. All time related fields, e.g. FullTime, Hour, Minute, etc are NULL in the database. I have a field called DayPart which has the value 'Unknown' where the key = -1.

    With the UnknownMember property for the dimension set to None, all attributes that have a NULL value for the -1 member display a blank member in the browser - except for DayPart which has 'Unknown' as one of its values. If I set the UnknownMember property to Visible and NullProcessing for all attribures that have a NULL value to Unknown Member, the blank member is replaced by Unknown - which appears much better, but then the DayPart attribute displays two Unknown members.

    If I change the UnknownMember dimension property to Hidden, all the attributes with NULL attributes for the -1 member don't show either blank or unknown members and the DayPart attribute is left with its original Unknown member. This appears nice but is it correct? BIDS gives a recommendation to only use the Visible or None options for the UnknownMember property and MSDN advises the Hidden should only ever be used with prototyping. I'm also guessing that when browsing the cube, any data assigned to the -1 member would not be accounted for unless the DayPart field is included as a row or column.

    What is the correct approach to this?

    I'm using SSAS 2008 R2.

    Thanks

    Ian

  • Sorry for the late post to this topic. Someone should really go through the open topics and try to get outstanding questions answered. I suspect you've moved on, but hopefully someone else may benefit.

    "UnknownMember" is a special processing capability of MSAS, that has some good features as well as some definiate drawbacks. You should thoroughly understand the options before committing to a production implimentation.

    In your example, you're mixing your own implementation (with key values of -1) and the MSAS implementation (key value = null) which is probably the source of your confusion. Pick one method an stick with it. In general, I'd say that manually handling unknowns is probably more robust (that means, characteristics never have a null value), but there are some nice aspects to the MSAS processing.

    On to "UnknownMember" handling:

    Having MSAS decide what to do about null values is a little tricky (NullProcessing).

    If you have it set to "Automatic", it's the same as "ZeroOrBlank". Since many (most) characteristics are strings, that means nulls get translated to an empty string and you end up with a key value of ".&[]". However, in most BI reporting tools, you end up with a zero length string for a description (i.e. "") and that can be devilishly hard to select in most circumstances. As well, you end up with an empty field on most report panes, which leads to confusion and misunderstanding.

    If you set it to "UnknownMember", it gets a little better. The key value generated becomes ".UNKNOWNMEMBER" and the description becomes "Unknown", which you can easily override using "UnknownMemberName" (at the dimension level). There's even MDX components to properly identify this construct, so that you can incorporate any special handling.

    What's more, MSAS has the ability to hide the unknown member, a very useful feature.

    Where does it break down? It seems that pieces of the MSAS parser hasn't taken this into account. Any time you include this type of attribute into a hierarchy, you'll end up generating logical keys that include constructs like:

    .UNKNOWNMEMBER.&[somekey]

    or even

    .UNKNOWNMEMBER.&[somekey].UNKNOWNMEMBER

    Some parts of the engine handle this gracefully. Other parts throw a syntax error on the "&" symbol. (Use "SQL Server Profiler" to trace the error. even though it's perfectly valid syntax. So be prepared for some operations and tools to fail.

    I contacted Microsoft on this issue a few years ago (with very heavy, complete documentation) and they ended up with the old saw "working as designed".

    Still very useful, but you've been warned.

  • Hi Dave,

    Is it a good way to use Unique Id field in Key Column and actual field in Name column to avoid duplicate issue.

    For example, I'm facing issue in LastName field (both Case and Blank values) in Employee.

    Option 1:

    Use Emp_Id as Key Column and LastName in Name Column.

    Option 2:

    Expand Key Column properties, and set Trimming, NullProcessing, and Collation to solve the problem.

    Plz let me know your thoughts.

    Thanks,

    Suresh

    Suresh

  • Wow - firstly, Dave - thanks for the reply to my question. I don't even remember asking it!

    Suresh - if you ever have an integer ID that can be used as a key for an attribute, you should use it. Not only will it solve your issue with duplicates, but it will also mean smaller and better performing indexes.

    Ian

  • Ian, Thanks for the reply.

    I used ID as Key for and it works. But question is, can I use ID for all attributes (First, Last Name, Address etc...) to avoid duplicate error??? I think it's not correct way. That's why I gave option 2 as well.

    Though we can fix those data issues in DB or using Named Query in DSV, I'm interested to know any performance issues while using Key Column properties.

    Suresh

Viewing 5 posts - 1 through 4 (of 4 total)

You must be logged in to reply to this topic. Login to reply