Build dimension as a VIEW

  • Hello,
    I have build a dimension as a VIEW. Definition of the view is below:
    CREATE VIEW dbo.PowerBI_DimClient(ClientId,Client)
    WITH SCHEMABINDING
    AS
    WITH CTE AS
    (
        SELECT    DISTINCT LTRIM(RTRIM(Client)) As Client
        FROM    dbo.RaportTechnologii
    )
    SELECT    ROW_NUMBER() OVER (ORDER BY Client) AS ClientId,
            Client
    FROM    CTE

    I am having doubts if this is a correct way of  defining  a dimension because i don't have any order by clause and i'm not sure whether each Client in a dimension will have the same ClientId each time the dimension is loaded. 
    Could you please give me any advice on the above dimension definition?
    Best regards

  • lukaszpiech - Wednesday, January 23, 2019 1:08 AM

    Hello,
    I have build a dimension as a VIEW. Definition of the view is below:
    CREATE VIEW dbo.PowerBI_DimClient(ClientId,Client)
    WITH SCHEMABINDING
    AS
    WITH CTE AS
    (
        SELECT    DISTINCT LTRIM(RTRIM(Client)) As Client
        FROM    dbo.RaportTechnologii
    )
    SELECT    ROW_NUMBER() OVER (ORDER BY Client) AS ClientId,
            Client
    FROM    CTE

    I am having doubts if this is a correct way of  defining  a dimension because i don't have any order by clause and i'm not sure whether each Client in a dimension will have the same ClientId each time the dimension is loaded. 
    Could you please give me any advice on the above dimension definition?
    Best regards

    If all the client names are different, that is, there is no duplication, you can hashbyte the client to produce a persistent identifier.
    😎
    Do you have a client identifier in the originating table?

  • Thank you for reply.
    Unfortunatelly, i don't have client identifier in originating table

  • lukaszpiech - Wednesday, January 23, 2019 5:30 AM

    Thank you for reply.
    Unfortunatelly, i don't have client identifier in originating table

    Can you add an intermediary table for persisting the identification values? Looks to me as the only option here.
    😎

    Enumeration with row_number will not be consistent and cannot be relayed on in this case.

  • Provide some context about how this view fits into the overall data warehouse design, and how it will be used. From a data warehousing perspective, assuming this discussion is in the context of a classical star schema design, using a view for as a replacement for a properly modeled slowly changing dimension table is a bad idea, especially if you can't guarantee a static primary key. 

    https://www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball-techniques/dimensional-modeling-techniques/dimension-table-structure/
    https://www.kimballgroup.com/data-warehouse-business-intelligence-resources/kimball-techniques/dimensional-modeling-techniques/type-2/

    Adding an ORDER BY clause to a view is also generally a bad idea, because it causes poor execution plans and performance.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Thanks for all replies.
    The view is part of a model used in PowerBI. I wanted to build a system that will containt main table with all necessary data. In next step i wanted to divide data into fact tabel and dimension tables assuming that all of those will be index views. Then i connect PowerBI to those views.
    Well, that was the initial plan 🙂 but now i think that instead of views i will use tables, what do you think about that approach?

  • In Kimball's 3rd edition book he recommends using views for the fact and dimension tables.  I had done this in a couple of instances.  As an example, all the inspectors for one cube were also technicians, but most technicians were not inspectors.  So I created a dimInspector view using the dimTechnican table as a base.  But the surrogate key for the view was from the table (and therefore not sequential, but that isn't important).  I did not try to create a new surrogate key on the fly.  In my most recent design I use views for all the fact and dimension tables.  But in each case the underlying surrogate key already exists.  I don't, and wouldn't, try to create them on the fly.  I'm not normally a fan of views in an OLTP environment.  There are only a few good use cases for them and they seem to me too often used when a stored procedure would have been better.  But I've become a big fan of views in an OLAP environment.  Just not in the way that I think you're suggesting.

  • Thanks for replying @ronkyle.
    Because i don't have underlying key in base table i've decided to stick with a table approach.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply