Building my first data warehouse. Advice welcomed :)

  • Hi all,

    I've been brought on board to help a company start to build their first business intelligence setup. I'm a developer by trade, and have good knowledge of SQL and application design using relative tables but this is my first step into designing a data warehouse.

    I have two questions initially from the customer, and if I can do a good job of providing a report setup for this we'll expand from there. The questions I'm tasked to answer are:

    How many support tickets are being opened each day?

    How many support tickets have been closed each day?

    How many does each employee have?

    How many does each group of employees have?

    I've designed my first start schema to handle the answering of these questions which is as follows.

    I would love any feedback/critique of this design.

    Thanks

    Alan

    p.s As a side note I'm planning on using SISS to perform the ELT tasks rather than programming my own software to do the job? Does this seem like a sensible opinion? For the first part of the data entry I'll be pulling information out of Dynamics CRM and Request Tracker.

  • What's the purpose of the Queue dimension table?

    SSIS is great for performing ETL, so yes, if you have the knowledge and you're comfortable developing packages then go ahead and use it. It also comes with components specifically designed for data warehousing such as the Slowly Changing Dimension component.

    Good luck!

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • The tickets are entered into queues ie ( Hosting, Hardware, Software ) and these descriptions are I think subject to changes.

    Thanks for the reply and the good luck 😉

  • Shouldn't there be a Ticket dimension table with attributes that describe the type of ticket?

    Not entirely sure the Queue dimension table is appropriate to be honest.

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • Good point. I wasn't sure whether it was best to have many different dimension tables for the different queue attributes or a single one. On reflection I think you're right and I should include a single attributes table for the ticket.

    Thanks again,

    Alan

  • I wonder if a snow flake schema might be better for you to model the ticket dimension but without actually seeing the schema of your source data I can only guess.

    You can create a main ticket dimension table which will have all the attributes shared by all you ticket types. Then for each distinct type of ticket (this replaces the queue dimension in your original design) you place it's attributes in a new table and link it back to the ticket dimension table.

    ---------------------------------------------------------

    It takes a minimal capacity for rational thought to see that the corporate 'free press' is a structurally irrational and biased, and extremely violent, system of elite propaganda.
    David Edwards - Media lens[/url]

    Society has varying and conflicting interests; what is called objectivity is the disguise of one of these interests - that of neutrality. But neutrality is a fiction in an unneutral world. There are victims, there are executioners, and there are bystanders... and the 'objectivity' of the bystander calls for inaction while other heads fall.
    Howard Zinn

  • I've modified the schema slightly to incorporate your suggestions. What I'm struggling with now is how using this schema I'd deal with tickets that are currently in an open state per day ( keeping in mind that an open ticket might not necessarily have any information associated with a particular day ) without doing some horrible looking SQL.

  • I wonder if a snow flake schema might be better for you to model the ticket dimension

    While anything is possible, the uses for a snow flake schema are uncommon. It should never be used simply to normalize in the manner as a transactional database. I recommend you start with a star schema.

  • I would add the group id that the employee is in to the dim user table as well. This will allow group counts.

    ----------------------------------------------------

  • alan.hollis 1097 (1/14/2013)


    Hi all,

    I've been brought on board to help a company start to build their first business intelligence setup. I'm a developer by trade, and have good knowledge of SQL and application design using relative tables but this is my first step into designing a data warehouse.

    I have two questions initially from the customer, and if I can do a good job of providing a report setup for this we'll expand from there. The questions I'm tasked to answer are:

    How many support tickets are being opened each day?

    How many support tickets have been closed each day?

    How many does each employee have?

    How many does each group of employees have?

    I've designed my first start schema to handle the answering of these questions which is as follows.

    I would love any feedback/critique of this design.

    Thanks

    Alan

    p.s As a side note I'm planning on using SISS to perform the ELT tasks rather than programming my own software to do the job? Does this seem like a sensible opinion? For the first part of the data entry I'll be pulling information out of Dynamics CRM and Request Tracker.

    As the others have mentioned, you'll need a "Group" dimension to identify how many tickets a group has.

    As for your first two questions, you'll need to identify the status as a date. That's the only way you'll be able to firgure out those two. You might even want to create a separate fact table to store the different statuses and dates of those statuses that any give ticket goes through. Don't forget that the status of a ticket may regress.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • No need to have a separate fact table to store all those dates.

    The ticket process is actually a perfect candidate for an accumulating fact table[/url].

    If you are interested in a current status snapshot of tickets, then a standard accumulating snapshot is ideal. You add in some extra fields for each of those dates on the fact table (date dimension FK's), and fill them as the information becomes available.

    If the detailed history of each ticket change is required (as Jeff mentioned, tickets regressing in their status) then a Time stamping accumulated snapshot [/url]might be more appropriate, but this is overkill IMO.

Viewing 11 posts - 1 through 10 (of 10 total)

You must be logged in to reply to this topic. Login to reply