Up early this morning, probably due to my body still being on Toronto time (EST).
As the first cup of coffee makes it way through my blood stream, I’m trying to set out some goals for today, my first ‘real’ day of this OLAP sprint experiment.
My target is to get through the first 200 pages of the Wrox book.
The first 200 pages should give me some idea of the fundamentals, including:
Data Sources and Data Views
I’m also hoping that this will help me to shape the agenda for the rest of the week, as it would be nice to know where I’m going.
As I type, the source code and samples are downloading from the Wrox site, and I’ve already restored a copy of my production OLTP database to my laptop, because I’d like to work with my data this week as I build and explore.
The poor-man’s version of a Keurig/Tassimo coffee maker that they’ve got here in the hotel is passable, but I may have to venture out and get something that doesn’t taste so much like puddle water.
The first fork in the road seems to be picking an OLAP model – Tabular or Multidimensional.
I’m going with Multidimensional.
As I stated in the outset, it’s not my intention to be thorough in my research and decision-making this week. So, this Microsoft White Paper, and Chapter 1 from my Wrox book have given me enough input to choose the Multidimensional model.
Loosely, the factors in my decision are:
The Multidimensional model has been around a lot longer (5 releases of SQL Server), and is described as the ‘Traditional’ model. Presumably this means it is more stable and mature. I’m also hoping there are more resources out there for me to learn it.
The Tabular model is intriguing to me mostly for its newness. The in-memory xVelocity (aka Vertipaq) engine sound like lots of fun, and I know that I’m ignoring an opportunity to build a quicker, simpler solution using the Tabular model, but I can’t help but feel like it is somehow a shortcut that will bite me later on. It also seems to involve a lot of Excel. With all due respect to Excel and its legion of followers, I didn’t come all the way to PASS in Seattle to play in Excel all day.
Like I said, I’m making snap decisions to get this built, so there we are – Multidimensional it is.
The 6am breakfast of coffee and Pringles has run their course, so my next decision will be where to get some food.
Spent the morning and afternoon reading up on the basic elements of a multidimensional model, including the data warehouse, facts, measures, dimensions, attributes, hierarchies, cubes, etc. The lingua franca of SQL Server Analysis Services is a bit of a headache, but I’m sure I’ll get used to thinking in those terms before too much longer. Two-dimensional life in OLTP was so much simpler: rows and columns!
I ‘modeled’ my first schema on a notepad at the pub across the street from the hotel, while taking down a cheeseburger. Although it looks more like a grocery list than a star or snowflake schema, it’s a start. From what I’ve read, I don’t understand why everyone wouldn’t want a snowflake schema, to take advantage of more complex hierarchies, but I’m in no position to be asking that question yet.
Poor Man’s Schema
Once back at the hotel, I created a data warehouse in the relational database engine in SSMS from a copy of our production OLTP database. I’m not worrying about using SSIS to load the data warehouse from the OLTP source yet – I’ll save that for later.
From there, I jumped over to SSDT, and in about 30 minutes, quickly knocked out the following:
Created an Analysis Services Multidimensional Project
Created a data source
Created a data source view
Created a cube
Created some dimensions
All of the wizards in SSDT made it very easy, and I had a ‘this is too good to be true’ feeling, but nonetheless couldn’t resist feeling a bit pleased with myself for having filled the design surface with so many shiny new things.
My fears were confirmed when I went to deploy the solution back to the server and received no less than 28 errors and 7 warnings:
That’s a lot of errors and warnings. So, I’m going to take a break, find some food, and dive back in later tonight when my confidence has recovered.
I went back to the drawing board with my dimensions, as they were the ones causing me all the trouble.
To be more specific, I took out the elephant gun and started them from scratch.
Because I’m working from an OLTP datasource, and not a data warehouse that might be a bit cleaner, I was getting into trouble. Trouble to the tune of missing values, duplicate values, and the like, in my dimension attributes.
So, I wrote some quick and dirty TSQL UPDATE statements against my base OLTP tables, and then wrote some views in order to accomplish a bit of horizontal and vertical filtering.
Back in SSDT, I defined some hierarchies, and processed the dimensions one at a time, so that I could review the errors and warnings as they came up, and not be crushed by 29 errors at once.
I was rewarded for my patience:
Those three words – “Deployment Completed Successfully” – were music to my ears.
I flipped over to the Cube Browser tab in my newly-minted Cube and started to experiment. The Cube Browser is where the whole OLAP/Analysis Services experience really started to come to life. It was exciting to see the hierarchies and measures in action – a visual confirmation that my data was starting to take its proper shape. I was able to drag and drop the Dimension attributes and Measures quickly around the interface and was giddy at how quickly the results were updated. The filtering was straight-forward and intuitive, and allowed me to quickly spot trends in the data.
I got through 200 pages of my Wrox book, and managed to get a very rudimentary cube up and running today, so I’ll sleep well tonight!
Follow along on this journey and read the other days of Simon's pointed learning journey.