SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

Power BI first impressions

I have been using Power BI the past few weeks (see my video of all the features) after I was invited by Microsoft to test the Power BI for Office 365 Preview (register here).  I have to say I am extremely impressed.  It allows for a self-service reporting and ETL environment that can be a game-changer.  Power BI is an umbrella name for many technologies and tools that blend well together.  Excel is the “center of the universe” for all these tools, and works in combination with SharePoint Online.

Note that Office 365 is really just a different way of paying for Office 2013 (subscription based vs one-time fee), but you are allowed additional installations and get additional services (see comparison, Office 2013 or Office 365: What’s the Difference and Why So Confusing?, and What’s the Difference Between Office 365 and Office 2013?).

Microsoft describes it as: Power BI for Office 365 is a complete self-service Business Intelligence (BI) solution delivered through Excel and Office 365 providing you with data discovery, analysis, and visualization capabilities to identify deeper business insights from your data. The Power BI for Office 365 service is a cloud-based solution that reduces the barriers to deploying a business intelligence environment for sharing reports and accessing information.

The people who this benefits:

  • Report creator/user - such as a data analyst, data or BI consultant, or a mainstream Excel user – you’ll likely use Power BI’s self-service BI features in Excel often, and extensively.  Online services that support collaboration, such as distributing interactive reports and sharing workbooks, are also an important central workspace and information hub
  • Data stewards and IT professionals – such as a data scientist, a data administrator, or an IT professional – the online service features of Power BI enable you to provide specific and secure access to data resources.  Online service-related elements of certain self-service BI tools – such as enabling the identification, selection, and secure distribution of on-premise and public data feeds – are also significant.  Users can share and manage the data and queries they create in Excel through the Power BI for Office 365 service, and also track usage
  • Report consumers – anyone from an aspiring students to a CEO – you can collaborate, share securely, and interact with reports using Power BI for smarter, insightful, and more nimble decisions.  This includes using an online hub, built right into Office 365, specifically tailored for collaborative BI

In short, Power BI is an environment power business reporting in the cloud.  It lets you easily extract and transform data from both on-premises and cloud sources using Power Query.  Then, you can shape your data model, enrich it with business logic using Power Pivot, and create gorgeous reports with both Power Map and Power View.  Power BI sites allows you to share reports and data views via SharePoint Online, and Power BI Mobile allows those same reports to be seen on a mobile device.  Data on-premises can be easily refreshed in the cloud thanks to the Data Management Gateway.  Finally, Power BI Q&A allows you to query for data by entering a question and seeing the answer instantaneously.

Here is an example of how a business user would use these tools to generate a reporting solution:

Power Query: (download).  Still in beta.  Helps customers easily discover, combine and transform their data.  You can do an online search and import from many different public data sources such as data.gov, Windows Azure Marketplace (14 free sources), and Wikipedia (listed here), as well as import from many external data sources (listed here).  These external data sources include databases (SQL Server, Azure, etc), any website by entering a URL (it will automatically locate the tables), files (Excel, csv, etc), and other sources (SharePoint list, OData feed, Hadoop, etc).  You can share any query, where the following information (metadata) about the shared query automatically gets stored in the Power BI Data Catalog in the cloud (hosted in SharePoint Online):

  • Query name and description
  • Column names
  • Location of the data sources used by the query.  For example, URL, server name/database name, and computer name/file name depending upon the type of data source.  Once Power BI is aware of a data source and stores it in the Power BI Data Catalog, you can view and manage the data sources using the Manage Data portal to annotate them with friendly names, descriptions, and access URLs.  For more information, see Manage Data Source Information using the Manage Data Portal
  • Search access list for the query to determine which users/security groups can find and use this shared query
  • The URL for query documentation, if specified
  • A preview of the data referenced by the query, if data preview was selected for the query.
  • The query definition in the form of a Power Query script that specifies how to connect to the query’s data sources and transform the data. This script is created when you filter and shape data in your query

This information in the Power BI Data Catalog helps users in finding and using the shared queries.  When you use a shared query in Power Query, the query definition is executed in Excel to import the data referenced by the query into a new worksheet in Excel.  To find and consume shared queries using Online Search in Power Query, see Find and Use Shared Queries.  Note that when you click the “Online Search” button on the Power Query tab, another tab is made available called “Search” that has a lot of other features for scoping and filtering your query search.

You can also optionally index the data the query retrieves and schedule when the index runs [indexing the data not supported yet].

USER EXAMPLE: Open Excel, go to the Power Query tab and choose “Online search”.  Pull in “S&P 500 Component Stocks” from Wikipedia.  Assuming you have a local csv file with stock data (get it here), choose From File -> From CSV.  You can filter and shape the data as needed (add/remove columns, remove duplicates, change types, etc – see Guide to Power Query Context Menus).  When adding a column there is a powerful Power Query Formula Language, informally known as “M”, that you can use.  You can create your own queries that use M – see Create an advanced query.  Rename the query to something like “Daily S&P 500 data of 10/15/2013″.  Then you mash the two tables together via the Merge button (selecting the box “Only include matching rows”), using the Ticker fields, making sure to select all the fields in the second table.  Rename the query “S&P 500 stocks and daily data MERGE” and hit the “Load to data model”.

Power Pivot: Included with Excel 2013, you just need to enable it.  Power Pivot allows you to work with very large data sets and do things such as create calculations, aggregate sums, create relationships, and create hierarchies.

USER EXAMPLE: Go to the PowerPivot tab and choose “Manage”.  Add a hierarchy that includes “GICS Sector” and “GICS Sub Industry”.  Add a column “$ Day Change” that subtracts the close price from the open price and a “% Day Change” that divides the open price from the close price.  Hide from client tools the column “SEC filings”.  AutoSum “$ Day Change”.

Power View: Included with Excel 2013, you just need to enable it.  Power view is interactive data visualization experience that allows you to build very cool reports.  Data within an Excel sheet can be used as a data source (in which Excel creates a linked table in the Excel Data Model behind the scenes automatically or you can manually do it via the “Add to Data Model” button).  Optionally this data model can be edited in Power Pivot to create a more sophisticated data model.

USER EXAMPLE: Go to the Insert tab in Excel and choose Power View.  It will use the query you loaded into the data model to build a default report.  Drag fields onto the report and go to the Design tab to change the type of chart.  Use the Layout tab to show the data labels.  Demo a slicer by using GICS Sub Industry as the chart and GICS Sector as the slicer.  Demo the Map by using Address of Headquarters as the chart.

Power Map: (download).  Still in beta.  Gives users the ability to plot geographic and temporal data visually on Bing Maps, analyze that data in 3D, and create interactive tours to share with others.  These videos can be shared anywhere, including social media, PowerPoint slides, and Office 365.

USER EXAMPLE: Download the Power Map Sample Workbooks and play with them via the tab Insert -> Map -> Launch Power Map and hitting “Play Tour”.

Power BI Sites: Still in beta.  View the Power BI Provisioning Video and the provisioning guide. The Power BI site supersedes Power Gallery. It’s the hosting site for deploying Power Pivot models for up to 250 MB in size, allowing teams to share reports and data views.   Power BI sites are implemented as an app for Sharepoint; there’s a detailed explanation of what they are and how they work here.  Also not how they are different from the BI Center Site in Office 365: Compare a BI Center site to Power BI for Office 365 sites.

USER EXAMPLE: Go to your Power BI site and choose “add” for documents to upload a workbook you just created.  Click on the “Shared With” for the workbook to specify who can view/edit it and send them an email.  Uses can edit the workbook in Excel on their desktop or via Excel Web App in their browser.  Note: Power BI does not work in IE9 or older browsers.

Data Management:  There is a “Manage Data” link on the Power BI site that shows usage analytics for all shared queries, a list of the queries you shared, and the data sources used by those shared queries.

Power BI Admin Center: There is a client agent called the Data Management Gateway (download) that enables cloud access for on-premises data sources on your server and exposes data from these on-premises data sources as OData feeds for corporate-wide access via Power Query.

To do this, you download and install the client agent on your server (install instructions) and enter a gateway key (obtained on the Admin Center by creating a gateway) to register the gateway with the Admin Center (note it currently only supports SQL Server and SSIS packages as data sources.  In the meantime, as a workaround you can use linked servers or SSIS Complex Data Feeds to expose other sources).  Once it is installed, it establishes a communication channel with your Power BI service in the cloud (no firewall holes required), and essentially waits for requests.  Once it gets one, it acts as a broker between your cloud service and your on premises data.

You then create a new data source in the Admin Center (selecting the gateway and providing the connection string) which will enable data from the on-premise data source as a OData feed that can be accessed by users through Power Query.  When creating the data source you choose which tables and views are included as well as which users/groups are allowed to search for the OData feed in Power Query.  If your table/view contains any unsupported types, or does not have a primary key, or does not have at least one non-nullable column, the entire table/view will be unavailable for publishing (it will be greyed out).  A OData feed URL is created when the new data source is created, and that URL is specified in Excel when choosing to connect to a OData feed (but the URL is not needed if using Power Query Online Search).  See Create a Data Source and Enable OData Feed in the Power BI Admin Center.

As you can observe from the URL, this is a cloud service.  You should be able to connect to the service from anywhere, and it will connect through the gateway to serve up the data.  While this is great from a mobility standpoint, if you happen to be on premises, this would be quite inefficient, as the data would first need to be transferred to the endpoint in the cloud, and then back to the source network.

The good news is that the gateway is able to detect when you are accessing the feed locally, and it will redirect you to the source without sending the data up to the cloud and back.  The bad news for us preview users is that this is the only thing working at the moment.  Therefore, for the preview period at least, in order to access the OData feed, you must be on a local network.  Specifically, you must be able to resolve the server name defined in the connection string.

Metadata for the OData feed is automatically indexed in the Data Catalog in the cloud, so users don’t need to know the server name, connection info, etc.  You can also optionally index the data for the OData feed and schedule it [data indexes not supported yet].  See Index OData Feeds.

Users can then search in Power Query for the OData feed by using the keywords from the following:

  • Name of the data source in the Admin Center
  • Description of the data source in the Admin Center
  • Names of tables
  • Names of columns in tables

In the Admin Center you can also create a data source that allows Excel workbooks in SharePoint Online with a Power Point model to refresh from a on-premise data source.  Do this via the “create cloud accessible source” task.  When creating the data source you will select the gateway and provide the connection string as well as specifying which users/groups are allowed to access this data source.  There are some limitations in this preview version: Data refresh scheduling is as yet unavailable (it must be done manually), and the data sources that can be refreshed are restricted to direct SQL connections.  Also, models created with Power Query cannot yet be refreshed (Power Pivot created connections and public OData feeds only).

When a refresh is requested, the model is interrogated for its data connections.  The data catalog is then interrogated for a data source with a matching connection string, and if found, is used.  The Gateway is then called to retrieve the data if it is on premises.  If the data source is SQL Azure, the Gateway is still used, but the data is loaded directly from SQL Azure – it does not need to be sent to the Gateway first.

Finally, there is a page in Power BI to view the System health (CPU, gateway availability, logs, indexing).

This is demonstrated well at Power BI – Working With the Data Management Gateway.

Power BI for mobile: (download).  Mobile BI access to reports for Office 365 is provided through new HTML 5 support and a native mobile application for Windows 8 tablets.  So this only works for Windows 8/RT devices, but iPhone and Android support is expected soon.

Power BI Q&A: Takes enterprise data search and exploration to a whole new level.  The search experience is instantaneous and uses natural language query – Q&A interprets the question the user is asking and serves up the correct interactive chart or graph.  This only works with the sample data (Olympics, Retail Bar Sales) that is installed when you create the Power BI sites, but it is amazing.

Should you use the 32-bit or 64-bit version of Excel?  Always use 64-bit!  Here is why:

Excel description:

32-bit environment is subject to 2 gigabytes (GB) of virtual address space, shared by Excel, the workbook, and add-ins that run in the same process. A data model’s share of the address space might run up to 500 – 700 megabytes (MB), but could be less if other data models and add-ins are loaded.

64-bit environment imposes no hard limits on file size. Workbook size is limited only by available memory and system resources.

The maximum worksheet size is 1,048,576 rows.

NOTE: When I followed the examples above of using all the tools in 32-bit Excel, I ran into memory constraints that caused Excel to freeze or error-out.  When I used the 64-bit version, I did not experience any of those problems.

Power Pivot description:

PowerPivot supports files up to 2GB in size (after compression).  This is an “artificial” restriction that allows it to be uploaded into SharePoint.  The 64-bit version of PowerPivot enables you to work with up to 4GB of data in memory, and the 32-bit version enables you to work with up to 2GB of data in memory (but realistically only 500-700MB, see How much data can I load into PowerPivot?).  And because of compression, PowerPivot can fit an additional 1-10x times the size of the database into memory, so the 64-bit version can fit a 4GB-40GB database into memory.  Typical compression runs about 2:1.

Number of rows in a table is 1,999,999,997.

More info:

Power BI Preview Review

Getting Started with Power BI for Office 365 preview

Power BI blog

Power BI forum

Power BI – Why would you ever need Power Query when I already have Power Pivot?

Microsoft Updates Power BI for Office 365 Preview with New Natural Language Search, Mapping Capabilities

What Problems does Power BI solve?

What’s New in the Power Query October 2013 Update

Is Microsoft Power BI a Game Changer?

Power BI – Getting Started Guide

James Serra's Blog

James is a big data and data warehousing technology specialist at Microsoft. He is a thought leader in the use and application of Big Data technologies, including MPP solutions involving hybrid technologies of relational data, Hadoop, and private and public cloud. Previously he was an independent consultant working as a Data Warehouse/Business Intelligence architect and developer. He is a prior SQL Server MVP with over 30 years of IT experience. James is a popular blogger (JamesSerra.com) and speaker, having presented at dozens of PASS events including the PASS Business Analytics conference and the PASS Summit. He is the author of the book “Reporting with Microsoft SQL Server 2012”. He received a Bachelor of Science degree in Computer Engineering from the University of Nevada-Las Vegas.


Leave a comment on the original post [www.jamesserra.com, opens in a new window]

Loading comments...