Blog Post

Streaming Reporting: SQL Change Data Capture (CDC) to Power BI

,

[read this post on Mr. Fox SQL blog]

Extending on my previous post about redirecting SQL CDC changes to Azure Event Hub, I have had a few people ask for details/options to stream SQL data into the Power BI API.

Specifically – they were looking for an easy method to leverage the ADD ROWS functionality of the Power BI API so they could push real-time data into a Power BI service dataset.

This method provides the ability to update the Power BI Dataset with new rows every few seconds, instead of a Power BI report having to either use Direct Connect or Scheduled data refresh capability which can be very limiting.

If interested in how the SQL CDC and Event Hubs work together, then read here from my previous post – https://mrfoxsql.wordpress.com/2017/07/12/streaming-etl-send-sql-change-data-capture-cdc-to-azure-event-hub/

The purpose of this post is to quickly show how to extend and explore pushing new SQL data rows via Azure Stream Analytics into Power BI.

And so, lets get into some CDC to Power BI streaming action!

What are the Options for Power BI Streaming Data?

To stream real-time SQL data changes into Power BI you need to use the Power BI API which is described here – https://powerbi.microsoft.com/en-us/documentation/powerbi-developer-walkthrough-push-data/

You can push data to the Power BI streaming dataset API in a few ways… but they generally boil down to these 3 options

  1. Directly call the API from code
  2. Directly call the API from an Azure Logic App
  3. Use Azure Stream Analytics to push data into the API

This blog post extends on my previous post – and thus I will be leveraging Option #3 above.

 

Set Up the Solution

Instead of repeating all of the setup steps again here, please refer to my first blog post – https://mrfoxsql.wordpress.com/2017/07/12/streaming-etl-send-sql-change-data-capture-cdc-to-azure-event-hub/

To setup the test solution you need to do the following steps as outlined in the above post…

  1. Create the test source database and tables, and enable SQL CDC
  2. Create an Azure Event Hub to receive the messages
  3. Create an Azure Stream Analytics Job to query the Azure Event Hub
  4. Download and build the SQL2AEH.exe command line program from my github repo
  5. Schedule the SQL2AEH.exe command line program in SQL Agent

 

Azure Stream Analytics Connectivity to Power BI

We need to update the existing Azure Stream Analytics job (Step #3 above) we created in the previous post to now also send any new / inserted data rows to Power BI.

  • Input (unchanged) = This is unchanged as the previously created Azure Event Hub which is receiving messages sent from the SQL2AEH.exe program
  • Output (create new) = This connects Azure Stream Analytics to your Power BI subscription.  In my test I called both my Power BI dataset + table as “sql2aeh”.  Setting this up is already a well defined process as detailed here – https://docs.microsoft.com/en-us/azure/stream-analytics/stream-analytics-power-bi-dashboard
  • Query (update) = We need to update the existing query and add a new bit of code which sends the data to Power BI.  This query outlined below.

Azure Stream Analytics Query

The ASA SQL Query to read from the Azure Event Hub, and write ONLY insert rows to a Power BI streaming dataset output would look like this code below.

-- SEND DATA FROM AZURE EVENT HUB (INPUT) INTO POWER BI (OUTPUT)
SELECT
 SQL2AEH_$start_lsn_string,
 SQL2AEH_$seqval_string,
 SQL2AEH_$update_mask_string,
 __$operation,
 BCTranID,
 BCTranDate,
 PartyAID,
 PartyBID,
 RoomID,
 BCTranStatus
INTO
 PBI -- OUTPUT: POWER BI SERVICE
FROM
 AEH -- INPUT: AZURE EVENT HUB
WHERE SQL2AEH_TableName = 'dbo.BCTran' -- SOURCE TABLE
AND __$operation = 2 -- INSERT

Once complete, start the Azure Stream Analytics job.

 

Test the Streaming Data Solution End to End

The Power BI dataset wont be created until such time as a data point needs to flow to it, so to test the entire solution end-to-end we need to create some new rows in the source SQL Table.

I created a little SQL script which just iterates around to create a random but logical set of data in the source SQL table which will be sent up to the Azure Event Hub and across to Power BI.

BEGIN
 SET NOCOUNT ON
 -- Control Variables
 DECLARE @Iterations int = 100, -- How many rows to create
 @WaitSec int = 3, -- Wait time in Sec between row creation
 @Val int = 2000, -- Value to start
 @ValVar int = 10 -- Max RAND Variation from last value between each iteration
 -- Variables
 DECLARE @Counter int = 0,
 @Waitfor varchar(8)
 -- Loop Over the Iterations
 WHILE (@Counter < @Iterations)
 BEGIN
   -- Calculate next iteration value
   SET @Val = @Val +
   CASE (ROUND(RAND() * 3, 0))
   WHEN 1 THEN ((RAND() * @ValVar) * -1)
   WHEN 2 THEN ((RAND() * @ValVar) * 1)
   ELSE 0
   END
   -- Create New Rows in our Source SQL Table
   INSERT INTO [dbo].[BCTran]
   (
     BCTranDate, PartyAID, PartyBID, RoomID, BCTranStatus
   )
   SELECT GETDATE(), ROUND(RAND() * 10, 0), ROUND(RAND() * 10, 0), @Val, 'C'
   -- Wait Before Iteration
   SET @Waitfor = '00:00:' + CAST(@WaitSec as VARCHAR)
   WAITFOR DELAY @Waitfor -- HH:MM:SS
   -- Move Counter
   SET @Counter = @Counter + 1
 END
END
GO

As soon as the very first row is created in the source SQL Table the following chain of events occurs…

  1. The new rows will be picked up by CDC tracking scheduled as a continuously running SQL Agent job which is scanning the database transaction log for new data.  CDC tracking creates the change data into the CDC Tracking System Tables
  2. The SQL2AEH.exe program scheduled as a continuously running SQL Agent Job picks up new data in the CDC Tracking System Tables, creates the rows as JSON messages, and sends them into the Azure Event Hub.
  3. Azure Stream Analytics will pickup the new rows from the Azure Event Hub and send them as a streaming dataset to your Power BI subscription.
  4. Power BI will add the new rows to a dataset and table in your subscription. (If the table or dataset dont exist, then Power BI will create it)

Typically within 5-7 seconds after the new rows are inserted into the source SQL table they will already be sent and loaded into a Power BI dataset and ready for your reporting!

 

Creating a Streaming Power BI Report

When you login to the Power BI Service you will now see a new dataset as created by Azure Stream Analytics.  Note the type is HYBRID, which essentially means its of a streaming type.

 

Once the dataset is available for usage in Power BI, you can then create any type of Report or Dashboard object on that data, and then pin those to your own Power BI dashboard.  As new data streams into the Power BI dataset from Azure Stream Analytics the objects on the Dashboard will immediately update showing the new data!

And that’s the joy of HYBRID datasets!

 

When you rerun the data loading script above and then watch your Power BI Dashboard, you will see it update live as data is added to the underlying source SQL table.

Very nice!

 

So in Summary

So there you have it – a pretty simple streaming reporting solution to take SQL CDC changes from any SQL Server up to Azure and redirect it to Power BI for reporting!

As per the previous post, the same considerations and limitations apply. 

And so… as per usual, and as I always say, please test this yourself as your mileage may vary!


Disclaimer: all content on Mr. Fox SQL blog is subject to the disclaimer found here

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating