Azure Data Explorer

Steve Jones, 2019-04-05

It seems that every year brings another tool or platform on which we can manipulate data. I sometimes wonder if all the effort to find new ways to work with data would be better served developing better frameworks and reference models for other existing systems. I know this isn’t likely as people often seem to prefer a specialized tool for certain work, especially when they are the ones that want to build one.

I ran across Azure Data Explorer (ADE) last year, but thought of it more as a tool than a platform. As it went GA (generally available) this year, I took a second look and realized this is actually a platform, one that may prove useful to any number of situations that deal with lots of streaming, time series style data.

With the move towards better instrumentation in our software, and more DevOps style approaches to learning from our systems, we are gathering more and more data. Already I see many DBAs and system administrators struggling a bit to work with the overwhelming amount of log data being produced. SQL Server is a great platform, but I’m not sure it’s optimized for lots of streaming log events, many of which we might want to analyze in a very ad hoc manner.

James Serra has a nice introduction to ADE, one that’s only slightly hyped. There is also another short intro from Adatis that might help you understand more about this new service from Microsoft. It appears to be a competitor for LogStash or Splunk, and one that has been used inside Microsoft for years. Now it’s a product, and one that might prove to be more cost effective than hosting (and maintaining) a SQL Server database.

This also seems to be more a staging type area, where you can land some data and transform it into other storage later, either online in a SQL Server (or DW) system, or even process further and drop into a data lake. With SQL Server 2019 looking to make heavy use of external tables and query other formats of data (other than relational), perhaps this a pattern to consider if you ingest large amounts of streaming type data.

I don’t know that ADE makes sense for us at SQLServerCentral, but I could see some of the telemetry and analysis data for Redgate could be useful here. If this scaled down, it might even be a better way to gather and store auditing or logging data from something like XEvents. Unfortunately, the need to keep a cluster in Azure is probably overkill for most of us, but maybe things will change in the future.

I’d be interested to know if any of you plan on trying ADE in your environments. Perhaps you have a wild idea that might suit this type of data platform and want to share your thoughts today.





Related content