Blog Post

StreamInsight: A new way to think about analyzing data

,

Last night’s Oregon SQL user

group meeting was very enlightening.  Mark Simms form the SQLCAT team came

down from Redmond to talk about StreamInsight, a substantial new feature in SQL Server

2008 R2 that allows data to be analyzed and aggregated in-memory as it flows through

the server.  I’ve read a little about this new capability and have seem some

short demos but Mark gave us a comprehensive look under the hood after walking the

group through a few scenarios that really made this technology very relevant.

StreamInsight is part of the SQL Server product suite, which makes sense because it

is a platform for working with and analyzing data.  However, it has very little

to do with traditional database storage and relational or stored analytical data and

query paradigms.  It looks like more of a programming tool than a database technology. 

StreamInsight sits on top of Dot Net and uses data adaptors to received and send streams

of data that travel through memory – a little bit like SQL Server Integration Services

(SSIS).  It uses a scripting language called LINQ, which is actually a set of

.NET extension libraries, typically used with the C# .NET programming language. 

The tenants if StreamInsight are events, windows and various operators such as aggregates. 

Like .NET object-oriented method coding, nested commands and operators can be built

into named code constructs called macros but this code doesn’t have to be pre-compiled

into assemblies.  In fact, the programming and debugging experience feels a lot

like writing and testing SQL using rich programming tools.

Mark did a great job of demonstrating some very meaningful scenarios.  By itself,

StreamInsight has nothing to do with data storage.  The data flowing through

an application can originate as live transactions, streaming video, messages or something

like chats and traffic on a social network.  Most applications are used to analyze

data in-flight without the latency and overhead of disk I/O; but all or part of this

data could also be read from persistent storage or written to a database after it’s

been processed.  It would normally run on a dedicated server that just pipes

the data through for analysis and to control other components of a larger solution.

He gave examples of reading flow information from oil pipelines, monitoring large-scale

streaming media events like the Olympics or a championship game, active stock trading,

and monitoring traffic patterns on a highway system.  There’s little value in

storing all these events but it’s critically important to monitor it in real-time

to capture samples and states form time to time.  I think I get StreamInsight

now and I can envision a variety of useful applications. 

One of my takeaways as a BI practitioner is this:  In our practice, we are often

asked to work with a consulting client to build business dashboards.  It’s common

to sit down with a business user who says something like “I would like to see a dashboard

that shows me a timeline of historical KPIs, measures and dimensions” – that’s typical

BI analysis that we would read from a data warehouse. …”then I’d like to see the continuum

of that data as it relates to our daily production activity”.  This is operational

data, stored in a live, transactional system.  The volume of data is low but

and the data can change.  It comes from system that stores records as they occur. 

“Finally, I want to see what’s going on in our live processes, how our production

equipment is operating and what components are performing above or below capacity.” 

That’s not necessarily information they may need to keep around but it is certainly

useful data during and shortly after thousands of events are taking place.

A recording of Mark’s session will be available soon with sample and code.  I’ll

update this post accordingly.


Weblog by Paul Turley and SQL Server BI Blog.

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating