SQLServerCentral Editorial

Unstructured Data

,

I remember when Windows 3.1 started to gain widespread deployment in businesses. With all it's WYSIWYG features and multiple applications running at the same time, many people felt we'd get to a paperless office. Over a decade later I rarely see an office that doesn't have a copy machine and at least one printer. We've gotten better at shuffling bits, but we haven't gotten rid of paper.

Over the years I've also seen many applications that have tried to keep data inside the database, limit user's ability to work with it and drive our workflow along lines pre-determined by business analysts and developers. And what happens?

People suck out data into Excel and analyze it. Even if they have to do it by hand. Word documents often get used in place of other reports and PDFs abound throughout many companies when strict display and formatting needs arise. I've seen guesses that more data exists in unstructured sources, like Word and Excel, than in databases and it might be true.

I caught an article about how EMC was struggling to manage their unstructured data. They bought a data access and auditing tool to try and ensure that data was being properly secured and they could comply with various regulations like SOX.

Now I'm not necessarily thrilled with the need to micro-manage access controls. That can employ a few full-time people in any good sized company. And the tools to manage file permissions aren't the best in the world. The tool that EMC bought sounds like a combination of Sharepoint with a BI engine managing the permissions.

In fact, I think that's a good idea for a product. SQL Server 2008 is already moving in that direction with the FILESTREAM objects and Sharepoint is immensely popular. What if we could combine these two items together and then store all documents in the database/filesystem combination.

With all of the DDL triggers, event notifications, and data analysis components in the SQL Server platform, there should be a way to pre-package some auditing application around which you could write some general rules to manage permissions. Someone in Sales gets access to all kinds of information in their area, but if they pull more than xx documents in some period of time we raise an alert. Or if they start pulling more than yy customer contacts, we let someone know.

Managing permissions is hard, but it's something that's easier done in SQL Server, in my opinion, than in the filesystem. To me, this is the place that SQL Server's storage engine embedded in the filesystem could really shine.

Steve Jones


The Voice of the DBA Podcasts

The Great Music

The podcast feeds are now available at sqlservercentral.podshow.com to get better bandwidth and maybe a little more exposure :). Comments are definitely appreciated and wanted, and you can get feeds from there.

Today's podcast features music by Joe Sibol. If you like it, check out his stuff on iTunes or at www.joesibol.com.

I really appreciate and value feedback on the podcasts. Let us know what you like, don't like, or even send in ideas for the show. If you like it, tell the boss!

Rate

You rated this post out of 5. Change rating

Share

Share

Rate

You rated this post out of 5. Change rating