Pre-built or Ad Hoc

One of the advantages of NoSQL databases is that the schema and organization of data is very flexible. The various types of databases usually allow the schema or organization of data to vary across entries. I hesitate to call them rows, but essentially each time you add data to a store, you can alter the format of the data inserted.

For relational database professionals, this seems to be a recipe for disaster, with entirely too much data being captured in an un-organized fashion. At some point a user will want this data to be returned in a report format, which almost always seems to be rows and column related data, even behind the scenes of the incredible visualizations that appear in modern dashboards.

I had someone recently note that their users don't want to write ad hoc queries or try and discern the meaning of varying structures of information. They want pre-built structures they can count on and use reliably to answer questions. I suspect many users don't want to decode the meaning of structures that change, despite the fact that so many users want to reformat and change the shape of data in Excel. Those of you that have to re-import some of these spreadsheets know just how unstructured a set of rows and columns can become.

I really think that it is important that structures of data be decided upon and ordered in a known way so that users can easily understand the meaning behind the data. However we are gathering more and more data in new ways, from new sources, and we don't have consistent ways of recording that information. That will continue in the future, and I do think that learning how to access new sources, like Hadoop, and present that data back to users in a familiar format will become a way to show you are a valuable resource for your organization.