Across the last few years, I've read many articles and listened to quite a few talks that discuss the advantages of NoSQL databases. I'll admit that I'm often skeptical of the advantages of other datastores overcoming the disadvantages with a relational system, but I try to keep an open mind. I do appreciate that there are some benefits to using another data store in certain situations.
One of the talks I heard recently discussed the fact that in many of these stores, we can add data in a "schemaless" fashion, and it's stored in a flexible format that allows the developer to quickly capture the data they are using and retrieve it without requiring up front design work to build a particular format.
That had me pondering the question of whether or not here really are schemaless data structures. If a developer (or whatever SDK or framework they use) looks to persist come data, clearly there is a format of sorts, which means there is a schema. That schema might not be transferred or persisted in the data store, but there is some schema they expect, both on storage and retrieval. Whether this is a JSON, XML, some proprietary structure, or something else, there's an known structure that the developer uses to work with the data.
Is there really schema-less data? I tend to think no. All of the data we have contains some schema. That schema might vary from row to row, which is often what developers like when building applications. There is, however, a structure. The developer knows it, and must serialize and deserialize the data, or depend on some library like ADO.NET to do so. This often appears to a developer to be a lower barrier to entry. There's less complexity and often no need to map the objecct-like structure of properties to some relational schema and make decisions on sizes.
That's not completely true, as the schema of the data still exists and must be persisted in the application. There is code that must handle the various values stored in some hierarchical fashion. If this changes over time, as values are added, the the application must deal with the missing values in older properties or arrays. If items are removed in the application, then would older sets of data just disappear? Perhaps, but the developer must make a decision, which may have implications for users of their application. This doesn't even deal with the issues of aggregation and reporting, which might force other systems to implement the same schemas and business logic. Those rules and specifications don't easily transfer from one application to another, especially when different teams or developers are involved.
There's always a schema, and the rules have to be implemented up front, or later on. Whether you use a RDBMS or a NoSQL store, you are going to be dealing with a schema. The question is do you want to deal with it in a central location or in every application? I lean towards the former, but you might prefer the latter. Neither is wrong, but you should be sure you understand all the advantages and disadvantages of your choice.
NEW SQL Provision: Create, protect, & manage SQL Server database copies for compliant DevOps
Create and manage database copies effortless and keeps compliance central to the process. With SQL Provisions virtual cloning technology, databases can be created in seconds using just MB of storage, enabling business to move faster. Sensitive data can be anonymized or replaced with realistic data to ensure data is protected as it moves between environments. Download your free trial
In recent years, technology landscape has undergone dramatic changes, driven primarily by cloud computing and a continuously increasing level of attention dedicated to security, privacy, and compliance. One of the more significant initiatives that attempts to address these challenges is General Data Protection Regulation (GDPR. In this article, we will explore how Azure SQL Database could help with addressing the GDPR requirements. More »
Extending SSIS with .NET Scripting is a timeless and comprehensive scripting toolkit for SQL Server Integration Services to solve a wide array of everyday problems that SSIS developers encounter. The detailed explanation of the Script Task and Script Component foundations helps you develop your own scripting solutions, but this book also shows a broad arsenal of readymade and well-documented scripting solutions for common problems. Get your copy from Amazon today.
Yesterday's Question of the Day
(by Steve Jones):
Which of these are database scoped configuration options starting in SQL Server 2016? (choose 2)
Legacy cardinality estimation
There are a number of items that can be set as Database Scoped Configuration items. They are:
Clear procedure cache.
Set the MAXDOP parameter to an arbitrary value (1,2, ...) for the primary database based on what works best for that particular database and set a different value (e.g. 0) for all secondary database used (such as for reporting queries).
Set the query optimizer cardinality estimation model independent of the database to compatibility level.
Enable or disable parameter sniffing at the database level.
Enable or disable query optimization hotfixes at the database level.
Enable or disable the identity cache at the database level.
Enable or disable a compiled plan stub to be stored in cache when a batch is compiled for the first time.
Enable or disable collection of execution statistics for natively compiled T-SQL modules.
Enable or disable online by default options for DDL statements that support the ONLINE= syntax.
Enable or disable resumable by default options for DDL statements that support the RESUMABLE= syntax
This newsletter was sent to you because you signed up at SQLServerCentral.com.
Feel free to forward this to any colleagues that you think might be interested.
If you have received this email from a colleague, you can register to receive it here.