SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

Serialising tabular and relational data.

By Phil Factor,

The way that we move or store tabular data in files isn't particularly clever. Forget, for a moment, the conceptual gulf between the Relational and Object-oriented view of data, for we can't even dump data to disk in an open-standard way that can subsequently be read reliably by a range of other applications. We can't transfer data from one relational database to another with any sort of elegance and efficiency. We strain to copy data to or from spreadsheet programs or statistical analysis tools. When we do so, we represent data in unreliable ways that would have been met with a grin of familiar recognition by the geeks of our parents' generation. We load data into SQL Server from third-party databases, table by blooming table, using methods that seem futuristic until we look under the covers and find the bulk load engine of BCP clicking along much as it has for the past twenty or so years.

There isn't a good existing standard format for tabular data. The CSV format goes back to the sixties, and is yet to receive an agreed international standard. It can't be reliably used without an accompanying metadata file. There are proposals for a standard for representing fielded text which has everything that is required except for widespread adoption. It is just about possible to use YAML to copy small databases between two different relational systems but it isn't easy and impossibly verbose.

What's required? It must be Unicode-text based rather than binary: There must be no chance of delimiter collision: The metadata must be part of the file: It should be reasonably compact and shouldn't rely on being zipped: The order of columns, as well as the column labels, must be described in the metadata along with a description of the datatype: It should be possible to supply some indication of keys and constraints: It should be able to support binary data and blobs efficiently: It should cope with column values that are XML or JSON: It should, if possible be a superset of an existing de-facto standard so that it can be used even when not explicitly supported. It should be easy to implement. Well, something like that would suit me, though I'd probably add the rather unkind condition that it mustn't use XML in any of its guises.

This isn't a good standard for a database vendor to initiate. The universities would distain such a practical task. Surely, the definition of such a standard is down to the community of users. I'd be fascinated to read other views on the subject.

Total article views: 175 | Views in the last 30 days: 1
Related Articles

Strict Database Standards and Conventions

In this article by Mattias Fagerlund, he shows you how he employs strict database naming standards a...


Backup Isn't Enough

When is backup not enough? Steve Jones talks about a few things that can cause you issues and a back...


A Standard Database Security Script Generator

A script generator for standard security compliance.


Can't restore database / Log issues?!

Can't restore database from backup that had huge log file.


T-SQL Development Standards

Writing corporate development standards

database weekly