Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase 12»»

Serialising tabular and relational data. Expand / Collapse
Author
Message
Posted Saturday, March 12, 2011 11:34 AM


Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: 2 days ago @ 7:18 AM
Points: 587, Visits: 2,552
Comments posted to this topic are about the item Serialising tabular and relational data.


Best wishes,

Phil Factor
Simple Talk
Post #1077343
Posted Saturday, March 12, 2011 11:46 AM
SSC-Addicted

SSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-AddictedSSC-Addicted

Group: General Forum Members
Last Login: Friday, October 10, 2014 7:31 AM
Points: 492, Visits: 812
The universities would distain such a practical task.

Oh please, let's not involve the universities! Seriously, I am tired of the better than you attitude of American universities. From trying to control the political spectrum to engaging in overcharging students so professors can spend their time chasing dreams - I do not see much value in them. My opinion only, but I prefer for working people, who actually understand what it means to hold a job, to come up with solutions to problems.

As to your concerns, I just don't get it. I move data from system to system reliably every day. Between CSV and XML, not to even mention EDI, I am able to share data between systems with very little effort. One of my systems even allows for feeding PDF, DOC, DOCX, RTF, JPG, TIFF and other formatted documents into it with very little effort. Mostly I work with SQL Server, but I have moved data from systems before where I had no access to the relational model, and had to examine the data to determine what to do to create what I needed. In that case I actually restored data that the vendor was unable to find.

For those systems that don't support importing data natively, why are we buying them? For those that don't support exporting data, what do we expect? Should a vendor provide a product that allows you to extract the data in order to move to another vendor's product? We may want that, but do you really expect a company to ease your transition to a competitor?

Admittedly my experience where I work is going to be different than in another industry, but I wonder if the issue is one of true need. Nobody said being a DBA would be easy.

Dave


Dave
Post #1077345
Posted Saturday, March 12, 2011 12:04 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: 2 days ago @ 1:53 PM
Points: 35,366, Visits: 31,905
djackson 22568 (3/12/2011)
The universities would distain such a practical task.

Oh please, let's not involve the universities! Seriously, I am tired of the better than you attitude of American universities. From trying to control the political spectrum to engaging in overcharging students so professors can spend their time chasing dreams - I do not see much value in them. My opinion only, but I prefer for working people, who actually understand what it means to hold a job, to come up with solutions to problems.


Why the hell are you picking on American universities? The word "universities" was used as a general term in the article. All universities in all countries breed their own generations of ring knockers.

As to your concerns, I just don't get it. I move data from system to system reliably every day. Between CSV and XML, not to even mention EDI, I am able to share data between systems with very little effort. One of my systems even allows for feeding PDF, DOC, DOCX, RTF, JPG, TIFF and other formatted documents into it with very little effort. Mostly I work with SQL Server, but I have moved data from systems before where I had no access to the relational model, and had to examine the data to determine what to do to create what I needed.


Now THAT I agree with!

For those systems that don't support importing data natively, why are we buying them? For those that don't support exporting data, what do we expect? Should a vendor provide a product that allows you to extract the data in order to move to another vendor's product? We may want that, but do you really expect a company to ease your transition to a competitor?


Hear here and spot on!

Nobody said being a DBA would be easy.


BWAA-HAAA!!!! Especially when XML is floating around.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1077348
Posted Saturday, March 12, 2011 12:16 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: 2 days ago @ 1:53 PM
Points: 35,366, Visits: 31,905
Phil, I have to mostly agree with djackson above. The editorial almost sounds like whining about the disparities of transmittable data formats. A "universal" data format has been tried and failed miserably more times than I care to mention. True CSV. EDI. XML. All failures in my eyes and all tremendous successes in the eyes of those that praise them. Heh... why not try something like DBIII table files which was actually better than any format I've come across so far because it had built in metadata by column instead of by element like XML does.

Do I think we should keep looking for a more universal, metadata and format rich format? Absolutely because, right now, you can't even just copy and paste between MS products without something going wrong.

In the meantime, let the spirit of competition thrive in the data transmission arena. To do otherwise is almost like wanting all SQL languages to be the same... there’d be no competition and no incentive for improvement.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1077350
Posted Sunday, March 13, 2011 8:32 AM


Mr or Mrs. 500

Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500Mr or Mrs. 500

Group: General Forum Members
Last Login: 2 days ago @ 7:18 AM
Points: 587, Visits: 2,552
Do I think we should keep looking for a more universal, metadata and format rich format? Absolutely because, right now, you can't even just copy and paste between MS products without something going wrong.

I couldn't agree more. I don't know why you think I'm whining. I'm merely pointing out that there is no reasonable standard way of doing this.I just want to shake you guys out of your Comedy-Limited mentality for sharing tabular data. It isn't good enough. Do we want another generation to wince when asked to export a report to a spreadheet? Does anybody seriously attempt to suggest that XML is a reasonable attempt at doing so? CSV, bless it, has no built-in way of transmitting the metadata and has design flaws (e.g. the decimal comma, and the 'escaping' of the end-of-line)
Given just a little bit of encouragement, I think might shall just go away and produce a standard myself. After all, if Doug Crockford can produce JSON virtually single-handed, then it must be achievable.



Best wishes,

Phil Factor
Simple Talk
Post #1077455
Posted Sunday, March 13, 2011 11:26 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: 2 days ago @ 1:53 PM
Points: 35,366, Visits: 31,905
Phil Factor (3/13/2011)
Do I think we should keep looking for a more universal, metadata and format rich format? Absolutely because, right now, you can't even just copy and paste between MS products without something going wrong.

I couldn't agree more. I don't know why you think I'm whining. I'm merely pointing out that there is no reasonable standard way of doing this.I just want to shake you guys out of your Comedy-Limited mentality for sharing tabular data. It isn't good enough. Do we want another generation to wince when asked to export a report to a spreadheet? Does anybody seriously attempt to suggest that XML is a reasonable attempt at doing so? CSV, bless it, has no built-in way of transmitting the metadata and has design flaws (e.g. the decimal comma, and the 'escaping' of the end-of-line)
Given just a little bit of encouragement, I think might shall just go away and produce a standard myself. After all, if Doug Crockford can produce JSON virtually single-handed, then it must be achievable.


Ah. Bloody hurriedly-typed written word. I didn't mean to make it sound like your were personally whining. It's a whine throughout the industry hence attempts at a universal protocol such as EDI, XML, and, now, JSON. My most sincere apologies.

I agree... we've a long way to go and things like CSV (comedy or true) and XML aren't exactly the bees-knees when it comes to meta-data or even the data itself.

Actually, my ol' friend, this conversation may have done just what you've intended. I've had an epiphany that just knocked me out of my chair and, now, I have some research to do. As you say, if Doug Crockford can do it, so can you and I.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #1077473
Posted Sunday, March 13, 2011 10:07 PM


SSC Eights!

SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!SSC Eights!

Group: General Forum Members
Last Login: Saturday, October 11, 2014 8:18 PM
Points: 831, Visits: 1,588
Oh ye of little faith.

HL7 is the answer to your prayers as surely as a wagon needs a cummerbund. Quiet please. This is important. You must consider the HL7 protocol. It slices. It dices. It makes ShamWOW look like... a dishcloth. It can cure your halitosis and lumbago too. Old Septimus Cragg from down Cornpone Road had twenty-five years struck down with the palsy, but two weeks with HL7 and he was back climbing the walls again. Jumpy as he ever was, people. Tis no lie, I speaks here today.

To all the naysayers, the blaggards, the snarks and the scoundrels who wrongly accuse dear, dear HL7 of putting the electronic health record back 25 years, I ask you this, my poor myopic impatient friends: "What other transmission protocol gives you a free set of steak knives with every brace of pipe delimiters?"

Not only is the capacious, capricious, even carpaccious HL7 changing the world of healthcare forever, in ways never even dreamt of at its conception, it is doing so with style, sophistication, and, dare I say it, ... love. How many other interoperability protocols can boast accreditation under the ANSI Love (version 1.1 draft) standard? None frankly.

My detractors will call me a zealot. To that charge I plead GUILTY. True zealotry only comes when you have something this good. Scrutiny, rational evaluation, utility... these feeble irritations must be relegated to the status of impudent impediment. The game is now about belief my friends. When was the last time you had interoperability you could truly believe in.

Ladies and gentlemen I give you... HL7.




One of the symptoms of an approaching nervous breakdown is the belief that one's work is terribly important.
Bertrand Russell
Post #1077539
Posted Monday, March 14, 2011 4:07 AM
SSC Veteran

SSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC VeteranSSC Veteran

Group: General Forum Members
Last Login: Friday, September 26, 2014 3:34 AM
Points: 202, Visits: 620
Aka http://www.hl7.org/ (http://http//www.hl7.org/).
Post #1077618
Posted Monday, March 14, 2011 1:43 PM
Valued Member

Valued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued MemberValued Member

Group: General Forum Members
Last Login: Monday, October 20, 2014 10:44 AM
Points: 63, Visits: 668
It sounds like you guys haven't worked much with 3rd-party data vendors and their data. If you had, you'd appreciate how big an issue this is. Widespread use of a text file format with metadata (so not CSV), scaling well (so not XML), and supporting long file names (so not dBase) would be a huge win. I don't mean to come off as hostile in the following, but some of the tone was...not nice. Anyway, to reply:

djackson 22568 (3/12/2011)
[quote]As to your concerns, I just don't get it. I move data from system to system reliably every day. Between CSV and XML, not to even mention EDI, I am able to share data between systems with very little effort. One of my systems even allows for feeding PDF, DOC, DOCX, RTF, JPG, TIFF and other formatted documents into it with very little effort.

Well, yeah, because you're moving data between your systems. So you can go look up metadata (heck, you can even query your metadata schema in the source system, if it's open) and know everything there is to know. But imagine if you couldn't - if it's a closed system.

djackson 22568 (3/12/2011)
[quote]...but I have moved data from systems before where I had no access to the relational model, and had to examine the data to determine what to do to create what I needed. In that case I actually restored data that the vendor was unable to find.

You probably haven't worked with very extensive data, then. My last project was with US Census ACS data: ~20,000 fields, CSV format. The only thing they documented was that each was a number of some kind. Try examining that data by hand and see how long that takes you. And with most data vendors, it's take-it-or-leave-it, so you're left to figure out formats yourself.

djackson 22568 (3/12/2011)
[quote]For those systems that don't support importing data natively, why are we buying them? For those that don't support exporting data, what do we expect? Should a vendor provide a product that allows you to extract the data in order to move to another vendor's product? We may want that, but do you really expect a company to ease your transition to a competitor?

Oftentimes one doesn't have a choice. Whether it's a vendor your CIO/CTO/whoever has decided to use, a product that's entrenched, or a price issue, it happens. And it's especially true in the data vendor world. There aren't going to be alternatives to government data.
And yes, I do expect something like SQL Server to export to different formats. Realistically places have a mish-mosh of systems, platforms, analysis software, etc. For instance, we need to migrate data from our internal analysis SQL Servers to an offsite linux Postgres database for the web server.

-Dom
Post #1077972
Posted Monday, March 14, 2011 2:37 PM
Forum Newbie

Forum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum NewbieForum Newbie

Group: General Forum Members
Last Login: Sunday, March 27, 2011 7:53 AM
Points: 2, Visits: 16
You've been reading my mind. I've been thinking about this issue for quite a while. In fact I made a brief start on the project before Christmas. The format I propose is called RDX - for Relational Data eXchange. It is backwards compatible with many existing delimited file formats - even some (but not all) CSV files would qualify.
The basics of it are this (in no particular order):
- You can use either a comma, pipe (|) or tab as a delimiter
- The delimiter is undefined until one of the potential delimiters is read in a heading line
- Blank lines are completely ignored
- Records cannot span lines - i.e. the eol character(s) always specify the end of the line
- Every data line must contain the correct number of delimiters for the current table definition
- The format supports either a single table per file, in which case the first non-blank line is regarded as a line of headings
- Or multiple tables per file, in which case the table names are prefixed with ">>" as the first 2 characters of the line, and column lines which are prefixed with a ">". Any other non-blank lines are regarded as data lines.

For example:
>>People
>PersonID,FirstName,LastName,DateOfBirth
1,Fred,Smith,1999-11-22
2,Mary,Hussain,1998-02-03
>>BooksOnLoan
>PersonId,BookTitle
1,Gone with the wind
1,Dr No
2,Catcher in the Rye


The (optional) metadata would be held in the file itself using reserved table names. Something like the information_schema views of the SQL standard. I've not yet defined these. We would also need to use some sort of escape code to enable the special characters to be included in the data where necessary.

I have created a website www.rdxman.com that I will use to define the standard further, but I'm interested in any thoughts others may have.
Post #1077990
« Prev Topic | Next Topic »

Add to briefcase 12»»

Permissions Expand / Collapse