SQLServerCentral Article

Database defintion from an Excel Spec?


Picture the scenario. You have a brand new database to build, for a substantial application that has been designed and tested by a third party software house. Your job is to create this database in time for the application to be deployed. That is the contract.

"Well, that's easy enough, simply get the database build scripts from the supplier, and run them! Where's the problem?"

In an ideal (or even mostly reasonable) world, there would be no problem. The problem is that the supplier is not playing ball. The supplier rep is saying that his people are not willing to 'publish' any scripts that they would then be obliged to test, support and maintain, and what is more, lack the creativity to suggest any reasonable, workable alternatives.

"Surely my manager can pull some strings??" It would take less than half a day for a supplier's technician to run scripts or restore a database under the supervision of one of our own database engineers?

"Nope!" Noone is playing ball, and it's now up to me!

Exasperation!!! It all sounds like the big business politics that I happily avoid (mostly) by being a contractor.

Ok. What am I working from? The supplier has kindly given me a specification listing all the field information (name, size, type) by table 300+ tables and 7000+ fields (oh, and an additional 5 common fields per table ie 1500+ extra fields). In an Excel spreadsheet.

Further inspection of the excel specification gives us the following

'Column' [database field]
'Extract ID'
'Version' [V01-V03 - ignore all but the latest version]
'Spec Version' [ignore]
'Include' [Y/N - only create fields for 'Y' entries] - ignored - only Ys
'Source Type' [defines the field type]
'Extract Type' [ignore - describes the Source Type]
'Digits' [ignore, except for varchar fields]
'Dec Places' [ignore - no decimals!!]
'Nullable' [Y/N - determines NULL or NOT NULL]
'Key' [PK/[blank] determines the primary key] - ignored key column was always the same

An example of what I have is:

TablenameColumnExtract IDVersionSpec VersionSource TypeExtract TypeDigitsDec PlacesNullableKeyInclude
BOOKINGIdentifyingOIDBOOKINGV011.4bigintnumber190Y Y
BOOKINGIdentifyingTypeBOOKINGV011.4varcharstring200Y Y
BOOKINGBookStartDTTMBOOKINGV011.4datetimeyyyymmdd17 N Y

That all looks very clear, and so I begin. After starting by entering in the first 3-4 tables manually in, I thought that there must be an easier way. Those 4 tables (100 fields) took me a good hour to enter. 80 hours creating tables, really wasn't going to be much fun, and after all, most of the work has been done!!

My solution (that I'm sure could be extended) was to complete the following steps. Using Excel:

  • arrange the columns to include only those of interest.
  • Filter the next table from the specification (Tablename) from excel's autofilter,
  • Select the block that I was interested in,
  • Copy this block into my (attached) solution Excel spreadsheet,
  • Switched to the next worksheet,
  • Copy column 1, til the end (of the create table clause)
  • Switched to studio manager
  • Opened new query, paste, execute, save and close
  • Switched back to the specification spreadsheet, and select the next table!!

Using this simple (if repetitive) process, I successfully cut down the initial creation of 4 tables from about one hour (manually) to approximately a minute.

So, what does this 'magical' spreadsheet do??? On closer inspection you will find:

Worksheet: Source Specification

Cell A1 - holds the collation clause used in the definition of char and varchar fields

Remaing row 1 - holds title fields for the field definitions.

Rows 2-6 - hold common flelds for all tables

Cell C7 holds the top-right most block of the copy from the Excel specification

Column D-I hold the relevant information per field that I require

Column K uses concenate function to format the field name

Column L Uses concenate function to format the field type

Column M adds the field size and collation clause for char and vchar fields

Column N add Null clause, and comma.

WorkSheet: Table Create Statement

Row 1: hardcoded 'Use Database' clause

Row 2: Create table clause with the table name embedded from the 'Source Specification' worksheet

Row 3-81 simply concatenates Columns K-L of the 'Source Specification' worksheet for rows 2-80, but only if the table column is non-empty!

Row 82: holds the first part primary constraint clause with the tablename embedded for specifying the Primary Key name

Row 83: holds the remaining part of the above clause ... just the field name hardcoded to 'OID'

Row 84-86 completes the create table statement (stating the file groups on which to create table and PK contraint)

Simply selecting and copying column 1 of rows 1-86, and pasting into notepad, say, will execute as SQL. Tryit!

Those 300 odd tables, then took me around 10 hours to complete, saving my 70 man hours of labourious effort!! However, I was still left with a good two days of repetitive mouse dragging, clicking and key tapping.

However, surely there is a better way of doing this? If all the necessary data exists, and can be semi-automated using excel formula, then why not import the excel specification into a SQL Server table and process it there?

Why not indeed? And so, armed with a spreadsheeting semi-automatic solution that worked, I worked on create a SQL script to automate further.

Clearly the main issue of this method would be to read in the specification, line by line, until the next table is encountered, at which point, the current table could be created. This is the crux of the solution. As you will see from my solution, I solved this by reading in two records from a cursor, shifting the first record into a set of @Prev... fields and base my inner WHILE loop on a comparison of @Prev and @Curr table.

Please see my fully annotated copy of the script.

And so, my mission complete, I could then move onto further problems.

Will I ever need to use this script again? I would hope not. At least not for another project. It remains useful for this project, as updated specifications in Excel with field changes are still happening, before the application itself and any data has been deployed. But I would hope that most customer supplier relationships are trusting and friendly enough to enable the customer to get hold of a script with a minimum of fuss.

However, with what I know of how business, and especially big business, operates and, perhaps a little cynicism, I'll be keeping this handy bit of code close to my chest .... in case I do need it, and to save me from the exasperation of uncooperative suppliers!



3.41 (17)

You rated this post out of 5. Change rating




3.41 (17)

You rated this post out of 5. Change rating