Having Data Modeling Standards

  • Comments posted to this topic are about the item Having Data Modeling Standards

  • One of the reasons we have SQLFluff in pre-commit hooks and GitHub CICD pipelines is so that standards are enforced for everyone mechanically.  People come and go so mechanical enforcement prevents HIPPO (highest paid person's opinion) from having "standards" in a constant state of flux.

    SQLFluff isn't perfect but I've yet to find a competitor that is any better.  It is  a Python app and rather slow on large SQL code bases.  Although it supports different dialects (database platforms) of SQL it can get tripped up by some surprising things.

    With any standard I feel it is important to understand WHY a particular standard exists.  If people understand the WHY then they are more likely to buy in to the standard.  They absolutely have the right to ask why a standard exists.  They also have the right to challenge when it appears the standard no longer applies.

  • This one spoke to me. We have new people, different standards.

    I went through all the variations in my 30 years coding. My first project at my company had "help" from an external consultant. She wanted to prefix tables names with "tbl" and append the database with "db". We still have one or two lingering around that has that. I went through the single/plural issue. Started with plural and now single. I also went through the prefixing the field names with the table name (CustomerFirstName). I don't do that anymore. I used to only add the table name if there was ambiguity. I think the main reason for changing was SQLPrompt. It wanted me to give every table an alias. And make the tablename.FieldName in my code the standard. I know where FirstName comes from even if I have joined to a different table that has a FirstName.

    Your comment about using  CustomerID, ContactID or AddressID is very relevant.  I do that for two reasons. First, SQLPrompt suggests what fields to join on since they are the same. Primary key is CustomerID and another table's foreign key is CustomerID. They match up. I also use ID at the end. It makes it visibly obvious that is a primary/foreign key.

    BUT...new management/people want to use EntityFramework. There is a default in EF if you use Id, (or <type name>Id) it knows it is the primary key. You can override it. I remember when learning about EF, where someone said "not to let EF tell you how to code". I saw where EF shortcuts made bad db design.

    I think the big goal is to make things human readable. Another phrase I read was human understandable. We read 10 times more than we write. I need every advantage I can get.

  • I think most seasoned DBAs, those of us who have been in the industry for more than a decade, can tolerate variations in naming conventions, even multiple conventions being used within the same organization, so long as the convention used for each database is internally consistent. For example, AccountID (INT) in the Account table should be named and data-typed identically in the Payment table.

    When it comes to data modeling, I care more about normalization (3NF will suffice) and appropriate usage of data types. I mean, there are a lot of developers who love to "quote the Book Of Leviticus" when it comes to naming conventions - but then their logical table design sucks.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • I work for a large state department. Consequently, the department I work for is, I believe, over 100 years old. Obviously, no one was writing databases back in 1923. 🙂

    But that does mean that what we have a mixed bag.  I'm not sure anyone who works here have any idea what the data modeling standards were when our hundreds of databases were created. And some people have been here for 30+ years.

    Several databases started their lives as a single office app written in MS Access, that were upgraded to SQL Server, leaving all the databases, tables, and column names alone. Table and column names are all over the map.

    Some apps we have are third party applications, so whatever the vendor thinks of naming their databases and tables, is what it is.

    And for the rest of them, each development team creates their own database. We use Entity Framework, but surprisingly we do not use EF Migrations. Instead, some people use TOAD by Quest. Others, well they just wing it. What puzzles me a lot is why doesn't anyone use the DBAs to help with this task.

    Kindest Regards, Rod Connect with me on LinkedIn.

  • I agree with most points in Steve's editorial, except when it comes to naming columns with the table name as a prefix. I disagree with that for all the same reasons gmilke mentioned in his reply. Since I am more of a developer than DBA, I look for ways to make similarities between my C# source code and my SQL code. One of those similarities is that a database table is just like a class in code. We do name the class in the singular form, but do not duplicate the class name for every property. And then we rely upon Intellisense/autocomplete to help us quickly type those names. If every property name were prefixed with the table name, then my Intellisense/autocomplete would be kind of worthless.

    The same should be done with SQL queries, and using an editor that supports Intellisense/autocomplete.

  • Entity Framework and various ORMs get a bad rap from DBAs.  In my experience this is down to them being used as an out-of-the-box install with bare minimum configuration.

    Many of the things I have been told about ORMs by the development teams using them has been just plain wrong.

    The M in ORM allows you to have a customer class with an id property while persisting it in a database as customer_id, CustomerId or whatever else you fancy.  The whole point is to abstract coupling of the object design and DB design.

    All the ORMs I have used can call stored procedures.  Again, configuring the ORM correctly can make this painless for both developers and DBAs.

    Lazy loading is another favourite.  Configuring this incorrectly would mean that retrieving parent/child records would run a query per parent and then a query per child.  Configuring it correctly results in a more set based approach.

    I've seen various Twitter/X threads that comment that if you RTFM you are in the 1%.  I can't tell if that is a joke, an observation or both because it rings true.

    My advice to a DBA would be to learn just enough about the tools the development teams use to be able to build yourself a sandpit environment and start experimenting.  If you are on good terms with development teams they will probably help you do this, just as you would help them build a sandpit learning environment for your DBs.

  • I think standards are really important across all aspects of a database and the code written to access and update it.

    I do find it annoying when I have created a consistent table where everything is named with a capital letter and in the most logical way such as "IsActive" but then someone comes along to make an edit and adds something like "cmiddlename" when you would surely look at how the other fields are named and follow that logic.

    It is the same as with a really neat stored procedure when someone wants to add some fields so they add them all on the same line with incorrect indenting. It is just sloppy and hardly takes any more time to make it neat and consistent. At least you can see where the edits are that way though I suppose.

    Often I will generate code as a temporary table that needs to be inserted into a range of different tables and then I will prefix the table name onto each field then it will avoid conflicts and mean I can copy and paste the section out to form the next insert statement I need where I can do a find and replace to quickly remove the prefix again when referencing the field list for insertion.

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply