The Number that shouldn't be a number

  • Wow, I'm reading this today while laying in bed  with my Golden Retriever and drinking coffee while most of you folks are out there slaving away.  I just have to share my perspective on data types from a slightly different angle.  I got my start in IT development 50 years ago, in 1968, using 360 Assembler and COBOL, running systems for a bank - checking, savings, time deposits (CDs), mortgages, the whole range of financial apps, on a 360/25 system with 32K (32768 BYTES) of hard memory, with a 6K 'supervisor' (OS) and on disk drives so small that we had to mount and dismount removable disk packs at various pauses in the processing cycles. 

    Obviously, the two main concerns were how much space a data type required, and how much computing power was consumed doing conversions.  The OS did elementary memory paging for us, so programs consisted of a 'main' and numerous 'overlays' that were paged in and out from disk storage as needed.  Further, we did not have the plethora of built-in data conversion functions that are available today, so you wrote ( or plagiarized )  code to do logic for handling and converting data types.   And the calls to 'subroutines' were 'branch and return'  or 'perform x through x-exit'  (if you were lucky and the developer had used a common return point) and I don't recall that they even used parameters.  Each 'file' had it's own hard-storage for blocks and individual records, and all internal data items were in what COBOL called 'Working Storage', and data elements were  described in Cobol by use of a 'PICTURE' . 

    So the external presentation of data was far secondary to internal storage and manipulation constraints which were  not always easy decisions. 

    Like the old song says, 'Thanks for the Memories'.  And at the risk of being non-PC, Merry Christmas to you all on here.

    Rick
    Disaster Recovery = Backup ( Backup ( Your Backup ) )

  • This article brings up a lot of frustrations we as DBA's see and it generally reminds me that there are a lot of highly paid consultants who really have no idea what they are doing. 
    My favorite examples are the data warehouse "experts" who think indexes only slow down his process, primary keys with 24 columns, and the use float data types for currency values.

  • jarick 15608 - Thursday, December 13, 2018 7:15 AM

    This article brings up a lot of frustrations we as DBA's see and it generally reminds me that there are a lot of highly paid consultants who really have no idea what they are doing. 
    My favorite examples are the data warehouse "experts" who think indexes only slow down his process, primary keys with 24 columns, and the use float data types for currency values.

    Jarick, you are so right about the highly paid consultants' who in the past came in on 'short term' assignments and to whom our superiors would unquestioningly listen, and who then left us with the poor design decisions.  Been there, done that.  At least in later years, the outsiders were more often developers and coders and we could impose OUR design decisions on them.  And we pretty much got to write data access and manipulation code in stored procedures that we provide to the application developers.  That was a huge improvement.

    Rick
    Disaster Recovery = Backup ( Backup ( Your Backup ) )

  • If you're debating whether something like phone number or zip code should be contained as a varchar versus an integer, then I'd rather err on the side of varchar. For example, there really is no space savings for integer based phone numbers. Most phone numbers, here in the US, are 10 digits long, and the BigInt datatype is still 8 bytes fixed. A variable length phone number also can accommodate formatting and extension codes.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • John Corkett - Thursday, December 13, 2018 1:38 AM

    Less serious, but still annoying is using datetimes when you only want to keep a record of the date. Or using datetime2(7) or time(7) when you are only interested in recording a time to the nearest second.

    John

    I had to import data from a 3rd party database that stored the date and time of each transaction in two separate datetime columns; one for the date part and one for the time part. Imagine my joy.

  • This topic amuses me, and I enjoy reading about the pet peeves of the rest of you. I really do feel your pain! 🙂 I'm labelled as a "Business Analyst" though from time to time I have been an occasional developer and an accidental DBA, so maybe I have a different viewpoint to many of you.It seems to me that many of the problems you/we are complaining about can be resolved easily providing they are spotted during "Development" (or Analysis or Design, whatever those words mean in your location). 

    A really good resource for doing this is Phil Factor's idea of SQL Smells It covers "The number that shouldn't be a number" and a good many others. I recommend it to anyone who will listen. It isn't heavy going to read or difficult apply.

    Tom Gillies LinkedIn Profilewww.DuhallowGreyGeek.com[/url]

  • Another issue is date/times contained in VARCHAR columns. 

    The ISO standard for formatting date strings is YYYYMMDD, and the following check constaint will not only require that the string be formatted correctly, but it will also require that the value be a valid calendar date.

    create table foo
    (
    foo_date varchar(30) not null
    constraint ck_foo_date_yyyymmdd
    check (foo_date = convert(char(8),cast(foo_date as date),112))
    );

    insert into foo (foo_date) values ('2011/02/28'); The INSERT statement conflicted with the CHECK constraint "ck_foo_date_yyyymmdd". Feb 29, 2011 is an invalid date. insert into foo (foo_date) values ('20110229'); Conversion failed when converting date and/or time from character string. However, 2012 is a leap year, so Feb 29, 2012 is valid. By leveraging the CAST() function within the check constraint, you don't have to mess with a table of valid dates to do this. insert into foo (foo_date) values ('20120229'); (1 row(s) affected)

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Tom Gillies - Thursday, December 13, 2018 7:36 AM

    This topic amuses me, and I enjoy reading about the pet peeves of the rest of you. I really do feel your pain! 🙂 I'm labelled as a "Business Analyst" though from time to time I have been an occasional developer and an accidental DBA, so maybe I have a different viewpoint to many of you.It seems to me that many of the problems you/we are complaining about can be resolved easily providing they are spotted during "Development" (or Analysis or Design, whatever those words mean in your location). 

    A really good resource for doing this is Phil Factor's idea of SQL Smells It covers "The number that shouldn't be a number" and a good many others. I recommend it to anyone who will listen. It isn't heavy going to read or difficult apply.

    I like the way red-gate has codified quite a few of these code smells into SQLPrompt as part of its automatic Code Analysis functionality, popping up a little warning notification in the margin in SSMS and underlining bits of your query...

    Thomas Rushton
    blog: https://thelonedba.wordpress.com

  • GilaMonster - Tuesday, February 3, 2015 4:41 AM

    ....  Time is a point in time, not a time span. ...

    Tell that to Microsoft, or at least the people who created the data objects in .Net.
    If you read a column with the Time data type in .Net it shows up as a TimeSpan data type.
    I ended up converting it to a string, concatenating it to a date and converting it to DateTime.

  • Worst data types: I've seen money values, that are usually < 100 EUR be stored in DECIMAL(38,17) columns (and many percentage columns with the same declaration too).

    God is real, unless declared integer.

  • Eric M Russell - Thursday, December 13, 2018 7:47 AM

    The ISO standard for formatting date strings is YYYYMMDD

    What ISO standard is that?
    ISO 8601 standard states:
    "ISO 8601 tackles this uncertainty by setting out an internationally agreed way to represent dates:

    YYYY-MM-DD

    For example, September 27, 2012 is represented as 2012-09-27."

    https://www.iso.org/iso-8601-date-and-time-format.html

  • andrew gothard - Tuesday, February 3, 2015 6:06 AM

    Aye. Just had one of those bought from a third party. And another one at the other extreme. Many, many indexes. As in 41 indexes on 20 column table, and 33 indexes on another 21 column table (no. not mistypes) - and the general pattern is repeated throughout the database. Vast majority on a single column. Ohhh, the number with names starting _dta gives much away. This is OLTP, in case you're wondering. There are There are, in total, 2650 indexes in the database. Of which during the testing, initial setup and training fewer than 5% have been hit.Still - at least the tables have PK's. Every. Single. One. A. GUID. Clustered! <- This bit's on topic!Apparently "The Box Is Slow". 10 seconds to add a record at times, with the db practically empty - certainly compared to the volumes it's going to have to take. So even on a massively over specced machine for what it SHOULD need, with a PLE of 42 days with only 80% of memory used and a buffer cache hit ratio of 100% - performance has been raised as a concern.I've pointed out this is never going to scale, at all, in that state. Reply came there non.

    Our 3rd party database is like this.  Single column, for the major columns.  And they're rewriting the app, and so have added the same index again with a prefix of the application initials (if that makes sense).  Without deleting the original.  So, lots of single column indexes that are duplicated.

  • Tom Gillies - Sunday, February 8, 2015 9:08 AM

    I agree with GPO, who asks "Question is why does this scenario play out over and over again?"I think one of the reasons, which GPO touches on, is an accumulation of small independent applications. Another, is where data gets migrated from one database manager to another, or _especially_ from files to a database manager (I wonder if that is how the "Male/Female" 2 integers came about? I've never come across that one!). Regarding "small applications", I've created a few in my time, and I've probably transgressed a few times too. The problem isn't so much the independent applications as when they grow, multiply and then try to join up with other applications. One answer is "data modelling" which I used to claim to do, but that answer never used to be acceptable. It was always "too much trouble" for "too little to show for it". I guess another question to add to GPO's is "Someone having got into the situation, how do we get them (and ourselves) out of it?" I don't expect any obvious answers, I've been struggling with that one for years!;-)

    IMO, the problem comes from "Wanted: Full Stack Developer".  Which usually means OO developer designing a database.  I talked to a company owner of our 3rd party app at a User's Group Conference and said it seemed like a lot of the issues sounded like they originated with OO developers taking the lead.  He agreed, and said dedicated database developers were hard to find (they are based in PA).  The company was bought out and all the founders forced out about 6 months later....

  • t.franz - Thursday, December 13, 2018 8:46 AM

    Worst data types: I've seen money values, that are usually < 100 EUR be stored in DECIMAL(38,17) columns (and many percentage columns with the same declaration too).

    Perhaps the data modeler is hedging against monetary hyper inflation.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell - Thursday, December 13, 2018 12:19 PM

    t.franz - Thursday, December 13, 2018 8:46 AM

    Worst data types: I've seen money values, that are usually < 100 EUR be stored in DECIMAL(38,17) columns (and many percentage columns with the same declaration too).

    Perhaps the data modeler is hedging against monetary hyper inflation.

    Why all the decimals then?

Viewing 15 posts - 91 through 105 (of 113 total)

You must be logged in to reply to this topic. Login to reply