How to study the layout of a database?

  • jcelko212 32090 - Sunday, November 18, 2018 3:39 PM

    I think it’s an excellent question 🙂, and I wish more people would ask it. The first obvious question is “what the heck is this thing supposed to do?†If you know that, then you can make some guesses things that should be there. For example, if this schema deals with vehicles, then you’d expect to see a VIN. If it deals with addresses, then you’d expect to see CASS standards applied to the addresses. If you deal with money and currencies, you know that anybody who storing currency data is floating-point is committing a crime under EU and GAAP regulations.

    I also have a suggestion that while you’re inspecting the schema, you do what should have been done at the start and build at least a data dictionary. The thing that will drive you nuts is when the same data element has multiple names. Sometimes this guy is a “customer_idâ€, sometimes “cust_idâ€, sometimes “ry1234†or worse. I truly wish people would learn the basic ISO 11179 naming rules; it won’t prevent bad names, but it will at least minimize them. You

    The other thing that will drive you nuts is that the same data element under the various pseudonyms and aliases will be cast in different datatypes. Sometimes he’s CHAR(10), or maybe VARCHAR (50) or maybe he’s numeric. This gets be real cute problem when you try to use views and at various levels of nesting, the data element name changed.
    After that, I’d like to see at least a high-level ER diagram. There are various tools for this.

    Wow, this is gold! I had no idea these are some things I must look out for. So far my knowledge and practice is based entirely on sample databases, so I have not encountered any of the issues you described. Do you know of any resources that provides a "check-list" of what I need to look for to ensure the entire database tables and columns are consistent? If there isn't anything like that then I will make such a list on my own and share on this forum for other members to add to the list. Thanks so much for your advises!

  • you know that anybody who storing currency data is floating-point is committing a crime under EU and GAAP regulations.

    @JoeCelko, I'd be interested to see a reference which backs this up, if you have one.


  • jcelko212 32090 - Sunday, November 18, 2018 3:39 PM

    If you deal with money and currencies, you know that anybody who storing currency data is floating-point is committing a crime under EU and GAAP regulations..

    So anybody using a digital calculator for money and currencies is committing a crime ?   😀:sick:

    Ben

  • It's entirely possible the DBA can't help much when it comes to understanding the logical model of the database, especially if it's part of a 3rd party vendor application, because the DBA simply had no role in developing it in the first place. Even the most popular CRM and accounting packages that organizations spend millions of dollars per year to license have database designs that are incomprehensible when it comes to naming conventions and normalization. You may even find that most of the tables in the database are either legacy or just there to support a feature set that isn't relevant to the version of the application you're running.

    But even if the logical database model is deficient in terms of self documenting, the SQL language itself is very declarative, so you can learn a lot about the form and function of a database model by examining the SQL queries contained in views and stored procedures. Also, if you want to know where something like the month end report gets it's data, you can run a SQL Profiler trace while the report is running and capture the SQL queries that way.

    "Do not seek to follow in the footsteps of the wise. Instead, seek what they sought." - Matsuo Basho

  • Eric M Russell - Tuesday, November 20, 2018 8:40 AM

    It's entirely possible the DBA can't help much when it comes to understanding the logical model of the database, especially if it's part of a 3rd party vendor application, because the DBA simply had no role in developing it in the first place. Even the most popular CRM and accounting packages that organizations spend millions of dollars per year to license have database designs that are incomprehensible when it comes to naming conventions and normalization. You may even find that most of the tables in the database are either legacy or just there to support a feature set that isn't relevant to the version of the application you're running.

    But even if the logical database model is deficient in terms of self documenting, the SQL language itself is very declarative, so you can learn a lot about the form and function of a database model by examining the SQL queries contained in views and stored procedures. Also, if you want to know where something like the month end report gets it's data, you can run a SQL Profiler trace while the report is running and capture the SQL queries that way.

    Hi Eric,

    Do you know what type of SQL queries I can run to help with this? I tried searching the Microsoft documentation page, but am not sure where I should look for this. Any resource or advice you may have will be highly appreciated

  • ben.brugman - Tuesday, November 20, 2018 5:26 AM

    jcelko212 32090 - Sunday, November 18, 2018 3:39 PM

    So anybody using a digital calculator for money and currencies is committing a crime?   😀:sick:

    Rounding might actually be worth writing an article. Good idea! I haven't looked at the current rules in a while, so that would make me do some research. Actually, I've seen someone get in trouble for using a pocket calculator.

    He rounded every item in a long list either up or down (I camber which direction), instead of varying it from item to item. Then he did his summation. The errors accumulated in one direction and messed up the final results. Suddenly, what had been done on the mainframe didn't match his calculations.

    ToEven is also known as 'Banking Rules', it the default used in IEEE 754 floating-point standards, which is why it's the default in .NET.

    Conversely, AwayFromZero is also known as 'Commercial Rounding'. I don't know why it is the default of SQL Server. It's very widely known and SQL has been used in commercial environments.

    Euro conversion gets even weirder. If you go from one non-euro currency to another, you have to convert the first currency to euros, then convert the euros to the second currency. It's called triangulation. The problem is that transactions occur so fast that the rates can change during the calculations - ARRGH :sick:!

    I'm going to assume by now everybody seen the problems with the old Sybase MONEY datatypes. Basically, if you do multiplication and division, it rounds too soon and only carries things out to four decimal places.

    Please post DDL and follow ANSI/ISO standards when asking for help. 

  • jcelko212 32090 - Tuesday, November 20, 2018 10:28 AM

    ben.brugman - Tuesday, November 20, 2018 5:26 AM

    jcelko212 32090 - Sunday, November 18, 2018 3:39 PM

    So anybody using a digital calculator for money and currencies is committing a crime?   😀:sick:

    Rounding might actually be worth writing an article. Good idea! I haven't looked at the current rules in a while, so that would make me do some research. Actually, I've seen someone get in trouble for using a pocket calculator.

    He rounded every item in a long list either up or down (I camber which direction), instead of varying it from item to item. Then he did his summation. The errors accumulated in one direction and messed up the final results. Suddenly, what had been done on the mainframe didn't match his calculations.

    ToEven is also known as 'Banking Rules', it the default used in IEEE 754 floating-point standards, which is why it's the default in .NET.

    Conversely, AwayFromZero is also known as 'Commercial Rounding'. I don't know why it is the default of SQL Server. It's very widely known and SQL has been used in commercial environments.

    Euro conversion gets even weirder. If you go from one non-euro currency to another, you have to convert the first currency to euros, then convert the euros to the second currency. It's called triangulation. The problem is that transactions occur so fast that the rates can change during the calculations - ARRGH :sick:!

    I'm going to assume by now everybody seen the problems with the old Sybase MONEY datatypes. Basically, if you do multiplication and division, it rounds too soon and only carries things out to four decimal places.

    That's great information and I absolutely agree about the problems with using the MONEY datatype in currency calculations but what does this have to do with the "floating-point" calculations you spoke of previously?  What do you use, instead?

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • jcelko212 32090 - Sunday, November 18, 2018 3:39 PM

    If you deal with money and currencies, you know that anybody who storing currency data is floating-point is committing a crime under EU and GAAP regulations.

    kindly back this statement with links to the relevant EU legislation.

  • primitivefuture2006 - Tuesday, November 20, 2018 10:08 AM

    Eric M Russell - Tuesday, November 20, 2018 8:40 AM

    It's entirely possible the DBA can't help much when it comes to understanding the logical model of the database, especially if it's part of a 3rd party vendor application, because the DBA simply had no role in developing it in the first place. Even the most popular CRM and accounting packages that organizations spend millions of dollars per year to license have database designs that are incomprehensible when it comes to naming conventions and normalization. You may even find that most of the tables in the database are either legacy or just there to support a feature set that isn't relevant to the version of the application you're running.

    But even if the logical database model is deficient in terms of self documenting, the SQL language itself is very declarative, so you can learn a lot about the form and function of a database model by examining the SQL queries contained in views and stored procedures. Also, if you want to know where something like the month end report gets it's data, you can run a SQL Profiler trace while the report is running and capture the SQL queries that way.

    Hi Eric,

    Do you know what type of SQL queries I can run to help with this? I tried searching the Microsoft documentation page, but am not sure where I should look for this. Any resource or advice you may have will be highly appreciated

    Seeing what a view or stored procedure contains, code-wise, is actually fairly easy and doesn't require any software beyond Management Studio (SSMS)  Simply pick a view or stored procedure, right-click it, and choose the option "Script {Item} as -> Create To -> New Query window"  This will generate the code necessary to create the object in question, which you can then review at your leisure.

    So, with a view (which is basically a saved select query) you can see what tables it's joining together and on what column(s), which will give you some idea as to the relationship between those two tables.

    One suggestion I've not seen as well, if you want to be very paranoid about poking around in a "live" database, take a backup of it and restore it to a separate server, so that if you do break something in the database (oops, I dropped the logins table) you didn't break production.  (Note:  Check with your management, etc about this first, it may not be allowed!)

  • frederico_fonseca - Tuesday, November 20, 2018 12:43 PM

    jcelko212 32090 - Sunday, November 18, 2018 3:39 PM

    If you deal with money and currencies, you know that anybody who storing currency data is floating-point is committing a crime under EU and GAAP regulations.

    kindly back this statement with links to the relevant EU legislation.

    I'm going to give Joe a pass on this one (lol like he cares whether I give him a pass or not). I think its generally agreed not to store money in floats, unless you restrict the number to having no decimal points, and even then, why not just use integers if you're going to count pennies, etc 

    The closest I found to an actual regulation was this bit on the web:

    https://opendata.stackexchange.com/questions/10346/what-specifications-are-out-there-for-the-precision-required-to-store-money

    I doubt that'll satisfy the folks who just want to catch Mr. Celko typing something wrong on the internet but at least the link chats about the subject.

  • patrickmcginnis59 10839 - Wednesday, November 21, 2018 6:47 AM

    frederico_fonseca - Tuesday, November 20, 2018 12:43 PM

    jcelko212 32090 - Sunday, November 18, 2018 3:39 PM

    If you deal with money and currencies, you know that anybody who storing currency data is floating-point is committing a crime under EU and GAAP regulations.

    kindly back this statement with links to the relevant EU legislation.

    I'm going to give Joe a pass on this one (lol like he cares whether I give him a pass or not). I think its generally agreed not to store money in floats, unless you restrict the number to having no decimal points, and even then, why not just use integers if you're going to count pennies, etc 

    The closest I found to an actual regulation was this bit on the web:

    https://opendata.stackexchange.com/questions/10346/what-specifications-are-out-there-for-the-precision-required-to-store-money

    I doubt that'll satisfy the folks who just want to catch Mr. Celko typing something wrong on the internet but at least the link chats about the subject.

    Absolutely agreed on this one.  That's why I asked Joe what the specs actually were (a reference link or two from him would be great) and what he used in such cases.  Way too many people "format" the interim answers way too early in a series of calculations and then wonder why Granny with her 4 function calculator is able to catch "penny errors" on things like mortgage interest calculations, etc, .  Add to that the silent automatic severe reduction in scale that can occur with the Decimal and Numeric datatypes  (which some folks aren't aware of) in SQL Server and you can end up with some real problems. 
    https://docs.microsoft.com/en-us/sql/t-sql/data-types/precision-scale-and-length-transact-sql?view=sql-server-2017

    Of course, there are also problems with FLOAT datatypes, as well, and so it would be interesting and helpful if he let us know how he solves these types of problems, especially when it comes to meeting the requirements of EU and GAAP regulations.

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • ben.brugman - Tuesday, November 20, 2018 5:26 AM

    jcelko212 32090 - Sunday, November 18, 2018 3:39 PM

    If you deal with money and currencies, you know that anybody who storing currency data is floating-point is committing a crime under EU and GAAP regulations..

    So anybody using a digital calculator for money and currencies is committing a crime ?   😀:sick:

    Ben

    Yes. Once you turn your calculator off, it loses the account balances. This goes against GAAP at least!

  • Jeff Moden - Wednesday, November 21, 2018 8:35 AM

    patrickmcginnis59 10839 - Wednesday, November 21, 2018 6:47 AM

    frederico_fonseca - Tuesday, November 20, 2018 12:43 PM

    jcelko212 32090 - Sunday, November 18, 2018 3:39 PM

    If you deal with money and currencies, you know that anybody who storing currency data is floating-point is committing a crime under EU and GAAP regulations.

    kindly back this statement with links to the relevant EU legislation.

    I'm going to give Joe a pass on this one (lol like he cares whether I give him a pass or not). I think its generally agreed not to store money in floats, unless you restrict the number to having no decimal points, and even then, why not just use integers if you're going to count pennies, etc 

    The closest I found to an actual regulation was this bit on the web:

    https://opendata.stackexchange.com/questions/10346/what-specifications-are-out-there-for-the-precision-required-to-store-money

    I doubt that'll satisfy the folks who just want to catch Mr. Celko typing something wrong on the internet but at least the link chats about the subject.

    Absolutely agreed on this one.  That's why I asked Joe what the specs actually were (a reference link or two from him would be great) and what he used in such cases.  Way too many people "format" the interim answers way too early in a series of calculations and then wonder why Granny with her 4 function calculator is able to catch "penny errors" on things like mortgage interest calculations, etc, .  Add to that the silent automatic severe reduction in scale that can occur with the Decimal and Numeric datatypes  (which some folks aren't aware of) in SQL Server and you can end up with some real problems. 
    https://docs.microsoft.com/en-us/sql/t-sql/data-types/precision-scale-and-length-transact-sql?view=sql-server-2017

    Of course, there are also problems with FLOAT datatypes, as well, and so it would be interesting and helpful if he let us know how he solves these types of problems, especially when it comes to meeting the requirements of EU and GAAP regulations.

    Well I would think floats are out of the question for storing currency like Joe says. Obviously this assumes people have at least a bit of familiarity with numerical analysis topics when arranging for calculations but I wouldn't hold my breath on that.

  • patrickmcginnis59 10839 - Wednesday, November 21, 2018 11:35 AM

    Jeff Moden - Wednesday, November 21, 2018 8:35 AM

    patrickmcginnis59 10839 - Wednesday, November 21, 2018 6:47 AM

    frederico_fonseca - Tuesday, November 20, 2018 12:43 PM

    jcelko212 32090 - Sunday, November 18, 2018 3:39 PM

    If you deal with money and currencies, you know that anybody who storing currency data is floating-point is committing a crime under EU and GAAP regulations.

    kindly back this statement with links to the relevant EU legislation.

    I'm going to give Joe a pass on this one (lol like he cares whether I give him a pass or not). I think its generally agreed not to store money in floats, unless you restrict the number to having no decimal points, and even then, why not just use integers if you're going to count pennies, etc 

    The closest I found to an actual regulation was this bit on the web:

    https://opendata.stackexchange.com/questions/10346/what-specifications-are-out-there-for-the-precision-required-to-store-money

    I doubt that'll satisfy the folks who just want to catch Mr. Celko typing something wrong on the internet but at least the link chats about the subject.

    Absolutely agreed on this one.  That's why I asked Joe what the specs actually were (a reference link or two from him would be great) and what he used in such cases.  Way too many people "format" the interim answers way too early in a series of calculations and then wonder why Granny with her 4 function calculator is able to catch "penny errors" on things like mortgage interest calculations, etc, .  Add to that the silent automatic severe reduction in scale that can occur with the Decimal and Numeric datatypes  (which some folks aren't aware of) in SQL Server and you can end up with some real problems. 
    https://docs.microsoft.com/en-us/sql/t-sql/data-types/precision-scale-and-length-transact-sql?view=sql-server-2017

    Of course, there are also problems with FLOAT datatypes, as well, and so it would be interesting and helpful if he let us know how he solves these types of problems, especially when it comes to meeting the requirements of EU and GAAP regulations.

    Well I would think floats are out of the question for storing currency like Joe says. Obviously this assumes people have at least a bit of familiarity with numerical analysis topics when arranging for calculations but I wouldn't hold my breath on that.

    While I agree that there are issues with FLOATs vs currency, I would like to see the official spec where Joe says it's a "committing a crime under EU and GAAP regulations".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Jeff Moden - Wednesday, November 21, 2018 1:22 PM

    While I agree that there are issues with FLOATs vs currency, I would like to see the official spec where Joe says it's a "committing a crime under EU and GAAP regulations".

    I do support this.

    We should be aware of roundoff (and other problems) using dataformats see:
    Deadly Rond-Off Error
    (29 killed 'because' off Round-Off). **)
    Short description of Numerical Errors.
    ***)

    But, most people are guilty of using Spreadsheets for financial calculations, storage and desision making. Most Spreadsheats store numerical information in Floats. And for other formats there are problems as wel, for example 08:00 in the morning can not be represented exactly in a MS spreadsheet, it is stored in a float as 1/3 of the 'date' and as we know 1/3 can not be represented exactly in a binairy system. So most of us are 'guilty' :blush:

    And then money (or integer) types can not store financial information if fractions of financial costs have to be represented. For example a 1000 items cost a specific amount of money and the amount for a single item is also stored in some database. These things happen.

    So we should be aware of these types of problems. But there is no single solution.

    Ben

    **)
    The 29 killed in the 1991 Dhahran attack was attrributed to the roundoff error. But this was caused by operating the Patriot system outside its operating parameters and it is not clear if the patriot would have disabled the scud.
    ***)
    The roundoff error for the Ariane 5 failure. Before the accident there had been intensive studies which parameters should have which formats and risc calculations were done. But they where done for the Arianne 4 vehicle and some systems where not reassessed for the Arianne 5 and kept as they were. Regrettably this offending parameter was not reassessed and what was even worse for the Arianne 5 vehicle this was in an obsolete system, removing the system was considered a risc and therefore it remained. This resulted in the auto destruction of the Arianne 5, because the parameter went out of bounds.

Viewing 15 posts - 16 through 30 (of 31 total)

You must be logged in to reply to this topic. Login to reply