Database Development Made Easy?

  • Avi1 - Wednesday, August 30, 2017 12:49 PM

    I agree, i think this happens at most of workplace. I too hear that, we don't have time, we'll fix later

    I will second that! Sadly it applies in most areas from the original requirements being unclear to a lack of documentation within the code itself. Also developers/DBAs are often pressurised to do things that have not been fully thought through. Over a year ago I had to split a table purely for performance when what was needed was a RAM upgrade. This database is merged at irregular intervals - last time predated the change - with two others for data mining. The knock on effect on the "merging code" has probably cost more than the RAM upgrade would have!

    What staggered me at my last two companies was that four graduates that claimed to have done relational databases in various Computer Science degrees had minimal knowledge. Most did not understand normalisation and one even did not know what an index was. Two others that stood out as really knowing the subject came from the Universities of Hertfordshire and Manchester respectively. The variation in the quality of teaching/lecturing should not be so vast!

  • mjh 45389 - Thursday, August 31, 2017 6:16 AM

    What staggered me at my last two companies was that four graduates that claimed to have done relational databases in various Computer Science degrees had minimal knowledge.

    To be fair to the uni.s in question, they surely had normalisation and Codd's 12 rules on their curriculum.
    Whether the students were interested, were awake, were even *at* the lectures [1] is probably more relevant and worth asking said graduates.

    [1] I hated 9am lectures...

  • mjh 45389 - Thursday, August 31, 2017 6:16 AM

    What staggered me at my last two companies was that four graduates that claimed to have done relational databases in various Computer Science degrees had minimal knowledge. Most did not understand normalisation and one even did not know what an index was. Two others that stood out as really knowing the subject came from the Universities of Hertfordshire and Manchester respectively. The variation in the quality of teaching/lecturing should not be so vast!

    University/College is not and should not be job training if it's done right.

  • Sean Redmond - Thursday, August 31, 2017 7:25 AM

    mjh 45389 - Thursday, August 31, 2017 6:16 AM

    What staggered me at my last two companies was that four graduates that claimed to have done relational databases in various Computer Science degrees had minimal knowledge.

    To be fair to the uni.s in question, they surely had normalisation and Codd's 12 rules on their curriculum.
    Whether the students were interested, were awake, were even *at* the lectures [1] is probably more relevant and worth asking said graduates.

    [1] I hated 9am lectures...

    That's why I did not name them although two attended the same uni! If I had been interviewing they would probably not have been offered a job but in every case they were interviewed by managers with limited technical knowledge!

  • ZZartin - Wednesday, August 30, 2017 12:11 PM

    jasona.work - Wednesday, August 30, 2017 6:31 AM

    I'd suspect some of the things software developers see as "problems" when developing a database (such as the example in the editorial about dropping a column, then wondering why the data was gone when it was re-created,) is because software developers expect the database to behave just like the application does.

    Whooo!!! let's make everything an EAV then we'll never have to touch the database again and can add fields whenever we want!!!!

    Heh... we'll only need 1 or two indexes, never have to worry about some queries being slower than others, and will never have to worry about that pesky DRI stuff.  Yeah... let's do that! 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • ZZartin - Thursday, August 31, 2017 7:36 AM

    mjh 45389 - Thursday, August 31, 2017 6:16 AM

    What staggered me at my last two companies was that four graduates that claimed to have done relational databases in various Computer Science degrees had minimal knowledge. Most did not understand normalisation and one even did not know what an index was. Two others that stood out as really knowing the subject came from the Universities of Hertfordshire and Manchester respectively. The variation in the quality of teaching/lecturing should not be so vast!

    University/College is not and should not be job training if it's done right.

    I was questioning their understanding of relational databases not their ability to do a job.

  • I believe we should always have a DB server just for developers to do some fun coding and get understanding of different functionalities. In our development environment we have few databases specifically for that. This helps developers to grow as well and have more fun with SQL.

  • Steve Jones - SSC Editor - Wednesday, August 30, 2017 8:51 AM

    jasona.work - Wednesday, August 30, 2017 6:31 AM

    I'd suspect some of the things software developers see as "problems" when developing a database (such as the example in the editorial about dropping a column, then wondering why the data was gone when it was re-created,) is because software developers expect the database to behave just like the application does.

    This is a never ending source of frustration for many developers. The fundamentally don't get this.

    So very true.  I've found that projects that are very data dependent are more likely to succeed when a data person (such as a data architect or design oriented DBA) and a UI focused staff run the project.  When coders run the project, you tend to get a VCR no one can program.

    The more you are prepared, the less you need it.

  • Rants about sucky developers aside, SSDT actually is a fairly decent development tool. I use it pretty much daily. It integrates with source code control quite well, and I can "build" my database to catch various issues. Overall, it's quite helpful.

    Where it falls down in my opinion is database deployment, specifically incremental deployment after you've got a live database with data. I don't use SSDT for that, Finding diffs and generating scripts and source code control, sure, but if it comes to potentially destructive changes involving existing data...no way.

    I'm still trying to wrap my head around "devops" as it pertains to database deployment. I know RedGate has a lot of tools for that, but how do you deal with those potentially dangerous/destructive changes without manually coding? How, for example, can I automate the changing of a column datatype on a table with 200 million records? Script out an ALTER TABLE, run it and all done? No, I don't think so. A live database is stateful, and development against a live database has to maintain state. That can often be difficult.

    So, what tools are there that help in that regard? That's what database developers really need. The only tools I've found that's useful so far are experience and care.

  • dmbaker - Friday, September 1, 2017 7:47 AM

    I'm still trying to wrap my head around "devops" as it pertains to database deployment. I know RedGate has a lot of tools for that, but how do you deal with those potentially dangerous/destructive changes without manually coding? How, for example, can I automate the changing of a column datatype on a table with 200 million records? Script out an ALTER TABLE, run it and all done? No, I don't think so. A live database is stateful, and development against a live database has to maintain state. That can often be difficult.

    So, what tools are there that help in that regard? That's what database developers really need. The only tools I've found that's useful so far are experience and care.

    There's no magic to avoid coding, or avoid potential problem issues. The tools we have at Redgate speed up development by handling some tedious tasks or generating some code, but they don't avoid potential mistakes or problems. The rules for making changes in SQL Server don't change with any tools.

    DevOps is about trying to smooth the deployment process and get feedback to developers quicker. It's about sharing your knowledge and ensuring that information moves left and right along the development pipeline.

    It DOES NOT  mean your code gets better or your developers are smarter. You can't avoid manual coding these changes, and nor should you. The potential issues are complex and vary by environment. DevOps will smooth the way you deploy that code you write. You decide how to handle that change in your environment. (change type, or add new col/move data/drop old col later). Both SSDT and RG tools give you ways to ensure this is reliably deployed to downstream environments. SSDT would be pre/post, RG has scripting embedded.

    NO TOOL  makes this easy  and does it well. If they claim that, they're lying, and untrustworthy.
    Rules for  database changes and their impact do not change from Waterfall to Scrum to Lean, or in a DevOps world.

  • Steve Jones - SSC Editor - Friday, September 1, 2017 11:22 AM

    NO TOOL  makes this easy  and does it well. If they claim that, they're lying, and untrustworthy.
    Rules for  database changes and their impact do not change from Waterfall to Scrum to Lean, or in a DevOps world.

    Yeah, that was sort-of my point. Maintaining the statefulness of a database while you change it can be difficult. Automating deployment (a key component of DevOps) of non-database code is achievable and generally pretty straightforward. Automating database deployment, specifically with regard to having to maintain the state of the data, not so much -- there's still a lot of manual, one-off coding that one must do. But then, that would be expected of any system where you're having to maintain state, There's just no substitute for being careful, and having the experience to know what to do.

  • ZZartin - Thursday, August 31, 2017 7:36 AM

    mjh 45389 - Thursday, August 31, 2017 6:16 AM

    What staggered me at my last two companies was that four graduates that claimed to have done relational databases in various Computer Science degrees had minimal knowledge. Most did not understand normalisation and one even did not know what an index was. Two others that stood out as really knowing the subject came from the Universities of Hertfordshire and Manchester respectively. The variation in the quality of teaching/lecturing should not be so vast!

    University/College is not and should not be job training if it's done right.

    If someone claims to have learnt about relational databases as part of a Computer Science degree but doesn't know about normalisation or about indexes one can conclude one of two things: either they are lying (they didn't do the relational database part of the course) or the University's coverage of relational database theory is false - I can't see any other possibility. 

    And teaching about normalisation and teaching about indexes isn't training for a job, it's teaching a part of elementary computer science.  

    And sadly, there are plenty of Universities which are doing neither job training nor CS education - places where the students are taught to write C++ without any attempt to teach them about the various classes of programming language, about type theory, about error management (detection, containment, recovery, reporting, repair), about design, about modularity, about system/program structure/architecture, about orders of complexity, about well-known widely -used algorithms or about unit testing or system testing or acceptance testing or indeed any sort of testing; then this is disguised as computer science by including single short lectures on what OO progrmming is, what a relational database is, hardware architecture, and maybe half a dozen other randomly chosen topics over the three or four years of study, and this inevitably results in people with a CS degree with first class honours (summa con laude on the other side of the pond) who think they know it all but actually are capable only of churning out reams of bad C++ that's so badly structured that debugging it is close to impossible and once debugged its performance is terrible because the structure is a mess and the wrong algorithms are used everywhere; it may be possible to turn such a person into a competent developer or dba if they are willing to accept that they are not competent to do those jobs and must learn some real CS and get some real job training, but it will be hard work - in my experience it was much easier to train someone without  degree at all or with a degree in something not related to computing, e.g. in music or in foreign languges or in classics and philosophy.

    My attitude to someone with a degree in CS that they claim "includes relational database theory" who can't tell me what 1NF means is much like Jeff M's opinon of people with "10 years experience of SQL server" who don't know how to get the current date and time.

    Tom

  • Back a few years ago, we had very OO developers, who'd never seen a stored procedure that did any kind of multiple statements.  They viewed the DB as a big can of data. All work was done in the code.

  • miapjp - Friday, September 1, 2017 4:07 PM

    Back a few years ago, we had very OO developers, who'd never seen a stored procedure that did any kind of multiple statements.  They viewed the DB as a big can of data. All work was done in the code.

    How'd that work out for ya?  EBay does the same thing.  It might be why they actually do need 600 servers. 😉

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Heh... I have a saying about designing databases... "If you think it was easy, you did something wrong".  There are a lot of "standard" and "best practice" things that must be done for 90 or 95% of the tables but that other 5 or 10% can really make it or break it.

    The other issue that I have is that people frequently design "off the cuff" even when following some "standard" or "best practice" method.  They frequently don't actually follow a written standard and, when they do, it's frequently inadequate because it frequently doesn't contain such things as specific naming standards for the nuances of the data.

    For example, we have a database developed by a 3rd party vendor.  It's pretty obvious that either the single person designing it was a complete and totally incompetent idiot or it was designed by a half dozen "developers'.  Either of those two can be the only reason why an identifier for a batch number was different in every bloody table in which it appeared and frequently with different data types.

    Batch - NUMERIC(18,0)
    BatchID - BIGINT
    BatchNum - INT
    BatchNo - INT
    BatchNumber - NUMERIC(9,0)
    Batch_ID - I forget the datatype
    Batch_Num - I forget the datatype

    I forget the rest of the manifestations but you get the idea.  Designing a database is just a part of it.  You have to design the data, publish a standard, and do peer reviews to make sure that everyone follows the standard.

    The real fun starts when you find out things like all changes to certain tables must appear in an audit table.  I can just imagine the conversation that my predecessors had... "We'll conditionally audit every column in an EAV fashion and we'll use this handy CLR Trigger we found on the internet so that we don't actually have to write any trigger code.  We'll just copy and paste".  They didn't even think about the fact that the 2 logical trigger tables are out of scope for the CLR and must be fully materialized before a comparison can be done.  That's a part of the reason why it took 4 minutes to update just 10,000 rows on just 4 columns on a 138 column ("design was {easy}" :blink:) table.  To add insult to injury for singleton inserts, the "developers" would first poke in an "empty record" so that they could get the @@IDENTITY (remember, folks, there's a trigger on the table and that doesn't work right here) and then update the row with the data.  Once they figured out what they were doing wrong and used SCOPE_IDENTITY() instead, they still didn't figure out that the trigger was unnecessarily recording data because we weren't supposed to be auditing inserts (and, contrary to most opinion, it's never necessary to do so).

    Yeah... good "easy" design. :sick:  No software tool in the world will prevent such poor "design".

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

Viewing 15 posts - 16 through 30 (of 40 total)

You must be logged in to reply to this topic. Login to reply