Documenting Database Code: Structured Headers

  • Comments posted to this topic are about the item Documenting Database Code: Structured Headers

    Best wishes,
    Phil Factor

  • It's worth noting that pre-2008 the dependencies catalogue might not list all the dependencies.

    Dependencies were tracked by object id, so if you create the objects out of order, or drop and recreate a dependency then the system looses track of that dependency.

    http://msdn.microsoft.com/en-us/library/ms345449(SQL.90).aspx

    This has since changed in sql server 2008

    http://msdn.microsoft.com/en-us/library/ms345449.aspx

    * edited - fixed urls

  • I've always struggled with this. Red Gate's Dependency Tracker (grown out of a conversation that Andy and I had with Red Gate) really helps if you depend on these things.

    I think one of the reasons I can be a control freak as a DBA is to ensure that I know, or can find out, what all objects are for.

  • I think your idea is not only excellent, its vital - but your approach is a bit sterile and well, in my view, only partially useful.

    Structured headers would be fine and probably a good idea, but that does not really translate over time to "useful commenting". I have been at this game for three decades and I have learned how important code commenting is.

    The reason you comment your code is this: Someday, somewhere, you or someone else will be going back to your code and trying to figure out not just what you were doing, but what you were thinking when you did it. Useful comments can save a company many hours of re-analysis and since code of any kind is a company asset, it needs to be commented just as companies store records away for possible later retrieval with notes about the records.

    Many people have said its silly to comment things like stored procedures - a good coder can read them and know whats going on. Baloney! Suppose a new hire is trying to get their feet wet, comments speed that process. Suppose an analyst is tracking down an error, or some cascaded effect, comments make that process fruitful. Since comments don't cost anything or inhibit the code itself - the more, the better.

    In our organization, when we do code reviews, if a developer has not commented their code, its rejected and returned. I do not allow any uncommented code in our repository. As said, this is a company asset and I want to make sure that when that inevitable day comes, possibly long after I am gone, "old code" which may in fact be vital code, remains useful code because it is well commented for any new viewer of it to see, and understand.

    Commenting and documenting your code is as important an ability as coding itself, because the code you are commenting/documenting is a company asset, and who wants an asset that no one knows what its for, how it works, or why it was created.

    There's no such thing as dumb questions, only poorly thought-out answers...
  • Blandry, I miss that. Too many places I've worked in the last decade didn't require code reviews. It seemed once Y2K passed without a big issue, lots of companies just didn't worry about documentation (again).

  • I like the idea of an XML structure for commenting. Never thought of that myself, and haven't run into it as a practice.

    Where documentation standards have been set for me, as opposed to me setting them myself, they've been similar, but not standardized. It's been a comment block at the beginning, with the name of the author, the creation date, and something short and sweet about what it's for.

    Personally, I like comments in the body of the proc/function, that say what each section does, at the very least. Helps in debugging/enhancing later on. Not needed, of course, for something that's just a parameterized select or other single-statement CRUD procs.

    But that's where it breaks down, isn't it? "Oh, this is so simple it can't possibly need to be commented and have full, formal documentation!" Sure, makes sense. But what's the break-point where it needs documentation? Or is it a gradient from, "needs a few comments" to "needs full-on documentation", depending on some complexity judgement of the code?

    - Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
    Property of The Thread

    "Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon

  • i'm terrible with documentation. i put descriptions into procedures in a comment block up top and have additional brief descriptions for the business functionality of discrret sections within them. i don't think i've ever commented a table (never even occured to me) and have certainly never written up anything like a comprehensive document for overall database structure. I'd like to get in the habit, since on more than one occassion i've been asked to review code i or a co-worker wrote 2 years before to explain why it's suddenly "not working" (bollocks, of course).

    Unfortunately, client-side pressures and my boss being even worse about documentation/commenting than i am conspire to prevent such sensible actions. : /

  • We have gotten into the habit of formatted documentation in the header with standard stuff at the top describing who, what, when, etc. Then a revision section with numbered revisions, again with who, what, when... Then in the code we will reference additions or changes by the revision number with any additional comments required. You can get a sense of how the code evolved. This can, of course, get messy if there a lot of revisions.

  • I think Phil has got a good area here to investigate and put effort into.

    I add a comment header to all programmability objects like stored procedures and user defined functions. Although I keep a few good pieces of information there like who, when, a change log, basic purpose and so forth I have not devoted much thought to this area. Perhaps somebody has done some serious thinking and, through this topic, we all can walk away with something.

    The header is probably best if brief. The key pieces of information to best go there are probably fairly straightforward. One piece worth some consideration is whether a change log should go in the header--you know, like the VSS change log that can be set to automatically append to the header as a comment.

    Now Phil focuses on structured headers. The comments here have already drifted to code comments which is also an interesting and obviously related topic. Here I'd like to see the same XML comment subsystem used as is in use with C#: http://msdn.microsoft.com/en-us/magazine/cc302121.aspx. Why reinvent the wheel? I imagine that parsers could be extended fairly easily to account for the additional double dash that would go in front of the tripple foward slash to extract the comments such as "--///."

    I haven't seen much on this yet, but I suppose the topic of commenting T-SQL code that's rendered by mGrammar will come up--and through that, some standards will need to come out of it if they haven't already. Can anyone comment on this? Supposedly in the future, a growing percentage of T-SQL code will be produced through mGrammar and that whole Oslo-esque DSL (domain-specific language) family of solutions--and perhaps out of necessity these frameworks will drive the commenting solutions.

    Bill Nicolich: www.SQLFave.com.
    Daily tweet of what's new and interesting: AppendNow

  • Great idea, Phil. By the way VB .Net has the XML comments too. They show up in the object browser. It forms the minimum set of documentation. But the barest minimum. We had talked about a "librarian" feature that would scanning all our source code and pull this data into some type of searchable repository. We have found that each of us has written the same routine, each in our own way, without knowing that anyone else had already done it.

    Andy you are not alone. The thought that descriptive routine names and variable names are enough is pervasive and dangerous. All developers are lazy. (Especially me.) Working for a boss who is a developer is bad. You will share the same bad habits.

    ATBCharles Kincaid

  • I'm afraid I have to disagree. As a developer, I've seen too many source code files filled with revision history cruft to the point that a twenty line procedure is dwarfed by a hundred lines of comments regarding who changed what and why.

    Not only are these revision history comments distracting, they also produce a huge number of 'false positives' when you're comparing two versions of a script (e.g. with diff or windiff) to see what really changed.

    By all means put a few comments about the parameters to your stored procedure or the relevance of your table fields, but leave revision history where it belongs - in the checkin comments field of a proper revision control system.

    Revision control not only provides a place to store comments, it gives you direct access to previous versions of a script and more importantly gives you the ability to branch code, letting you craft immediate fixes then bring those fixes back into the main branch.

    It's time DBAs learned from the developers and started using revision control as a matter of course.

    Robb

  • I'd love to see some community standards, if only to see if I can't improve on my own methods.

    When I first started coding stored procedures, I was terrible about not sticking to a standard format or adding commenting. Three or four months later, when I had to go back to revise my earlier code, a new (self enforced) standard was created.

    I've changed and improved it a bit over time, of course.

    Because I am not doing any coding that will leave my company, I can leave as much commenting as I want...so I leave a ton.

    I have templates created to use as my jump off point that include a large /*...*/ bracket for the purpose of the stored procedure, tables it uses, other procedures it depends on, etc.

    I then have the code broken down in to semi logical sections with their own mini headers including the reason for each section, comments about known problems, etc. For particularly dense or non intuitive code, I'll add even more commenting along the way.

    I do have a Change Log at the end, but rather than keeping old code commented in it, I just note what changes were made in general in what section. I make backups of any stored procedure I am about to make major changes to, and leave it in an archive folder with a date attached. If I need to know specifically what the old code was, I can refer back there.

    When I have to go back now and review or update a set of stored procedures from a year back, I can quickly narrow down to the sections I need...and the poor guy I am slowly teaching is able to open up my stored procedures and figure out how everything works as well.

    Yes, it does sometimes mean a page filled with more green than black and blue, but it means vastly decreased maintenance or repair times, and leaves a (hopefully) easy to follow set of code for whoever ends up taking my place eventually.

    When I have (often mythical) free time, I follow that up with regular documentation that covers the procedure from a non coding point of view ("This set of stored procedures and tables are designed to manage inventory for reporting, EDI and web site use...")

    I also have some jobs running nightly that attempt to go through and parse out my custom tables and stored procedures to let me know what else is going to be affected when I make a change. Remembering to actually reference it, on the other hand, is an entirely different matter.

  • Information that you'd normally want to know, such as dependencies (which other database objects depend on a given object and which it, in turn, depends on) come for free.

    At best, this is only true for objects in the database. I think it is important to have a comment header and include dependencies (when not obvious) in terms of what front-end objects use the code. Some procs/functions/etc may be general and have wide use. In those cases, I'm not saying it is helpful to have an exhaustive list of every front-end bit that calls the database code. But in some cases, it is extremely helpful to know that if you change Proc X, then both screens Y *and* Z need to be updated too. This kind of documentation is the kind of documentation that would go in a header and has saved me on more than one occasion.

    Some posters here seem to think that documenting is an either-or proposition: either header or body. No, no. Do both. The type of comment that is relevant for the different locations is different. It is not repeating.

    Other posters here state that comments should be short. No, no, no, no, oh dear deity, no. Comments should be complete, fully meeting documenting goals (which I guess might legitimately differ from agency to agency). Sometimes complete documentation means a short phrase. Sometimes complete documentation means writing pages of comments that exceed lines of code. It's better to be wordy and complete than to skimp to fit some kind of arbitrary definition of "too many words for our standard". "Our poor DBAs can't read that much. It's like reading the manual or something. Are you kidding?"

    In the past couple of years, I've started documenting tables and fields in addition to code procedures. This has been really helpful both to myself, and I expect to my future replacement. To do this right, you have to understand how the data will be used in the front-end. I've seen other tables and fields commented in a way that was a complete waste of time as the words told nothing helpful. For that matter, all of these comments will be a better quality if the DBA truly understands how the data/results will be used and WHY.

    - JJ

    Documenting Manifesto: Who and when are intellectually interesting, but generally the least of your concerns. What is important, but easy and not sufficient. Where code is used and why brings you blissful completion.

  • JJ B,

    The problem with multi-page long comment headers (in general), and your suggestion of headers including "where" and "why" (in particular) is that they are more likely to develop inaccuracies over time than concise comments. The only thing worse than reading code with no comments - is reading code with comments that no longer reflect what the code actually does. If all I have to put in the header is my name, date of revision, and what the revision was, I will do it correctly every time. Even if I fail to, a code review will easily catch the omission. Short body-level comments that explain what's happening can quickly be updated as the code is changed. Long header comments require me read the entire thing and see if the tiny change I made in one small section might require me to update it. If some sentence in the middle of 15 paragraphs of comments is no longer valid because of the change I made, a code review will likely miss it too.

    The main issue I've seen with trying to include "where" and "why" is: Developer A creates a simple sp and some code that uses it, he then lists that code as a consumer. Some time later developer B writes another piece of code that uses that sp without requiring any changes to it. He never checked it out of source control so there's no review when checking it back in. Now when someone needs to make a change to the sp, the comment will have them believe it only affects the code written by dev A. Changes in the "why" (which may be several layers of abstaction removed from the db) are even less likely to be accurately tracked.

    Even if you're team is perfect, never forgets to comment a dependency, brushes and flosses their teeth after every meal, etc: it doesn't matter. If I'm the new guy coming on board, I'm not going to trust the header. I'm going to search the code in source control for the name of that sp and verify any dependencies myself. I think most people would. It doesn't take very long, and if I'm going to do that anyway, what's the point of including the extra info in the header?

    I agree with you about the need for commenting the tables and fields, but again I would leave the front end stuff out of it. I just use it to give a brief note about what should be stored there and comment on anything that might not be obvious.

  • The problem with multi-page long comment headers (in general), and your suggestion of headers including "where" and "why" (in particular) is that they are more likely to develop inaccuracies over time than concise comments. The only thing worse than reading code with no comments - is reading code with comments that no longer reflect what the code actually does.

    I've seen these arguments in every discussion about documentation, and I used to think they had some validity. Now, years later, I've found both of these statements to be 100% false. If your team can't seem to comment accurately on where and why, then you have a major (discipline? writing? understanding?) problem with your team. Fix it. It is not negotiable. Proper documentation is every bit as important to producing professional work as it is to work on getting rid of RBAR if your team just can't seem to get on board. Good documentation (like words that answer the question "why") can tell you things that the code itself does not. These are things that determine the accuracy of the code. Just because a query returns an answer does not mean that the query returns the right answer.

    Of course, it occasionally (rarely with good people) happens that documentation gets off. In those cases, I've never had an experience where reading wrong documentation was worse than having no documentation. The documentation helps a person go through code faster when doing maintenance. Of course, reading the documentation does not substitute for checking against the code. So, you read the documentation, go much faster through the code than you would without documentation, and you see a contradiction. Good. A disconnect between documentation and code can be helpful in a couple of ways, including helping you zero in on the problem at hand.

    Unlike say, naming conventions, I do not believe appropriate documentation is a matter of style or an expression of different work environments. I believe proper documentation is like the issue of using cursors. There's precious little to justify wide cursor use in SQL Server. There's precious little to justify lack of complete documentation.

Viewing 15 posts - 1 through 15 (of 33 total)

You must be logged in to reply to this topic. Login to reply