• Derek Dongray (2/13/2009)


    Regarding your problem of stored procedure comparison, Windiff is still available (just Google "windiff download", one of the places is Microsoft) so I'd have just used that (and regularly do!).

    ...

    ...

    Sometimes the solution is just to use a simpler tool like Windiff along with the mk.1 Eyeball as used by an experience SQL writer. 😀

    I agree with your admonition against reinventing the wheel, as Windiff does a great job. But my original intention was to provide quantitative matching results, not necessarily the visual representation of the matching. I imagined this as being part of a larger tool to compare databases (for example, between development, testing, and production versions of a database, or instances of independently maintained databases that began as identical copies), which would involve automatic comparisons of many objects.

    The more subtle differences you mentioned above make the problem even more interesting, and in my comparison of the two spds of 1000+ lines each, I found many instances of such. You're right, the complexity goes upwards from there. This sort of "fuzzy matching" is exactly what's involved in name/address deduplication in direct marketing. For example, how do we match the two addresses:

    Jon G. Smith

    100 E. Main St

    Baltimore, MD

    J. Smyth

    100 East Main Street

    Balt. Maryland

    These sorts of problems have been tackled (in fact, I've written parsing, scoring, and matching algorithms myself to handle it). I think the SQL parser would be of similar complexity. Using the current method of capturing lines of code from the spds, we would have to parse those lines into "words" based on white space, etc. Taking the example of the SELECT you made, we would have to standardize the JOIN sequence and data nicknames, compare the metadata of the results, etc. I'm not sure I would want to go in that direction, but it's possible.