Click here to monitor SSC
SQLServerCentral is supported by Redgate
 
Log in  ::  Register  ::  Not logged in
 
 
 

Implementing Fuzzy Sets in SQL Server, Part 5: The Mystery of the Missing Left Join

By Steve Bolton

…………Information on set operations like complements, intersections and unions is plentiful in the literature on fuzzy sets, which made the last three articles in this series of amateur self-tutorials easier to write in a certain sense. These topics are far more complex than with ordinary “crisp”… Read more

0 comments, 307 reads

Posted in Multidimensional Mayhem on 7 November 2016

Implementing Fuzzy Sets in SQL Server, Part 4: From Fuzzy Unions to Fuzzy Logic

By Steve Bolton

…………Fuzzy set relations carry an added layer of complexity not seen in ordinary “crisp” sets, due to the need to derive new grades for membership in the resultset from the scores in the original sets. As I explained two weeks ago in this series of amateur… Read more

0 comments, 168 reads

Posted in Multidimensional Mayhem on 10 October 2016

Implementing Fuzzy Sets in SQL Server, Part 3: Using Fuzzy Intersections as AND Statements

By Steve Bolton

…………Whenever we assign set membership grades to records on a continuous scale, we open up a whole new world of possibilities for measuring uncertainty and modeling different types of imprecision. Two articles ago in this series of amateur self-tutorials, we saw how a whole taxonomy of… Read more

0 comments, 175 reads

Posted in Multidimensional Mayhem on 6 September 2016

Implementing Fuzzy Sets in SQL Server, Part 2: Measuring Imprecision with Fuzzy Complements

By Steve Bolton

…………Taking the dive into fuzzy sets immediately elicits the obvious question: just how fuzzy is the data we’re operating on? As discussed in the first two installment of this amateur series of self-tutorials, the raison d’etre of fuzzy set theory is to model imprecise data that… Read more

0 comments, 416 reads

Posted in Multidimensional Mayhem on 2 August 2016

Implementing Fuzzy Sets in SQL Server, Part 1: Membership Functions and the Fuzzy Taxonomy

By Steve Bolton

…………In the first installment of this amateur self-tutorial series on applying fuzzy set theory to SQL Server databases, I discussed how neatly it dovetails with Behavior-Driven Development (BDD) principles and user stories. This is another compelling reason to take notice of fuzzy sets, beyond the advantages… Read more

0 comments, 737 reads

Posted in Multidimensional Mayhem on 6 July 2016

Implementing Fuzzy Sets in SQL Server, Part 0: The Buzz About Fuzz

By Steve Bolton

…………I originally planned to post a long-delayed series titled Information Measurement with SQL Server next, in which I’d like to cover scores of different metrics for quantifying the data our databases hold –  such as how random, chaotic or ordered it might be, or how much… Read more

0 comments, 1,971 reads

Posted in Multidimensional Mayhem on 13 June 2016

Goodness-of-Fit Testing with SQL Server Part 7.4: The Cramér–von Mises Criterion

By Steve Bolton

…………This last installment of this series of amateur tutorials features a goodness-of-fit metric that is closely related to the Anderson-Darling Test discussed in the last post, with one important caveat: I couldn’t find any published examples to verify my code against. Given that the code… Read more

0 comments, 221 reads

Posted in Multidimensional Mayhem on 31 May 2016

Goodness-of-Fit Testing with SQL Server Part 7.3: The Anderson-Darling Test

By Steve Bolton

…………As mentioned in previous installments of this series of amateur self-tutorials, goodness-of-fit tests can be differentiated in many ways, including by the data and content types of the inputs and the mathematical properties, data types and cardinality of the outputs, not to mention the performance impact… Read more

3 comments, 776 reads

Posted in Multidimensional Mayhem on 2 May 2016

Goodness-of-Fit Testing with SQL Server Part 7.2: The Lilliefors Test

By Steve Bolton

…………Since I’m teaching myself as I go in this series of self-tutorials, I often have only a vague idea of the challenges that will arise when trying to implement the next goodness-of-fit test with SQL Server. In retrospect, had I known that the Lilliefors Test was… Read more

4 comments, 1,221 reads

Posted in Multidimensional Mayhem on 15 April 2016

Goodness-of-Fit Testing with SQL Server Part 7.1: The Kolmogorov-Smirnov and Kuiper’s Tests

By Steve Bolton

…………“The names statisticians use for non-parametric analyses are misnomers too, in my opinion: Kruskal-Wallis tests and Kolmogorov-Smirnov statistics, for example. Good grief! These analyses are simple applications of parametric modeling that belie their intimidating exotic names.”[i]
                Apparently even experts like Will G. Hopkins, the author of… Read more

1 comments, 1,501 reads

Posted in Multidimensional Mayhem on 24 March 2016

Goodness-of-Fit Testing with SQL Server Part 6.2: The Ryan-Joiner Test

By Steve Bolton

…………In the last installment of this amateur series of self-tutorials, we saw how the Shapiro-Wilk Test might probably prove less useful to SQL Server users, despite the fact that it is one of the most popular goodness-of-fit tests among statisticians and researchers. Its impressive statistical power… Read more

3 comments, 350 reads

Posted in Multidimensional Mayhem on 12 March 2016

Goodness-of-Fit Testing with SQL Server Part 6.1: The Shapiro-Wilk Test

By Steve Bolton

…………Just as a good garage mechanic will fill his or her Craftsman with tools designed to fix specific problems, it is obviously wise for data miners to stockpile a wide range of algorithms, statistical tools, software packages and the like to deal with a wide variety… Read more

0 comments, 1,315 reads

Posted in Multidimensional Mayhem on 29 February 2016

Goodness-of-Fit Testing with SQL Server Part 5: The Chi-Squared Test

By Steve Bolton

…………As I’ve cautioned before, I’m writing this series of amateur self-tutorials in order to learn how to use SQL Server to perform goodness-of-fit testing on probability distributions and regression lines, not because I already know the topic well. Along the way, one of the things I’ve… Read more

0 comments, 501 reads

Posted in Multidimensional Mayhem on 13 February 2016

Goodness-of-Fit Testing with SQL Server Part 4.2: The Hosmer–Lemeshow Test with Logistic Regression

By Steve Bolton

…………The last installment of this amateur series of self-tutorials was the beginning of a short detour into using SQL Server to perform goodness-of-fit testing on regression lines, rather than on probability distributions. These are actually quite simple concepts; any college freshman ought to be able to… Read more

0 comments, 1,074 reads

Posted in Multidimensional Mayhem on 26 January 2016

Goodness-of-Fit Testing with SQL Server Part 4.1: R2, RMSE and Regression-Related Routines

By Steve Bolton

…………Throughout most of this series of amateur self-tutorials, the main topic has been and will continue to be in using SQL Server to perform goodness-of-testing on probability distributions. Don’t let the long syllables (or the alliteration practice in the title) fool you, because the underlying concept… Read more

0 comments, 1,272 reads

Posted in Multidimensional Mayhem on 13 January 2016

Goodness-of-Fit Testing with SQL Server, part 3.2: D’Agostino’s K-Squared Test

By Steve Bolton

…………In the last edition of this amateur series of self-tutorials on goodness-of-fit testing with SQL Server, we discussed the Jarque-Bera Test, a measure that unfortunately doesn’t scale well on datasets of the size that DBAs are accustomed to using. The problem is not with the usefulness… Read more

2 comments, 1,741 reads

Posted in Multidimensional Mayhem on 21 December 2015

Goodness-of-Fit Testing with SQL Server, part 3.1: Skewness, Kurtosis and the Jarque-Bera Test

By Steve Bolton

…………In the last installment of this series of amateur self-tutorials on using SQL Server to identify probability distributions, we saw how devices like probability plots can provide simple visual confirmation of a dataset’s shape. I considered doing a quick detour into Q-Q plots, but decided against… Read more

0 comments, 793 reads

Posted in Multidimensional Mayhem on 2 December 2015

Goodness-of-Fit Testing with SQL Server, part 2.1: Implementing Probability Plots in Reporting Services

By Steve Bolton

…………In the first installment of this series of amateur self-tutorials, I explained how to implement the most basic goodness-of-fit tests in SQL Server. All of those produced simple numeric results that are trivial to calculate, but in terms of interpretability, you really can’t beat the straightforwardness… Read more

0 comments, 798 reads

Posted in Multidimensional Mayhem on 3 November 2015

Goodness-of-Fit Testing with SQL Server, part 1: The Simplest Methods

By Steve Bolton

…………In the last series of mistutorials I published in this amateur SQL Server blog, the outlier detection methods I explained were often of limited usefulness because of a chicken-and-egg problem: some of the tests could tell us that certain data points did not fit a particular… Read more

1 comments, 1,918 reads

Posted in Multidimensional Mayhem on 17 October 2015

Outlier Detection with SQL Server, part 8: A T-SQL Hack for Mahalanobis Distance

By Steve Bolton

…………Longer code and substantial performance limitations were the prices we paid in return for greater sophistication with Cook’s Distance, the topic of the last article in this series of amateur self-tutorials on identifying outliers with SQL Server. The same tradeoff was even more conspicuous in this… Read more

0 comments, 1,975 reads

Posted in Multidimensional Mayhem on 12 September 2015

Older posts