Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Outlier Detection with SQL Server, part 6.3: Visual Outlier Detection with Reporting Services Plots and SSDM Clustering

By Steve Bolton

…………When the goal is to illustrate how just how outlying an outlier may be, the efficiency with which scatter plots represent distances really can’t be beaten. It doesn’t take any training in mathematics to look at one and notice that a few data points are further… Read more

0 comments, 97 reads

Posted in Multidimensional Mayhem on 17 May 2015

Outlier Detection with SQL Server, part 6.2: Finding Outliers Visually with Reporting Services Box Plots

 By Steve Bolton

…………Throughout this series of amateur mistutorials in using SQL Server to identify outliers, we have repeatedly seen that the existing tried-and-true methods of detection long used for such purposes as hypothesis testing are actually poorly suited for finding aberrant values in large databases. The same problem… Read more

0 comments, 137 reads

Posted in Multidimensional Mayhem on 29 April 2015

Outlier Detection with SQL Server, part 6.1: Visual Outlier Detection with Reporting Services

 By Steve Bolton

…………Most of the previous articles in this self-tutorials on using SQL Server to find outliers required us to implement statistical formulas, in order to derive measures that required some explanation before they could be interpreted correctly. In this segment of the series, we’ll be discussing a… Read more

0 comments, 157 reads

Posted in Multidimensional Mayhem on 21 April 2015

Outlier Detection with SQL Server, part 5: Interquartile Range

By Steve Bolton

…………The last seven articles in this series of mistutorials on identifying outlying values in SQL Server database were clunkers, in the sense that the methods had many properties in common that made them inapplicable to the scenarios DBAs typically need them for. Chauvenet’s Criterion, Peirce’s Criterion,… Read more

0 comments, 178 reads

Posted in Multidimensional Mayhem on 30 March 2015

Outlier Detection with SQL Server, part 4: Peirce’s Criterion

By Steve Bolton

…………In the last couple of installments of this amateur series of self-tutorials on outlier identification with SQL Server, we dealt with detection methods that required recursive recomputation of the underlying aggregates. This week’s topic, Peirce’s Criterion, also flags outliers in an iterative manner, but doesn’t require… Read more

2 comments, 5,373 reads

Posted in Multidimensional Mayhem on 20 March 2015

Outlier Detection with SQL Server, part 3.6: Chauvenet’s Criterion

By Steve Bolton

…………This is the last of six articles I’ve segregated in this middle of my mistutorial series on identifying outlying values with SQL Server, because they turned out to be difficult to apply to the typical use cases DBAs encounter. After this detour we’ll get back on… Read more

0 comments, 245 reads

Posted in Multidimensional Mayhem on 28 February 2015

Outlier Detection with SQL Server, part 3.5: The Modified Thompson Tau Test

By Steve Bolton

…………Based on what little experience I’ve gained from writing this series on finding outliers in SQL Server databases, I expected the Modified Thompson Tau test to be a clunker. It marries the math underpinning one of the most ubiquitous means of outlier detection, Z-Scores, with the… Read more

0 comments, 6,147 reads

Posted in Multidimensional Mayhem on 14 February 2015

Outlier Detection with SQL Server, part 3.4: Dixon’s Q-Test

By Steve Bolton

…………In the last three installments of this amateur series of mistutorials on finding outliers using SQL Server, we delved into a subset of standard detection methods taken from the realm of statistical hypothesis testing. These are generally more difficult to apply to tables of thousands of… Read more

0 comments, 413 reads

Posted in Multidimensional Mayhem on 30 January 2015

Outlier Detection with SQL Server, part 3.3: The Limitations of the Tietjen-Moore Test

By Steve Bolton

…………The Tietjen-Moore test may have the coolest-soundest name of any of the outlier detection methods I’ll be surveying haphazardly in this amateur series of mistutorials, yet it suffers from some debilitating limitations that may render it among the least useful for SQL Server DBAs. It is… Read more

0 comments, 211 reads

Posted in Multidimensional Mayhem on 20 January 2015

Outlier Detection with SQL Server, part 3.2: GESD

By Steve Bolton

…………In the last edition of this amateur series of self-tutorials on finding outlying values in SQL Server columns, I mentioned that Grubbs’ Test has a number of limitations that sharply constrain its usefulness to DBAs. The Generalized Extreme Studentized Deviate Test (GESD) suffers from some of… Read more

0 comments, 5,318 reads

Posted in Multidimensional Mayhem on 17 December 2014

Outlier Detection with SQL Server, part 3.1: Grubbs’ Test


By Steve Bolton

…………In the last two installments of this series of amateur self-tutorials, I mentioned that the various means of detecting outliers with SQL Server might best be explained as a function of the uses cases, the context determined by the questions one chooses to ask of the… Read more

2 comments, 6,124 reads

Posted in Multidimensional Mayhem on 29 November 2014

Outlier Detection with SQL Server, part 2.2: Modified Z-Scores

By Steve Bolton

…………There are apparently many subtle variations on Z-Scores, a ubiquitous measure that is practically a cornerstone in the foundation of statistics. The popularity and ease of implementation of Z-Scores are what made me decide to tackle them early on in this series of amateur self-tutorials, on… Read more

0 comments, 466 reads

Posted in Multidimensional Mayhem on 13 November 2014

Outlier Detection with SQL Server, part 2.1: Z-Scores

By Steve Bolton

…………Using SQL Server to ferret out those aberrant data points we call outliers may call for some complex T-SQL, Multidimensional Expressions (MDX) or Common Language Runtime (CLR) code. Yet thankfully, the logic and math that underpin the standard means of outlier detection I’ll delve into in… Read more

2 comments, 721 reads

Posted in Multidimensional Mayhem on 28 October 2014

Outlier Detection with SQL Server, part 1: Finding Fraud and Folly with Benford’s Law

By Steve Bolton

…………My last blog series, A Rickety Stairway to SQL Server Data Mining, often epitomized a quip by University of Connecticut statistician Daniel T. Larose, to the effect that “data mining is easy to do badly.”[1] It is clear that today’s sophisticated mining algorithms can still… Read more

2 comments, 1,449 reads

Posted in Multidimensional Mayhem on 19 September 2014

Stay Tuned…for a SQL Server Tutorial Series Juggling Act

by Steve Bolton

…………If all goes according to plan, my blog will return in a few weeks with two brand new series, Using Other Data Mining Tools with SQL Server and Information Measurement with SQL Server. Yes, I will be attempting what amounts to a circus act among SQL… Read more

0 comments, 262 reads

Posted in Multidimensional Mayhem on 1 July 2014

A Rickety Stairway to SQL Server Data Mining, Part 15, The Grand Finale: Custom Data Mining Viewers


By Steve Bolton

…………As mentioned previously in this amateur self-tutorial series on the most neglected component of Microsoft’s leading database server software, SQL Server Data Mining (SSDM) can be extended through many means, such as Analysis Services stored procedures, CLR functionality, custom mining functions and plug-in algorithms. I had… Read more

0 comments, 1,193 reads

Posted in Multidimensional Mayhem on 11 February 2014

A Rickety Stairway to SQL Server Data Mining, Part 14.8: PMML Hell


By Steve Bolton

…………In A Rickety Stairway to SQL Server Data Mining, Part 14.3: Debugging and Deployment, we passed the apex of this series of amateur self-tutorials on SQL Server Data Mining (SSDM) and have seen the difficulty level and real-world usefulness of the material decline on a… Read more

0 comments, 556 reads

Posted in Multidimensional Mayhem on 15 January 2014

A Rickety Stairway to SQL Server Data Mining, Part 14.7: Additional Plugin Functionality


By Steve Bolton

…………In order to divide this segment of my amateur tutorial series on SQL Server Data Mining (SSDM) into digestible chunks, I deferred discussion of some functionality that can be implemented in custom algorithms. In the last installment I explained how to write custom data mining functions,… Read more

0 comments, 1,071 reads

Posted in Multidimensional Mayhem on 31 December 2013

A Rickety Stairway to SQL Server Data Mining, Part 14.6: Custom Mining Functions


by Steve Bolton

…………In the last installment of this amateur series of mistutorials on SQL Server Data Mining (SSDM), I explained how to implement the Predict method, which controls how the various out-of-the-box prediction functions included with SSDM are populated in custom algorithms. This limits users to returning the…

Read more

0 comments, 1,464 reads

Posted in Multidimensional Mayhem on 28 November 2013

A Rickety Stairway to SQL Server Data Mining, Part 14.5: The Predict Method

By Steve Bolton

…………In order to divide the Herculean task of describing custom algorithms into bite-sized chunks, I omitted discussion of some plug-in functionality from previous installments of this series of amateur tutorials on SQL Server Data Mining (SSDM). This week’s installment ought to be easily digestible, since it… Read more

2 comments, 1,634 reads

Posted in Multidimensional Mayhem on 30 October 2013

Older posts