### Introduction to Statistics

This article will give a brief overview of how statistics are generated, stored, and used in SQL Server.  Read more...
### Statistics in SQL: Student’s t-test

Many undergraduates have misunderstood the name 'Students' in the t-test to imply that it was designed as a simple test suitable for students. In fact it was William Sealy Gosset, an Englishman publishing under the pseudonym Student, who developed the t-test and t distribution in 1908, as a way of making confident predictions from small sample sizes of normally-distributed variables. As Gosset's employer was Guinness, the brewer, Phil Factor takes a sober view of calculating it in SQL.  Read more...
### Pattern Recognition via Principal Components Analysis

We'll look at using principal components analysis to help visualise your data and detect underlying structure or patterns.  Read more...
### Statistics in SQL: The Mann–Whitney U test

Phil Factor shows how to use the Mann-Whitney U test in SQL to to find out whether two samples come from the same distribution.  Read more...
### Statistics in SQL: The Kruskal–Wallis Test

Before you report your conclusions about your data, have you checked whether your 'actionable' figures occurred by chance? The Kruskal-Wallis test is a safe way of determining whether samples come from the same population, because it is simple and doesn't rely on a normal distribution in the population. This allows you a measure of confidence that your results are 'significant'. Phil Factor explains how to do it.  Read more...
### Machine Learning for Outlier Detection in R

A look into clustering to detect outliers in R. An extension on univariate statistical tests to include multivariate data.  Read more...
### Scoring Outliers in Non-Normal Data with R

Using R to detect outliers is relatively easy, but most methods assume your data is normally distributed. How do you handle skewed datasets?  Read more...
### Statistics in SQL: Simple Linear Regressions

Although linear regressions can get complicated, most jobs involving the plotting of a trendline are easy. Simple Linear Regression is handy for the SQL Programmer in making a prediction of a linear trend and giving a figure for the level probability for the prediction, and what is more, they are easy to do with the aggregation that is built into SQL.  Read more...
### Statistics in SQL: Kendall’s Tau rank correlation

Statistical calculations in SQL are often perfectly easy to do. SQL was designed to be a natural fit for calculating correlation, regression and variance on large quantities of data. It just isn't always immediately obvious how. In the second of a series of articles, Phil factor shows how calculating a non-parametric correlation via Kendall's Tau or Spearman's Rho can be stress-free.  Read more...
### Scoring Outliers with R

What is normal? More to the point, what is abnormal? We will look at using R to score outliers in a typical monitoring dataset.  Read more...
### What is normal? Finding outliers with R

How do you currently set alerting thresholds? What is normal? And more importantly, what is truly abnormal? We will explore these questions.  Read more...
### Statistics in SQL: Pearson’s Correlation

Some people will assure you that you can't do any serious statistical calculations in SQL. In the first of a series of articles, Phil factor aims to prove them wrong by explaining how easy it is to calculate Pearson's Product Moment Correlation.  Read more...
