SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

Mala's Data Blog

My name is Malathi, a.k.a Mala - I am a DBA turned BI/Data Science person, working with SQL Server since 6.5. I am also founder of the Louisville SQL Server user group, organizer of 8 SQL Saturdays, Regional mentor for northeast, and 12-year PASS conference attendee. In my spare time I love to garden, travel, read, paint, and do yoga.

Understanding Relative Risk – with T-SQL

In this post we will explore a common statistical term – Relative Risk, otherwise called Risk Factor. Relative Risk is a term that is important to understand when you are doing comparative studies of two groups that are different in some specific way. The most common usage of this is… Read more

0 comments, 1,386 reads

Posted in Mala's Data Blog on 19 June 2017

Cochran-Mantel-Haenzel Method with T-SQL and R – Part I

This test is an extension of the Chi Square test I blogged of earlier. This is applied when we have to compare two groups over several levels and comparison may involve a third variable.
Let us consider a cohort study as an example – we have two medications A and… Read more

0 comments, 143 reads

Posted in Mala's Data Blog on 12 June 2017

Dataset for Cochran-Mantel-Hanzel Test

Below is the script to create the table and dataset I used. This is just test data and not copied from anywhere.

USE [yourdb]
/****** Object: Table [dbo].[DrugResponse] Script Date: 6/12/2017 6:45:46 AM ******/
CREATE TABLE [dbo].[DrugResponse](…

Read more

0 comments, 129 reads

Posted in Mala's Data Blog on 12 June 2017

Fischer’s Exact Test – with T-SQL and R

This post is a long overdue second part to the post on Chi Square Test that I did a few months ago.  This post addresses relationships between two categorical variables, but in cases where data is sparse, and the numbers (in any cell) are less than 5. The Chi Square… Read more

3 comments, 1,017 reads

Posted in Mala's Data Blog on 22 May 2017

SQL Saturday Louisville precon – interview Andy Leonard

This will be year #9 of sql saturdays in Louisville. Every year (starting with 3rd or 4th), it has been a tradition to do ‘precons’ on Fridays. For those who don’t know – Precons are day long paid trainings by an expert in the subject, held on the friday… Read more

0 comments, 165 reads

Posted in Mala's Data Blog on 8 May 2017

SQL Saturdays – down memory lane

A casual twitter-conversation with Karla Landrum and some other peeps led me down memory lane on older events. Our SQL Saturday at Louisville will be 9 years old this year. We were event #23, in 2009. SQL Saturdays started two years before, in 2007.

Our first event was held at… Read more

0 comments, 158 reads

Posted in Mala's Data Blog on 1 May 2017

The Birthday Problem – with T-SQL and R

When I was on SQLCruise recently – Buck Woody (b|t) made a interesting statement – that in a room of 23 people, there is over a 50% chance that  two or more have the same birthdays. And sure enough, we did end up having more than… Read more

0 comments, 1,490 reads

Posted in Mala's Data Blog on 24 April 2017

Normal approximation to binomial distribution using T-SQL and R

In the previous post I demonstrated the use of binomial formula to calculate probabilities of events occurring in certain situations. In this post am going to explore the same situation with a bigger sample set. Let us assume, for example, that instead of 7 smokers we had 100 smokers. We… Read more

0 comments, 317 reads

Posted in Mala's Data Blog on 17 April 2017

The Binomial Formula with T-SQL and R

In a previous post I explained the basics of probability. In this post I will use some of those principles to see how to solve certain problems. I will pick a very simple problem that I found in a statistics textbook. Suppose I have 7 friends who are smokers. The… Read more

3 comments, 1,479 reads

Posted in Mala's Data Blog on 10 April 2017

Sampling Distribution and Central Limit Theorem

In this post am going to explain (in highly simplified terms) two very important statistical concepts – the sampling distribution and central limit  theorem.

The sampling distribution is the distribution of means collected from random samples taken from a population. So, for example, if i have a population of life… Read more

0 comments, 218 reads

Posted in Mala's Data Blog on 3 April 2017

Basics of Probability

In this post am going to introduce into some of the basic principles of probability – and use it in other posts going forward. Quite a number of people would have learned these things in high school math and then forgotten – I personally needed a refresher. These concepts are… Read more

0 comments, 2,270 reads

Posted in Mala's Data Blog on 20 March 2017

TSQL2sday – Daily database WTF

This month’s TSQL Tuesday is organized by Kennie T Pontoppidan(t) – the topic is ‘Daily Database WTF‘ – or a horror story from the database world.  As someone who has worked databases for nearly two decades, there are several of these – I picked one of… Read more

0 comments, 219 reads

Posted in Mala's Data Blog on 12 March 2017

Generating Frequency Table

This week’s blog post is rather simple. One of the main characteristics of a data set involving classes, or discrete variables – are frequencies. The number of times each data element or class is observed is called its frequency. A table that displays the discrete variable and number of times… Read more

0 comments, 2,230 reads

Posted in Mala's Data Blog on 6 March 2017

The Empirical Rule

I am resuming technical blogging after a gap of nearly a month. I will continue to blog my re learning of statistics and basic concepts, and illustrate them to the best of my ability using R and T-SQL where appropriate.

For this week I have chosen a statistical concept called… Read more

0 comments, 1,786 reads

Posted in Mala's Data Blog on 27 February 2017

SQL Cruise 2017 – Western Carribean – my experience

As some readers may know, I am a regular attendee on SQL Cruise s for 8 years now. SQLCruise is a training(&-vacation for some) event organized by Tim Ford(b | t ) and Amy Ford (t) that happens twice a year. I went on the first one… Read more

0 comments, 366 reads

Posted in Mala's Data Blog on 19 February 2017

My Epic Life Quest

I have always maintained a private bucket list. I have not had the courage to actually put it down in writing – but this year I decided that it is time. My good friend Brent Ozar has been doing this for a few years now, and his list is my… Read more

0 comments, 1,615 reads

Posted in Mala's Data Blog on 2 January 2017

2016 – A Year to remember

2016 has undoubtedly been a landmark year in my life. To me it marked my first conscious entry into mid age. It was the first year that I really pondered some of the questions that people need to think of as they get older in life – with clarity that… Read more

0 comments, 391 reads

Posted in Mala's Data Blog on 31 December 2016

Multivariate Variable Analysis using R

So far I’ve worked on simple analytical techniques using one or two variables in a dataset. This article is a sort of a summary – about various techniques we can use for such datasets, depending on the type of variable in question. The techniques include – how to get summary… Read more

0 comments, 494 reads

Posted in Mala's Data Blog on 5 December 2016

Associative Analytics: Two sample T Test

In the previous post we looked at a one way T-Test. A one way T Test helped us determine if a selected sample was indeed truly representative of the larger population. A Two way T Test goes a step further – it helps us determine if both samples came from… Read more

0 comments, 1,181 reads

Posted in Mala's Data Blog on 21 November 2016

PASSion Award and what it means to me

2016 is going to be a special year in my life. There was an article on Oscar awards a while ago – on reasons why the oscar is the most watched awards ceremony around the world. No, it is not just because of movie stars. Everyone, secretly or publicly –… Read more

0 comments, 786 reads

Posted in Mala's Data Blog on 31 October 2016

Older posts