SQL Clone
SQLServerCentral is supported by Redgate
Log in  ::  Register  ::  Not logged in

Mala's Data Blog

My name is Malathi, a.k.a Mala - I am a DBA turned BI/Data Science person, working with SQL Server since 6.5. I am also founder of the Louisville SQL Server user group, organizer of 8 SQL Saturdays, Regional mentor for northeast, and 12-year PASS conference attendee. In my spare time I love to garden, travel, read, paint, and do yoga.

Fischer’s Exact Test – with T-SQL and R

This post is a long overdue second part to the post on Chi Square Test that I did a few months ago.  This post addresses relationships between two categorical variables, but in cases where data is sparse, and the numbers (in any cell) are less than 5. The Chi Square… Read more

0 comments, 84 reads

Posted in Mala's Data Blog on 22 May 2017

SQL Saturday Louisville precon – interview Andy Leonard

This will be year #9 of sql saturdays in Louisville. Every year (starting with 3rd or 4th), it has been a tradition to do ‘precons’ on Fridays. For those who don’t know – Precons are day long paid trainings by an expert in the subject, held on the friday… Read more

0 comments, 119 reads

Posted in Mala's Data Blog on 8 May 2017

SQL Saturdays – down memory lane

A casual twitter-conversation with Karla Landrum and some other peeps led me down memory lane on older events. Our SQL Saturday at Louisville will be 9 years old this year. We were event #23, in 2009. SQL Saturdays started two years before, in 2007.

Our first event was held at… Read more

0 comments, 104 reads

Posted in Mala's Data Blog on 1 May 2017

The Birthday Problem – with T-SQL and R

When I was on SQLCruise recently – Buck Woody (b|t) made a interesting statement – that in a room of 23 people, there is over a 50% chance that  two or more have the same birthdays. And sure enough, we did end up having more than… Read more

0 comments, 1,313 reads

Posted in Mala's Data Blog on 24 April 2017

Normal approximation to binomial distribution using T-SQL and R

In the previous post I demonstrated the use of binomial formula to calculate probabilities of events occurring in certain situations. In this post am going to explore the same situation with a bigger sample set. Let us assume, for example, that instead of 7 smokers we had 100 smokers. We… Read more

0 comments, 237 reads

Posted in Mala's Data Blog on 17 April 2017

The Binomial Formula with T-SQL and R

In a previous post I explained the basics of probability. In this post I will use some of those principles to see how to solve certain problems. I will pick a very simple problem that I found in a statistics textbook. Suppose I have 7 friends who are smokers. The… Read more

3 comments, 1,319 reads

Posted in Mala's Data Blog on 10 April 2017

Sampling Distribution and Central Limit Theorem

In this post am going to explain (in highly simplified terms) two very important statistical concepts – the sampling distribution and central limit  theorem.

The sampling distribution is the distribution of means collected from random samples taken from a population. So, for example, if i have a population of life… Read more

0 comments, 162 reads

Posted in Mala's Data Blog on 3 April 2017

Basics of Probability

In this post am going to introduce into some of the basic principles of probability – and use it in other posts going forward. Quite a number of people would have learned these things in high school math and then forgotten – I personally needed a refresher. These concepts are… Read more

0 comments, 1,988 reads

Posted in Mala's Data Blog on 20 March 2017

TSQL2sday – Daily database WTF

This month’s TSQL Tuesday is organized by Kennie T Pontoppidan(t) – the topic is ‘Daily Database WTF‘ – or a horror story from the database world.  As someone who has worked databases for nearly two decades, there are several of these – I picked one of… Read more

0 comments, 162 reads

Posted in Mala's Data Blog on 12 March 2017

Generating Frequency Table

This week’s blog post is rather simple. One of the main characteristics of a data set involving classes, or discrete variables – are frequencies. The number of times each data element or class is observed is called its frequency. A table that displays the discrete variable and number of times… Read more

0 comments, 1,977 reads

Posted in Mala's Data Blog on 6 March 2017

The Empirical Rule

I am resuming technical blogging after a gap of nearly a month. I will continue to blog my re learning of statistics and basic concepts, and illustrate them to the best of my ability using R and T-SQL where appropriate.

For this week I have chosen a statistical concept called… Read more

0 comments, 1,609 reads

Posted in Mala's Data Blog on 27 February 2017

SQL Cruise 2017 – Western Carribean – my experience

As some readers may know, I am a regular attendee on SQL Cruise s for 8 years now. SQLCruise is a training(&-vacation for some) event organized by Tim Ford(b | t ) and Amy Ford (t) that happens twice a year. I went on the first one… Read more

0 comments, 294 reads

Posted in Mala's Data Blog on 19 February 2017

My Epic Life Quest

I have always maintained a private bucket list. I have not had the courage to actually put it down in writing – but this year I decided that it is time. My good friend Brent Ozar has been doing this for a few years now, and his list is my… Read more

0 comments, 1,567 reads

Posted in Mala's Data Blog on 2 January 2017

2016 – A Year to remember

2016 has undoubtedly been a landmark year in my life. To me it marked my first conscious entry into mid age. It was the first year that I really pondered some of the questions that people need to think of as they get older in life – with clarity that… Read more

0 comments, 308 reads

Posted in Mala's Data Blog on 31 December 2016

Multivariate Variable Analysis using R

So far I’ve worked on simple analytical techniques using one or two variables in a dataset. This article is a sort of a summary – about various techniques we can use for such datasets, depending on the type of variable in question. The techniques include – how to get summary… Read more

0 comments, 412 reads

Posted in Mala's Data Blog on 5 December 2016

Associative Analytics: Two sample T Test

In the previous post we looked at a one way T-Test. A one way T Test helped us determine if a selected sample was indeed truly representative of the larger population. A Two way T Test goes a step further – it helps us determine if both samples came from… Read more

0 comments, 1,111 reads

Posted in Mala's Data Blog on 21 November 2016

PASSion Award and what it means to me

2016 is going to be a special year in my life. There was an article on Oscar awards a while ago – on reasons why the oscar is the most watched awards ceremony around the world. No, it is not just because of movie stars. Everyone, secretly or publicly –… Read more

0 comments, 708 reads

Posted in Mala's Data Blog on 31 October 2016

Days 1,2 and 3 of PASS Summit 2016

Today is Thursday, October 27th already. For some of us the summit begins monday – with precons and PASS Volunteering related meetings on Tuesday. For most other attendees the first day was Wednesday.

I arrived in the afternoon on Sunday with six other friends from Louisville,including my good friend Chris… Read more

0 comments, 221 reads

Posted in Mala's Data Blog on 27 October 2016

Sending Trevor love…

As some of you may be aware – fellow SQL family member, PASS Director, SQL Server MVP, founder of SQL Cruise and a good friend to many of us – Tim Ford – has a young son Trevor Ford who was recently diagnosed with a rare allergic drug reaction called… Read more

0 comments, 239 reads

Posted in Mala's Data Blog on 26 October 2016

TSQL Tuesday #83 – The Stats update solution

TSQL Tuesday is a monthly blog part hosted by a different blogger every month – it was started by Adam Machanic. This week’s TSQL Tuesday is hosted by Andy Mallon – the topic is ‘We’re dealing with the same problem’. I have chosen to write about a common problem… Read more

0 comments, 205 reads

Posted in Mala's Data Blog on 11 October 2016

Older posts