Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 

Low Profile Architecture

By Igor Zaychik,

The goal of this article is to explain how to get unique transaction data from a data set containing multiple transactions per client per month. Taking into account the popularity of BI and analytical tools, I want to show that SQL and Excel can be considered a platform for complex reporting and data warehousing.

I plan to use Excel as a reporting tool (Excel 2010 is pretty good for this task) and SQL to get data from the SQL server database. Also, I’ll  use Excel’s drill down capability to see original records behind the counts on a pivot-table. So, I’m building a virtual data warehouse.

My first goal is to count unique transactions per client per month.

Let’s start from my records:

Client ID1, Transaction Code1, Transaction Date1, Comment

Client ID2, Transaction Code2, Transaction Date2, Comment

Client ID2, Transaction Code2, Transaction Date3, Comment

Here, the combination of Client ID + Transaction Code + Transaction Date comprises a primary key.

If Transaction Date2 and Transaction Date3 from my example always belongs to a different months Excel pivot table will do correct counts by months as shown in a pivot-table below:

Month 1

Month 2

Month 3

Transaction Code1

Counts of Client ID

Counts of Client ID

Counts of Client ID

Transaction Code2

Counts of Client ID

Counts of Client ID

Counts of Client ID

The SQL data source will be simple as well:

Select [Client ID], [Transaction Code], [Transaction Date] from DB
where [Transaction Code] in (…)

In the case of [Transaction Date2] and [Transaction Date3] belong to the same month, the pivot-table will calculate Client ID2 for Transaction Code2 in the same month. Because my level of granularity is a month I need to eliminate all records per month with the same Transaction Code and Client ID. To unduplicate my set I used transaction with maximum Transaction Date in a month per Client ID. In my initial solution, grouping by month was done in the pivot-table. I want to have same simple solution I had before – a pivot table with the same number of fields and names on a data source.

My plan was to use a combination of group by and having clauses. Unfortunately group by uses aggregation functions for fields not included in a grouping. In my example, the Comment field would lose its original value if it was placed in an aggregation function like max().

I started finding maximum Transaction Dates per client without using group by and having clauses.

The next SQL statement serves this purpose:

Select a.[Client ID], a.[Transaction Code], a.[Transaction Date] from DB a
where a.[Transaction Code] in (..)
and not exists
(
   Select * from DB b
   where b.[Client Id] = a.[Client ID] and b.[Transaction Code] in (…)

   -- This line filter a maximum transaction date per client
   and b.[Transaction Date] > a.[Transaction Date]
)

This solution looks up to the task and will eliminate records (say with Transaction Date2) from my result set. My further analysis showed that I missed a point on monthly granularity. If I have the same transactions (same Transaction Code) per client every month my script will return only the latest one. To preserve transactions on a monthly basis I added a comparison on month and (because reporting time frame can cross a year) a year comparison.

Select a.[Client ID], a.[Transaction Code], a.[Transaction Date] from DB a
where a.[Transaction Code] in (..)
and not exists
(
   Select * from DB b
   where b.[Client Id] = a.[Client ID] and b.[Transaction Code] in (…) 
  
   -- This line filter a maximum transaction date
   and b.[Transaction Date] > a.[Transaction Date]

   -- Next two lines group maximum values by year months
   and Year(b.[Transaction Date] = Year(a.[Transaction Date])
   and Month([b.[Transaction Date] = Month(a.[Transaction Date]) 
)

As a benefit of this method you may have any number of fields in the select list, (not just Comment) to present transactions details without restriction connected to using the aggregate functions. Those fields may appear on a drill-down list in Excel to explain details behind dry quantity numbers.

This method may be used as a base for building a simple data warehouse combining power of SQL and Excel which I call “low profile architecture”.

I appreciate any questions and comments to this article.

Total article views: 3714 | Views in the last 30 days: 17
 
Related Articles
FORUM

Excel Row count with AcvtiveX Script (DTS)

How do I get Excel worksheet rows count.

FORUM

Count Months (periods) on program

Need to counts periods and spells on a program

FORUM

Client App hangs up while server in Transaction

Client App hangs up while server in Transaction

BLOG

Exploring Excel 2013 as Microsoft’s Business Intelligence Client

A little over six months ago, I wrote an article on Magenic.com about Excel 2013’s Impact for BI Use...

FORUM

Average Number of Months Between Events?

Count Months with Zero Values

Tags
data warehouse    
excel    
reporting services (ssrs)    
t-sql    
 
Contribute

Join the most active online SQL Server Community

SQL knowledge, delivered daily, free:

Email address:  

You make SSC a better place

As a member of SQLServerCentral, you get free access to loads of fresh content: thousands of articles and SQL scripts, a library of free eBooks, a weekly database news roundup, a great Q & A platform… And it’s our huge, buzzing community of SQL Server Professionals that makes it such a success.

Join us!

Steve Jones
Editor, SQLServerCentral.com

Already a member? Jump in:

Email address:   Password:   Remember me: Forgotten your password?
Steve Jones