January 21, 2010 at 8:44 am
This one is really strange and unfortunately I can't give an example of the bug. I did recreate the scenario with temp tables, primary keys and unique indexes but it works. There may be an entirely other way to rewrite this which I welcome but here it goes. There is a table with a pool and company cross joined to a date table and I need to get the last rate equal to or less than the date from another table. At first I was using an inner join figuring there'd be at least one rate but lets face it, never trust the users :hehe: so I changed it to a left outer figuring if they didn't enter a rate for the month it would get last months rate. This will be averaged at the end but as I started building the code I wanted to make sure it was gettin the correct dates. The code that works in temp tables is the following
BEGIN
SET NOCOUNT ON
CREATE TABLE #pool_def(
pool_id CHAR(10),
company_id CHAR(10),
unique_id INT IDENTITY PRIMARY KEY,
CONSTRAINT unqPoolDef UNIQUE(pool_id, company_id)
)
INSERT INTO #pool_def VALUES('pool', 'subcmp1')
INSERT INTO #pool_def VALUES('pool', 'subcmp2')
CREATE TABLE #c_doy2k(
date_time DATETIME,
unique_id INT IDENTITY PRIMARY KEY,
CONSTRAINT unqDoy2k UNIQUE(date_time)
)
DECLARE @i INT
SET @i = 0
BEGIN TRANSACTION
WHILE @i < 31 BEGIN
INSERT INTO #c_doy2k
VALUES(DATEADD(d,@i,'20100101'))
SET @i = @i + 1
END
COMMIT TRANSACTION
CREATE TABLE #srate(
datex DATETIME,
pool_id CHAR(10),
company_id CHAR(10),
date2 DATETIME,
rate FLOAT,
unique_id INT IDENTITY PRIMARY KEY,
CONSTRAINT unqSrate UNIQUE(datex,pool_id, company_id),
CONSTRAINT unqSrate2 UNIQUE(pool_id, company_id,date2)
)
INSERT INTO #srate
SELECT '20100101', 'pool', 'subcmp1', '20100101', 1 UNION ALL
SELECT '20100105', 'pool', 'subcmp1', '20100105', 2 UNION ALL
SELECT '20100101', 'pool', 'subcmp2', '20100101', 3 UNION ALL
SELECT '20100105', 'pool', 'subcmp2', '20100105', 4
DECLARE
@sDate DATETIME,
@eDate DATETIME
SELECT
@sDate = '20100101',
@eDate = '20100131'
SELECT pd.pool_id, pd.company_id, doy.date_time, MAX(srate.datex) rateDate
FROM #pool_def pd
INNER JOIN #c_doy2k doy ON
doy.date_time BETWEEN @sDate AND @eDate
INNER JOIN #srate srate ON
srate.pool_id = pd.pool_id AND
srate.company_id = pd.company_id AND
srate.datex <= doy.date_time
WHERE pd.pool_id = 'pool'
GROUP BY pd.pool_id, pd.company_id, doy.date_time
ORDER BY pd.pool_id, pd.company_id, doy.date_time
SELECT pd.pool_id, pd.company_id, doy.date_time, MAX(srate.datex) rateDate
FROM #pool_def pd
INNER JOIN #c_doy2k doy ON
doy.date_time BETWEEN @sDate AND @eDate
LEFT OUTER JOIN #srate srate ON
srate.pool_id = pd.pool_id AND
srate.company_id = pd.company_id AND
srate.datex <= doy.date_time
WHERE pd.pool_id = 'pool'
GROUP BY pd.pool_id, pd.company_id, doy.date_time
ORDER BY pd.pool_id, pd.company_id, doy.date_time
DROP TABLE #c_doy2k
DROP TABLE #pool_def
DROP TABLE #srate
END
GO
from the example you can see that the result set is
POOL SUBCMP1 2010-01-01 00:00:00.0002010-01-01 00:00:00.000
POOL SUBCMP1 2010-01-02 00:00:00.0002010-01-01 00:00:00.000
POOL SUBCMP1 2010-01-03 00:00:00.0002010-01-01 00:00:00.000
POOL SUBCMP1 2010-01-04 00:00:00.0002010-01-01 00:00:00.000
POOL SUBCMP1 2010-01-05 00:00:00.0002010-01-05 00:00:00.000
POOL SUBCMP1 2010-01-06 00:00:00.0002010-01-05 00:00:00.000
and again with the left outer
POOL SUBCMP1 2010-01-01 00:00:00.0002010-01-01 00:00:00.000
POOL SUBCMP1 2010-01-02 00:00:00.0002010-01-01 00:00:00.000
POOL SUBCMP1 2010-01-03 00:00:00.0002010-01-01 00:00:00.000
POOL SUBCMP1 2010-01-04 00:00:00.0002010-01-01 00:00:00.000
POOL SUBCMP1 2010-01-05 00:00:00.0002010-01-05 00:00:00.000
POOL SUBCMP1 2010-01-06 00:00:00.0002010-01-05 00:00:00.000
Now the interesting part is that in the production tables which are bigger and uglier than what I pieced together I get two different result sets from the following
BEGIN
DECLARE
@sDate DATETIME,
@eDate DATETIME
SELECT
@sDate = '20100101',
@eDate = '20100131'
SELECT pd.pool_id, pd.company_id, doy.date_time, MAX(srate.datex) rateDate
FROM pool_def pd
INNER JOIN c_doy2k doy ON
doy.date_time BETWEEN @sDate AND @eDate
INNER JOIN srate srate ON
srate.pool_id = pd.pool_id AND
srate.company_id = pd.company_id AND
srate.datex <= doy.date_time
WHERE pd.pool_id = 'pool'
GROUP BY pd.pool_id, pd.company_id, doy.date_time
ORDER BY pd.pool_id, pd.company_id, doy.date_time
SELECT pd.pool_id, pd.company_id, doy.date_time, MAX(srate.datex) rateDate
FROM pool_def pd
INNER JOIN c_doy2k doy ON
doy.date_time BETWEEN @sDate AND @eDate
LEFT OUTER JOIN srate srate ON
srate.pool_id = pd.pool_id AND
srate.company_id = pd.company_id AND
srate.datex <= doy.date_time
WHERE pd.pool_id = 'pool'
GROUP BY pd.pool_id, pd.company_id, doy.date_time
ORDER BY pd.pool_id, pd.company_id, doy.date_time
END
GO
but this is actually what i get from production
POOL SUBCMP1 2010-01-01 00:00:00.0002010-01-01 00:00:00.000
POOL SUBCMP1 2010-01-02 00:00:00.0002010-01-01 00:00:00.000
POOL SUBCMP1 2010-01-03 00:00:00.0002010-01-01 00:00:00.000
POOL SUBCMP1 2010-01-04 00:00:00.0002010-01-01 00:00:00.000
POOL SUBCMP1 2010-01-05 00:00:00.0002010-01-05 00:00:00.000
POOL SUBCMP1 2010-01-06 00:00:00.0002010-01-05 00:00:00.000
with the left outer i get
POOL LEADCMP 2010-01-01 00:00:00.000NULL
POOL LEADCMP 2010-01-02 00:00:00.000NULL
POOL LEADCMP 2010-01-03 00:00:00.000NULL
POOL LEADCMP 2010-01-04 00:00:00.000NULL
POOL LEADCMP 2010-01-05 00:00:00.000NULL
POOL LEADCMP 2010-01-06 00:00:00.000NULL
The version of the server is as follows
Microsoft SQL Server 2005 - 9.00.4053.00 (Intel X86) May 26 2009 14:24:20 Copyright (c) 1988-2005 Microsoft Corporation Enterprise Edition on Windows NT 5.2 (Build 3790: Service Pack 2)
Any ideas whatsoever?
January 21, 2010 at 8:52 am
Aren't the differences what you'd expect?
With an inner join there must be a match in the second table, so the only rows that can appear in the resultset will have values for the srate.datex column
With a left outer join you're allowing there not to be a matching row at all in the srate table. When there isn't, the column values will appear as null, exactly what you're seeing in the second result set.
It means that you have rows in the pool_def/c_doy2k tables that has no match at all on the conditions that you specified in the srate table
Sorry, but I see nothing that's unexpected here.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 21, 2010 at 8:56 am
To reproduce that in your test example, change your insert into the first temp table as follows:
INSERT INTO #pool_def VALUES('pool', 'subcmp1')
INSERT INTO #pool_def VALUES('pool', 'subcmp2')
INSERT INTO #pool_def VALUES('pool', 'LEADCMP')
Because there are no rows in the #srate table that have the company_id = LEADCMP, when you do a left join, you'll see the rows with leadcmp appear, with null values from #srate.
Gail Shaw
Microsoft Certified Master: SQL Server, MVP, M.Sc (Comp Sci)
SQL In The Wild: Discussions on DB performance with occasional diversions into recoverability
January 21, 2010 at 8:58 am
GilaMonster (1/21/2010)
Aren't the differences what you'd expect?With an inner join there must be a match in the second table, so the only rows that can appear in the resultset will have values for the srate.datex column
With a left outer join you're allowing there not to be a matching row at all in the srate table. When there isn't, the column values will appear as null, exactly what you're seeing in the second result set.
It means that you have rows in the pool_def/c_doy2k tables that has no match at all on the conditions that you specified in the srate table
Sorry, but I see nothing that's unexpected here.
You are absolutely right, I didn't even notice the lead company, thank you. brain dead :w00t:
Viewing 4 posts - 1 through 4 (of 4 total)
You must be logged in to reply to this topic. Login to reply