January 4, 2010 at 3:27 pm
I have 2 tables with similar data in. One takes preference over the other and i would like to do some sort of union except statement to take the data out of table one if it exists, otherwise from table 2.
Table 1 + 2 schema
-----------------
Date
Source_ID
Some_Number
so what i need to do is select from table 1 and then also from table 2 if the date+source combination isnt in table 1
whats the best way to do this?
Thanks
January 4, 2010 at 3:45 pm
Please post table def and sample data as described in the first link in my signature.
January 5, 2010 at 6:57 am
If you ALWAYS want the data from table 1 if it exists I believe you can just do a UNION because UNION eliminates duplicates and I believe it will take the row from the first table in the UNION first, I can't guarantee it, but I'm pretty confident that will work.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
January 5, 2010 at 7:18 am
If UNION will be used, there wouldn't be any difference whether it took the data from table 1 or table 2: both have to be identical. How would you know what table has been used to get the data? And what would it matter?
short example:
DECLARE @t1 TABLE (col1 char(1))
DECLARE @t2 TABLE (col2 char(1))
INSERT INTO @t1
SELECT 'A' UNION ALL
SELECT 'B' UNION ALL
SELECT 'C' UNION ALL
SELECT 'D'
INSERT INTO @t2
SELECT 'a' UNION ALL
SELECT 'b' UNION ALL
SELECT 'E' UNION ALL
SELECT 'F'
SELECT col1 FROM @t1
UNION
SELECT col2 FROM @t2
SELECT col2 FROM @t2
UNION
SELECT col1 FROM @t1
Edit:
The output for both queries is identical. How could we tell that @t1 is used to get 'A' and 'B' or @t2?
I changed the data to be upper and lower case and the output is, well.., interesting: The data from BOTH tables are used to get "a/A" and "b/B".
Edit 2:
So, I basically proved myself to be wrong: as soon as we're talking about case sensitive character, we can tell wether it's taken from one table or the other. But, as far as I can see, we cannot predict which table will be used...
Does anybody have any kind of explanation for this behavior?
January 5, 2010 at 7:54 am
Well, 1 if the values are identical, does it really matter which table it comes from? 2 - I was assuming and apparently incorrectly that SQL Server will use the value from the first table referenced in the UNION. 3 - This does not work if there are columns that have different values and you want to limit it by a key column. Extending your example to add a second column and make the first column the PK:
DECLARE @t1 TABLE (col1 char(1) PRIMARY KEY, col2 CHAR(1))
DECLARE @t2 TABLE (col1 char(1) PRIMARY KEY, col2 CHAR(1))
INSERT INTO @t1
SELECT 'A', 'Z' UNION ALL
SELECT 'B', 'Y' UNION ALL
SELECT 'C', 'X' UNION ALL
SELECT 'D', 'W'
INSERT INTO @t2
SELECT 'a', 'w' UNION ALL
SELECT 'b', 'x' UNION ALL
SELECT 'E', 'Y' UNION ALL
SELECT 'F', 'Z'
SELECT col1, col2 FROM @t1
UNION
SELECT col1, col2 FROM @t2
SELECT col1, col2 FROM @t2
UNION
SELECT col1, col2 FROM @t1
These return all the rows. I think a FULL OUTER JOIN will work. Like this:
SELECT
ISNULL(T1.col1, T2.col1) AS col1,
CASE
WHEN T1.col1 IS NULL THEN T2.col2
ELSE T1.col2
END AS col2
FROM
@t1 AS T1 FULL OUTER JOIN
@t2 AS T2 ON
T1.col1 = T2.col1
A UNION with a LEFT OUTER JOIN in the second part of the UNION like this:
SELECT
col1,
col2
FROM
@t1
UNION
SELECT
T2.col1,
T2.col2
FROM
@t2 AS T2 LEFT JOIN
@t1 AS T1
ON T2.col1 = T1.col1
WHERE
T1.col1 IS NULL
The FULL OUTER JOIN performs better on the little test, but depending on indexes and amount of data the second one may work better.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
January 5, 2010 at 10:27 am
It seems like we need the OP to clarify what he's looking for...
January 5, 2010 at 1:38 pm
Thanks for the responses guys.. much appreciated. I think i didnt quite explain it enough.
the data in the tables is energy usage per hour. table A is generated from second data... the second table is historical data from a second source.
table a
CREATE TABLE [dbo].[Watts_Hour](
[Date] [datetime] NOT NULL,
[Source_Id] [uniqueidentifier] NOT NULL,
[Watts_Avg] [numeric](18, 10) NOT NULL,
[Watts_Min] [int] NOT NULL,
[Watts_Max] [int] NOT NULL,
[Total_Readings] [int] NOT NULL
) ON [PRIMARY]
table b
CREATE TABLE [dbo].[History_Watts_Hour](
[Date] [datetime] NOT NULL,
[Source_Id] [uniqueidentifier] NOT NULL,
[Watts_Avg] [numeric](18, 10) NOT NULL
) ON [PRIMARY]
now table a and b will have the same hours in but a different watts_avg value. Also, table a or b may be missing hours. Therefore i want to use table a as master and then table b as the fallback for missing hours, so that i get up to 24 hours in a day, but never over that.
i ended up doing a union of table a with a select from table b where source_id and date not in a. But is there a better way to do this? It seems messy.
Declare @lastDate datetime
set @lastdate = '2010-01-01' -- this date actually comes from another table
(select
Date,
Source_Id,
Watts_Avg,
Watts_Min,
Watts_Max,
Total_Readings
from Watts_Hour
where dbo.roundtime(Date,24) < dbo.roundtime(GetDate(),24)
AND dbo.roundtime(Date,24) > isnull(@lastdate,'01/01/1900')
)
union
(
select
Date,
Source_Id,
Watts_Avg,
Watts_Avg as Watts_Min,
Watts_Avg as Watts_Max,
0 as Total_Readings
from History_Watts_Hour hist
where dbo.roundtime(Date,24) < dbo.roundtime(GetDate(),24)
AND dbo.roundtime(Date,24) > isnull(@lastdate,'01/01/1900')
and not exists(select * from Watts_Hour wh where wh.date = hist.Date and wh.source_id = hist.Source_ID)
)
Thanks!
January 5, 2010 at 1:49 pm
I think you'd find the FULL OUTER JOIN method I mentioned earlier may perform better, although you are not going to get optimal performance from any solution as long as you have the function against a column in the WHERE clause.
What is the roundtime function rounding to? If you can replace that with a SARGable clause then you are more likely to get index seeks.
Jack Corbett
Consultant - Straight Path Solutions
Check out these links on how to get faster and more accurate answers:
Forum Etiquette: How to post data/code on a forum to get the best help
Need an Answer? Actually, No ... You Need a Question
January 5, 2010 at 1:53 pm
That's a slightly different requirement...
What I would do is to start with a calendar table holding days and hours.
Then I would use this table in a left join on Watts_Hour and History_Watts_Hour.
To get the data from Watts_Hour if available and History_Watts_Hour as a "second source" I would use COALESCE() function.
To get the description from above transformed into SQL I'd like to have some sample data to play with (I prefer to provide tested code...).
Regarding your current solution: I think at least because of your function dbo.roundtime() this code won't perform in a decent way... I'd rather work with dateadd/datediff than with a UDF....
January 5, 2010 at 3:09 pm
right heres some inserts...
insert into watts_hour (date, source_id, watts_avg, watts_min, watts_max, total_readings)
SELECT 'Jan 2 2010 9:00AM','394D9190-0196-4926-B952-6F72966FAD6A','1620.0000000000','1553','1701','48' UNION ALL
SELECT 'Jan 2 2010 10:00AM','394D9190-0196-4926-B952-6F72966FAD6A','1953.2500000000','1558','4532','518' UNION ALL
SELECT 'Jan 2 2010 11:00AM','394D9190-0196-4926-B952-6F72966FAD6A','1460.7500000000','440','2208','508' UNION ALL
SELECT 'Jan 2 2010 12:00PM','394D9190-0196-4926-B952-6F72966FAD6A','560.5000000000','401','1226','491' UNION ALL
SELECT 'Jan 2 2010 1:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1024.0000000000','514','4935','499' UNION ALL
SELECT 'Jan 2 2010 2:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1136.0000000000','495','2672','502' UNION ALL
SELECT 'Jan 2 2010 3:00PM','394D9190-0196-4926-B952-6F72966FAD6A','986.5000000000','547','2378','488' UNION ALL
SELECT 'Jan 2 2010 4:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1134.0000000000','665','3571','499' UNION ALL
SELECT 'Jan 2 2010 5:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1261.0000000000','1098','1468','509' UNION ALL
SELECT 'Jan 2 2010 6:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1447.2500000000','830','1969','499' UNION ALL
SELECT 'Jan 2 2010 7:00PM','394D9190-0196-4926-B952-6F72966FAD6A','955.5000000000','733','1158','483' UNION ALL
SELECT 'Jan 2 2010 8:00PM','394D9190-0196-4926-B952-6F72966FAD6A','902.7500000000','720','1438','403' UNION ALL
SELECT 'Jan 3 2010 12:00PM','394D9190-0196-4926-B952-6F72966FAD6A','985.0000000000','481','4391','472' UNION ALL
SELECT 'Jan 3 2010 1:00PM','394D9190-0196-4926-B952-6F72966FAD6A','676.2500000000','480','2928','494' UNION ALL
SELECT 'Jan 3 2010 2:00PM','394D9190-0196-4926-B952-6F72966FAD6A','717.5000000000','495','885','488' UNION ALL
SELECT 'Jan 3 2010 3:00PM','394D9190-0196-4926-B952-6F72966FAD6A','821.5000000000','760','978','473' UNION ALL
SELECT 'Jan 3 2010 4:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1277.0000000000','638','4456','482' UNION ALL
SELECT 'Jan 3 2010 5:00PM','394D9190-0196-4926-B952-6F72966FAD6A','991.5000000000','640','1681','490' UNION ALL
SELECT 'Jan 3 2010 6:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1602.2500000000','1212','1998','492' UNION ALL
SELECT 'Jan 3 2010 7:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1342.5000000000','1091','1466','506' UNION ALL
SELECT 'Jan 3 2010 8:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1375.5000000000','1122','1663','476' UNION ALL
SELECT 'Jan 3 2010 9:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1669.2500000000','1073','4880','503' UNION ALL
SELECT 'Jan 3 2010 10:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1426.7500000000','1099','2036','503' UNION ALL
SELECT 'Jan 3 2010 11:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1114.0000000000','1038','1172','67'
insert into history_watts_hour (date, source_id, watts_avg)
SELECT 'Jan 1 2010 1:00AM','394D9190-0196-4926-B952-6F72966FAD6A','313.0000000000' UNION ALL
SELECT 'Jan 1 2010 2:00AM','394D9190-0196-4926-B952-6F72966FAD6A','313.0000000000' UNION ALL
SELECT 'Jan 1 2010 5:00AM','394D9190-0196-4926-B952-6F72966FAD6A','633.0000000000' UNION ALL
SELECT 'Jan 1 2010 6:00AM','394D9190-0196-4926-B952-6F72966FAD6A','633.0000000000' UNION ALL
SELECT 'Jan 1 2010 9:00AM','394D9190-0196-4926-B952-6F72966FAD6A','578.0000000000' UNION ALL
SELECT 'Jan 1 2010 10:00AM','394D9190-0196-4926-B952-6F72966FAD6A','578.0000000000' UNION ALL
SELECT 'Jan 1 2010 1:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1029.0000000000' UNION ALL
SELECT 'Jan 1 2010 2:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1029.0000000000' UNION ALL
SELECT 'Jan 1 2010 5:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1202.0000000000' UNION ALL
SELECT 'Jan 1 2010 6:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1202.0000000000' UNION ALL
SELECT 'Jan 1 2010 7:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1002.5000000000' UNION ALL
SELECT 'Jan 1 2010 8:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1002.5000000000' UNION ALL
SELECT 'Jan 1 2010 9:00PM','394D9190-0196-4926-B952-6F72966FAD6A','257.5000000000' UNION ALL
SELECT 'Jan 1 2010 10:00PM','394D9190-0196-4926-B952-6F72966FAD6A','257.5000000000' UNION ALL
SELECT 'Jan 1 2010 11:00PM','394D9190-0196-4926-B952-6F72966FAD6A','267.0000000000' UNION ALL
SELECT 'Jan 2 2010 12:00AM','394D9190-0196-4926-B952-6F72966FAD6A','267.0000000000' UNION ALL
SELECT 'Jan 2 2010 1:00AM','394D9190-0196-4926-B952-6F72966FAD6A','277.0000000000' UNION ALL
SELECT 'Jan 2 2010 2:00AM','394D9190-0196-4926-B952-6F72966FAD6A','277.0000000000' UNION ALL
SELECT 'Jan 2 2010 3:00AM','394D9190-0196-4926-B952-6F72966FAD6A','356.0000000000' UNION ALL
SELECT 'Jan 2 2010 4:00AM','394D9190-0196-4926-B952-6F72966FAD6A','356.0000000000' UNION ALL
SELECT 'Jan 2 2010 5:00AM','394D9190-0196-4926-B952-6F72966FAD6A','640.0000000000' UNION ALL
SELECT 'Jan 2 2010 6:00AM','394D9190-0196-4926-B952-6F72966FAD6A','640.0000000000' UNION ALL
SELECT 'Jan 2 2010 7:00AM','394D9190-0196-4926-B952-6F72966FAD6A','1773.0000000000' UNION ALL
SELECT 'Jan 2 2010 8:00AM','394D9190-0196-4926-B952-6F72966FAD6A','1773.0000000000' UNION ALL
SELECT 'Jan 2 2010 9:00AM','394D9190-0196-4926-B952-6F72966FAD6A','972.0000000000' UNION ALL
SELECT 'Jan 2 2010 10:00AM','394D9190-0196-4926-B952-6F72966FAD6A','972.0000000000' UNION ALL
SELECT 'Jan 2 2010 11:00AM','394D9190-0196-4926-B952-6F72966FAD6A','1181.5000000000' UNION ALL
SELECT 'Jan 2 2010 12:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1181.5000000000' UNION ALL
SELECT 'Jan 2 2010 1:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1033.5000000000' UNION ALL
SELECT 'Jan 2 2010 2:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1033.5000000000' UNION ALL
SELECT 'Jan 2 2010 3:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1338.5000000000' UNION ALL
SELECT 'Jan 2 2010 4:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1338.5000000000' UNION ALL
SELECT 'Jan 2 2010 5:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1019.0000000000' UNION ALL
SELECT 'Jan 2 2010 6:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1019.0000000000' UNION ALL
SELECT 'Jan 2 2010 7:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1229.0000000000' UNION ALL
SELECT 'Jan 2 2010 8:00PM','394D9190-0196-4926-B952-6F72966FAD6A','1229.0000000000' UNION ALL
SELECT 'Jan 2 2010 9:00PM','394D9190-0196-4926-B952-6F72966FAD6A','466.5000000000' UNION ALL
SELECT 'Jan 2 2010 10:00PM','394D9190-0196-4926-B952-6F72966FAD6A','466.5000000000' UNION ALL
SELECT 'Jan 2 2010 11:00PM','394D9190-0196-4926-B952-6F72966FAD6A','247.5000000000' UNION ALL
SELECT 'Jan 3 2010 12:00AM','394D9190-0196-4926-B952-6F72966FAD6A','247.5000000000' UNION ALL
SELECT 'Jan 3 2010 1:00AM','394D9190-0196-4926-B952-6F72966FAD6A','252.0000000000' UNION ALL
SELECT 'Jan 3 2010 2:00AM','394D9190-0196-4926-B952-6F72966FAD6A','252.0000000000' UNION ALL
SELECT 'Jan 3 2010 3:00AM','394D9190-0196-4926-B952-6F72966FAD6A','397.5000000000' UNION ALL
SELECT 'Jan 3 2010 4:00AM','394D9190-0196-4926-B952-6F72966FAD6A','397.5000000000'
and using datediff dateadd
(select
Date,
Source_Id,
Watts_Avg,
Watts_Min,
Watts_Max,
Total_Readings
from Watts_Hour
where dbo.roundtime(Date,24) < DATEADD(hour, DATEDIFF(hour, 0, getdate()), 0)
AND dbo.roundtime(Date,24) > isnull(@lastdate,'01/01/1900')
)
union
(
select
Date,
Source_Id,
Watts_Avg,
Watts_Avg as Watts_Min,
Watts_Avg as Watts_Max,
0 as Total_Readings
from History_Watts_Hour hist
where dbo.roundtime(Date,24) < DATEADD(hour, DATEDIFF(hour, 0, getdate()), 0)
AND dbo.roundtime(Date,24) > isnull(@lastdate,'01/01/1900')
and not exists(select * from Watts_Hour wh where wh.date = hist.Date and wh.source_id = hist.Source_ID)
)
Hope that works... ill have a look into your suggestion as well to see if i can work it out.
Thanks soo much for your help!
January 5, 2010 at 3:12 pm
Hmm... i think i have solved it... presume the only way to get the hours is using a temp table?
I then cross join that temp table with all the source_ids to get every combination of hour and source id, then join the watts_hour and history_watts hour on the date and source_id
Declare @lastDate datetime
set @lastdate = '2010-01-01' -- this date actually comes from another table
-- build hours 0 - 23
DECLARE @temp_hours TABLE
(
hour datetime
)
declare @temp_date datetime
set @temp_date = @lastdate
WHILE @temp_date <= DATEADD(Hour, DATEDIFF(Hour, 0, getdate()), 0)
BEGIN
insert into @temp_hours (hour)
values(DATEADD(Hour, DATEDIFF(Hour, 0, @temp_date), 0))
set @temp_date = dateadd(hh,1, @temp_date)
END
select
TH.Hour,
S.Source_ID,
COALESCE(WH.Watts_Avg, HWH.Watts_Avg) as Watts_Avg,
COALESCE(WH.Watts_Min, HWH.Watts_Avg) as Watts_Min,
COALESCE(WH.Watts_Max, HWH.Watts_Avg) as Watts_Max,
COALESCE(WH.Total_Readings, null) as Total_Readings
from @temp_hours TH
cross join (select distinct Source_Id from Watts_Hour) S
left join Watts_Hour WH on WH.date = TH.hour and WH.Source_Id = s.Source_Id
left join History_Watts_Hour HWH on HWH.date = TH.hour and HWH.Source_Id = s.Source_Id
where WH.Watts_Avg is not null or HWH.Watts_Avg is not null
Is this what you think is the best approach?
Thanks
January 5, 2010 at 5:27 pm
Here's what I would use.
;with cteTally AS
(
SELECT row_number() OVER(order BY v1.number) AS N FROM master..spt_values v1
WHERE type ='P' AND number<365
),
cteCalendar AS
(
SELECT dateadd(dd,N-1,'20100101') AS day_
FROM cteTally
)
,cteHrPerDay AS
(
SELECT dateadd(hh,N-1,day_) AS date
FROM cteCalendar
CROSS JOIN cteTally WHERE n<25 -- expand each day from the calendar table to have 24hrs
)
SELECT c.date, coalesce(w.watts_avg,h.watts_avg) AS watts_avg
FROM cteHrPerDay c
LEFT OUTER JOIN Watts_Hour w ON c.date=w.Date
LEFT OUTER JOIN history_Watts_Hour h ON c.date=h.Date
WHERE coalesce(w.watts_avg,h.watts_avg) IS NOT NULL
ORDER BY c.date
I strongly recommend to have the cteTally as a permanent Tally or Numbers table as well as having a permanent calendar table (instead of building it on the fly like I did to demonstrate the concept). The concept as well as some sample code how to create it is described in the Tally table link in my signature.
Regarding your solution: I didn't have time to have a closer look at it. Promised for tomorrow (it's past 1 o'clock in the morning over here...)
January 6, 2010 at 2:26 am
After looking at your solution I found that I need to include the source_ID.
The basic concept of the join is pretty much the same you used. I just have a different way how to build the hour table.
I also move the source_ID subquery into the cte section. But it's still the same join concept...
You mentioned the value for @lastdate comes from another table. Having a calendar table like I recommended in my prev. post would allow to join this table as well.
;with cteTally AS
(
SELECT row_number() OVER(order BY v1.number) AS N FROM master..spt_values v1
WHERE type ='P' AND number<365
),
cteCalendar AS
(
SELECT dateadd(dd,N-1,'20100101') AS day_
FROM cteTally
)
,cteHrPerDay AS
(
SELECT dateadd(hh,N-1,day_) AS date
FROM cteCalendar
CROSS JOIN cteTally WHERE n<25 -- expand each day from the calendar table to have 24hrs
)
, cteSourceId AS
(
SELECT Source_Id FROM Watts_Hour GROUP BY Source_Id
)
SELECT c.date, s.source_id, coalesce(w.watts_avg,h.watts_avg) AS watts_avg
FROM cteHrPerDay c
CROSS JOIN cteSourceId s -- expand each hour per day to each sourceid
LEFT OUTER JOIN Watts_Hour w ON c.date=w.Date AND s.Source_Id = w.Source_Id
LEFT OUTER JOIN history_Watts_Hour h ON c.date=h.Date AND s.Source_Id = h.Source_Id
WHERE coalesce(w.watts_avg,h.watts_avg) IS NOT NULL
ORDER BY c.date
January 6, 2010 at 4:14 am
That looks excellent. Having a permanent calendar table makes absolute sense to me and your solution looks so much cleaner than mine.
I'll give it a try when I get home tonight.
Thanks so much for your help.
Do you recommend that i create a table with hours in from the earliest date i need to some date a few years down the line, or should i have a job that continues to add new rows to the table every month or so.... or does it not make much of a difference?
Thanks so much
January 6, 2010 at 5:38 am
sqlservercentral 53898 (1/6/2010)
That looks excellent. Having a permanent calendar table makes absolute sense to me and your solution looks so much cleaner than mine.I'll give it a try when I get home tonight.
Thanks so much for your help.
Do you recommend that i create a table with hours in from the earliest date i need to some date a few years down the line, or should i have a job that continues to add new rows to the table every month or so.... or does it not make much of a difference?
Thanks so much
Thank you for the feedback! 🙂
Regarding the hours table:
If this is a frequently used query I would use a permanent hours_per_day table as you described. Since each year you'd add less than 9000 rows I would include a check in one of my archiving procedures and if there are future data for less than a month I'd add another year.
I recommend against building a table with a few years down the line without any addtl. code that will take care of another expansion once you reach the point where your precreated data won't be enough anymore. Otherwise you're running the risk of false results all of a sudden...
Viewing 15 posts - 1 through 15 (of 15 total)
You must be logged in to reply to this topic. Login to reply