March 3, 2011 at 12:03 pm
Hi,
I'm stuck with a following task.
Here is my problem (simplified):
I have 3 columns in my table: 2008, 2009, 2010
Here is what the data looks like:
2008 2009 2010
2 2 2
3 3 5
5 5 NULL
I need to compare values in each row, and find those rows where values are the same. The tricky part is that if the value is NULL, it needs to be ignored.
So based on the example above, my query would return rows 1 and 3.
Does it make sense?
Thank you very much in advance.
March 3, 2011 at 12:12 pm
A nasty but successful solution:
WHERE
COALESCE( 2008, 2009, 2010) = COALESCE( 2009, 2008, 2010)
AND COALESCE( 2008, 2009, 2010) = COALESCE( 2010, 2009, 2008)
....
That Null is a problem. If you could guarantee one of the columns as not null, just one, this would be a lot cleaner.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
March 3, 2011 at 12:17 pm
CREATE TABLE #T (
ID INT IDENTITY
PRIMARY KEY,
[2008] INT,
[2009] INT,
[2010] INT) ;
INSERT INTO #T
([2008],
[2009],
[2010])
SELECT 2,
2,
2
UNION ALL
SELECT 3,
3,
5
UNION ALL
SELECT 5,
5,
NULL ;
SELECT *
FROM #T
WHERE ([2008] = [2009]
AND [2009] = [2010])
OR ([2008] = [2009]
AND [2010] IS NULL)
OR ([2008] IS NULL
AND [2009] = [2010])
OR ([2008] = [2010]
AND [2009] IS NULL) ;
There are more complex ways to get this done, which may be better if you have more columns than just three.
I have to ask, why columns for years (that's what they look like)? Why not rows? That would make much more sense in most relational databases.
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
March 3, 2011 at 12:26 pm
GSquared (3/3/2011)
CREATE TABLE #T (
ID INT IDENTITY
PRIMARY KEY,
[2008] INT,
[2009] INT,
[2010] INT) ;
INSERT INTO #T
([2008],
[2009],
[2010])
SELECT 2,
2,
2
UNION ALL
SELECT 3,
3,
5
UNION ALL
SELECT 5,
5,
NULL ;
SELECT *
FROM #T
WHERE ([2008] = [2009]
AND [2009] = [2010])
OR ([2008] = [2009]
AND [2010] IS NULL)
OR ([2008] IS NULL
AND [2009] = [2010])
OR ([2008] = [2010]
AND [2009] IS NULL) ;
There are more complex ways to get this done, which may be better if you have more columns than just three.
I have to ask, why columns for years (that's what they look like)? Why not rows? That would make much more sense in most relational databases.
Thanks for your help, guys! There are actually 25 columns, not 3.
Let me explain exactly what I'm working with, so you will see why I need such a strange task. I have a list of companies. The timeline is 25 years. Some companies existed all of these 25 years, some were created at any point during these 25 years, some ceased to exist during these 25 years. So the nulls are in the columns where company didn't exist. What I need to find are those companies, that change their names during their existence, no matter what the longevity of that existence was. Does this make more sense now?
March 3, 2011 at 12:30 pm
Makes sense :-); how many rows are there in the table ?
March 3, 2011 at 12:31 pm
lee rite (3/3/2011)
Thanks for your help, guys! There are actually 25 columns, not 3.Let me explain exactly what I'm working with, so you will see why I need such a strange task. I have a list of companies. The timeline is 25 years. Some companies existed all of these 25 years, some were created at any point during these 25 years, some ceased to exist during these 25 years. So the nulls are in the columns where company didn't exist. What I need to find are those companies, that change their names during their existence, no matter what the longevity of that existence was. Does this make more sense now?
I would sincerely follow Gus's advice and look into normalizing this data. An UnPivot would work well here, and would allow for you to use more common conventions in locating this data.
Never stop learning, even if it hurts. Ego bruises are practically mandatory as you learn unless you've never risked enough to make a mistake.
For better assistance in answering your questions[/url] | Forum Netiquette
For index/tuning help, follow these directions.[/url] |Tally Tables[/url]
Twitter: @AnyWayDBA
March 3, 2011 at 12:32 pm
About 5,000....
March 3, 2011 at 12:38 pm
Craig Farrell (3/3/2011)
lee rite (3/3/2011)
Thanks for your help, guys! There are actually 25 columns, not 3.Let me explain exactly what I'm working with, so you will see why I need such a strange task. I have a list of companies. The timeline is 25 years. Some companies existed all of these 25 years, some were created at any point during these 25 years, some ceased to exist during these 25 years. So the nulls are in the columns where company didn't exist. What I need to find are those companies, that change their names during their existence, no matter what the longevity of that existence was. Does this make more sense now?
I would sincerely follow Gus's advice and look into normalizing this data. An UnPivot would work well here, and would allow for you to use more common conventions in locating this data.
I actually created that table myself from the original table, thinking that this will somehow help me to find the solution. This was the structure of the original table that consists all data:
GUID, YEAR, CompanyName
111 2002 Company1
111 2003 Company1
111 2004 NULL
222 2002 Company2
222 2003 Company2
222 2004 Company2
333 2002 Company3
333 2003 Company3
333 2004 Company34
and so on...
I couldnt think of other ways to deal with it. Maybe you guys have some ideas? Thanks!
March 3, 2011 at 12:42 pm
Try this:
SELECT ID
FROM
( SELECT * FROM #T ) PIVOT_TABLE
UNPIVOT
(
Vals FOR ColNames IN ([2008],[2009],[2010])
) PIVOT_HANDLE
GROUP BY ID
HAVING COUNT( DISTINCT PIVOT_HANDLE.Vals) = 1
March 3, 2011 at 12:48 pm
lee rite (3/3/2011)
Craig Farrell (3/3/2011)
lee rite (3/3/2011)
Thanks for your help, guys! There are actually 25 columns, not 3.Let me explain exactly what I'm working with, so you will see why I need such a strange task. I have a list of companies. The timeline is 25 years. Some companies existed all of these 25 years, some were created at any point during these 25 years, some ceased to exist during these 25 years. So the nulls are in the columns where company didn't exist. What I need to find are those companies, that change their names during their existence, no matter what the longevity of that existence was. Does this make more sense now?
I would sincerely follow Gus's advice and look into normalizing this data. An UnPivot would work well here, and would allow for you to use more common conventions in locating this data.
I actually created that table myself from the original table, thinking that this will somehow help me to find the solution. This was the structure of the original table that consists all data:
GUID, YEAR, CompanyName
111 2002 Company1
111 2003 Company1
111 2004 NULL
222 2002 Company2
222 2003 Company2
222 2004 Company2
333 2002 Company3
333 2003 Company3
333 2004 Company34
and so on...
I couldnt think of other ways to deal with it. Maybe you guys have some ideas? Thanks!
If this is your data, then
SELECT GUID
FROM #T
GROUP BY GUID
HAVING COUNT( DISTINCT CompanyName) = 1
will do...
March 3, 2011 at 12:48 pm
ColdCoffee (3/3/2011)
Try this:
SELECT ID
FROM
( SELECT * FROM #T ) PIVOT_TABLE
UNPIVOT
(
Vals FOR ColNames IN ([2008],[2009],[2010])
) PIVOT_HANDLE
GROUP BY ID
HAVING COUNT( DISTINCT PIVOT_HANDLE.Vals) = 1
Wow! This is way over my head! Impressive! But I get following error when trying to run this:
Incorrect syntax near 'UNPIVOT'. You may need to set the compatibility level of the current database to a higher value to enable this feature. See help for the stored procedure sp_dbcmptlevel.
I have SQL Server 2005.
March 3, 2011 at 12:48 pm
lee rite (3/3/2011)
Craig Farrell (3/3/2011)
lee rite (3/3/2011)
Thanks for your help, guys! There are actually 25 columns, not 3.Let me explain exactly what I'm working with, so you will see why I need such a strange task. I have a list of companies. The timeline is 25 years. Some companies existed all of these 25 years, some were created at any point during these 25 years, some ceased to exist during these 25 years. So the nulls are in the columns where company didn't exist. What I need to find are those companies, that change their names during their existence, no matter what the longevity of that existence was. Does this make more sense now?
I would sincerely follow Gus's advice and look into normalizing this data. An UnPivot would work well here, and would allow for you to use more common conventions in locating this data.
I actually created that table myself from the original table, thinking that this will somehow help me to find the solution. This was the structure of the original table that consists all data:
GUID, YEAR, CompanyName
111 2002 Company1
111 2003 Company1
111 2004 NULL
222 2002 Company2
222 2003 Company2
222 2004 Company2
333 2002 Company3
333 2003 Company3
333 2004 Company34
and so on...
I couldnt think of other ways to deal with it. Maybe you guys have some ideas? Thanks!
Is the "GUID" column some sort of identifier for the company, or is it something else? I.e.: Did Company3 become Company34, since both have "GUID" 333, in your example?
Or is there some other way of identifying which prior-year rows are tied to which later-year rows, other than just CompanyName?
- Gus "GSquared", RSVP, OODA, MAP, NMVP, FAQ, SAT, SQL, DNA, RNA, UOI, IOU, AM, PM, AD, BC, BCE, USA, UN, CF, ROFL, LOL, ETC
Property of The Thread
"Nobody knows the age of the human race, but everyone agrees it's old enough to know better." - Anon
March 3, 2011 at 12:53 pm
GSquared (3/3/2011)
lee rite (3/3/2011)
Craig Farrell (3/3/2011)
lee rite (3/3/2011)
Thanks for your help, guys! There are actually 25 columns, not 3.Let me explain exactly what I'm working with, so you will see why I need such a strange task. I have a list of companies. The timeline is 25 years. Some companies existed all of these 25 years, some were created at any point during these 25 years, some ceased to exist during these 25 years. So the nulls are in the columns where company didn't exist. What I need to find are those companies, that change their names during their existence, no matter what the longevity of that existence was. Does this make more sense now?
I would sincerely follow Gus's advice and look into normalizing this data. An UnPivot would work well here, and would allow for you to use more common conventions in locating this data.
I actually created that table myself from the original table, thinking that this will somehow help me to find the solution. This was the structure of the original table that consists all data:
GUID, YEAR, CompanyName
111 2002 Company1
111 2003 Company1
111 2004 NULL
222 2002 Company2
222 2003 Company2
222 2004 Company2
333 2002 Company3
333 2003 Company3
333 2004 Company34
and so on...
I couldnt think of other ways to deal with it. Maybe you guys have some ideas? Thanks!
Is the "GUID" column some sort of identifier for the company, or is it something else? I.e.: Did Company3 become Company34, since both have "GUID" 333, in your example?
Or is there some other way of identifying which prior-year rows are tied to which later-year rows, other than just CompanyName?
GUID is the unique identifier for the company which lets me know that Company3 and Company34 IS in fact the same company, that changed its name.
March 3, 2011 at 12:55 pm
SELECT GUID
FROM #T
GROUP BY GUID
HAVING COUNT( DISTINCT CompanyName) = 1
March 3, 2011 at 12:59 pm
ColdCoffee (3/3/2011)
SELECT GUID
FROM #T
GROUP BY GUID
HAVING COUNT( DISTINCT CompanyName) = 1
Thanks ColdCoffee! I think you nailed it. Your query selects all companies that never changed their name. I think I can work with that!
Thank you very much, guys!
Viewing 15 posts - 1 through 15 (of 29 total)
You must be logged in to reply to this topic. Login to reply