December 5, 2008 at 4:34 am
Hi all,
I need all the help I can get for a SQL query I just can't figure out that's making my head spin for days...
Here comes an explanation, note that I have simplified the tables and fields names (and the number of them) to make it a little simpler.
I have 3 tables: PRODUCTS, CITIES and OTM (for one-to-many)
PRODUCTS contains prdID (numeric), prdName (varchar)
CITIES contains postcodeID (numeric), postcode (numeric), city (varchar), latitude (numeric), longitude (numeric)
OTM contains prdID (numeric), postcodeID (numeric)
I also have a SQL scalar function called getDistanceFromID that accepts 3 parameters: postcodeID, latitude, longitude.
What it does is query the CITIES table to retrieve the latitude and longitude of the city whose postcodeID is passed as parameter and calculate the distance between that city and the one whose latitude and longitude are passed as parameters.
To give you an idea, the following query
SELECT city, getDistanceFromID(555, latitude, longitude) AS [distance] FROM CITIES ORDER BY distance
would return all the cities contained in the CITIES table and the distance from the city whose postcodeID is 555, ordered by ascending distance.
This function works fine.
This is where my problem starts:
Each product can be purchased by a customer in different cities. That means that the OTM table contains one record for each product/city pair. If product X is available in 3 different cities, OTM will have 3 different records for that product, with the same prdID and a different postcodeID.
What I want is build a query that will retrieve ONLY one record for each product. This ain't a problem but the problem is that:
that record must return at least the followong fields: prdID, postcodeID, min_distance (where min_distance is the smallest distance between the customer's own city - which we know at all times - and the city the product in question can be found)
Believe me, I have tried lots of things.
- retrieve unique products (prdID) and the shortest distance -> no problem
- retrieve cities (postcodeID) and the shortest distance -> no problem
- retrieve unique products (prdID) AND the city it's in (postcodeID) AND the shortest distance -> [PULL HAIR]
I've tried all sorts of sub-queries, GROUP By clauses and aggregate function to no avail until now.
I hope I've been clear in the description of my problem and I really hope someone can help.
Thanks in advance,
Chris
December 5, 2008 at 5:15 am
Chris,
Do you mean this:
retrieve unique products (prdID) AND the city it's in (postcodeID) AND the shortest distance to a reference city
Cheers
ChrisM
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
December 5, 2008 at 5:21 am
Chris Morris (12/5/2008)
Chris,Do you mean this:
retrieve unique products (prdID) AND the city it's in (postcodeID) AND the shortest distance to a reference city
Cheers
ChrisM
Hi ChrisM,
That's it exactly !!!
Chris
December 5, 2008 at 5:26 am
Ok lets confirm this:
Say your reference city is Boston, your requirement would be:
For each and every individual product (which has been sold), list the nearest city to Boston where it's been sold.
Is this correct?
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
December 5, 2008 at 5:30 am
Chris Morris (12/5/2008)
Ok lets confirm this:Say your reference city is Boston, your requirement would be:
For each and every individual product (which has been sold), list the nearest city to Boston where it's been sold.
Is this correct?
It's correct.
Where the query is concerned, I actually need (at least) the prdID, the postcodeID AND the distance
Chris
December 5, 2008 at 5:34 am
chris (12/5/2008)
Chris Morris (12/5/2008)
Ok lets confirm this:Say your reference city is Boston, your requirement would be:
For each and every individual product (which has been sold), list the nearest city to Boston where it's been sold.
Is this correct?
It's correct.
Where the query is concerned, I actually need (at least) the prdID, the postcodeID AND the distance
Chris
Lovely stuff, thanks Chris. Last question - can you confirm that you're using SQL Server 2000 and not 2005?
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
December 5, 2008 at 5:35 am
To give you an idea, I tried this subquery to retrieve an individual product and the distance to the nearest location. It works great as is but as you can imagine, the GROUP BY clause prevents me from also retrieving the postcodeID that I need as well.
Note: the 555 value in the getDistance function is the customer's postcodeID; we know this value all along because it's stored in a cookie.
SELECT prdID, MIN(getDistance(555, POSTCODES.latitude, POSTCODES.longitude)) AS [Distance]
FROM OTM LEFT JOIN POSTCODES
ON OTM.postcodeID = POSTCODES.postcodeID
GROUP BY prdID
Chris
December 5, 2008 at 5:37 am
Chris Morris (12/5/2008)
Lovely stuff, thanks Chris. Last question - can you confirm that you're using SQL Server 2000 and not 2005?
Yes unfortunately.
I've heard of the OVER PARTITION option that's available in MSSQL 2005 but I do have to work with MSSQL 2000 for the time being :doze:
December 5, 2008 at 5:48 am
Yep your absolutely right, this type of query is far more elegant in 2005.
Here's a query which will work.
[font="Courier New"]DECLARE @Boston_prdID (numeric)
SET @Boston_prdID = 555
SELECT p.*, c.*
FROM PRODUCTS p
INNER JOIN OTM o ON o.prdID = o.prdID
INNER JOIN CITIES c ON c.postcodeID = o.postcodeID
INNER JOIN (
SELECT pd.prdID, MIN(dbo.getDistanceFromID(@Boston_prdID, cd.latitude, cd.longitude)) AS [DistanceToBoston]
FROM PRODUCTS pd
INNER JOIN OTM od ON od.prdID = od.prdID
INNER JOIN CITIES cd ON cd.postcodeID = od.postcodeID
GROUP BY pd.prdID
) d ON d.prdID = p.prdID AND d.[DistanceToBoston] = dbo.getDistanceFromID(@Boston_prdID, c.latitude, c.longitude)
[/font]
I hate it, hopefully someone will come up with something more elegant, but at least it will get you started.
Cheers
ChrisM
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
December 5, 2008 at 6:15 am
Chris Morris (12/5/2008)
Here's a query which will work.
[font="Courier New"]DECLARE @Boston_prdID (numeric)
SET @Boston_prdID = 555
SELECT p.*, c.*
FROM PRODUCTS p
INNER JOIN OTM o ON o.prdID = o.prdID
INNER JOIN CITIES c ON c.postcodeID = o.postcodeID
INNER JOIN (
SELECT pd.prdID, MIN(dbo.getDistanceFromID(@Boston_prdID, cd.latitude, cd.longitude)) AS [DistanceToBoston]
FROM PRODUCTS pd
INNER JOIN OTM od ON od.prdID = od.prdID
INNER JOIN CITIES cd ON cd.postcodeID = od.postcodeID
GROUP BY pd.prdID
) d ON d.prdID = p.prdID AND d.[DistanceToBoston] = dbo.getDistanceFromID(@Boston_prdID, c.latitude, c.longitude)
[/font]
Hi ChrisM,
I'm afraid there's a twist: I do indeed get unique products BUT they all map the same city, that of the first product returned :crying:
Chris
December 5, 2008 at 6:34 am
Hi Chris
There's an error in the JOIN, which is corrected here:
DECLARE @Boston_prdID (numeric)
SET @Boston_prdID = 555
SELECT p.*, c.*, dbo.getDistanceFromID(@Boston_prdID, c.latitude, c.longitude) AS [DistanceToBoston], d.DistanceToBoston AS [MinDistanceToBoston]
FROM PRODUCTS p
INNER JOIN OTM o ON o.prdID = p.prdID
INNER JOIN CITIES c ON c.postcodeID = o.postcodeID
LEFT JOIN (SELECT pd.prdID, MIN(dbo.getDistanceFromID(@Boston_prdID, cd.latitude, cd.longitude)) AS [DistanceToBoston]
FROM PRODUCTS pd
INNER JOIN OTM od ON od.prdID = pd.prdID
INNER JOIN CITIES cd ON cd.postcodeID = od.postcodeID
GROUP BY pd.prdID
) d ON d.prdID = p.prdID AND d.[DistanceToBoston] = dbo.getDistanceFromID(@Boston_prdID, c.latitude, c.longitude)
ORDER BY p.prdID, d.DistanceToBoston
I've used a LEFT JOIN here so you can see the closest city and other cities (rejected rows in the original) in the output.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
December 5, 2008 at 6:51 am
Chris Morris (12/5/2008)
I've used a LEFT JOIN here so you can see the closest city and other cities (rejected rows in the original) in the output.
Chris,
Just read the Best practices, thanks, though I'm not sure how I can make it easier for you to help me.
Please don't hesitate to tell me...
I'm afraid your second query has a problem as well.
Now I get results where all postcodeID's are used for all products 😉
I'll now try and post a script that will create test tables and their content to make it easier to understand and to test.
I really appreciate your help, Chris
Chris
December 5, 2008 at 6:58 am
I'm afraid your second query has a problem as well.
Now I get results where all postcodeID's are used for all products
That's the intention, sometimes a quick eyeball is all you need.
If you look at the two rightmost columns, one of them is always poulated, the other is only populated when that row has the minimum distance from city to reference city.
To restrict this query so that it provides the results you want, change the join to the derived table from a LEFT JOIN to an INNER JOIN like this:
DECLARE @Boston_prdID (numeric)
SET @Boston_prdID = 555
SELECT p.*, c.*, dbo.getDistanceFromID(@Boston_prdID, c.latitude, c.longitude) AS [DistanceToBoston], d.DistanceToBoston AS [MinDistanceToBoston]
FROM PRODUCTS p
INNER JOIN OTM o ON o.prdID = p.prdID
INNER JOIN CITIES c ON c.postcodeID = o.postcodeID
INNER JOIN (SELECT pd.prdID, MIN(dbo.getDistanceFromID(@Boston_prdID, cd.latitude, cd.longitude)) AS [DistanceToBoston]
FROM PRODUCTS pd
INNER JOIN OTM od ON od.prdID = pd.prdID
INNER JOIN CITIES cd ON cd.postcodeID = od.postcodeID
GROUP BY pd.prdID
) d ON d.prdID = p.prdID AND d.[DistanceToBoston] = dbo.getDistanceFromID(@Boston_prdID, c.latitude, c.longitude)
ORDER BY p.prdID, d.DistanceToBoston
Some sample data would be just grand!
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
December 5, 2008 at 7:30 am
Chris Morris (12/5/2008)
Some sample data would be just grand!
here you go 🙂
--creation of the getDistanceFromID scalar function
CREATE FUNCTION [dbo].[getDistanceFromID]
(
@localID numeric, @localX float, @localY float
)
RETURNS float
AS
BEGIN
DECLARE @i float
DECLARE @lX float, @lY float
SELECT @lX = lambertX, @lY = lambertY FROM CITIES WHERE postcodeID = @localID
SET @i = SQRT(((@lX - @localX)*(@lX - @localX)) + ((@lY - @localY)*(@lY - @localY)))
SET @i = ROUND(@i, 1)
RETURN @i
END
--creation of the PRODUCTS TABLE
DROP TABLE PRODUCTS
CREATE TABLE PRODUCTS(
[ID] [numeric](18, 0) IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED,
[prdID] [numeric](18, 0) NOT NULL,
[prdName] [varchar](50) NOT NULL
)
--creation of the CITIES TABLE
DROP TABLE CITIES
CREATE TABLE CITIES(
[ID] [numeric](18, 0) IDENTITY(1,1) NOT NULL PRIMARY KEY CLUSTERED,
[postcodeID] [numeric](18, 0) NOT NULL,
[postcode] [numeric](18, 0) NOT NULL,
[city] [varchar](50) NOT NULL,
[latitude] [numeric](18, 0) NOT NULL,
[longitude] [numeric](18, 0) NOT NULL
)
--creation of the OTM TABLE
DROP TABLE OTM
CREATE TABLE OTM(
[prdID] [numeric](18, 0) NOT NULL,
[postcodeID] [numeric](18, 0) NOT NULL
)
--populate PRODUCTS TABLE
INSERT INTO PRODUCTS VALUES(1, 'airplane')
INSERT INTO PRODUCTS VALUES(2, 'speedboat')
INSERT INTO PRODUCTS VALUES(3, 'bicycle')
--populate CITIES TABLE
INSERT INTO CITIES VALUES(1, 1040, 'Madrid', 159, 183)
INSERT INTO CITIES VALUES(2, 1800, 'Rome', 209, 159)
INSERT INTO CITIES VALUES(3, 4500, 'Madrid', 208, 133)
INSERT INTO CITIES VALUES(4, 6270, 'Firenze', 233, 88)
INSERT INTO CITIES VALUES(5, 9070, 'Brussels', 126, 181)
--populate OTM TABLE
INSERT INTO OTM VALUES(1, 2)
INSERT INTO OTM VALUES(1, 5)
INSERT INTO OTM VALUES(2, 1)
INSERT INTO OTM VALUES(3, 1)
INSERT INTO OTM VALUES(3, 3)
INSERT INTO OTM VALUES(3, 5)
December 5, 2008 at 7:47 am
SELECT @lX = lambertX, @lY = lambertY FROM CITIES WHERE postcodeID = @localID
The definition fior the cities table doesn't include lambertX and lambertY
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
Viewing 15 posts - 1 through 15 (of 19 total)
You must be logged in to reply to this topic. Login to reply