May 19, 2011 at 11:04 am
Hi All,
I'm trying to eliminate duplicate pricing from a price export file we get. I upload it into a staging table and then select new values into the main data table. Both tables have the same schema, the relevant parts are:
Export Table:
Transmission_ID = varchar(20)
Supplier= varchar(40)
Terminal = varchar(40)
Branded = varchar(1)
Product = varchar(40)
Eff_Date = varchar(8)
Eff_Time - varchar(5)
Price = decimal(10,6)
Move = decimal(10,6)
The data looks like this:
'5B6Y9460AH0F','Supplier1','B','Vstl C','ULS #2','20110518','1800','3.225600','0.033000'
'5B5LJ560AH03','Supplier1','B','Vstl C','ULS 2DY','20110517','1800','3.196600','-0.010000'
'5B6Y9460AH0F','Supplier1',''Vstl C','ULS 2DY','20110518','1800','3.229600','0.033000'
'5B5LDQ60AH03','Supplier2','U','Albn M','M89 E10%','20110517','1800','2.972999','-0.042500'
'5B6K1260AH07','Supplier2','U','Albn M','M89 E10%','20110518','1100','3.003000','0.030000'
'5B6YGWP0AH05','Supplier2','U','Albn M','M89 E10%','20110518','1800','2.990000','-0.013000'
'5B6XWXP0AH05','Supplier2','U','Albn M','M89 E10%','20110518','1800','2.980000','-0.023000'
'5B5LDQ60AH03','Supplier2','U','Albn M','P93 E10%','20110517','1800','3.123000','-0.042500'
'5B6K1260AH07','Supplier2','U','Albn M','P93 E10%','20110518','1100','3.153000','0.030000'
'5B6XWXP0AH05','Supplier2','U','Albn M','P93 E10%','20110518','1800','3.129999','-0.023000'
'5B6YGWP0AH05','Supplier2','U','Albn M','P93 E10%','20110518','1800','3.140000','-0.013000'
The Primary key for the main table is (Supplier, Terminal, Branded, Product, Eff_Date, Eff_Time)
You can see in the data that Supplier2 sent a second price change for the same date and time that breaks the primary key. I'm trying to just select the latest price (Max (Transmission_ID) )when this happens. But the final query does not act like I expect. I think it's because Transmission_ID is only unique for a batch of prices that were appended to the file (50 or so at a time).
This code gives me the correct subset of prices (about 2983 of 3166):
SELECT MAX(Transmission_ID) AS Transmission_ID
,[Supplier]
,[Terminal]
,[Product]
,[Branded]
,[Eff_Date]
,[Eff_Time]
FROM [DTN_EXPORT]
GROUP BY [Supplier]
,[Terminal]
,[Product]
,[Branded]
,[Eff_Date]
,[Eff_Time]
Then I try to map in the missing fields.
In this version I tried selecting into a temp table instead of using a sub query, but the result was the same, it returns 25411 records, because of the "Batch" nature of the Transmission_ID
SELECT MAX(Transmission_ID) AS Transmission_ID
,Supplier
,Terminal
,Product
,Branded
,Eff_Date
,Eff_Time INTO #EXPORT_TEMP
FROM dbo.DTN_EXPORT
GROUP BY Supplier
,Terminal
,Product
,Branded
,Eff_Date
,Eff_Time
SELECT E.* From [DTN_EXPORT] E
Inner Join #EXPORT_TEMP E1
On E1.Transmission_ID = e.Transmission_ID
AND E1.Supplier = E.Supplier
And E1.Terminal = E.Terminal
And E1.Branded = E.Branded
And E1.Eff_Date = E.Eff_Date
And E1.Eff_Time = E.Eff_Time
If I put a "DISTINCT' in, it removes the duplicates because of the TRANS_ID field but leaves in the duplicate prices from Supplier2 (They have distinct TRANS_ID)
Not sure how to get the results i'm after.
Thanks
Brent
May 19, 2011 at 12:03 pm
brent 82845 (5/19/2011)The Primary key for the main table is (Supplier, Terminal, Branded, Product, Eff_Date, Eff_Time)
You can see in the data that Supplier2 sent a second price change for the same date and time that breaks the primary key.
I think your real problem is you are fighting a bad design. It totally makes sense that a supplier might send a second entry for a product because it was either entered incorrectly or somebody changed their mind about the price on xxx date. Also, you really should use datetime fields instead of two separate varchar fields.
If you can provide some ddl and some sample data (in the form of insert statements) I am certain that many people around here are willing and able to help.
_______________________________________________________________
Need help? Help us help you.
Read the article at http://www.sqlservercentral.com/articles/Best+Practices/61537/ for best practices on asking questions.
Need to split a string? Try Jeff Modens splitter http://www.sqlservercentral.com/articles/Tally+Table/72993/.
Cross Tabs and Pivots, Part 1 – Converting Rows to Columns - http://www.sqlservercentral.com/articles/T-SQL/63681/
Cross Tabs and Pivots, Part 2 - Dynamic Cross Tabs - http://www.sqlservercentral.com/articles/Crosstab/65048/
Understanding and Using APPLY (Part 1) - http://www.sqlservercentral.com/articles/APPLY/69953/
Understanding and Using APPLY (Part 2) - http://www.sqlservercentral.com/articles/APPLY/69954/
Viewing 2 posts - 1 through 2 (of 2 total)
You must be logged in to reply to this topic. Login to reply