Duplicate records on a join.

  • Hi all, I'm having problems with the query below, this is a link to my previous post on the forum although this relates to a slightly different topic. http://www.sqlservercentral.com/Forums/Topic1557479-2799-1.aspx.

    I had a problem before of not been able to find the rows with 0 values. I've now managed this although it's brought up duplicate rows due to the discounts been different on the same Mfr_part_number. I tried using the max function on the isnull (Exhibit_Discount.Discount, 0.00) AS Discount instead but to no success. If anybody had any advice on what steps to take to resolve this I'd be very grateful, i think i maybe something to do with PK keys not been used in the set-up of the database.

    Use Sales_Builder

    Go

    SELECT DISTINCT

    GBPriceList.[Mfr_Part_Num],

    [Long_ description],

    [Short_ Description],

    --Exhibit_Unpivoted.Exhibit,

    WeightKGS,

    WarrantyCode,

    ProductLine,

    Serialized,

    isnull (ListPrice, 0) AS Listprice,

    Prod_Class,

    isnull (Exhibit_Discount.Discount, 0.00) AS Discount

    FROM GBPriceList,LongDescr,ShortDescr, Exhibit_Discount

    right Join Exhibit_Unpivoted

    on Exhibit_Discount.Exhibit = Exhibit_Unpivoted.Exhibit

    Where

    GBPriceList.[Mfr_Part_Num] = Exhibit_Unpivoted.Mfr_Part_Number

    AND

    GBPriceList.[Mfr_Part_Num] = LongDescr.[Mfr_ Part_ Num]

    AND

    GBPriceList.[Mfr_Part_Num] = ShortDescr.[Mfr_ Part_ Number]

  • Hi Grasshopper,

    this very much looks like a join problem, if you've looked in your data and found that there are multiple discounts for a single part you are on the right track. i would also question if the part numbers are returned multiple times in the Exhibit_unpivoted data. once there you need to find the most appropriate discount/Exhibit, that will depend entirely on your requirements. you have mentioned finding the max in the query, this wont resolve the relational issue in the join. however if you find the max discount per part/exhibit, slot that into a #temp table and join that to your existing query in place of the exhibit_discount table that might do the trick. obviously i cant see your data to test but ive had a go at the code below.

    I have also rearranged your query, the way you were joining your tables wasn't wrong but its best practice to join all your tables in the join section and not in the where, it also made it easier to read and troubleshoot.

    Use Sales_Builder

    Go

    select exhibit, max(discount) ---- will give you one discount per exhibit, always the biggest. if problem lays with the duplicate part numbers do same for Exhibit_Unpivoted

    into #temp

    from Exhibit_Discount

    group by exhibit

    SELECT DISTINCT

    GBPriceList.[Mfr_Part_Num],

    [Long_ description],

    [Short_ Description],

    --Exhibit_Unpivoted.Exhibit,

    WeightKGS,

    WarrantyCode,

    ProductLine,

    Serialized,

    isnull (ListPrice, 0) AS Listprice,

    Prod_Class,

    isnull (Exhibit_Discount.Discount, 0.00) AS Discount

    FROM #temp as t

    right Join Exhibit_Unpivoted

    on t.Exhibit = Exhibit_Unpivoted.Exhibit

    join GBPriceList

    on GBPriceList.[Mfr_Part_Num] = Exhibit_Unpivoted.Mfr_Part_Number

    join LongDescr

    on GBPriceList.[Mfr_Part_Num] = LongDescr.[Mfr_ Part_ Num]

    join ShortDescr

    on GBPriceList.[Mfr_Part_Num] = ShortDescr.[Mfr_ Part_ Number]

    drop table #temp

    Hope this helps some

  • Hi,

    Thanks for the reply, That does help quite a lot actually, I'll try implementing the changes today and see how the work out for the results. I've been looking into creating a temp table but was unsure how to really go about doing it. I'll reply later on after testing it :-D.

  • Hi again,

    I've looked into your query and it just doesn't seem to work with the max(discount) in the select part of the query for the #temp table. I looked further into what you said and there is multiple mfr part numbers in the exhibit_unpivoted data due to multiple exhibits. I'm a little unsure where to go from here with it to be honest, all i need is just one mfr_part_number to one discount which is the maximum for each product. I'll continue trying to use a temp table though.

  • looking at the code is the exhibit table used in the select? i could see one column sectioned out if thats not needed you might be best getting rid of the exhibit table all together, then change your temp table to look at the Exhibit_Unpivoted table to bring back one discount per part number and join on that.

    what error message are you getting with the MAX?

    J

  • I don't use the exhibit in the final query to get the data. It is still necessary for us to have our exhibits included though as the parts discount is based on the exhibit it has as the exhibit is referring to the exhibit discount cross checking the exhibits against one another to assign them the discount. Before there wasn't a join so it wasn't finding all the discounts but now the join does exist it's grabbing everything if i remove duplicate parts in excel i get the right amount of rows i'm expecting, around 390,000.

    The error message for the max was:

    Msg 1038, Level 15, State 5, Line 2

    An object or column name is missing or empty. For SELECT INTO statements, verify each column has a name. For other statements, look for empty alias names. Aliases defined as "" or [] are not allowed. Change the alias to a valid name.

    I figured it was due to the max in front of the discount though as it runs fine without it although it's been running for 20 minutes so far.

  • The error message is because i forgot to give the max column a name, each column in a temp table needs a name. replace it with "max(discount) as Discount " that should work.

    20 mins is way too long for a query bringing back that many rows. do you have access to a DBA? they might be able to look at the tables and assign appropriate index's to speed it up.

    if you can have a different discount for the same part at different exhibit, isnt it correct to supply the different discount rates and provide the duplicates in result set.

    from what I'm understanding the data is at a different grain than what you are trying to achieve.

    what is the spec for this query? what does the customer want to see?

    J

  • Unfortunately not, we're the one's who need to see it as the maximum discount for each product since we're purchasing the products. we want to see where we'll achieve the highest discount when purchasing each product. e.g single batch truck load etc

Viewing 8 posts - 1 through 7 (of 7 total)

You must be logged in to reply to this topic. Login to reply