Fill in Missing Data

  • Hello Everyone

    I hope that you all had a wonderful weekend.

    I am working with some data that is not so clean. I have some rows that are Very close to being a duplicate row. I need to take all the rows with the same SKU number and fill in the missing data for the Price, Cost or the SalePrice, what ever may be missing. The data is set like @DiamondsMissingData and I need the row to look like @DiamondsWithData table. I can remove the duplicate values once I get the missing data filled in. I am not sure how the best way to go about coding for this situation.

    DECLARE @DiamondsMissingData TABLE

    (

    SKU varchar(20)

    ,VendorNumber varchar(20)

    ,VendorSKU varchar(20)

    ,GradingReportType varchar(20)

    ,GradingReportNumber bigint

    ,Price money

    ,Cost money

    ,SalePrice money

    )

    DECLARE @DiamondsWithData TABLE

    (

    SKU varchar(20)

    ,VendorNumber varchar(20)

    ,VendorSKU varchar(20)

    ,GradingReportType varchar(20)

    ,GradingReportNumber bigint

    ,Price money

    ,Cost money

    ,SalePrice money

    )

    INSERT INTO @DiamondsMissingData

    SELECT '6157560941C','Crown','CR50135','GIA',6157560941,35588000,177940,NULL UNION ALL

    SELECT '6157560941C','Crown','CR50135','GIA',6157560941,35588000,NULL,27375500 UNION ALL

    SELECT '6157560941C','Crown','CR50135','GIA',6157560941,NULL,177940,27375500

    INSERT INTO @DiamondsWithData

    SELECT '6157560941C','Crown','CR50135','GIA',6157560941,35588000,177940,27375500

    SELECT * FROM @DiamondsMissingData

    SELECT * FROM @DiamondsWithData

    Thank You in advance for your assistance, time, and code sample.

    Andrew SQLDBA

  • It seems like you're wanting to copy the non-NULL values for any given SKU into the NULL values, correct? If so, this should do it:

    ;WITH NullFill(SKU,FillPrice,FillCost,FillSalePrice) AS(

    SELECT VendorSKU,MAX(Price)AS Price, MAX(Cost) AS Cost, MAX(SalePrice) AS SalePrice

    FROM @DiamondsMissingData

    GROUP BY VendorSKU

    )

    UPDATE @DiamondsMissingData

    SET Price = CASE WHEN Price IS NULL THEN FillPrice ELSE Price END,

    Cost = CASE WHEN Cost IS NULL THEN FillCost ELSE Cost END,

    SalePrice = CASE WHEN SalePrice IS NULL THEN FillSalePrice ELSE SalePrice END

    FROM @DiamondsMissingData A

    INNER JOIN NullFill B

    ON A.VendorSKU = B.SKU

    That should take the non-NULL value for each of the three potentially-NULL columns, and fill it in where the value is NULL, ignoring it if it's not NULL.

    However, this won't work if the same SKU can have multiple values for any of the three nullable columns; if there's a variation in price, for example, this will make all price values the same. In your provided data, this isn't a problem, but if it could happen in your real data, this will be problematic.

    If that's the case, a better solution can certainly be worked out; please provide an example of a case where that might happen, and adjustments will be made! 🙂

    - 😀

  • This will deduplicate the rows that have missing values. However, if the duplicate rows are the majority, I might use a different approach.

    INSERT INTO @DiamondsMissingData

    SELECT '6157560941C','Crown','CR50135','GIA',6157560941,35588000,177940,NULL UNION ALL

    SELECT '6157560941C','Crown','CR50135','GIA',6157560941,35588000,NULL,27375500 UNION ALL

    SELECT '6157560941C','Crown','CR50135','GIA',6157560941,NULL,177940,27375500

    INSERT INTO @DiamondsWithData

    SELECT SKU

    ,VendorNumber

    ,VendorSKU

    ,GradingReportType

    ,GradingReportNumber

    ,MAX( Price)

    ,MAX( Cost)

    ,MAX( SalePrice)

    FROM @DiamondsMissingData

    WHERE Price IS NULL

    OR Cost IS NULL

    OR SalePrice IS NULL

    GROUP BY SKU

    ,VendorNumber

    ,VendorSKU

    ,GradingReportType

    ,GradingReportNumber

    DELETE FROM @DiamondsMissingData

    WHERE Price IS NULL

    OR Cost IS NULL

    OR SalePrice IS NULL

    INSERT INTO @DiamondsMissingData

    SELECT *

    FROM @DiamondsWithData

    Luis C.
    General Disclaimer:
    Are you seriously taking the advice and code from someone from the internet without testing it? Do you at least understand it? Or can it easily kill your server?

    How to post data/code on a forum to get the best help: Option 1 / Option 2
  • Thank You to you both. That code worked very nicely. I had to change it just slightly, but you got me going down the correct path

    Thank you again

    Andrew SQLDBA

Viewing 4 posts - 1 through 3 (of 3 total)

You must be logged in to reply to this topic. Login to reply