SELECT DISTINCT returns random duplicate rows

  • Well, I have never seen anything like this one before! I have a stored procedure that accepts a number of comma delimited strings as parameters, (each string looks something like: 3,456,7,984,38,...) which I parse into integers and store in temp tables. I then link to those temp tables to query a larger table for values.

    The final select is a SELECT DISTINCT, which is in fact NOT selecting distinct values -- it is returning multiple rows randomly. I have run any number of tests, dumping the results into tables and comparing them, and they are never the same!

    The input parameters are the same, and the underlying tables and views are also static. If I run the sproc in rapid succession I never receive the same row count or the same values. Any thoughts? Many thanks. DLewis

  • Without some examples and possibly the offending code - it's probably going to be tough to see what the issue is. Can you post some specifics?

    Also - what build and version are you running?

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

  • DISTINCT applies to all columns in the SELECT clause.

    So it is possible your data really is DISTINCT but you do not realize it.

    Please post some sample DDL, DATA, and your query so we can give you better answers.

    Forum Etiquette: How to post data/code on a forum to get the best help

  • Something else is going on. All things being equal, a static set of data and a static set of code is going to return the same values. As others have pointed out, you'll need to post some structures and some sample data.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Hi Everyone: Thanks for the responses. I will post some example code, and what I think solved the problem in an hour or two.... I don't understand why the problem occurred, so I am hoping someone out there is a computer science type of person who understands 0's and 1's better than I do.

  • Hi Everyone:

    The following is my theory of what caused the problem:

    The issue seems to have been one field in the Products table that stores the case equivalent value of a package type (we are a brewery). It is defined as a float, and the values are either 1 (for most cases, so to speak), 1

    2.2222222222, 4.8, or 6.8888888889. The 4.8 value is not relevant for the problem query (it did not occur); the 2.2222 and 6.8889 are for 5 gallon kegs and 15.5 gallon 1/2 bbls. What was happening was that random rows would occur twice when I multiplied the case equivalent factor by the total 5 gallon or 15.5 gallon kegs sold for a particular brand. For example 27*2.22222222=59.99999999, but it would seem that sometimes the computer would return 60. I say sometimes, because when I ran the query in rapid succession into a table, and compared the tables to each other, different rows would be duplicated. Also, strangely enough, the values displayed in the results pane for the duplicate rows were the same. I am assuming that somewhere at some unseen level of precision the final digit was different. I say assuming, because when I changed the following two lines in the query:

    ,a.CE AS CE

    ,ISNULL((SELECT tt.Sales_CE FROM vwDepletions_Month tt WHERE t.DISTID=tt.DISTID AND t.PRODID=tt.PRODID AND tt.DATEID=@Dateid),0)

    to

    ,CONVERT(NUMERIC(4,2),a.CE) AS CE

    ,CONVERT(NUMERIC(12,2),ISNULL((SELECT tt.Sales_CE FROM vwDepletions_Month tt

    WHERE t.DISTID=tt.DISTID AND t.PRODID=tt.PRODID

    AND tt.DATEID=@Dateid),0))

    The issue disappeared! Explicitly coercing the data returned from a float to numeric eliminated the problem. For this reason I suspect it has something to do with the fact that computers do not store all numbers precisely as you enter them. I admit I don't know much about this, and I was hoping someone can shed some light on it. Attached is the text of the entire stored procedure. If anybody wants I can also post several result sets that have duplicate rows (in case you don't believe me!).

    thanks. David

  • It's not that computers don't store what you tell them to. In this case it's storing exactly what you told it to. A float is an approximate number. It's not precise, let alone exact. More detail from BOL:

    Approximate numeric data types do not store the exact values specified for many numbers; they store an extremely close approximation of the value. For many applications, the tiny difference between the specified value and the stored approximation is not noticeable. At times, though, the difference becomes noticeable. Because of the approximate nature of the float and real data types, do not use these data types when exact numeric behavior is required, such as in financial applications, in operations involving rounding, or in equality checks. Instead, use the integer, decimal, money, or smallmoney data types.

    Avoid using float or real columns in WHERE clause search conditions, especially the = and <> operators. It is best to limit float and real columns to > or < comparisons.

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • Good ole BOL. The answer usually is there, but only when I know what the problem is. Thanks for the explanation -- I will search and destroy all floats!

    D

  • Why? Try this and see why I ask... ask yourself "What is 1 divided by 3 times 3?" 😛

    DECLARE @a DECIMAL(3,0), @b-2 DECIMAL(18,0),@F FLOAT

    SET @a = 3

    SET @b-2 = 3

    SET @F = 3

    SELECT 1/@A*3.0, 1/@B*3.0,1/@F*3.0

    SELECT 1/@A*3, 1/@B*3,1/@F*3

    --Jeff Moden


    RBAR is pronounced "ree-bar" and is a "Modenism" for Row-By-Agonizing-Row.
    First step towards the paradigm shift of writing Set Based code:
    ________Stop thinking about what you want to do to a ROW... think, instead, of what you want to do to a COLUMN.

    Change is inevitable... Change for the better is not.


    Helpful Links:
    How to post code problems
    How to Post Performance Problems
    Create a Tally Function (fnTally)

  • Is that accuracy without precision or precision without accuracy?

    "The credit belongs to the man who is actually in the arena, whose face is marred by dust and sweat and blood"
    - Theodore Roosevelt

    Author of:
    SQL Server Execution Plans
    SQL Server Query Performance Tuning

  • OK, I tried it an see that the answer is not as obvious as I thought (...destroy all floats). But if there isn't a broad brush way to avoid the problem, what are the alternatives?

  • DavidL (10/30/2008)


    OK, I tried it an see that the answer is not as obvious as I thought (...destroy all floats). But if there isn't a broad brush way to avoid the problem, what are the alternatives?

    Unfortunately - the answer is something along the lines of "know your tools". Just be aware of the gotchas for each data type, and the potential rounding/uncertainty/scientific error each type might introduce, and pick the one that happens to be appropriate at the time.

    For example - knowing that a float is imprecise, you know that at very least the last digit of a float is "not to be trusted", so if you need a certain number of digts precision, be sure to store the result in something BIGGER than what you need, and truncate the extra digits out at the very last minute in whatever UI you are using to communicate the results.

    ----------------------------------------------------------------------------------
    Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply