bug report: select distinct isn't quite always

Question

Post reply

bug report: select distinct isn't quite always

mountcrumpit

SSChasing Mays

Points: 603
More actions
March 2, 2015 at 7:59 pm

#315873

This is an intermittent problem. it doesn't happen every time the particular ASP script runs.
It appears that the adodb recordset that's processed by the "while !EOF" loop for some reason, sometimes,
decides identical data is _not_ distinct and will duplicate entries -- not the entire file, but somewhere along the way (this last time at "HODA" rec#632)
it then continues to the last entry (total recs ~2300)
Here is an example of the governee recordset select:
SELECT distinct Claims.*, Clients.[Last] AS Last, Payors.Allowed
FROM Claims
INNER JOIN Clients ON Claims.ClientID = Clients.ClientID
INNER JOIN Payors ON Payors.TransactionID = Claims.TransactionID
INNER JOIN MCAProcs ON mcaProcs.code=Claims.ServiceID
where Payors.CarrierCode='CLIENT' and Claims.Status='SEEN'
and MCAProcs.noninvoiceable=0 and Clients.ClientCompletelyTermed=0
and cast(Claims.servicedate as datetime) >= cast('1/1/2014' as datetime)
order by Clients.last,Claims.TransactionID
All the tables are essentially static at run time. updates may be occurring but not anything that'd explain a massive burp like this.
I know sql2k is, well, last century. Does anyone know if this was ever an issue addressed by anyone? any service pack? Was it a type of issue ever acknowledged?
I'm as much curious as frustrated. thanks.

Viewing 12 posts - 1 through 11 (of 11 total)

You must be logged in to reply to this topic. Login to reply

spaghettidba SSC Guru Points: 105732 More actions · Answer 1

Hmmm, looks quite unlikely.

Are you sure you're getting suplicate values? How did you determine that?

-- Gianluca Sartori

mountcrumpit SSChasing Mays Points: 603 More actions · Answer 2

Hi,

I agree. I have nothing concrete on it being the culprit. Its a best analysis opinion after reviewing the script. I'm setting up the script to take that resulting recordset and preserve a timestamped copy of it in the database. Next time the user reports dups, i'll be able to review the data and confirm.

Michael L John One Orange Chip Points: 27038 More actions · Answer 3

What fields are you using to determine that there is a duplicate?

The query is using Claims.*. Are all of these fields needed?

Could another field, like a datetime, be causing the dupes?

Is you data like this:

Name User Created_Time

Value1 Mike 01/01/2015 00:00:00

Value1 Mike 01/01/2015 10:00:00

Change your query to this and see if you get dupes.

SELECT COUNT(*),

Claims.*, <-- Expand this to the actual fields, not *

Clients.[Last] AS Last,

Payors.Allowed

FROM Claims

INNER JOIN Clients ON Claims.ClientID = Clients.ClientID

INNER JOIN Payors ON Payors.TransactionID = Claims.TransactionID

INNER JOIN MCAProcs ON mcaProcs.code=Claims.ServiceID

where Payors.CarrierCode='CLIENT' and Claims.Status='SEEN'

and MCAProcs.noninvoiceable=0 and Clients.ClientCompletelyTermed=0

and cast(Claims.servicedate as datetime) >= cast('1/1/2014' as datetime)

GROUP BY Claims.*, <-- Expand this to the actual fields, not *

Clients.[Last] AS Last,

Payors.Allowed

HAVING COUNT(*) > 1

Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/

Alvin Ramard SSC-Forever Points: 41190 More actions · Answer 4

Have you looked at the possibility that some of the data might contain unprintable characters, like CR, LF, TAB, etc.

These can make the data look identical to the human eye, but not to a computer.

Alvin Ramard
Memphis PASS Chapter[/url]

All my SSC forum answers come with a money back guarantee. If you didn't like the answer then I'll gladly refund what you paid for it.

For best practices on asking questions, please read the following article: Forum Etiquette: How to post data/code on a forum to get the best help[/url]

mountcrumpit SSChasing Mays Points: 603 More actions · Answer 5

Ok, this is something I haven't done before.

Since the problem is intermittent, I can't run the ADODB recordset selection stmt twice. One set may or may not contain the dups.

I need to _copy_ the recordset as it was selected

I could loop through the recordset and insert each set entry one by one into the "backupTable"

but that seems a bit time consuming.

Can something like this be done?

' first get the recordset as usual'

Set rstClientDB = Server.CreateObject("ADODB.Recordset")

rstClientDB.CursorLocation = 3

rstClientDB.open "select * from clients" ,cnn, adOpenKeyset, adLockOptimistic

' before processing copy it back into the database under an assumed name to preserve it for analysis later as needed.'

rstCopyofClientDB.Cursorlocation = 3

rstCopyofClientDB.open "select * from rstClientDB into backupTable"

I thought maybe a filescripting object (ala fso.copyfile(source, to)) but that's not quite right either.

thanks, B

spaghettidba SSC Guru Points: 105732 More actions · Answer 6

If you want a static copy of the data, use adOpenStatic when opening the recordset.

Your issues might come from modifications done by concurrent sessions.

-- Gianluca Sartori

Matt Miller (4) SSC Guru Points: 124210 More actions · Answer 7

Is there any chance you might be running up against the halloween problem?

http://en.wikipedia.org/wiki/Halloween_Problem

----------------------------------------------------------------------------------
Your lack of planning does not constitute an emergency on my part...unless you're my manager...or a director and above...or a really loud-spoken end-user..All right - what was my emergency again?

mountcrumpit SSChasing Mays Points: 603 More actions · Answer 8

thanks folks for the ideas. Remember the issue is intermittent. I've run the script 3 or 4 times since yesterday and no dups. but i have bak from when the user reported it. that clearly shows the output, if the recordset has dups.

I did take to heart the idea of eliminating the claims.* , that is a pretty heavy table and only 5 columns are needed from it.

Much as I'd like to see a concrete solution (or at least a controlled demonstration of dup'ing) I'm hoping that a little shake-up of the recordset grab

the process is pretty straight forward.

grab the charges against a client (the suspected recordset)

loop via movenext

1)each iteration inserts the charge under consideration into another db table

2)each iteration creates a recordset of assigned credits offsetting the charge for the specific charge, these are also inserted into another db table

3)each iteration creates a recordset of unassigned credits for the client. these are also inserted into another db table

the insertion of unassigned credits occurs only once and is prevented from occuring more than once by a boolean script var

1) and 2) are the insertions that show duplication, 3) isn't dup'd because of the set boolean

(its accounting, not linear algebra;-) )

mountcrumpit SSChasing Mays Points: 603 More actions · Answer 9

I suspect, if its not a bug (which seems all but assured), then its got something to do with db (updates), but there ar no updates occurring in this script.

I will take the suggestion to adOpenStatic. just as another "shake up"

sstalker 94512 SSC Rookie Points: 35 More actions · Answer 10

Is this join correct?

INNER JOIN MCAProcs ON mcaProcs.code=Claims.ServiceID

EdVassie SSC Guru Points: 60445 More actions · Answer 11

claims.* , that is a pretty heavy table and only 5 columns are needed from it

This may have the answer to your problem. The SELECT DISTINCT will work on all of the columns in the table, but you are only showing 5 of these to the end user. It is therefore possible that if two records from Claims differ only by the value in a non-displayed column, then the data displayed for the columns shown will be identical.

The problem is not with the database, it is doing exactly what you asked it to do. The problem is with the query, you are not asking it to do what you want.

Original author: https://github.com/SQL-FineBuild/Common/wiki/ 1-click install and best practice configuration of SQL Server 2019, 2017 2016, 2014, 2012, 2008 R2, 2008 and 2005.

When I give food to the poor they call me a saint. When I ask why they are poor they call me a communist - Archbishop Hélder Câmara