Bypassing the keyword DISTINCT

  • Garadin (11/20/2008)


    Can you make this join 1 to 1 ? b.BRL_ID = d.BRL_ID

    Where did you read that DISTINCT had a high cost associated with it?

    Hi, Garadin,

    Nope, that would be to much luck,

    Although in the most situations (75-80%) there will only exists a 1 on 1 relation though for some other discplines that we have it will be 1 on many (many can be approx 1-25 here)

    Where i have read it.? dunno right away, somewxhere on the internet or some sql book, i read that much lately that i would be better of stop reading cause i begin to forget faster then i can read πŸ˜‰

    Wkr,

    Eddy

  • GilaMonster (11/20/2008)


    Sorts are expensive operations and distinct requires a sort, as do most other methods of removing duplicates from a resuuklt set.

    The best 'workaround' is to find out why there are duplicated (bad data, bad database design, bad query) and fix that. If you don't need distinct, don't use it.

    Gila,

    So what i have been reading or hearing about DISTINCT is idd correct, an expensive operation witch will be more visible with a huge amount of rows i guess.?

    Workaround as proposed:

    Bad data : would be possible considering I using the development database, will be testing in the testdatabase later there i have more access to real data but not much considering this is a new piece of the total application that will be run.

    Bad Database design : apart from the CAPS in table and field names (our developers are deaf @ 1 ear and they do not hear well with the other), i think our schema is pretty good,always room for improvements offcourse.

    Bad query: well thats the reason why i asked the question here at this forum πŸ˜€

    I could check with the group by but as you said that every kinda "sort" is expensive my guess would be that Group by falls under the same rules then, right.?

    What would leave me without a real workaround for the moment.

    Have another little question considering these same 2 tables also.

    I one would say,

    for each BASISREL ROW where BASISREL.ART_ID = 5 (example)

    for each BASISREL.BRL_ID found, check the DEELREL table for existence of a row with this DEELREL.BRL_ID and where DEELREL.ADR_ID_LAAPLAATS = 8956

    if not found : create a DEELREL row and put in the values of the BASISREL.BRL_ID and DEELREL.ADR_ID_LAADPLAATS

    otherwise continue the checkup with the next BASISREL.BRL_ID as found in the first resultset.

    I guess a tally table solution would make it possible to not use a cursor here

    But here im wondering whether it would be possible to handle all this in just one statement.?

    can it be done.?

    wkr,

    Eddy

  • eddy (11/21/2008)


    Have another little question considering these same 2 tables also.

    I one would say,

    for each BASISREL ROW where BASISREL.ART_ID = 5 (example)

    for each BASISREL.BRL_ID found, check the DEELREL table for existence of a row with this DEELREL.BRL_ID and where DEELREL.ADR_ID_LAAPLAATS = 8956

    if not found : create a DEELREL row and put in the values of the BASISREL.BRL_ID and DEELREL.ADR_ID_LAADPLAATS

    otherwise continue the checkup with the next BASISREL.BRL_ID as found in the first resultset.

    I guess a tally table solution would make it possible to not use a cursor here

    But here im wondering whether it would be possible to handle all this in just one statement.?

    can it be done.?

    wkr,

    Eddy

    Hi Eddy, this is a really useful exercise you've come up with here. Looking at this problem from a row-by-row point of view, it appears difficult. But SQL Server works with sets. From a set point of view it's simple: generate a result set from BASISREL which meets the criteria...

    SELECT b.BRL_ID, 8956 AS ADR_ID_LAAPLAATS

    FROM BASISREL b

    LEFT JOIN DEELREL d ON d.BRL_ID = b.BRL_ID AND d.ADR_ID_LAAPLAATS = 8956

    WHERE b.ART_ID = 5 AND d.BRL_ID IS NULL

    ...then use this as a source for an INSERT into DEELREL ...

    INSERT INTO DEELREL (BRL_ID, ADR_ID_LAADPLAATS)

    SELECT b.BRL_ID, 8956 AS ADR_ID_LAAPLAATS

    FROM BASISREL b

    LEFT JOIN DEELREL d ON d.BRL_ID = b.BRL_ID AND d.ADR_ID_LAAPLAATS = 8956

    WHERE b.ART_ID = 5 AND d.BRL_ID IS NULL

    The only caution here would be to ensure that you're not inserting the same BRL_ID more than once into DEELREL ...

    INSERT INTO DEELREL (BRL_ID, ADR_ID_LAADPLAATS)

    SELECT b.BRL_ID, 8956 AS ADR_ID_LAAPLAATS

    FROM BASISREL b

    LEFT JOIN DEELREL d ON d.BRL_ID = b.BRL_ID AND d.ADR_ID_LAAPLAATS = 8956

    WHERE b.ART_ID = 5 AND d.BRL_ID IS NULL

    GROUP BY b.BRL_ID

    And there you go. Sometimes it's actually easier to solve a problem when you look at it from a set-based perspective.

    Cheers

    ChrisM

    β€œWrite the query the simplest way. If through testing it becomes clear that the performance is inadequate, consider alternative query forms.” - Gail Shaw

    For fast, accurate and documented assistance in answering your questions, please read this article.
    Understanding and using APPLY, (I) and (II) Paul White
    Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden

  • Hi Eddy, this is a really useful exercise you've come up with here. Looking at this problem from a row-by-row point of view, it appears difficult. But SQL Server works with sets. From a set point of view it's simple: generate a result set from BASISREL which meets the criteria...

    SELECT b.BRL_ID, 8956 AS ADR_ID_LAAPLAATS

    FROM BASISREL b

    LEFT JOIN DEELREL d ON d.BRL_ID = b.BRL_ID AND d.ADR_ID_LAAPLAATS = 8956

    WHERE b.ART_ID = 5 AND d.BRL_ID IS NULL

    ...then use this as a source for an INSERT into DEELREL ...

    INSERT INTO DEELREL (BRL_ID, ADR_ID_LAADPLAATS)

    SELECT b.BRL_ID, 8956 AS ADR_ID_LAAPLAATS

    FROM BASISREL b

    LEFT JOIN DEELREL d ON d.BRL_ID = b.BRL_ID AND d.ADR_ID_LAAPLAATS = 8956

    WHERE b.ART_ID = 5 AND d.BRL_ID IS NULL

    The only caution here would be to ensure that you're not inserting the same BRL_ID more than once into DEELREL ...

    INSERT INTO DEELREL (BRL_ID, ADR_ID_LAADPLAATS)

    SELECT b.BRL_ID, 8956 AS ADR_ID_LAAPLAATS

    FROM BASISREL b

    LEFT JOIN DEELREL d ON d.BRL_ID = b.BRL_ID AND d.ADR_ID_LAAPLAATS = 8956

    WHERE b.ART_ID = 5 AND d.BRL_ID IS NULL

    GROUP BY b.BRL_ID

    And there you go. Sometimes it's actually easier to solve a problem when you look at it from a set-based perspective.

    Cheers

    ChrisM

    Hey Chris,

    That works like a charm,

    Was not hoping that it could be done but when i saw your post, i knew that with a little extention i could turn it in our full requirements, and gues what,

    It does what it have to do.

    Tnx a million times for your help especially for pointing out that idd looking at set based solutions can handle the tasks with much less code and much more speed.

    Have done a test where the routine has to go through approx. 1000 Basisrel and has to create several hundreds of DEELRELs,

    Took me a lot of time checking that there is no wrong data created in DEELREL but took sql server less then a second to complete his task

    Amazing! :w00t:

    Wkr,

    Eddy

Viewing 4 posts - 16 through 18 (of 18 total)

You must be logged in to reply to this topic. Login to reply