large self-join taking ages, anything I can do?

  • Ed B (3/14/2012)


    CraigIW (3/12/2012)


    Updt:

    Though it does seem to be a lot faster on my 32bit SQL 2008 development desktop than it is on the production server :/

    Did you say the production server was a VM?

    If so does that mean it cannot use parallelism?

    Yes it is. Running on a MS Hyper-V host.

    I don't know about the parallelism It has four cores visible within the VM. What restrictions are there if it's virtual?

    Craig

  • I don't know much about it, but at a previous job all the VMs were unable to use parallelism (at least that's how it seemed).

    I would look at the execution plans on both boxes and see if they are different.

    Another thought is can you run your matching logic against a distinct set of customer details? Assign a key to each one and maintain a record of edit distances. Then each day you might only have to compare the new customer details.

  • Eugene Elutin (3/12/2012)


    Looking into your UDF, it would definitely benefit from implementing it as CLR function.

    Also, you can try to split your query in two parts.

    First filtering out everything without checking string similarities, then apply your "fine" checks using your udf.

    This was the post that solved the problem for me. Bench tests of the CLR equivalent were 4x faster than the UDF. Splitting the query saved the rest of the time (probably 4x again).

    Overall, the query went from taking 20+ hours to 70-80 minutes. Great job Eugene!

    Craig

Viewing 3 posts - 16 through 17 (of 17 total)

You must be logged in to reply to this topic. Login to reply