Query really slow with eager spool

Question

Query really slow with eager spool

SSC-Addicted

Points: 470

January 10, 2007 at 8:00 am

Hi,

I have a query which become slower as the number of item in the temp table 
increase.

Here's my code:

CREATE table #regions (
 ipFloat float null,
 countryId int null,
)

insert into #regions (ipFloat) select distinct top 100 ipFloat from tbl_logs

-- This is where the code gets really slow!
update #regions set countryId = (select countryId from tbl_ip2countries 
where ipFloat between ipFrom and ipTo)

select * from #regions
drop table #regions

What it does is it takes some ip converted in a float number and look for 
its associated countryId in a table. The Ip2countries table has more than 2 
000 000 records. There's a clustered index on the primary key which is ipTo 
and ipFrom (both float).

If I take only 10 ip (select distinct top 10 ipFloat from tbl_logs), the 
query executes in less than 1 second.

If I take up to 100 ip, it takes more than 45 secondes. And in this case, I 
have a index spool (eager spool) that consumes 83%. By the way, in the 
execution plan, the arrow from the clustered index shows about 2 000 000 
actual number of rows. But the arrow after the eager spool shows an actual 
number of rows of more than 80 000 000! Is this normal? There's a filter 
after that reduces the number of rows to only 100...

If I take 1000 ip, it takes several minutes.

Is there a way to reduces the eager spool or to reduce the number of 80 000 
000 of actual rows? I don't understand what's going on here and why this is 
so slow for few hundreds of rows.

Any idea?

Thanks

Stephane

Viewing 15 posts - 1 through 15 (of 19 total)

You must be logged in to reply to this topic. Login to reply

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 1

Can you edit your post so that it's more readable... I'm not sure you'll get much attention from anyone with that question.

Stéphane SSC-Addicted Points: 470 More actions · Answer 2

Thanks! I didn't notice... I made a copy and paste from word?!

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 3

Ninja's_RGR'us

SSC Guru

Points: 294069

January 10, 2007 at 8:42 am

#682403

That's what I thaught. Thanx for the quick fix .

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 4

Can you show us the some sample data and the required output from the update query... I'm sure there's a way to avoid the spool altogether.

Stéphane SSC-Addicted Points: 470 More actions · Answer 5

I just did a new test using a cursor this time.

CREATE table #regions (

ipFloat float null,

countryId int null)

insert into #regions (ipFloat) select distinct top 500 ipFloat from tbl_logs

declare @i float, @id int

DECLARE cSE CURSOR READ_ONLY

FOR select ipFLoat from #regions

OPEN cSE

FETCH NEXT FROM cSE INTO @i

WHILE @@FETCH_STATUS = 0

BEGIN

select @id = countryId from tbl_ip2countries where @i between ipFrom and ipTo

update #regions set countryId = @id where ipFloat = @i

FETCH NEXT FROM cSE INTO @i

END

CLOSE cSE

DEALLOCATE cSE

select * from #regions

drop table #regions

The results are quite surprising! For 100 ips, it took only 6 seconds instead of minutes.

But I'm sure there's a way to get this done quickly whitout using a cursor. I just can't find it out.

Thanks for any idea,

Stephane

John Mitchell-245523 SSC Guru Points: 148809 More actions · Answer 6

Stephane

Your syntax doesn't look quite right. I don't see how your update statement can work. How about this (not tested, obviously):

update r

set countryId = c.countryId

from #regions r join tbl_ip2countries c

on r.ipFloat = c.ipFloat

where r.ipFloat between c.ipFrom and c.ipTo

John

Stéphane SSC-Addicted Points: 470 More actions · Answer 7

The problem with that update is that I don't have the exact IP converted in float in tbl_ip2countries. I mean, I have only a range so Canada is assigned to 123.123.123.0 to 123.123.123.256. So I can't join on r.ipFloat = c.ipFloat.

Thanks!

Stéphane SSC-Addicted Points: 470 More actions · Answer 8

I have ip converted in float like this:

1362268423

3476946705

1179659265

1179726082

3234155789

3254497794

3489515534

3567080704

1390476300

1390490892

1390832915

1415938309

1456668932

3258127372

3569568532

3573210393

I have to check in a lookup table (containing more than 2M rows) set up like this:

ipFrom ipTo countryId

---------------------- ---------------------- -----------

0 33996343 95

33996344 33996351 219

33996352 50331647 95

50331648 50331903 220

50331904 50332159 220

50332160 50332671 220

50332672 50332927 220

50332928 50333695 220

50333696 50333951 220

50333952 50334719 220

So the #regions would look like this after:

ipFloat countryId

---------------------- -----------

1362268423 70

3476946705 38

1179659265 38

1179726082 38

3234155789 38

3254497794 70

3489515534 38

3567080704 199

1390476300 70

1390490892 70

thanks

Stephane

John Mitchell-245523 SSC Guru Points: 148809 More actions · Answer 9

Stephane

I still don't understand. The sample data you gave shows ipFrom and ipTo having 8 digits, and ipFloat having 9. So how can ipFloat ever be between ipFrom and ipTo? Also, how do you convert an IP address into this format? For instance, to use the Canada example, what do 123.123.123.0 and 123.123.123.256 become? (By the way, the latter isn't a valid IP address. I assume this is a typo?)

John

Stéphane SSC-Addicted Points: 470 More actions · Answer 10

Hi,

Thanks for your fast answer!

The ip is converted this way: For example, If the IP address 161.132.13.1, then the IP number is 2709785857.

IP Number, X = 161 x (256*256*256) + 132 x (256*256) + 13 x (256) + 1 = 2709785857.

You can have more info here: http://www.ip2location.com/docs/IP2Location_IP_Country_Region_City_Specification.pdf

So it can have 8 or 9 digits.

Sorry for the 256, it's obviously 255!

Thanks

Stephane

John Mitchell-245523 SSC Guru Points: 148809 More actions · Answer 11

Stephane

OK, that clears up the IP conversion algorithm. I still don't think that your sample data from tbl_ip2countries and from #regions tally up with each other, but let's not worry about that for now.

Try this:

update r

set countryId = c.countryId

from #regions r join tbl_ip2countries c

on r.ipFloat between c.ipFrom and c.ipTo

Not sure how it would perform, but much better than a cursor, I should think. You could experiment with using >= and <= operators instead of the BETWEEN function to see if that goes any faster.

John

Stéphane SSC-Addicted Points: 470 More actions · Answer 12

hum...

That's really weird: For 10 ip, it takes 7-8 seconds. For 100, it takes about a minute which is slower than my problem at the beginning.

BUT!

If I try with 40 or 50 ip, it executes in a flash: 1 second!

The difference is in uses table spool (lazy spool) in the first case but only clustered index seek in the second case.

What the ...???

By the way, in the prod environment, it's going to be up to 1000 ip.

Stephane

Stéphane SSC-Addicted Points: 470 More actions · Answer 13

I really don't get it...

10-20-30 ip uses the lazy spool so it takes several seconds.

40-50-60 ip uses the clustered index seek so it executes in a flash.

70 and more also uses the lazy spool so it takes more than a minute.

Is there a way to force the use of the index instead of the costly table spool?

By the way, it's a clustered index SCAN with the table spool instead of a clustered index seek. What's the difference?

Thanks

Stephane

Ninja's_RGR'us SSC Guru Points: 294069 More actions · Answer 14

SCAN : reads all rows.

SEEK : binary search algorithm to find only the required row.

Could you reindex and / or update the statistics on the table.

If that doesn't help you could always use index and join hints (last resort).