May 12, 2020 at 7:20 pm
Michael L John wrote:Jeff Moden wrote:What percentage of the rows of each table are less than 12 months ago?
All of them. The system is only 9 months old.
Lynn, there are a lot of index changes for this database that are in various states of being deployed. There are 3 that will directly help this query, as well as quite a few other queries. These are in QA.
X, the re-write goes like this:
WITH CTE (Pull all the user ids who have logged in for last 12 months)
SELECT Client ID's WHERE Exists (select From CTE)
You're not getting the actual code, because I do not feel like obfuscating it, and the original discussion was not about how to tune a query, but more a vent about developers who don't think past their fingertips.
With the indexes, and the rewrite, there are 3 index seeks as opposed to scans, and in dev the execution time improvement went from ~40 seconds to less than 1 second.
Hey thats fine, but being accurate doesn't really need the exact code, I try to tell folks to act as if they're logging a service call and try to duplicate the issue. You'll notice that I'm not asking "what the code is for," because all I was interested in was why this snippet won't use indexes.
Hi Patrick. I don't know the exact code or tables, but I'd think 6 rows would qualify as a trivial plan. Just read the (likely 1) page and call it quits. If that's really the scenario, the QO should call that a "good enough" plan and skip the rest.
However, if Mike wrote this code, then never mind. Right, Mike? 😉
May 12, 2020 at 7:22 pm
Okay, so TableA and TableB in the derived table are those two source tables where the login info may exist. So why the distinct in the outer query when there are no duplicates because of the UNION in the derived table? I won't bother to throw out any ideas on the this as I am sure you may have some good ideas of your own.
Well in one trivial example I tried, the number of sorts are the same whether the "distinct" clause is used or not. I suspect it depends on the query complexity tho.
May 12, 2020 at 7:29 pm
I think the most obvious way to rewrite that query would be:
SELECT u.UserId
FROM UserTable u
WHERE EXISTS(SELECT *
from TableA a
where DateColumn > @StartDate
and a.UserId = u.UserId)
OR EXISTS(select *
from TableB b
where Success = 1
and DateColumn > @StartDate
and b.UserId = u.UserId)
which I'm sure would perform better.
May 12, 2020 at 7:30 pm
x wrote:Michael L John wrote:Jeff Moden wrote:What percentage of the rows of each table are less than 12 months ago?
All of them. The system is only 9 months old.
Lynn, there are a lot of index changes for this database that are in various states of being deployed. There are 3 that will directly help this query, as well as quite a few other queries. These are in QA.
X, the re-write goes like this:
WITH CTE (Pull all the user ids who have logged in for last 12 months)
SELECT Client ID's WHERE Exists (select From CTE)
You're not getting the actual code, because I do not feel like obfuscating it, and the original discussion was not about how to tune a query, but more a vent about developers who don't think past their fingertips.
With the indexes, and the rewrite, there are 3 index seeks as opposed to scans, and in dev the execution time improvement went from ~40 seconds to less than 1 second.
Hey thats fine, but being accurate doesn't really need the exact code, I try to tell folks to act as if they're logging a service call and try to duplicate the issue. You'll notice that I'm not asking "what the code is for," because all I was interested in was why this snippet won't use indexes.
Hi Patrick. I don't know the exact code or tables, but I'd think 6 rows would qualify as a trivial plan. Just read the (likely 1) page and call it quits. If that's really the scenario, the QO should call that a "good enough" plan and skip the rest.
However, if Mike wrote this code, then never mind. Right, Mike? 😉
I now suspect I have zero insight into the situation especially once Mike said he was "paraphrasing / exaggerating / venting" but thats what I get for poking my nose into someone elses thread LOL
However it was cool to see SQL ditch the extra sort when it sees a superfluous "distinct," so I learned something anyways!
May 12, 2020 at 7:33 pm
X, the re-write goes like this:
WITH CTE (Pull all the user ids who have logged in for last 12 months)
SELECT Client ID's WHERE Exists (select From CTE)
This would not work, it could not progress the service call because in your first post there was a union.
May 12, 2020 at 7:54 pm
X, the re-write goes like this:
WITH CTE (Pull all the user ids who have logged in for last 12 months)
SELECT Client ID's WHERE Exists (select From CTE)
This would not work, it could not progress the service call because in your first post there was a union.
He left out the details of the code in the CTE.
May 12, 2020 at 10:21 pm
I feel like I'm missing something. An inner join is necessarily an existence check, why not:
SELECT u.UserId
FROM UserTable u
JOIN TableA a ON a.UserId = u.UserId
JOIN TableB b ON b.UserId = u.UserId
WHERE a.DateColumn > @StartDate
AND b.Success = 1
AND b.DateColumn > @StartDate
-------------------------------------------------------------------------------------------------------------------------------------
Please follow Best Practices For Posting On Forums to receive quicker and higher quality responses
May 12, 2020 at 10:38 pm
I feel like I'm missing something. An inner join is necessarily an existence check, why not:
SELECT u.UserId
FROM UserTable u
JOIN TableA a ON a.UserId = u.UserId
JOIN TableB b ON b.UserId = u.UserId
WHERE a.DateColumn > @StartDate
AND b.Success = 1
AND b.DateColumn > @StartDate
more data would be consumed on average. "if exists" makes the determination after the first row fufills the predicate whereas just joining looks to read more from tablea and tableb than needed, also I get "cartesianish" vibes somehow.
also, you're requiring a match in both tablea and tableb but I'm a bit hazy whether that matches the requirements.
May 12, 2020 at 11:47 pm
I think the most obvious way to rewrite that query would be:
SELECT u.UserId
FROM UserTable u
WHERE EXISTS(SELECT *
from TableA a
where DateColumn > @StartDate
and a.UserId = u.UserId)
OR EXISTS(select *
from TableB b
where Success = 1
and DateColumn > @StartDate
and b.UserId = u.UserId)which I'm sure would perform better.
That was about equal to the CTE.
This runs once a month, and once the devs handed me the re-write, I re-wrote it as you did. The performance difference was negligible. I gave my blessing to their code, which was pretty much of a first.
Now, I have to make sure they don't write everything as a CTE!
Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/
May 12, 2020 at 11:54 pm
I feel like I'm missing something. An inner join is necessarily an existence check, why not:
SELECT u.UserId
FROM UserTable u
JOIN TableA a ON a.UserId = u.UserId
JOIN TableB b ON b.UserId = u.UserId
WHERE a.DateColumn > @StartDate
AND b.Success = 1
AND b.DateColumn > @StartDate
It's an either or situation for the two tables. The data can exist in one, the other, or both. This logic will eliminate rows
Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/
May 13, 2020 at 12:23 pm
jonathan.crawford wrote:I feel like I'm missing something. An inner join is necessarily an existence check, why not:
SELECT u.UserId
FROM UserTable u
JOIN TableA a ON a.UserId = u.UserId
JOIN TableB b ON b.UserId = u.UserId
WHERE a.DateColumn > @StartDate
AND b.Success = 1
AND b.DateColumn > @StartDatemore data would be consumed on average. "if exists" makes the determination after the first row fufills the predicate whereas just joining looks to read more from tablea and tableb than needed, also I get "cartesianish" vibes somehow.
also, you're requiring a match in both tablea and tableb but I'm a bit hazy whether that matches the requirements.
Agreed on both observations, especially the latter, which does make it so it must exist in both TableA and TableB, when it should be either.
--Jeff Moden
Change is inevitable... Change for the better is not.
May 13, 2020 at 2:36 pm
If this code was run many times a day, it may be worth spending more time on it. It is executed once a month.
But, it's not the code that can be improved by much. It's the implementation in the database schema and the actual business requirements that need to be looked at. That, plus teaching the developers WHY, is a far better use of my time.
The unfortunate thing is that this code, as well as a lot more like it, was developed by folks who were let go a few weeks ago. The development team lost half of their personnel. The folks left who are trying to make sense of these issues that are now appearing because the usage of the system has risen. These guys needed a "win", badly.
Is this the most efficient code? Maybe not. But their re-write is magnitudes better than the original. I beat these folks up daily. When I gave my blessing to it, it was a desperately needed positive in a sea of negatives that looks like it will not go away any time soon.
Michael L John
If you assassinate a DBA, would you pull a trigger?
To properly post on a forum:
http://www.sqlservercentral.com/articles/61537/
May 19, 2020 at 7:41 am
What's worse: a problem which starts spontaneously and for which nobody is able to track down the cause, or, the same problem spontaneously stopping over the weekend and nobody having changed anything?
How to post a question to get the most help http://www.sqlservercentral.com/articles/Best+Practices/61537
May 19, 2020 at 8:20 am
What's worse: a problem which starts spontaneously and for which nobody is able to track down the cause, or, the same problem spontaneously stopping over the weekend and nobody having changed anything?
We had a hardware problem which exhibited this pattern, not too long ago. The cause was aging SSD's and a failure of the server/drive diagnostics to see it.
For fast, accurate and documented assistance in answering your questions, please read this article.
Understanding and using APPLY, (I) and (II) Paul White
Hidden RBAR: Triangular Joins / The "Numbers" or "Tally" Table: What it is and how it replaces a loop Jeff Moden
May 19, 2020 at 5:28 pm
So yesterday we got a bit of good/bad news...
Good, in that the wife has been called back to work and goes back on Tuesday.
Bad, in that the wife got called back to work and will be making LESS per week than she was getting from unemployment, with the extra $600/wk that the gov tacked on.
But frankly, I'm glad she's going back and so is she, in part because having a 40-hr/wk job is a much less stressful source of income than unemployment (especially when you add in the "will she FIND a job when the lockdowns end or when the unemployment runs out?")
Viewing 15 posts - 64,876 through 64,890 (of 66,815 total)
You must be logged in to reply to this topic. Login to reply