Sorting triggered by data type

Question

Sorting triggered by data type

Viewing 4 posts - 31 through 33 (of 33 total)

You must be logged in to reply to this topic. Login to reply

Paul White SSC Guru Points: 150467 More actions · Answer 1

L' Eomot Inversé (12/16/2011)
SQL Kiwi (12/13/2011)
I suppose we could also wonder about the relevance of the title, whether there is any guarantee the join will produce any rows or not, and what is meant by 'non-integer' data.
Surely it is not relevant whether the join will produce any rows? We know that the table containing the sometimes integer field has rows, and the optimiser may choose to do those conversions even in the case where the join produces no rows. Your post made me think maybe I misunderstood this so I tested it and found that my 2008 R2 installation hits the error when there is 1 row in the table we know is non-empty and no rows in the other table.

In which case, the answer would be 'it depends', or from a more quantum point of view, all answers would be correct at the same time until someone ran a test 🙂

For example this does not produce the error (sort after join):

DECLARE @cc TABLE (rCube INTEGER NULL, Value VARCHAR NULL)

DECLARE @C TABLE (Id INTEGER PRIMARY KEY, IsInsuranceLength BIT NULL)

INSERT @cc

(rCube, Value)

VALUES

(1, '1'),

(1, 'X')

SELECT

cc.[Value]

FROM @cc AS cc

JOIN @C AS c ON cc.rCube=c.Id

ORDER BY

(

CASE

WHEN c.IsInsuranceLength=1 THEN CONVERT(INTEGER, cc.Value)

ELSE cc.Value

END

)

...but remove the PRIMARY KEY constraint, and the Sort moves before the join, and we get an error.

Notice that when the Sort is after the join, the Compute Scalar below the join which defines the conversion to INTEGER has two rows flow through it, but the result that would cause the error is not actually computed until the first operator that needs it i.e. the Sort.

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

TomThomson SSC Guru Points: 104773 More actions · Answer 2

SQL Kiwi (12/16/2011)
L' Eomot Inversé (12/16/2011)
SQL Kiwi (12/13/2011)
I suppose we could also wonder about the relevance of the title, whether there is any guarantee the join will produce any rows or not, and what is meant by 'non-integer' data.
Surely it is not relevant whether the join will produce any rows? We know that the table containing the sometimes integer field has rows, and the optimiser may choose to do those conversions even in the case where the join produces no rows. Your post made me think maybe I misunderstood this so I tested it and found that my 2008 R2 installation hits the error when there is 1 row in the table we know is non-empty and no rows in the other table.
In which case, the answer would be 'it depends', or from a more quantum point of view, all answers would be correct at the same time until someone ran a test 🙂
For example this does not produce the error (sort after join):
DECLARE @cc TABLE (rCube INTEGER NULL, Value VARCHAR NULL)
DECLARE @C TABLE (Id INTEGER PRIMARY KEY, IsInsuranceLength BIT NULL)
INSERT @cc
(rCube, Value)
VALUES
(1, '1'),
(1, 'X')
SELECT
cc.[Value]
FROM @cc AS cc
JOIN @C AS c ON cc.rCube=c.Id
ORDER BY
(
CASE
WHEN c.IsInsuranceLength=1 THEN CONVERT(INTEGER, cc.Value)
ELSE cc.Value
END
)
...but remove the PRIMARY KEY constraint, and the Sort moves before the join, and we get an error.
Notice that when the Sort is after the join, the Compute Scalar below the join which defines the conversion to INTEGER has two rows flow through it, but the result that would cause the error is not actually computed until the first operator that needs it i.e. the Sort.

Yes, any unique constraint on that column makes it "work" (including those that allow null); as does any unique constraint on the ID column in the other table. Leave out both uniqueness constraints, and it ceases to work. But surely this is an absolute disaster for the concept that T-SQL is a language with a denotational (whether declarative or not) semantics (so that it is in no sense at all a declarative language)? There may be an operational semantics: a context dependent one, or course, there's no imaginable way there's a context-independent one, and the context dependence isn't just a type declaration dependence (those are usually ignored), it appears to be persistent if types are discarded, which is fantastically horrid, in fact utterly unacceptable. I really want to call it insanity instead of strangeness, but the MS fans may slaughter me if I do that. Never mind, let me say it: "This is INSANITY"!!! Despite any MS-fan opposition, let me state clearly that this is yet another example of the MS lunacy of saying something like "We don't give a **** about the semantics of T-SQL, our optimiser rules, OK?".

Tom

Paul White SSC Guru Points: 150467 More actions · Answer 3

L' Eomot Inversé (12/16/2011)
Yes, any unique constraint on that column makes it "work" (including those that allow null); as does any unique constraint on the ID column in the other table. Leave out both uniqueness constraints, and it ceases to work. But surely this is an absolute disaster for the concept that T-SQL is a language with a denotational (whether declarative or not) semantics (so that it is in no sense at all a declarative language)? There may be an operational semantics: a context dependent one, or course, there's no imaginable way there's a context-independent one, and the context dependence isn't just a type declaration dependence (those are usually ignored), it appears to be persistent if types are discarded, which is fantastically horrid, in fact utterly unacceptable. I really want to call it insanity instead of strangeness, but the MS fans may slaughter me if I do that. Never mind, let me say it: "This is INSANITY"!!! Despite any MS-fan opposition, let me state clearly that this is yet another example of the MS lunacy of saying something like "We don't give a **** about the semantics of T-SQL, our optimiser rules, OK?".

This is a thorny issue for sure. I'm not an expert on the semantics of standard SQL (whatever that is) by any means, by purely from a logical processing point of view, I understand that ORDER BY appears much later than JOIN. Now, does that mean that an expression required by the ORDER BY should also be deferred to that stage, or can they be computed earlier? More to the point here, is the optimizer wrong for pushing the physical sort required to honour the logical ORDER BY below the join? That is what causes the error, after all. A literal execution of the written query text would seem to me never to fail if the join eliminates rows that cause the conversion to fail.

My own personal stance on the broader question is that sometimes I would very much like to replace SQL with a strict, strongly-typed programming language that would require me to explicitly specify every operation and sequence of events in a query execution. On the other hand, I can quite appreciate that the subtleties of the mostly-relational transformations applied by the optimizer are probably a bit much for the general user that just wants to write a relatively simple expression of their requirement, and just let SQL Server figure out a reasonable way to generate those results. In most cases, I think the various freedoms afforded to the optimizer probably pay off overall; but who can say?

The other pragmatic point is that we have what we have, and it isn't going to change very much in the respects we are discussing. I have a strong feeling that this is a debate with strong views on both sides, but ultimately it is a rather academic one.

Paul White
SQLPerformance.com
SQLkiwi blog
@SQL_Kiwi

Dineshbabu Hall of Fame Points: 3220 More actions · Answer 4

I got it correct, B'coz I have already faced this issue(all expressions should be implicitly convertable to higher priority datatype) many times and gone for dynamic sql for sorting the result dynamically.

Nice one..

--
Dineshbabu
Desire to learn new things..