Click here to monitor SSC
SQLServerCentral is supported by Red Gate Software Ltd.
 
Log in  ::  Register  ::  Not logged in
 
 
 
        
Home       Members    Calendar    Who's On


Add to briefcase «««2829303132»»

There Must Be 15 Ways To Lose Your Cursors… Part 2: Just Put It in a S Expand / Collapse
Author
Message
Posted Friday, July 23, 2010 12:48 PM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Today @ 8:51 AM
Points: 8,440, Visits: 10,153
UMG Developer (7/23/2010)
I don't understand, that is a basic SELECT statement with no quirky update or aggregation, so since you have one record in the table you will get one record returned. (Well if the syntax errors are fixed first.) If you have 2 records in the table you would get two records back with a single column that contained a value of 3. (Unless one of the records contains a NULL in either column in which case the corresponding row will return NULL for the value of the C column.)

That of copurse was precisely my point - we know what theoutput from that will be so that passage in BoL that says that I can't know that the result has a single column containing "3" is nonsense, so whoever wrote that passager clearly wasn't thinking staright about what is guaranteed despite the optimzer's reordering of operations. The worry is that this sloppy thinking may at some point impact development in MS, resulting in some nasty bugs.

Maybe I should have used an even simpler example, like this: if that BoL passage were correct we could not be sure that the result set from "select 1 as c" has the value 1 in the single field c of its single row.


Tom
Post #958180
Posted Friday, July 23, 2010 1:24 PM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Today @ 8:51 AM
Points: 8,440, Visits: 10,153
Mike C (7/22/2010)
Let's walk through this briefly and see what steps would be required to implement the original query to perform it in logical order steps:

select
left( 'abc', T1.v )
from
T1
where
T1.v >= 0
;


Now let's say we have 10 million rows in T1 and 5 million of them match the where clause (T1.v >= 0). Here's what we need to do:

1. Scan the 10 million row table (look at every single row) and compare to 0.
2. Dump the 5 million rows that match the WHERE clause into a temp table or other intermediate storage.
3. Scan the 5 million rows that match the WHERE clause and perform the projection against them (left( 'abc', T1.v )).

We just scanned 15 million rows and stored 5 million rows in temporary storage to fulfill this query. Now let's look at one way we can optimize this:

1. Scan the 10 million row table.
2. Execute the WHERE clause and the projection at the same time on each row since both clauses are using the same columns.


I notice that you refer to a projection followed by a transformation (or mapping - people use both terms) as a projection; that's potentially dangerous thinking when discussing optimisation, since transformations (mappings) can be very expensive, while projections are fairly cheap.

In your "optimized" version we have 10 million invocations of left instead of 5 million. Seems to me to be too many! Surely the optimizer can handle transformations at the same time as restrictions (or would "select sum(T1.v*5) from T1 where T1.v>=0" create a table in interim storage to store the restriction's result in, and then scan that table to compute the multiplication and do the sum - if so it's inferior to an optimizer I saw a fresh young graduate throw together in a few months back in the 80s)? If it can do that in a single scan without a temporary table for intermediate results why should it not do the example using left that way too?

What I expect the system to do is:
scan the 10 million row table, executing the projection and the restriction [the where clause] on each row (you can't execute the where clause without doing the projection, since it examines c, the only column in the projection), and executing the transformation and passing the transformed row to the output if (only if) the row is included by the where clause.

No extra scans, no temporary tables, and no unneccessary transformations. And no violation of the declarative semantics of relational algebra.

Anyway, I expect the semantics to be such that the result of the select is not modified by rows which do not "qualify for the select" ("qualify for the select" is Microsoft's terminology, not mine). Maybe that's because I expect to see at least lip service paid to relational algebra.

I can't make a similar argument for the "quirky update", and I would prefer to see SQL add some concepts from functional algebra with good clear semantics to the select and update statements, and then T-SQL could adopt those to replace the current quirky update . But I don't see it happening.

My worry about the passage I quoted from the BoL documentation is twofold: (a) it's wrong (outrageously wrong); (b) it may indicate a tendency in Microsdoft to believe that clever tricks in the optimiser justify overriding the semantics of SQL, and if so that's very dangerous.


Tom
Post #958221
Posted Saturday, July 24, 2010 1:52 PM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Friday, October 10, 2014 2:06 PM
Points: 89, Visits: 482
ganotedp (7/20/2010)
"Business Logic"?!? That's a highly, um, "elastic" concept. I'm with Alex Papadimoulis on the topic:
http://thedailywtf.com/Articles/The-Mythical-Business-Layer.aspx



With due respect to Alex, it is not as elastic as you think. A good barometer for whether a given piece of logic belongs in the database is whether it can be written without cursors and without special CLR routines. If it cannot, the odds are high that it does not belong in the database. A database, specifically a OLTP database, is best at storing data, enforcing data integrity and serving data. It is not designed for complicated calculations. It is not made to be a reporting tool (as opposed to simply serving data to a reporting tool). Just as Excel can be misused as a database, a database can be misused as a repository for complicated calculations. It is a testament to T-SQL that the language is sufficiently powerful to be misused for these purposes.
Post #958424
Posted Saturday, July 24, 2010 2:59 PM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Today @ 8:51 AM
Points: 8,440, Visits: 10,153
Thomas-282729 (7/24/2010)
[quote]With due respect to Alex, it is not as elastic as you think. A good barometer for whether a given piece of logic belongs in the database is whether it can be written without cursors and without special CLR routines. If it cannot, the odds are high that it does not belong in the database. A database, specifically a OLTP database, is best at storing data, enforcing data integrity and serving data. It is not designed for complicated calculations. It is not made to be a reporting tool (as opposed to simply serving data to a reporting tool). Just as Excel can be misused as a database, a database can be misused as a repository for complicated calculations. It is a testament to T-SQL that the language is sufficiently powerful to be misused for these purposes.

The relational calculus with propositional calculus and transactional control, which is what a language supporting the relational model is supposed to be, is Turing Complete as soon as you add the "while <boolean expression> <compound statement>" construct; so it is sufficiently powerful to be used for anything that is computable. With its while loops SQL in general (and T-SQL in particular) although it only pays lip service to the idea of supporting relational algebra with predicate calculus, is indeed Turing complete. That isn't at all remarkable - there have been very very few programming languages which weren't and it isn't any sort of testament to T-SQL.

The term "business logic" is sufficiently vague in meaning that it can include many things. Any reasonable use of the term includes things which clearly are properties of the business data model - for example check constraints and uniqueness constraints. Forbidding the database to implement business logic would forbid database designers from normalising their schemata: after all progressing from 2NF to 3NF is the process of ensuring that all simple functional dependencies are enforced by the schema and its keys and constraints, and functional dependencies are clearly about business logic. So I have to disagree strongly with the idea that there should be no business logic in the database.

edit: first version didn't mention while loop - I was being a bit sloppy.


Tom
Post #958434
Posted Saturday, July 24, 2010 5:20 PM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Friday, October 10, 2014 2:06 PM
Points: 89, Visits: 482
Tom.Thomson (7/24/2010)
Thomas-282729 (7/24/2010)
[quote]With due respect to Alex, it is not as elastic as you think. A good barometer for whether a given piece of logic belongs in the database is whether it can be written without cursors and without special CLR routines. If it cannot, the odds are high that it does not belong in the database. A database, specifically a OLTP database, is best at storing data, enforcing data integrity and serving data. It is not designed for complicated calculations. It is not made to be a reporting tool (as opposed to simply serving data to a reporting tool). Just as Excel can be misused as a database, a database can be misused as a repository for complicated calculations. It is a testament to T-SQL that the language is sufficiently powerful to be misused for these purposes.

The relational calculus with propositional calculus, which is what a language supporting the relational model is supposed to be is Turing Complete; so it is sufficiently powerful to be used for anything that is computable. SQL in general (and T-SQL in particular) although it only pays lip service to the idea of supporting relational algebra with predicate calculus, is indeed Turing complete. That isn't at all remarkable - there have been very very few programming languages which weren't and it isn't any sort of testament to T-SQL.

The term "business logic" is sufficiently vague in meaning that it can include many things. Any reasonable use of the term includes things which clearly are properties of the business data model - for example check constraints and uniqueness constraints. Forbidding the database to implement business logic would forbid database designers from normalising their schemata: after all progressing from 2NF to 3NF is the process of ensuring that all simple functional dependencies are enforced by the schema and its keys and constraints, and functional dependencies are clearly about business logic. So I have to disagree strongly with the idea that there should be no business logic in the database.


Much like Alex's article, I think you are confusing data integrity with business logic which are both needed for any database driven solution. Check constraints and uniqueness constraints relate to the integrity of the data. Rules such as "in march, sales from reps with a certain sale volume threshold and for products of certain types should have an interest rate discount based the users credit score accounting for a series of values some of which come from services" are difficult to implement and properly test in a database.

Yes, the term "business logic" is vague. DRI in and of itself is arguably a form of business logic. Each set of tools for supporting business logic have their strengths and weaknesses. Databases are great at set-based logic. They are poor at complex calculations or calculations which require iterative solutions. For example, it is very difficult to create multi-threaded solutions in a database. They are simply not designed for that. Further, databases are substantially more expensive to scale out than middle-tier libraries or services.

I never said that absolutely no business logic should be in the database. Instead, I suggested that if you are required to implement an iterative solution in the database using cursors, it does not belong in the database. The database is not the end-all-be-all repository for business logic. There are some forms of logic that simply do not belong in the database. Sometimes the answer for how to do something in T-SQL is not to do it T-SQL.
Post #958443
Posted Sunday, July 25, 2010 12:58 PM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 2:14 PM
Points: 37,484, Visits: 34,352
Mike C (7/23/2010)
I'm in total agreement with you 99.9% of the time on just about every topic Jeff, and you know that well. The fact that people spend a "lot of time" tweaking indexes to improve performance is not the same as spending time tweaking indexes to "change the content of your end result"; at least not in my mind. The two concepts to me seem disconnected.

I'm not particularly interested in convincing you that indexes (or lack of them) should not change the content of your results. Or number of processors, etc. If you believe the content of your output should be determined by your indexes, index hints, etc. then go for it.

At any rate, I respect your opinion and since you feel I'm "bad mouthing folks", I'm outta here and done with this thread.

Out here.

Mike C


Man... didn't mean for it to come to that especially since you're absolutely correct... we do agree on 99.9% of everything. Sorry, Mike... my fault for being a bit over-zealous here because of the really terrible bad-mouthing I've taken in the past on the subject. Didn't mean to snap at one of the good guys.


--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #958509
Posted Wednesday, August 11, 2010 10:45 AM


Grasshopper

GrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopperGrasshopper

Group: General Forum Members
Last Login: Friday, April 17, 2015 12:48 PM
Points: 17, Visits: 155
SQL Developers definitely think differently! In my mind's eye, it's more of a 3D approach, while RBAR is pretty "flat".

BTW, the need for set-based thought processes isn't peculiar to SQL Server. Tom Kyte has a "mantra" that, though tailored for Oracle, can be modified to apply SQL Server development. His mantra:

o You should do it in a single SQL statement if at all possible.
o If you cannot do it in a single SQL Statement, then do it in PL/SQL.
o If you cannot do it in PL/SQL, try a Java Stored Procedure.
o If you cannot do it in Java, do it in a C external procedure.
o If you cannot do it in a C external routine, you might want to seriously
think about why it is you need to do it!
Post #967605
Posted Saturday, August 14, 2010 11:18 AM


SSC-Dedicated

SSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-DedicatedSSC-Dedicated

Group: General Forum Members
Last Login: Today @ 2:14 PM
Points: 37,484, Visits: 34,352
Heh... Tom is definitely one of the good guys.

Part of his last statement in his mantra is actually part of the first statement in mine...

"you might want to seriously think about why it is you need to do it!"



--Jeff Moden
"RBAR is pronounced "ree-bar" and is a "Modenism" for "Row-By-Agonizing-Row".

First step towards the paradigm shift of writing Set Based code:
Stop thinking about what you want to do to a row... think, instead, of what you want to do to a column."

(play on words) "Just because you CAN do something in T-SQL, doesn't mean you SHOULDN'T." --22 Aug 2013

Helpful Links:
How to post code problems
How to post performance problems
Post #969405
Posted Tuesday, December 13, 2011 8:35 PM


SSCrazy Eights

SSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy EightsSSCrazy Eights

Group: General Forum Members
Last Login: Today @ 8:51 AM
Points: 8,440, Visits: 10,153
Coming back to this after much delay, because I came across it as a reult of someone's reference to RBarry's article and I think it's an important issue.

Thomas-282729 (7/24/2010)
Much like Alex's article, I think you are confusing data integrity with business logic which are both needed for any database driven solution. Check constraints and uniqueness constraints relate to the integrity of the data.

No, they are clearly business logic. If I have a business rule that says "any salesman can be selling exactly one product class" that generates a uniqueness constraint on the two columns SalesPersonID and ProcuctClassId in whatever table tells me which is the product class sold by each salesman. If my business rules allows each salesman to sell products belonging to several product classes, such a constraint can not be imposed - the business rule that the constraint represents does not exist, so the constraint must not exist. You are proposing an utterly false distinction here - any question of "data integrity" boils down to a question of what states of the data are permitted by the business rules; we impose constraints, and normalise, in order to ensure that states not permitted by the business rules are unreachable, and that is the sole meaning of the phrase "data integrity": states which shoule be unreachable states (whose unreachability is determined by the business rules) must not be reachable. So data integrity rules are a particular subset of the business rules, not something separate from business rules, and declaring that the database must not implement business rules amounts to declaring that you are not permitted, in the database, to ensure data integrity - which is, in my view, absolute claptrap.

Rules such as "in march, sales from reps with a certain sale volume threshold and for products of certain types should have an interest rate discount based the users credit score accounting for a series of values some of which come from services" are difficult to implement and properly test in a database.

So there are some business rules that should probably not be in the database; how did you achieve the amazing leap from there to "there are no business rules that should be in the database"?

I never said that absolutely no business logic should be in the database. Instead, I suggested that if you are required to implement an iterative solution in the database using cursors, it does not belong in the database.[ The database is not the end-all-be-all repository for business logic. There are some forms of logic that simply do not belong in the database. Sometimes the answer for how to do something in T-SQL is not to do it T-SQL.

That seems to be a long step away from what you said earlier. It's something I agree with - I'm not interested for example in handling something in T-SQL that would be more appropiately handled in Fortran. But I still regard any statement that the database is not the place for business logic as nonsense, since there is oftrn a large amount of business logic (even if you want to call it something else, like "data integrity logic", and pretend it isn't business logic despite the very concept of data integrity being meaningless unless it is defined as banning states forbidden by business logic) that absolutely must be implemented in the database if we are to have clean bug-free systems.


Tom
Post #1221308
Posted Tuesday, December 13, 2011 10:56 PM
SSC Journeyman

SSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC JourneymanSSC Journeyman

Group: General Forum Members
Last Login: Friday, October 10, 2014 2:06 PM
Points: 89, Visits: 482
Let's step back for a second. The point I have made multiple times is that if you have to resort to T-SQL cursors, that is logic that almost assuredly does not belong in the database. The declarative features in SQL Server are such that there shouldn't be situations (i.e., they should be extraordinarily rare)where you must resort to cursors and have that logic in the database.

The only reason you consider the term "business logic" vague is that you are using it mean everything and the kitchen sink. As far as I can tell, by your reckoning, any constraint of any sort is business logic. I do not consider check constraints to be "business logic" even if it is a rule dictated by the business. Perhaps another term is needed to differentiate rules used to ensure the correctness of the data as opposed to rules related to facilitating of a business process. I'm calling the former data integrity and the later "business logic". There is no absolute demarcation between the two, however, a good point of reference is that if you must resort to using iterative solutions within a set based system (the database), the odds are good it does not belong there.

Thus, if one differentiates forms of business logic, between logic for data correctness and logic for business processes, the yes, the later does not belong in the database IMO. My statements throughout are consistent unless one extrapolates the term "business logic" to mean any decision point anywhere in the system no matter how small in which case the term itself is meaningless and I doubt that is the case with the other posters.
Post #1221340
« Prev Topic | Next Topic »

Add to briefcase «««2829303132»»

Permissions Expand / Collapse