Recent PostsRecent Posts Popular TopicsPopular Topics
 Home Search Members Calendar Who's On

 O(n) , O(n log n), O(n^2) Rate Topic Display Mode Topic Options
Author
 Message
 Posted Monday, June 08, 2009 5:46 AM
 SSCrazy Group: General Forum Members Last Login: Wednesday, March 19, 2014 1:43 AM Points: 2,020, Visits: 2,515
 Hi All,I have recently read the below article.http://www.sqlservercentral.com/blogs/philfactor/archive/2009/02/16/be-careful-with-string-concatenations-in-sql-server-with-big-strings.aspxThey used some concepts which i couldn't understand.1) O(n) means ?2) O(n^2) means ?3) O(n log n) means ?4) What is mean naive linear string concatenation?http://stackoverflow.com/questions/859562/whats-the-best-way-to-recursively-reverse-a-string-in-javaRBarryYoung said: It has long been known that naive linear string concatenation is an O(n^2) operation (more specifically, it's triangular). That simple fact is one of the big reasons that .NET has the StringBuilder class (which uses the strings->array->mass_concatenate trick that peter alludes to).As it happens, I spent quite a lot of time last month investigating this problem and potential solutions and let me just say: it's tough in SQL. Here are the problems:1) SQL Server apparently (based on my tests) already does the "buffer-extension" trick available to mutable strings. Unfortunately, the Extension trick does NOT solve the O(n^2) problem, it just partially alleviates it at the lower end because some percentage(k) of appends can be extensions instead of creating a new string. Effectively it changes the {(n)*(n+1)/2} cost of the naive implementation to {n +(n)*(n-1)/(2*k)} where "k" is that percentage. It's better, but it's still O(n^2).2) The "array & mass-concatenate" trick is not available to T-SQL, not because SQL doesn't have arrays (tables serve the same purpose), but because AFAIK, there is no function that can take a variable "collection" of strings (table, array, whatever) and produce an output string.3) The "pre-allocate and Stuff" trick popular with mutable strings is not workable in T-SQL because the STUFF() function in T_SQL is NOT like the function of the same name in some general purpose languages: the T-SQL STUFF() is an RHS (right-hand side) function and NOT an LHS (left-hand side) function. AFAIK, there is no function in SQL that can (physically) write into a pre-existing string.That all said, I did eventually find a way to do it. Not in the O(n) time (linear) achievable in most general purpose languages, but I could get it down to O(n*Log(n)) time which is still a huge improvement. Unfortunately, I do not have it together in a presentable form and it would take me some time to do so (a day or so), but I can get it to you if you want.February 18, 2009 12:32 PM RBarryyoung,1) What do you mean by "buffer-extension"?2) Can you explain about the formula you used? karthik
Post #730611
 Posted Monday, June 08, 2009 5:53 AM
 SSC-Forever Group: General Forum Members Last Login: Today @ 4:42 PM Points: 41,516, Visits: 34,431
 karthikeyan (6/8/2009)1) O(n) means ?2) O(n^2) means ?3) O(n log n) means ?Surely you covered the Big O notation for algorithmic (time) complexity at university? It's a fundamental of Comp Sci theory.Try thesehttp://en.wikipedia.org/wiki/Computational_complexity_theoryhttp://en.wikipedia.org/wiki/Big_O_notation Gail ShawMicrosoft Certified Master: SQL Server 2008, MVPSQL In The Wild: Discussions on DB performance with occasional diversions into recoverabilityWe walk in the dark places no others will enterWe stand on the bridge and no one may pass
Post #730616
 Posted Monday, June 08, 2009 7:07 AM
 SSCrazy Group: General Forum Members Last Login: Wednesday, March 19, 2014 1:43 AM Points: 2,020, Visits: 2,515
 Yes Gail...I have studied...But i don't remember the functionality...I have reviewed the above mentioed URL, Since there are lot of formula's are there,it is confusing...If you or anybody gave the short notes about it, I will review it once again with some understanding....instead of looking it blindly...Apart from that, I have some more questions....1) What is mean by pseudo cursor?"pre-allocate and Stuff" trick popular with mutable strings 2) what this trick will do?3) What RHS (right-hand side) function denotes exactly?I believe the way it operates is from right side....ex: substring() right?4) What LHS (Left-hand side) function denotes exactly?I believe the way it operates is from left side....ex: right() right?OK, since I couldn't get a copy of Moby Dick, I made my own test data. Here is your test rig Phil, with my test data and my technique, followed by your original approach. Needless to say, my method is nearly linear, but the simple approach grows geometrically:5) what are all the method available apart from linear? What is mean by linear method?how to know whether my code use linear method? aaah....lot of doubts..... karthik
Post #730657
 Posted Monday, June 08, 2009 8:35 AM
 SSCrazy Eights Group: General Forum Members Last Login: Friday, March 28, 2014 2:25 PM Points: 9,902, Visits: 9,479
 karthikeyan (6/8/2009)They used some concepts which i couldn't understand.1) O(n) means ?2) O(n^2) means ?3) O(n log n) means ?4) What is mean naive linear string concatenation?For (1-3), see the Wikipedia articles that Gail referenced. Big-O notation is simple once you understand it, but a little confusing to explain so I'm deferring to the experts.For (4), "naive linear string concatenation" is just the normal way of concatentating many strings together by appending each one onto the end of an accumulated output string:`--NOTE: This is an example. I dont really use While loops in SQL.Declare @i As int Declare @A As varchar(MAX)Select @i = 1, @A = ''WHILE @i <= (Select Count(*) From MyStrings)BEGIN Select @A = @A + Select B From MyStrings Where ID = @iEND`The problem with this approach is that you have to re-copy @A every time through the loop and @A is getting larger every time. So even if our strings are all 1 character, first we have to copy 1 character, then 2 characters, then 3, then 4, 5, 6, 7, ... all the way out to N where N is total number of strings we want to concatenate together. This totals to ((N^2) + N)/2 which is complexity Order N^2 (or just O(N^2)).RBarryYoung said: ...1) SQL Server apparently (based on my tests) already does the "buffer-extension" trick available to mutable strings. Unfortunately, the Extension trick does NOT solve the O(n^2) problem, it just partially alleviates it at the lower end because some percentage(k) of appends can be extensions instead of creating a new string. Effectively it changes the {(n)*(n+1)/2} cost of the naive implementation to {n +(n)*(n-1)/(2*k)} where "k" is that percentage. It's better, but it's still O(n^2).RBarryyoung,1) What do you mean by "buffer-extension"?The "buffer-extension trick" is referring to a low-level (Assembler or C) trick that takes advantage of that fact that in most OS's strings often have large gaps of unallocated memory after them. Thus if you want to add the string B\$ onto the end of A\$, then normally you would have to allocate a new buffer whose length is >= Len(A\$) + Len(B\$), then copy all of A\$ inot it, then copy B\$ in after A\$ and then re-point A\$ to the new buff (and deallocate the old). Besides all of the memory allocation overhead, this is an O(n) operation where n = Len(A\$)+Len(B\$).However, if A\$ just happens to have an unused portion of memory after it whose size is >= Len(B\$) then, instead you can just copy B\$ into that unused memory after A\$ and extend A\$'s length value to inlcude the appended characters.2) Can you explain about the formula you used?Which formula? I use a lot of them. -- RBarryYoung, (302)375-0451 blog: MovingSQL.com, Twitter: @RBarryYoungProactive Performance Solutions, Inc. "Performance is our middle name."
Post #730713
 Posted Monday, June 08, 2009 10:25 AM
 SSC-Forever Group: General Forum Members Last Login: Today @ 4:42 PM Points: 41,516, Visits: 34,431
 karthikeyan (6/8/2009)Yes Gail...I have studied...But i don't remember the functionality...I have reviewed the above mentioed URL, Since there are lot of formula's are there,it is confusing...Ok, explain what you do know of Big O notation, since you said you did study it, and what exactly is confusing you and we'll try and fill in from there. 3) What RHS (right-hand side) function denotes exactly?In what context?4) What LHS (Left-hand side) function denotes exactly?In what context?OK, since I couldn't get a copy of Moby Dick, I made my own test data. Here is your test rig Phil, with my test data and my technique, followed by your original approach. Needless to say, my method is nearly linear, but the simple approach grows geometrically:5) what are all the method available apart from linear? What is mean by linear method?how to know whether my code use linear method? Related to Big O notation. An algorithm is linear, or has linear time complexity if the time to process a number of inputs grows linearly with the number of inputs. It would be described as O(n)There's lots of complexities other than linear. Logarithmic, geometric, polynomial, exponential, etc. Gail ShawMicrosoft Certified Master: SQL Server 2008, MVPSQL In The Wild: Discussions on DB performance with occasional diversions into recoverabilityWe walk in the dark places no others will enterWe stand on the bridge and no one may pass
Post #730797
 Posted Monday, June 08, 2009 2:13 PM
 SSCrazy Eights Group: General Forum Members Last Login: Friday, March 28, 2014 2:25 PM Points: 9,902, Visits: 9,479
 karthikeyan (6/8/2009)1) What is mean by pseudo cursor?Pusedocursor is a name that some people (including me) use for the SQL Server (and Sybase too, I believe) trick of being able to assign both to and from variables for every row of a resultset. The typical example is the simple string concatenation trick that we often use:`Declare @str varchar(MAX)Set @str = ''Select @str = @str + ',' + [name] From sys.columnsPrint @str`"pre-allocate and Stuff" trick popular with mutable strings 2) what this trick will do? We used to call this "string mapping" back in the old days. Instead of reallocating the output string with every iteration, you estimate the likely size of the output string and pre-allocate the output string with that much blank space and use a LenPtr to keep track of how much of this buffer you have used. Then for each string you "MAP" or "STUFF" the current source string into the next available space after LenPtr and then update LenPtr to the new end. This is linear time ( O(n) ) because you only have to touch each input character once.(Heh. I was re-reading this and just realized that I got my "Left" and "Right" reversed! Yikes!)4) What LHS (Left-hand side) RHS (Right-hand side) function denotes exactly?LeftRight-hand Functions appear leftright of the assignment operator ("="):` Set @str = UPPER('Some text.')`These are normal functions, and AFAIK, in T-SQL, all functions are LRHS.3) What RLHS (rightleft-hand side) function denotes exactly?RightLeft-hand Side functions appear on the rightleft-hand side and they usually do "special" things having to do with addressing the output property or variable. AFAIK, T-SQL does not have any, but in some languages, STUFF is a RLHS:` STUFF(@str, offset, len) = 'foo'`This example would overwrite the output string (@str) with the input string starting at 'offset' for 'len' characters. The difference between this and the RHS STUFF() function in T-SQL is that the LHS version does not return anything, it actually does write over the characters of the @str variable. -- RBarryYoung, (302)375-0451 blog: MovingSQL.com, Twitter: @RBarryYoungProactive Performance Solutions, Inc. "Performance is our middle name."
Post #730960
 Posted Tuesday, June 09, 2009 4:12 AM
 SSCrazy Group: General Forum Members Last Login: Wednesday, March 19, 2014 1:43 AM Points: 2,020, Visits: 2,515
 RBarryYoung (6/8/2009)karthikeyan (6/8/2009)They used some concepts which i couldn't understand.1) O(n) means ?2) O(n^2) means ?3) O(n log n) means ?4) What is mean naive linear string concatenation?For (1-3), see the Wikipedia articles that Gail referenced. Big-O notation is simple once you understand it, but a little confusing to explain so I'm deferring to the experts.For (4), "naive linear string concatenation" is just the normal way of concatentating many strings together by appending each one onto the end of an accumulated output string:`--NOTE: This is an example. I dont really use While loops in SQL.Declare @i As int Declare @A As varchar(MAX)Select @i = 1, @A = ''WHILE @i <= (Select Count(*) From MyStrings)BEGIN Select @A = @A + Select B From MyStrings Where ID = @iEND`The problem with this approach is that you have to re-copy @A every time through the loop and @A is getting larger every time. So even if our strings are all 1 character, first we have to copy 1 character, then 2 characters, then 3, then 4, 5, 6, 7, ... all the way out to N where N is total number of strings we want to concatenate together. This totals to ((N^2) + N)/2 which is complexity Order N^2 (or just O(N^2)).RBarryYoung said: ...1) SQL Server apparently (based on my tests) already does the "buffer-extension" trick available to mutable strings. Unfortunately, the Extension trick does NOT solve the O(n^2) problem, it just partially alleviates it at the lower end because some percentage(k) of appends can be extensions instead of creating a new string. Effectively it changes the {(n)*(n+1)/2} cost of the naive implementation to {n +(n)*(n-1)/(2*k)} where "k" is that percentage. It's better, but it's still O(n^2).RBarryyoung,1) What do you mean by "buffer-extension"?The "buffer-extension trick" is referring to a low-level (Assembler or C) trick that takes advantage of that fact that in most OS's strings often have large gaps of unallocated memory after them. Thus if you want to add the string B\$ onto the end of A\$, then normally you would have to allocate a new buffer whose length is >= Len(A\$) + Len(B\$), then copy all of A\$ inot it, then copy B\$ in after A\$ and then re-point A\$ to the new buff (and deallocate the old). Besides all of the memory allocation overhead, this is an O(n) operation where n = Len(A\$)+Len(B\$).However, if A\$ just happens to have an unused portion of memory after it whose size is >= Len(B\$) then, instead you can just copy B\$ into that unused memory after A\$ and extend A\$'s length value to inlcude the appended characters.2) Can you explain about the formula you used?Which formula? I use a lot of them.RBarryyoung and Gail,Big O notation means the steps to complete a problem. Right?Still i have lot of questions.1) what do you mean by O(n^2) problem? Pls don't mistake me if i am asking this question again.still I am not getting this formula clearly.2) Big O notation means the steps to complete a problem. we denote it as O(n). But what is the difference between O(n) and O(n^2) ?This totals to ((N^2) + N)/2 which is complexity Order N^2 (or just O(N^2)).I believe if i understand the above one, i can idenitify the difference between them. But the above one is confusing me...say for example i have row like below12345i would like to append the above values as 12345soDeclare @str varchar(5)select @str = ''select @str = @str + no from mystringstep 1: '' + '1'step 2: '1' + '2'step 3: '12' + '3'step 4: '123' + '4'step 5: '1234' + '5'so if i apply ((N^2) + N)/2 formulaN = 5((5^2) + 5)/2 ( select ((5^2) + 5)/2 gave 6, but i thiink it is wrong as per mathematics. since ^ is considered as POWER in maths. Right ? )((5^2) + 5)/2 equivalent to select (Power(5,2) + 5)/2. Right?5^2 = 5*5 = 2525+5 = 30/2 = 15. Right?2) Can you explain about the formula you used?Which formula? I use a lot of them. {(n)*(n+1)/2} cost of the naive implementation to {n +(n)*(n-1)/(2*k)} where "k" is that percentage. It's better, but it's still O(n^2).Can you explain it? karthik
Post #731289
 Posted Tuesday, June 09, 2009 4:22 AM
 SSCrazy Group: General Forum Members Last Login: Wednesday, March 19, 2014 1:43 AM Points: 2,020, Visits: 2,515
 1) SQL Server apparently (based on my tests) already does the "buffer-extension" trick available to mutable strings. Unfortunately, the Extension trick does NOT solve the O(n^2) problem, If it supports the "buffer-extension" trick,then why it doesn't solve O(n^2) problem. again , i need to know exactly about O(n^2).can you explain it with example? so that i can remember it through my career.Does this issue solved in sql2008?aaah...i am confusing myself....i think.... karthik
Post #731295
 Posted Tuesday, June 09, 2009 4:24 AM
 SSC-Forever Group: General Forum Members Last Login: Today @ 4:42 PM Points: 41,516, Visits: 34,431
 karthikeyan (6/9/2009)Big O notation means the steps to complete a problem. Right?No. Not at all.Big O notation describes the time complexity of an algorithm. That is describes how the time required to solve an algorithm will increase as the number of inputs (n) increases.A linear algorithm (O(n)) is one where the time required to solve the algorithm increases linearly with the number of inputs. So if 1000 inputs takes 1 minute to solve, 2000 inputs will take 2 minutes to solve and 5000 inputs will take 5 minutes to solve.A quadratic algorithm (O(n2) is one where the time required to solve the algorithm increases quadratically with the number of inputs. So if 1000 inputs takes 1 minute to solve, 2000 inputs will take 4 minutes to solve and 5000 inputs will take 25 minutes to solve. (roughly)I'm curious. Where and when did you do your MCA? Where and when did you do your undergrad Comp Sci degree? Gail ShawMicrosoft Certified Master: SQL Server 2008, MVPSQL In The Wild: Discussions on DB performance with occasional diversions into recoverabilityWe walk in the dark places no others will enterWe stand on the bridge and no one may pass
Post #731297
 Posted Tuesday, June 09, 2009 4:24 AM
 SSCrazy Group: General Forum Members Last Login: Wednesday, March 19, 2014 1:43 AM Points: 2,020, Visits: 2,515
 RBarryYoung (6/8/2009)karthikeyan (6/8/2009)1) What is mean by pseudo cursor?Pusedocursor is a name that some people (including me) use for the SQL Server (and Sybase too, I believe) trick of being able to assign both to and from variables for every row of a resultset. The typical example is the simple string concatenation trick that we often use:`Declare @str varchar(MAX)Set @str = ''Select @str = @str + ',' + [name] From sys.columnsPrint @str`"pre-allocate and Stuff" trick popular with mutable strings 2) what this trick will do? We used to call this "string mapping" back in the old days. Instead of reallocating the output string with every iteration, you estimate the likely size of the output string and pre-allocate the output string with that much blank space and use a LenPtr to keep track of how much of this buffer you have used. Then for each string you "MAP" or "STUFF" the current source string into the next available space after LenPtr and then update LenPtr to the new end. This is linear time ( O(n) ) because you only have to touch each input character once.4) What LHS (Left-hand side) function denotes exactly?Left-hand Functions appear left of the assignment operator ("="):` Set @str = UPPER('Some text.')`These are normal functions, and AFAIK, in T-SQL, all functions are LHS.3) What RHS (right-hand side) function denotes exactly?Right-hand Side functions appear on the right-hand side and they usually do "special" things having to do with addressing the output property or variable. AFAIK, T-SQL does not have any, but in some languages, STUFF is a RHS:` STUFF(@str, offset, len) = 'foo'`This example would overwrite the output string (@str) with the input string starting at 'offset' for 'len' characters. The difference between this and th LHS STUFF() function in T-SQL is that the RHS version does not return anything, it actually does write over the characters of the @str variable.So we should avoid pseudo cursor. Right? 1) It will be treated as internal cursor. i.e it will perform looping internally.But comparing to external cursor, it will be very fasy. Right?2) is it a good habit to use pseudo cursor in sql programming? if not, what is the workaround for this? karthik
Post #731298

 Permissions