SQLServerCentral » SQL Server 7,2000 » T-SQL » O(n) , O(n log n), O(n^2)InstantForum 2016-2 FinalSQLServerCentralhttps://www.sqlservercentral.com/Forums/SQLServerCentralFri, 21 Jul 2017 10:48:21 GMT20O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost730611.aspxHi All,
I have recently read the below article.
http://www.sqlservercentral.com/blogs/philfactor/archive/2009/02/16/be-careful-with-string-concatenations-in-sql-server-with-big-strings.aspx
They used some concepts which i couldn't understand.
1) O(n) means ?
2) O(n^2) means ?
3) O(n log n) means ?
4) What is mean naive linear string concatenation?
http://stackoverflow.com/questions/859562/whats-the-best-way-to-recursively-reverse-a-string-in-java
[quote]RBarryYoung said:
It has long been known that naive linear string concatenation is an O(n^2) operation (more specifically, it's triangular). That simple fact is one of the big reasons that .NET has the StringBuilder class (which uses the strings->array->mass_concatenate trick that peter alludes to).
As it happens, I spent quite a lot of time last month investigating this problem and potential solutions and let me just say: it's tough in SQL. Here are the problems:
1) SQL Server apparently (based on my tests) already does the "buffer-extension" trick available to mutable strings. Unfortunately, the Extension trick does NOT solve the O(n^2) problem, it just partially alleviates it at the lower end because some percentage(k) of appends can be extensions instead of creating a new string. Effectively it changes the [b]{(n)*(n+1)/2}[/b] cost of the naive implementation to [b]{n +(n)*(n-1)/(2*k)} [/b]where "k" is that percentage. It's better, but it's still O(n^2).
2) The "array & mass-concatenate" trick is not available to T-SQL, not because SQL doesn't have arrays (tables serve the same purpose), but because AFAIK, there is no function that can take a variable "collection" of strings (table, array, whatever) and produce an output string.
3) The "pre-allocate and Stuff" trick popular with mutable strings is not workable in T-SQL because the STUFF() function in T_SQL is NOT like the function of the same name in some general purpose languages: the T-SQL STUFF() is an RHS (right-hand side) function and NOT an LHS (left-hand side) function. AFAIK, there is no function in SQL that can (physically) write into a pre-existing string.
That all said, I did eventually find a way to do it. Not in the O(n) time (linear) achievable in most general purpose languages, but I could get it down to O(n*Log(n)) time which is still a huge improvement. Unfortunately, I do not have it together in a presentable form and it would take me some time to do so (a day or so), but I can get it to you if you want.
February 18, 2009 12:32 PM
[/quote]
RBarryyoung,
1) What do you mean by "buffer-extension"?
2) Can you explain about the formula you used?
Sat, 16 Mar 2013 12:18:09 GMTkarthik MRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost1431933.aspx[quote][b]karthik M (3/16/2013)[/b][hr]How to identify whether a code fall under O(n) / O (n2) / O(n log n) by seeing the code ?[/quote]
[url]http://en.wikipedia.org/wiki/Analysis_of_algorithms[/url]
Should get you started. The list of references and further reading too.Sat, 16 Mar 2013 12:18:09 GMTGilaMonsterRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost1431914.aspx[quote][b]karthik M (3/16/2013)[/b][hr]Hello All,
I just refreshed all my posts.
This is a very old post (asked on 2009) :)
How to identify whether a code fall under O(n) / O (n2) / O(n log n) by seeing the code ?[/quote]
Most of us don't try and analyze our code to determine if it is O(n), O(n^2), or O(n log n). We test our code against expect data loads, and then unexpected loads (the million row test) and tune our code appropriately to achieve a scalable solution.
Sat, 16 Mar 2013 10:43:44 GMTLynn PettisRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost1431907.aspxHello All,
I just refreshed all my posts.
This is a very old post (asked on 2009) :)
How to identify whether a code fall under O(n) / O (n2) / O(n log n) by seeing the code ?Sat, 16 Mar 2013 09:59:54 GMTkarthik MRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost1366200.aspx[quote][b]karthikeyan-444867 (9/28/2012)[/b][hr]Experts,
is there any solution available for "Mass concatenate" in SQL 2008 OR SQL 2012?
[/quote]
People have already referred to teh "for xml" method, which is O(n) so for larg enough n withh be much better than the other methods discussed (which are all O(n^2)).Sat, 29 Sep 2012 09:52:59 GMTTomThomsonRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost1366198.aspx[quote][b]GilaMonster (6/8/2009)[/b][hr][quote][b]karthikeyan (6/8/2009)[/b][hr]
1) O(n) means ?
2) O(n^2) means ?
3) O(n log n) means ?
[/quote]
Surely you covered the Big O notation for algorithmic (time) complexity at university? It's a fundamental of Comp Sci theory.
Try these
[url]http://en.wikipedia.org/wiki/Computational_complexity_theory[/url]
[url]http://en.wikipedia.org/wiki/Big_O_notation[/url]
[/quote]
It's not often I disagree with Gail, but here I have to say that those references probably are not useful.
The first is off into esoteric stuff way beyond the level of a simple primer, going so far as to talk about open problems, and surely anything that far beyond primer level is unlikely to be helpful to someone asking this question. The second gives a formal definition which makes it correct to say that the constant function C1(x) = 1 is O(exp(exp(exp(x)))), and the identity function I(x) = x is O(x^x), which certainly is not going to give someone who wants to know what we mean by it anything like the right idea.
So I'm going to offer a description for laymen:
suppose we have an algorithm A which given n rows of input (or count rows of output instead, if that is useful) will on average have to do W(n) units of work and use S(n) amount of store. Then we say W(n) is O(n) if (for big enough n) when n doubles then W(n) roughly doubles. We say W(n) is O(n^2) if (for big enough n) when n doubles W(n) goes up by roughly a factor of 4 (2 = 2^2) and when n goes up by a factor of 10 W(n) goes up roughly by a factor of 100. Work complexity W(n) is O(n) means the amount of work is roughly proportional to n, work complexity W(n) is O(n^2) means the amount of work is roughly proportional to n^2. The same for any other function in the brackets after the big O: if the work complexity is O(n log(n)) then for big enough the amount of work is roughly proportional to n*log(n), if the work complexity is O(2^n) each extra row roughly doubles the amount of work, and so on. The idea of Space complexity S(n) being O(g(n)) for some function g is much the same.
Obviously a formal mathematical definition is a bit different, but for practical purposes when working with database stuff the description above is what is actually needed. Of course you won't solve any of the currently unresolved mathematical issues in complexity theory with that description, but that's not what (most) database people want to do.
edit: woops, just noticed that this thread started a very long time ago, and my comments are on a very early post, not on today's post, so may no longer be of any interest.Sat, 29 Sep 2012 09:41:16 GMTTomThomsonRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost1366159.aspxExperts,
is there any solution available for "Mass concatenate" in SQL 2008 OR SQL 2012?
Fri, 28 Sep 2012 21:57:03 GMTkarthik MRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost734415.aspx[quote][b]Jeff Moden (6/13/2009)[/b][hr]You're absolutely correct and preaching to the choir. I should have just said that it was O(n[sup]2[/sup]) and left it at that because that would have been the correct thing to do.
Still, a triangular join will beat a square join by a factor of two every time. :-P[/quote]
Heh. Thanks, now I have to spend the rest of the day coming up with a count-example? :-PSat, 13 Jun 2009 12:32:06 GMTRBarryYoungRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost734400.aspx[quote][b]RBarryYoung (6/13/2009)[/b][hr][quote][b]Jeff Moden (6/12/2009)[/b][hr][quote][b]RBarryYoung (6/10/2009)[/b][hr][quote][b]Jeff Moden (6/10/2009)[/b][hr]If you strip away all the small stuff that won't impact the result much when the numbers get really big, you end up with O(N[sup]2[/sup]/2). That's much easier to compare to a cartesian join which is both N[sup]2[/sup] and O(n[sup]2[/sup]) to say that a triangular join is approximately half as bad as the square join.[/quote]
Actually, you strip away the constant factors (like 1/2) too, so that the complexity order is also O(n[sup]2[/sup]). Thus the complexity of all square and all triangular algorithms have the same order.[/quote]
Understood and I agree. The point I was really trying to make is that for all values this side of infinity that will still fit inside SQL Server, the /2 is important. :-P[/quote]
Yes, but the point is that you do not use constant factors like the /2 in big-O notation or the "complexity order" of something. This is because the purpose of complexity order analysis is to separate algorithmic efficiency (the converse of complexity), from implementation efficiency which is dependent on HW performance, SW efficiency, and the developer's code efficiency.
What this means is that an efficiently implemented Square algorithm can beat an inefficient Triangular algorithim, all the way up to infinity. But no matter how efficient the implementation of an O(n[sup]2[/sup]) algorithm like simple string concatenation with pseudocursors is, it will [b][i]always[/i][/b] eventually lose out to an algorithm of lower complexity order no matter how inefficient it is, like FOR XML string concatenation, which is a very inefficient O(n).
So what big-O notation tells us is that while I might be able to tune a Square algorithim to beat a Triangular one because they are both O(n[sup]2[/sup]), I will never be able to tune a Triangular algorithm (O(n[sup]2[/sup]) )to beat a Linear one (O(n)). Except at the low end where it's less important anyway.[/quote]
You're absolutely correct and preaching to the choir. I should have just said that it was O(n[sup]2[/sup]) and left it at that because that would have been the correct thing to do.
Still, a triangular join will beat a square join by a factor of two every time. :-PSat, 13 Jun 2009 11:58:51 GMTJeff ModenRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost734394.aspx[quote][b]Jeff Moden (6/12/2009)[/b][hr][quote][b]RBarryYoung (6/10/2009)[/b][hr][quote][b]Jeff Moden (6/10/2009)[/b][hr]If you strip away all the small stuff that won't impact the result much when the numbers get really big, you end up with O(N[sup]2[/sup]/2). That's much easier to compare to a cartesian join which is both N[sup]2[/sup] and O(n[sup]2[/sup]) to say that a triangular join is approximately half as bad as the square join.[/quote]
Actually, you strip away the constant factors (like 1/2) too, so that the complexity order is also O(n[sup]2[/sup]). Thus the complexity of all square and all triangular algorithms have the same order.[/quote]
Understood and I agree. The point I was really trying to make is that for all values this side of infinity that will still fit inside SQL Server, the /2 is important. :-P[/quote]
Yes, but the point is that you do not use constant factors like the /2 in big-O notation or the "complexity order" of something. This is because the purpose of complexity order analysis is to separate algorithmic efficiency (the converse of complexity), from implementation efficiency which is dependent on HW performance, SW efficiency, and the developer's code efficiency.
What this means is that an efficiently implemented Square algorithm can beat an inefficient Triangular algorithim, all the way up to infinity. But no matter how efficient the implementation of an O(n[sup]2[/sup]) algorithm like simple string concatenation with pseudocursors is, it will [b][i]always[/i][/b] eventually lose out to an algorithm of lower complexity order no matter how inefficient it is, like FOR XML string concatenation, which is a very inefficient O(n).
So what big-O notation tells us is that while I might be able to tune a Square algorithim to beat a Triangular one because they are both O(n[sup]2[/sup]), I will never be able to tune a Triangular algorithm (O(n[sup]2[/sup]) )to beat a Linear one (O(n)). Except at the low end where it's less important anyway.Sat, 13 Jun 2009 11:32:56 GMTRBarryYoungRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost734219.aspx[quote][b]RBarryYoung (6/10/2009)[/b][hr][quote][b]Jeff Moden (6/10/2009)[/b][hr]If you strip away all the small stuff that won't impact the result much when the numbers get really big, you end up with O(N[sup]2[/sup]/2). That's much easier to compare to a cartesian join which is both N[sup]2[/sup] and O(n[sup]2[/sup]) to say that a triangular join is approximately half as bad as the square join.[/quote]
Actually, you strip away the constant factors (like 1/2) too, so that the complexity order is also O(n[sup]2[/sup]). Thus the complexity of all square and all triangular algorithms have the same order.[/quote]
Understood and I agree. The point I was really trying to make is that for all values this side of infinity that will still fit inside SQL Server, the /2 is important. :-PFri, 12 Jun 2009 19:49:59 GMTJeff ModenRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost732271.aspx[quote][b]Jeff Moden (6/10/2009)[/b][hr]If you strip away all the small stuff that won't impact the result much when the numbers get really big, you end up with O(N[sup]2[/sup]/2). That's much easier to compare to a cartesian join which is both N[sup]2[/sup] and O(n[sup]2[/sup]) to say that a triangular join is approximately half as bad as the square join.[/quote]
Actually, you strip away the constant factors (like 1/2) too, so that the complexity order is also O(n[sup]2[/sup]). Thus the complexity of all square and all triangular algorithms have the same order.Wed, 10 Jun 2009 07:35:17 GMTRBarryYoungRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost732156.aspx[quote][b]RBarryYoung (6/9/2009)[/b][hr][quote][b]Jeff Moden (6/9/2009)[/b][hr]O(wtf[sup]2[/sup]) ;-)[/quote]
Heh. Hey, I'm not the one who asked the question, I'm just trying to answer them.[/quote]
Oh no... sorry, Barry. Wasn't directed at anyone in particular.
Shifting gears,
If someone asked me to define it so that a layman could understand it, I'd tell them the "O" is an abbreviation for "Order of Magnitude" and the the formula it contains is nothing more than the part of another formula that's going to have the biggest effect on things... that's it's just shorthand to make comparing some more complex formulas easier to compare as a swag (order of magnitude).
For example, the forumula for the number of rows that a triangular join will touch in SQL Server is ((N[sup]2[/sup]+N)/2)+N. If you strip away all the small stuff that won't impact the result much when the numbers get really big, you end up with O(N[sup]2[/sup]/2). That's much easier to compare to a cartesian join which is both N[sup]2[/sup] and O(n[sup]2[/sup]) to say that a triangular join is approximately half as bad as the square join.Wed, 10 Jun 2009 05:52:03 GMTJeff ModenRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost732022.aspx[quote][b]Jeff Moden (6/9/2009)[/b][hr]O(wtf[sup]2[/sup]) ;-)[/quote]
Heh. Hey, I'm not the one who asked the question, I'm just trying to answer them.Tue, 09 Jun 2009 22:32:21 GMTRBarryYoungRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost731979.aspxO(wtf[sup]2[/sup]) ;-)Tue, 09 Jun 2009 19:26:09 GMTJeff ModenRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost731908.aspx[quote][b]karthikeyan (6/9/2009)[/b][hr][quote]{(n)*(n+1)/2} cost of the naive implementation to {n +(n)*(n-1)/(2*k)} [/quote]
how?
{(n)*(n+1)/2} = 15
{n +(n)*(n-1)/(2*k)} = 15
There is no difference.
If K= 0,
=5+(5)*(5-1)/(2*0)
"Divide By Zero error occured".[/quote]
Right. the problem is that you are quoting my comments on Phil Factor's blog and I could't edit my typos there. Here is a more correct formula:
[b] k*n + (1-k)*(n[sup]2[/sup] - n)/2 [/b]
Note that this is still only an approximation because we are waving away the difference between the number of strings and the possibly different lengths of those strings.
[quote]How should i determine 'k' value? on what basis i have to determine the value?[/quote] I doubt that you could. AFAIK, I am the only person who has ever claimed that K was anything other than 0 (though if I am correct, then there must be folks inside Microsoft who know it also).
[quote][quote]but I could get it down to O(n*Log(n)) time which is still a huge improvement[/quote]
Is it denotes FOR XML option?
or
Is it denotes Pseudo cursor method?
[/quote] Neither. The naive pseudocursor method is O(n[sup]2[/sup]). The FOR XML method is O(n). I was referring to the very complex modified pseudocursor method that I posted at Phil Factor's blog. It is O(n*log(n)).
Tue, 09 Jun 2009 15:38:47 GMTRBarryYoungRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost731898.aspx[quote][b]karthikeyan (6/9/2009)[/b][hr]
output of O(n) ?
output of O(n^2). It is 15.
[/quote]
They aren't formulas to have values plugged into and results calculated. They are descriptions of the time-complexity of algorithms.
To say that O(n^2) is 15 is a meaningless statement (plus 5^2 is not 15. It's 25)Tue, 09 Jun 2009 15:19:09 GMTGilaMonsterRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost731897.aspxKarthik,
At this point I'd suggest stopping. Unless you are involved in theoretical research, you really don't need to figure out how to solve Big O mathematical formulas, just understand the O(n) algorithms are better than O(n[sup]2[/sup]) algorithms. Know when to recognisize them, for instance a nested loop using cursors versus a set-based cross join for example.
Plus, you still haven't answered Gail's original questions.
Tue, 09 Jun 2009 15:18:22 GMTLynn PettisRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost731888.aspxNow I understood your point #2 and #3. But still #1 is not clear for me...
[quote]{(n)*(n+1)/2} cost of the naive implementation to {n +(n)*(n-1)/(2*k)} [/quote]
Again, if you could explain your formula's for my below example, i think it will be very useful for me as well as for others too.
Declare @str varchar(5)
select @str = ''
select @str = @str + no from mystring
step 1: '' + '1'
step 2: '1' + '2'
step 3: '12' + '3'
step 4: '123' + '4'
step 5: '1234' + '5'
output of O(n) ?
output of O(n^2). It is 15.
output of {(n)*(n+1)/2}. Again 5*(5+1)/2 = 5*3 = 15
output of {n +(n)*(n-1)/(2*k)}.
If K= 1,
=5+(5)*(5-1)/(2*1)
=5+5*2
=5+10
=15
[quote]{(n)*(n+1)/2} cost of the naive implementation to {n +(n)*(n-1)/(2*k)} [/quote]
how?
{(n)*(n+1)/2} = 15
{n +(n)*(n-1)/(2*k)} = 15
There is no difference.
If K= 0,
=5+(5)*(5-1)/(2*0)
"Divide By Zero error occured".
How should i determine 'k' value? on what basis i have to determine the value?
[quote]but I could get it down to O(n*Log(n)) time which is still a huge improvement[/quote]
Is it denotes FOR XML option?
or
Is it denotes Pseudo cursor method?
Can you explain it?
Tue, 09 Jun 2009 15:03:20 GMTkarthik MRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost731880.aspxOh wait, thats not from here is it? That's from my comment on Phil Factor's blog some months ago, I must have mis-copied the headers when using the QUOTE button here.Tue, 09 Jun 2009 14:54:28 GMTRBarryYoungRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost731871.aspx[quote][b]karthikeyan (6/9/2009)[/b][hr][quote][b]RBarryYoung (6/8/2009)[/b][hr]
3) The "pre-allocate and Stuff" trick popular with mutable strings is not workable in T-SQL because the STUFF() function in T_SQL is NOT like the function of the same name in some general purpose languages: the T-SQL STUFF() is an RHS (right-hand side) function and NOT an LHS (left-hand side) function. AFAIK, there is no function in SQL that can (physically) write into a pre-existing string.
[/quote]
[/quote]
Oops, I just realized that I have a typo here. I reversed LHS and RHS in this statement. I will go back and fix it...Tue, 09 Jun 2009 14:49:10 GMTRBarryYoungRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost731868.aspx[quote][b]karthikeyan (6/9/2009)[/b][hr][quote][b]RBarryYoung (6/8/2009)[/b][hr]
3) The "pre-allocate and Stuff" trick popular with mutable strings is not workable in T-SQL because the STUFF() function in T_SQL is NOT like the function of the same name in some general purpose languages: the T-SQL STUFF() is an RHS (right-hand side) function and NOT an LHS (left-hand side) function. AFAIK, there is no function in SQL that can (physically) write into a pre-existing string.
[/quote]
RBarryyoung,
The above two points are confusing me...
you mean to say TSQL STUFF() function can't write into a pre-existing string.
[/quote]
Correct. Except for the TEXT-mod commands (UPDATETEXT, etc.) SQL Server does not have mutable strings for in-memory values, which simply means that technically, you can never "modify" a string in-place, you always have to make a new copy of it. (note: do not try to apply this to on-disk values, like columns in tables, because the terms don't mean the same thing there).
Note that this is in apparent contradiction to the BOL entry for STUFF() which says: "Deletes a specified length of characters and inserts another set of characters at a specified starting point." However, STUFF() does nothing of the sort as demonstrated by this script:
[code]
Declare @a varchar(20), @b varchar(20), @c varchar(20)
Select @a='', @b='123456789', @c='xyz'
Select @a as [@a], @b as [@b], @c as [@c]
Set @a = STUFF(@b, 5, 3, @c)
Select @a as [@a], @b as [@b], @c as [@c]
[/code]
You will note that the @B string is unchanged by STUFF().
Now if BOL had prefaced that statement so that it read "[i]Returns a string by copying the source string and then[/i] Deletes a specified length of characters and inserts another set of characters at a specified starting point." it would be correct, but as it currently stands, BOL is wrong.
Tue, 09 Jun 2009 14:45:23 GMTRBarryYoungRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost731838.aspxHe's talking about memory allocation. I don't know how T-SQL handles variable buffers and such, but in C, you would actually rewrite the memory addresses (or at least, you can). That's different from what Stuff does, per his post. (Can't say for sure. Don't know enough about that level of the database engine.)Tue, 09 Jun 2009 13:58:28 GMTGSquaredRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost731834.aspx[quote][b]RBarryYoung (6/8/2009)[/b][hr][quote][b]karthikeyan (6/8/2009)[/b][hr]
[quote]4) What LHS (Left-hand side) function denotes exactly?[/quote]
Left-hand Functions appear left of the assignment operator ("="):
[code] Set @str = UPPER('Some text.')[/code]
These are normal functions, and AFAIK, in T-SQL, all functions are LHS.
[quote]3) What RHS (right-hand side) function denotes exactly?[/quote]
Right-hand Side functions appear on the right-hand side and they usually do "special" things having to do with addressing the output property or variable. AFAIK, T-SQL does not have any, but in some languages, STUFF is a RHS:
[code] STUFF(@str, offset, len) = 'foo'[/code]
This example would overwrite the output string (@str) with the input string starting at 'offset' for 'len' characters. The difference between this and th LHS STUFF() function in T-SQL is that the RHS version does not return anything, it actually does write over the characters of the @str variable.
[/quote]
[quote]
3) The "pre-allocate and Stuff" trick popular with mutable strings is not workable in T-SQL because the STUFF() function in T_SQL is NOT like the function of the same name in some general purpose languages: the T-SQL STUFF() is an RHS (right-hand side) function and NOT an LHS (left-hand side) function. AFAIK, there is no function in SQL that can (physically) write into a pre-existing string.
[/quote]
RBarryyoung,
The above two points are confusing me...
you mean to say TSQL STUFF() function can't write into a pre-existing string.
Tue, 09 Jun 2009 13:54:07 GMTkarthik MRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost731832.aspxKarthik, '1'+'2' is still O(n^2). What's 1 squared?Tue, 09 Jun 2009 13:52:58 GMTGSquaredRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost731817.aspx[quote]Declare @str varchar(5)
select @str = ''
select @str = @str + no from mystring
step 1: '' + '1'
step 2: '1' + '2'
step 3: '12' + '3'
step 4: '123' + '4'
step 5: '1234' + '5'
so if i apply ((N^2) + N)/2 formula
N = 5
...
5^2 = 5*5 = 25
25+5 = 30/2 = 15. Right?
[/quote]
[quote]O(n) operation where n = Len(A$)+Len(B$).
[/quote]
so '1' + '2' is an O(n) operation.
O(n^2) problem is
1) Time to concatenate the string
2) Space claimed by the variables
3) Resource taken to do the above two operation
Tue, 09 Jun 2009 13:27:30 GMTkarthik MRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost731693.aspx[quote]So we should avoid pseudo cursor. Right? [/quote]
I never said anything like this.
[quote]1) It will be treated as internal cursor. i.e it will perform looping internally.[/quote]AFAIK all set processing in SQL Server uses internal cursors. And as long as CPU-cores execute single streams of instructions, all servers, programs and OS's will have to loop internally.
[quote]But comparing to external cursor, it will be very fasy. Right?[/quote]
Yes.
[quote]2) is it a good habit to use pseudo cursor in sql programming? if not, what is the workaround for this?[/quote]It is a technique. It is not the most desirable technique, but neither is it entirely undesirable. And sometimes it is the best technique available for a problem.Tue, 09 Jun 2009 11:12:12 GMTRBarryYoungRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost731679.aspx[quote][b]karthikeyan (6/9/2009)[/b][hr][quote]1) SQL Server apparently (based on my tests) already does the "buffer-extension" trick available to mutable strings. Unfortunately, the Extension trick does NOT solve the O(n^2) problem, [/quote]
If it supports the "buffer-extension" trick,then why it doesn't solve O(n^2) problem. again , i need to know exactly about O(n^2).[/quote]
Because, as I already explained, it can only be used part of the time.
[quote]can you explain it with example? so that i can remember it through my career.[/quote]
I already did this above, here it is again:
[quote]The "buffer-extension trick" is referring to a low-level (Assembler or C) trick that takes advantage of that fact that in most OS's strings often have large gaps of unallocated memory after them. Thus if you want to add the string B$ onto the end of A$, then normally you would have to allocate a new buffer whose length is >= Len(A$) + Len(B$), then copy all of A$ inot it, then copy B$ in after A$ and then re-point A$ to the new buff (and deallocate the old). Besides all of the memory allocation overhead, this is an O(n) operation where n = Len(A$)+Len(B$).
However, if A$ just happens to have an unused portion of memory after it whose size is >= Len(B$) then, instead you can just copy B$ into that unused memory after A$ and extend A$'s length value to inlcude the appended characters.
[/quote]Please read fully what we have taken the time to write and stop making us repeat ourselves.
[quote]Does this issue solved in sql2008?[/quote]The "buffer-extension trick" can never solve this problem. It is a trick that can only be used if you are not already using the memory that is immediately above your output string. There is no way that an OS or DB environment can ever guarantee this all of the time. The only true solution is to use a different algorithm.Tue, 09 Jun 2009 11:02:29 GMTRBarryYoungRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost731657.aspx[quote][b]karthikeyan (6/9/2009)[/b][hr]RBarryyoung and Gail,
Big O notation means the steps to complete a problem. Right?[/quote]
Sort of. Big-O notation categorizes an algorithm's complexity in terms of the dominant term (without any constant factors) of its execution run-time as a function of the length of its input data("N"), as N approaches infinity. It is a way of talking about the efficiency of an algorithm apart from the efficiency of any particular implementation of that algorithm.
But then you would know that if you had read the Wikipedia articles.
[quote]1) what do you mean by O(n^2) problem? Pls don't mistake me if i am asking this question again.[/quote]
You would know this if you had done what we asked. I will not respond to this question again.
EDIT: fixed formatting mistakes...
[quote]2) Big O notation means the steps to complete a problem. we denote it as O(n). But what is the difference between O(n) and O(n^2) ?[/quote]
This is incorrect. Please read the Wikipedia articles.
[quote]Declare @str varchar(5)
select @str = ''
select @str = @str + no from mystring
step 1: '' + '1'
step 2: '1' + '2'
step 3: '12' + '3'
step 4: '123' + '4'
step 5: '1234' + '5'
so if i apply ((N^2) + N)/2 formula
N = 5
...
5^2 = 5*5 = 25
25+5 = 30/2 = 15. Right?
[/quote]
Yes. Now count the total number of characters in quotes in your steps 1 through 5 above, what do you get?
[quote][quote][quote]2) Can you explain about the formula you used?[/quote]
Which formula? I use a lot of them. [/quote]
{(n)*(n+1)/2} cost of the naive implementation to {n +(n)*(n-1)/(2*k)} where "k" is that percentage. It's better, but it's still O(n^2).
Can you explain it?
[/quote]
It is explained by the text that preceeded it. It is a formula that calculates the execution time of string concatenation that has been improved by being able to use buffer-extension part of the time. "k" is a number between 0 and 1 that expresses how often the buffer-extension trick can be used.Tue, 09 Jun 2009 10:50:02 GMTRBarryYoungRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost731428.aspx[quote][b]Tom Brown (6/9/2009)[/b][hr]I may be totally wrong or this could be too simplistic, but thats what I understood from the Wiki article and various other researching. [/quote]
That's pretty much the core of it, yes.
There's a whole tonne of theory as to analysing algorithms to work out their complexity but I don't recall any of the details of that any more. It's way over most people's heads anyway and more for maths/CS researchers.
Tue, 09 Jun 2009 07:27:26 GMTGilaMonsterRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost731410.aspx[quote][b]karthikeyan (6/9/2009)[/b][hr]Gail,
i expected this question from you. Since i studied these concepts just to pass the exams, beacuse that time i don't know the real time usage and it will be used in our programming skill, I forgot the concepts.
Once i got the job, then only i realized that all i studied will be applied in the work...anyway hereafter i never forgot this chapters.:)..Thanks for teaching me....Thank you Gail :)
[/quote]
I have an idea, how about actually answering Gail's questions about where and when you went to school for your MCA and BS degrees instead of just this fluff.
Tue, 09 Jun 2009 07:12:05 GMTLynn PettisRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost731305.aspxHi
I'm sure Barry & Gail will give full answers, but this interpretation may help.
Don't think of O(n) / O(n^2) as proper mathematical formulas that can be solved. Think of them as a convenient shorthand description for a problem type.
So O(n) describes a linear problem - as n increases, the solution increases proportionally to n as in a straight line graph
O(n^2) is essentially a quadratic problem - as n increases, the solution increases proportionally to n-squared - or some function of n-squared.
O(log n) means the solution increase proportional to log n
I may be totally wrong or this could be too simplistic, but thats what I understood from the Wiki article and various other researching.
Tue, 09 Jun 2009 04:38:23 GMTTom BrownRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost731302.aspxGail,
i expected this question from you. Since i studied these concepts just to pass the exams, beacuse that time i don't know the real time usage and it will be used in our programming skill, I forgot the concepts.
Once i got the job, then only i realized that all i studied will be applied in the work...anyway hereafter i never forgot this chapters.:)..Thanks for teaching me....Thank you Gail :)
Tue, 09 Jun 2009 04:31:43 GMTkarthik MRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost731298.aspx[quote][b]RBarryYoung (6/8/2009)[/b][hr][quote][b]karthikeyan (6/8/2009)[/b][hr]1) What is mean by pseudo cursor?[/quote]Pusedocursor is a name that some people (including me) use for the SQL Server (and Sybase too, I believe) trick of being able to assign both to and from variables for every row of a resultset. The typical example is the simple string concatenation trick that we often use:
[code]
Declare @str varchar(MAX)
Set @str = ''
Select @str = @str + ',' + [name] From sys.columns
Print @str
[/code]
[quote][quote]"pre-allocate and Stuff" trick popular with mutable strings [/quote]
2) what this trick will do?
[/quote] We used to call this "string mapping" back in the old days. Instead of reallocating the output string with every iteration, you estimate the likely size of the output string and pre-allocate the output string with that much blank space and use a LenPtr to keep track of how much of this buffer you have used. Then for each string you "MAP" or "STUFF" the current source string into the next available space after LenPtr and then update LenPtr to the new end. This is linear time ( O(n) ) because you only have to touch each input character once.
[quote]4) What LHS (Left-hand side) function denotes exactly?[/quote]
Left-hand Functions appear left of the assignment operator ("="):
[code] Set @str = UPPER('Some text.')[/code]
These are normal functions, and AFAIK, in T-SQL, all functions are LHS.
[quote]3) What RHS (right-hand side) function denotes exactly?[/quote]
Right-hand Side functions appear on the right-hand side and they usually do "special" things having to do with addressing the output property or variable. AFAIK, T-SQL does not have any, but in some languages, STUFF is a RHS:
[code] STUFF(@str, offset, len) = 'foo'[/code]
This example would overwrite the output string (@str) with the input string starting at 'offset' for 'len' characters. The difference between this and th LHS STUFF() function in T-SQL is that the RHS version does not return anything, it actually does write over the characters of the @str variable.
[/quote]
So we should avoid pseudo cursor. Right?
1) It will be treated as internal cursor. i.e it will perform looping internally.
But comparing to external cursor, it will be very fasy. Right?
2) is it a good habit to use pseudo cursor in sql programming? if not, what is the workaround for this?
Tue, 09 Jun 2009 04:24:56 GMTkarthik MRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost731297.aspx[quote][b]karthikeyan (6/9/2009)[/b][hr]Big O notation means the steps to complete a problem. Right?[/quote]
No. Not at all.
Big O notation describes the time complexity of an algorithm. That is describes how the time required to solve an algorithm will increase as the number of inputs (n) increases.
A linear algorithm (O(n)) is one where the time required to solve the algorithm increases linearly with the number of inputs. So if 1000 inputs takes 1 minute to solve, 2000 inputs will take 2 minutes to solve and 5000 inputs will take 5 minutes to solve.
A quadratic algorithm (O(n[sup]2[/sup]) is one where the time required to solve the algorithm increases quadratically with the number of inputs. So if 1000 inputs takes 1 minute to solve, 2000 inputs will take 4 minutes to solve and 5000 inputs will take 25 minutes to solve. (roughly)
I'm curious. Where and when did you do your MCA? Where and when did you do your undergrad Comp Sci degree?
Tue, 09 Jun 2009 04:24:51 GMTGilaMonsterRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost731295.aspx[quote]1) SQL Server apparently (based on my tests) already does the "buffer-extension" trick available to mutable strings. Unfortunately, the Extension trick does NOT solve the O(n^2) problem, [/quote]
If it supports the "buffer-extension" trick,then why it doesn't solve O(n^2) problem. again , i need to know exactly about O(n^2).
can you explain it with example? so that i can remember it through my career.
Does this issue solved in sql2008?
aaah...i am confusing myself....i think....
Tue, 09 Jun 2009 04:22:16 GMTkarthik MRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost731289.aspx[quote][b]RBarryYoung (6/8/2009)[/b][hr][quote][b]karthikeyan (6/8/2009)[/b][hr]They used some concepts which i couldn't understand.
1) O(n) means ?
2) O(n^2) means ?
3) O(n log n) means ?
4) What is mean naive linear string concatenation?
[/quote]
[b]For (1-3), see the Wikipedia articles that Gail referenced. Big-O notation is simple once you understand it, but a little confusing to explain so I'm deferring to the experts.[/b]
For (4), "naive linear string concatenation" is just the normal way of concatentating many strings together by appending each one onto the end of an accumulated output string:
[code]
--NOTE: This is an example. I dont really use While loops in SQL.
Declare @i As int
Declare @A As varchar(MAX)
Select @i = 1, @A = ''
WHILE @i <= (Select Count(*) From MyStrings)
BEGIN
Select @A = @A + Select B From MyStrings Where ID = @i
END
[/code]
The problem with this approach is that you have to re-copy @A every time through the loop and @A is getting larger every time. So even if our strings are all 1 character, first we have to copy 1 character, then 2 characters, then 3, then 4, 5, 6, 7, ... all the way out to N where N is total number of strings we want to concatenate together.
[b]This totals to ((N^2) + N)/2 which is complexity Order N^2 (or just O(N^2)).[/b]
[quote][quote]RBarryYoung said: ...
1) SQL Server apparently (based on my tests) already does the "buffer-extension" trick available to mutable strings. Unfortunately, the Extension trick does NOT solve the [b]O(n^2)[/b] problem, it just partially alleviates it at the lower end because some percentage(k) of appends can be extensions instead of creating a new string. Effectively it changes the [b]{(n)*(n+1)/2}[/b] cost of the naive implementation to [b]{n +(n)*(n-1)/(2*k)} [/b]where "k" is that percentage. It's better, but it's still O(n^2).
[/quote]
RBarryyoung,
1) What do you mean by "buffer-extension"?[/quote]
The "buffer-extension trick" is referring to a low-level (Assembler or C) trick that takes advantage of that fact that in most OS's strings often have large gaps of unallocated memory after them. Thus if you want to add the string B$ onto the end of A$, then normally you would have to allocate a new buffer whose length is >= Len(A$) + Len(B$), then copy all of A$ inot it, then copy B$ in after A$ and then re-point A$ to the new buff (and deallocate the old). Besides all of the memory allocation overhead, this is an O(n) operation where n = Len(A$)+Len(B$).
However, if A$ just happens to have an unused portion of memory after it whose size is >= Len(B$) then, instead you can just copy B$ into that unused memory after A$ and extend A$'s length value to inlcude the appended characters.
[quote]2) Can you explain about the formula you used?[/quote]
Which formula? I use a lot of them.[/quote]
RBarryyoung and Gail,
Big O notation means the steps to complete a problem. Right?
Still i have lot of questions.
1) what do you mean by O(n^2) problem? Pls don't mistake me if i am asking this question again.
still I am not getting this formula clearly.
2) Big O notation means the steps to complete a problem. we denote it as O(n). But what is the difference between O(n) and O(n^2) ?
[b]This totals to ((N^2) + N)/2 which is complexity Order N^2 (or just O(N^2)).[/b]
I believe if i understand the above one, i can idenitify the difference between them. But the above one is confusing me...
say for example i have row like below
1
2
3
4
5
i would like to append the above values as
12345
so
Declare @str varchar(5)
select @str = ''
select @str = @str + no from mystring
step 1: '' + '1'
step 2: '1' + '2'
step 3: '12' + '3'
step 4: '123' + '4'
step 5: '1234' + '5'
so if i apply ((N^2) + N)/2 formula
N = 5
((5^2) + 5)/2 ( select ((5^2) + 5)/2 gave 6, but i thiink it is wrong as per mathematics. since ^ is considered as POWER in maths. Right ? )
((5^2) + 5)/2 equivalent to select (Power(5,2) + 5)/2. Right?
5^2 = 5*5 = 25
25+5 = 30/2 = 15. Right?
[quote]2) Can you explain about the formula you used?
Which formula? I use a lot of them.
[/quote]
{(n)*(n+1)/2} cost of the naive implementation to {n +(n)*(n-1)/(2*k)} where "k" is that percentage. It's better, but it's still O(n^2).
Can you explain it?
Tue, 09 Jun 2009 04:12:52 GMTkarthik MRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost730960.aspx[quote][b]karthikeyan (6/8/2009)[/b][hr]1) What is mean by pseudo cursor?[/quote]Pusedocursor is a name that some people (including me) use for the SQL Server (and Sybase too, I believe) trick of being able to assign both to and from variables for every row of a resultset. The typical example is the simple string concatenation trick that we often use:
[code]
Declare @str varchar(MAX)
Set @str = ''
Select @str = @str + ',' + [name] From sys.columns
Print @str
[/code]
[quote][quote]"pre-allocate and Stuff" trick popular with mutable strings [/quote]
2) what this trick will do?
[/quote] We used to call this "string mapping" back in the old days. Instead of reallocating the output string with every iteration, you estimate the likely size of the output string and pre-allocate the output string with that much blank space and use a LenPtr to keep track of how much of this buffer you have used. Then for each string you "MAP" or "STUFF" the current source string into the next available space after LenPtr and then update LenPtr to the new end. This is linear time ( O(n) ) because you only have to touch each input character once.
[i](Heh. I was re-reading this and just realized that I got my "Left" and "Right" reversed! :blush: Yikes!)[/i]
[quote]4) What [strike]LHS (Left-hand side)[/strike] [i]RHS (Right-hand side)[/i] function denotes exactly?[/quote]
[strike]Left[/strike][i]Right[/i]-hand Functions appear [strike]left[/strike][i]right[/i] of the assignment operator ("="):
[code] Set @str = UPPER('Some text.')[/code]
These are normal functions, and AFAIK, in T-SQL, all functions are [strike]L[/strike][i]R[/i]HS.
[quote]3) What [strike]R[/strike]LHS ([strike]right[/strike][i]left[/i]-hand side) function denotes exactly?[/quote]
[strike]Right[/strike][i]Left[/i]-hand Side functions appear on the [strike]right[/strike][i]left[/i]-hand side and they usually do "special" things having to do with addressing the output property or variable. AFAIK, T-SQL does not have any, but in some languages, STUFF is a [strike]R[/strike][i]L[/i]HS:
[code] STUFF(@str, offset, len) = 'foo'[/code]
This example would overwrite the output string (@str) with the input string starting at 'offset' for 'len' characters. The difference between this and the RHS STUFF() function in T-SQL is that the LHS version does not return anything, it actually does write over the characters of the @str variable.Mon, 08 Jun 2009 14:13:52 GMTRBarryYoungRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost730797.aspx[quote][b]karthikeyan (6/8/2009)[/b][hr]Yes Gail...I have studied...But i don't remember the functionality...
I have reviewed the above mentioed URL, Since there are lot of formula's are there,it is confusing...[/quote]
Ok, explain what you do know of Big O notation, since you said you did study it, and what exactly is confusing you and we'll try and fill in from there.
[quote]3) What RHS (right-hand side) function denotes exactly?[/quote]
In what context?
[quote]4) What LHS (Left-hand side) function denotes exactly?[/quote]
In what context?
[quote]OK, since I couldn't get a copy of Moby Dick, I made my own test data. Here is your test rig Phil, with my test data and my technique, followed by your original approach. Needless to say, my method is nearly [b]linear[/b], but the simple approach grows geometrically:[/quote]
[quote]5) what are all the method available apart from linear? What is mean by linear method?
how to know whether my code use linear method? [/quote]
Related to Big O notation. An algorithm is linear, or has linear time complexity if the time to process a number of inputs grows linearly with the number of inputs. It would be described as O(n)
There's lots of complexities other than linear. Logarithmic, geometric, polynomial, exponential, etc.Mon, 08 Jun 2009 10:25:50 GMTGilaMonsterRE: O(n) , O(n log n), O(n^2)https://www.sqlservercentral.com/Forums/FindPost730713.aspx[quote][b]karthikeyan (6/8/2009)[/b][hr]They used some concepts which i couldn't understand.
1) O(n) means ?
2) O(n^2) means ?
3) O(n log n) means ?
4) What is mean naive linear string concatenation?
[/quote]
For (1-3), see the Wikipedia articles that Gail referenced. Big-O notation is simple once you understand it, but a little confusing to explain so I'm deferring to the experts.
For (4), "naive linear string concatenation" is just the normal way of concatentating many strings together by appending each one onto the end of an accumulated output string:
[code]
--NOTE: This is an example. I dont really use While loops in SQL.
Declare @i As int
Declare @A As varchar(MAX)
Select @i = 1, @A = ''
WHILE @i <= (Select Count(*) From MyStrings)
BEGIN
Select @A = @A + Select B From MyStrings Where ID = @i
END
[/code]
The problem with this approach is that you have to re-copy @A every time through the loop and @A is getting larger every time. So even if our strings are all 1 character, first we have to copy 1 character, then 2 characters, then 3, then 4, 5, 6, 7, ... all the way out to N where N is total number of strings we want to concatenate together.
This totals to ((N^2) + N)/2 which is complexity Order N^2 (or just O(N^2)).
[quote][quote]RBarryYoung said: ...
1) SQL Server apparently (based on my tests) already does the "buffer-extension" trick available to mutable strings. Unfortunately, the Extension trick does NOT solve the O(n^2) problem, it just partially alleviates it at the lower end because some percentage(k) of appends can be extensions instead of creating a new string. Effectively it changes the [b]{(n)*(n+1)/2}[/b] cost of the naive implementation to [b]{n +(n)*(n-1)/(2*k)} [/b]where "k" is that percentage. It's better, but it's still O(n^2).
[/quote]
RBarryyoung,
1) What do you mean by "buffer-extension"?[/quote]
The "buffer-extension trick" is referring to a low-level (Assembler or C) trick that takes advantage of that fact that in most OS's strings often have large gaps of unallocated memory after them. Thus if you want to add the string B$ onto the end of A$, then normally you would have to allocate a new buffer whose length is >= Len(A$) + Len(B$), then copy all of A$ inot it, then copy B$ in after A$ and then re-point A$ to the new buff (and deallocate the old). Besides all of the memory allocation overhead, this is an O(n) operation where n = Len(A$)+Len(B$).
However, if A$ just happens to have an unused portion of memory after it whose size is >= Len(B$) then, instead you can just copy B$ into that unused memory after A$ and extend A$'s length value to inlcude the appended characters.
[quote]2) Can you explain about the formula you used?[/quote]
Which formula? I use a lot of them.Mon, 08 Jun 2009 08:35:53 GMTRBarryYoung