Wednesday, November 12, 2008

String Vs. StringBuilder classes

One of the .Net performance efficiency tip that we, developers keep hearing from all quarters is to use StringBuilder over String concatenates.

Now, lets see what actually happens behind the scenes. Are people right when they StringBuilder is better? Does it have a performance edge? If so, when?

Scenario 1:
String Concatenation
String sStringHolder = "";
Random rRandomize = new Random(150);

DateTime dtStartTime = System.DateTime.Now;
for (int iCount = 0; iCount < 10000; iCount++)
{
sStringHolder = sStringHolder + "1";
}//for(int iCount=0; iCount<1000; iCount++)
DateTime dtEndTime = System.DateTime.Now;

Trace.WriteLine("Time taken using the Concat operation >> 10000 Iterations: ");
Trace.WriteLine(((TimeSpan)(dtEndTime - dtStartTime)).Milliseconds);

dtStartTime = System.DateTime.Now;
for (int iCount = 0; iCount < 20000; iCount++)
{
sStringHolder = sStringHolder + "1";
}//for(int iCount=0; iCount<1000; iCount++)
dtEndTime = System.DateTime.Now;

Trace.WriteLine("Time taken using the Concat operation >> 20000 iterations: ");
Trace.WriteLine(((TimeSpan)(dtEndTime - dtStartTime)).Milliseconds);
The first iteration (for 10000 loops) took 46 milliseconds while the second iteration where the loop count doubled (20000 loops) increased almost 9 times (375 milliseconds).

So, what does really go on inside? Why are the results so drastically different?
sStringHolder = sStringHolder + "1";
Strings are immutable, i.e., the value can not modified once it is created. So, in this case three instance of the String objects are created to accommodate the concatenation. When a concat is performed, the old sStringHolder is discarded and a new one created to hold the newly concatenated string. That is the reason why it becomes too expensive.

Scenario 2
StringBuilder Goodness
StringBuilder sbStringHolder = new StringBuilder(50);

dtStartTime = System.DateTime.Now;
for (int iCount = 0; iCount < 50000; iCount++)
{
sbStringHolder.AppendFormat("{0}", "Hello");
}//for(int iCount=0; iCount < 1000; iCount++)
dtEndTime = System.DateTime.Now;

Trace.WriteLine("Time taken using StringBuilder class:AppendFormat >> 50000 iterations: ");
Trace.WriteLine(((TimeSpan)(dtEndTime - dtStartTime)).Milliseconds);
That was 15 milliseconds for a 50000 count iteration! StringBuilder certainly has an edge here.


This advantage that StringBuilder has is due to the fact that the string allocation and copies does not have to be as frequent as a String Concatenate. That's from where the majority of the savings come from. But if the StringBuilder has a buffer that's just right for the string that it holds, it will have to grow on every Append which is as good (or bad) as the string concatenate.

Although, this cannot be taken as a rule of thumb that StringBuilder is always more efficient than a String concatenation - Rico Mariani explains the four different cases.

Big string and small appends
=>It's substantially likely that appends will fit in the slop and so they're fast, this is the best case(buffer size becomes double the string when it no longer fits so on average the slop is half the current string length) (if there are lots of small appends to a big string you win the most using stringbuilder)

Big string and big appends:
=>While the string is comparable in size (or smaller) to the appends stringbuilder won't save you much, if this continues to the point where the appends are small compared to the accumlated string you're in the good case

Small string big appends:
=> bad case, string builder will just slow you down until enough slop has built up to hold those appends, you move to "big string big appends" as you append and finally to "big string small appends" if/when the buffer becomes collossal

Small string, small appends:
=> could be ok if you had a good idea how big your string was going to get and pre-allocated enough so that you have sufficient slop for the appends. You might be able to do better if you just concatenated all the small appends together in one operation.

Otherwise put, if the operation that is to be performed is something like this:
x = f(x) + f(y) + f(z) + f(a) - String Concat is better

If the operation will be something to this effect:
x += f(x);
x += f(y);
x += f(z);
x += f(a) - StringBuilder will suit better in this scenario

0 comments: