How fast StringBuilder is ?
Subject
As we know StringBuilder builds strings faster than + operator or String.Concat() method. Also StringBuilder with specified capacity performs faster than with default one.
But how fast StringBuilder is?
I created simple application to measure performance of different approaches of string concatenations. I will use several ways how to build a text: String.Concat(), StringBuilder with default capacity(16 character), StringBuilder with specified capacity.
Theory
Using String concatenation
Strings are immutable objects which means there is no way how to change already existing instance of string. Each time when we perform some action with string such as Trim(), Substring(), Replace(), etc it creates new object in the heap.
For example let's imagine that we have some array of strings and would like to concatenate them.
string[] myNameArray = new string[5]{"A", "R", "M", "E", "N"};
string text = null;
foreach (string s in myNameArray)
{
text = String.Concat(text, s);
}
Below you can see illustration of what exactly is happening in memory during each iteration.
Iteration 1:
There will be a String object ("A") allocated in the heap and a pointer will be returned back to "text" field.
Iteration 2:
By concatenating second member ("R") to already existing string we actually create new String object in a heap ("AR") and a reference to new created object will be returned to "text" field. The old object ("A") in a heap became obsolete and it will be collected by Garbage Collector and will be destroyed after.
Iteration 3:
The same story will happen with next member of array, string "M". There will be again created a new Sting object ("ARM") in a heap and the pointer of new object will be returned to "text" field. Previous object will loose its connection to pointer in the stack and object will be Garbage Collected.
Iteration 4:
The same behavior as with the previous iteration. New object "ARME" will be created in the heap and "text" field start reference to it. As "ARM" object is not connected anymore with any pointer in the stack, it will be destroyed by Garbage Collector.
Iteration 5:
And finally the last iteration. There will be again created new object and the previous one will be collected by Garbage Collector.
So, the conclusion is that to prepare simple string with 5 characters inside, we had to create 5 String objects, allocate memory space for each of them, where 4 objects immediately became a candidates for GarbageCollector.
When I was young my Mom was saying to me "I am afraid to ask you to clean anything in the kitchen, because you will create more mess around you" :D String Concatenation works pretty the same way.
StringBuilder with default capacity
Ok, let's now build a sting by using StringBuilder. By default StringBuilder has 16 character capacity. It means that object in the heap will have allocated memory space for 16 characters. Which is perfect for us as we have to build a string with only 5 characters inside:
StringBuilder textSBuilder = new StringBuilder();
Again, we are iterating thru array and adding each member to StingBuilder:
foreach (string s in myNameArray)
{
textSBuilder.Append(s);
}
Illustration below shows that all characters (strings) will be located to existing StringBuilder object, which "txtSBuilder" referencing to. There is no new object creation in the heap and there is nothing to be destroyed by Garbage Collector. So this approach should perform much faster than usual string concatenation.
Well, this is good, but what will happen if the string become bigger than capacity of StringBuilder. The answer is easy, in the moment when StringBuilder will reach to its capacity and will need more space, there will be created new StringBuilder object in the heap with doubled (in our case it will be 2 x 16 = 32 characters) capacity. All members of the current object will be copied to the new one and the pointer in stack will start referencing to the new object.
So, let's try to make the same operation as we did before but repeat it 4 times. Which means we need to get string of 4 x 5 ("A", "R", "M", "E", "N") = 20 characters.
for (int i = 0; i < 4; i++ )
{
foreach (string s in myNameArray)
{
textSBuilder.Append(s);
}
}
As you can see on illustration below, when StringBuilder object fills its capacity and need more space for adding new characters, new StringBuilder object will be allocated in the heap with double capacity than previous one (2 x 16 = 32), all chars from original object will be copied to new one and "textSBuilder" field will change reference to the new StringBuilder object. The original StringBuilder will become obsolete and will be destroyed by GarbageCollector.
StringBuilder with Specified Capacity
To optimize StringBuilder to perform better, it is recommended to specify capacity of StringBuilder object in the moment of instantiation. The point is that if we know how many letters will string contain, we have to specify this number as capacity for StringBuilder and it will allocate just one object with enough size for our need.
Lets perform the same operation as we did before but with specifying capacity.
StringBuilder textSBuilder = new StringBuilder(20);
Illustration shows that we just created one object. No need for new object creation, no new memory allocation, no work for GarbageCollection (at least for now).
So, this is the best way how to build a strings.
Speedometer
Now is the time for real test :)
As I mentioned before, I created small console application where I used all those approaches to see the result in "real numbers".
There are several methods which basically do the same thing: cycle specified number of times and build a string by concatenating existing string with new char(or text). Difference is that all of them use different approaches: String.Concat(), StringBuilder(), StringBuilder(capacity). There also set some checkpoint interval in which application writes out report.
Here is the example of one of those methods.
/// <summary>
/// Concatenation of strings by using String.Concat method.
/// </summary>
private static void StringConcatenation()
{
String text = null;
//Reset the properties related to checkpoint.
//Write out start time.
OperationReportPreset("String.Concat()");
//Perform concatination specified number of times
for(int i=0; i<OPERATION_COUNT; i++)
{
//Concatenate two string by using
//String.Concat(string, string) static method
text = String.Concat(text, textForConcatenation);
//Write out checkpoint result to show
CheckpointReport();
//each concatination creates new object in a heap;
numberOfCreatedObjects = i + 1;
}
//Write out operation end time.
OperationReportFinalizer();
}
Other methods are pretty the same. The difference is that they use StringBuilder().Append() functionality instead of standard concatenation.
Note: The whole project you can download at the end of this blog post. There is a "Source Code" link.
Execution
After execution I got this output:
Execution with String.Concat() took more than 2 minutes.
It is interesting to observe the dynamic of objects creation. First 20000 concatenations took 1 second, for next 20000 almost 3 seconds, for another 20000 more than 6 seconds. For the last 20000 operations, it took 25 seconds. Which means as bigger string becomes after each iteration as more time it needs to allocate memory for the new String object in a heap.
BTW, after all iterations there will be 200000 objects created in the heap where 199999 are unused and obsolete.
Lets take a look how StringBuilder with default, not specified capacity performed:
The same action has been performed in less than 1 second, almost instantly. New object creates with double size each time when StringBuilder reaches its capacity. There were only 15 objects created during whole iterations where "only" 14 are obsolete.
What about StringBuilder with specified capacity:
Well it performed with the same result - under the 1 second but only 1 object has been created in the heap.
Report
|
|
Number of operations |
Total performance duration |
Created unused objects for GC |
|
String.Concat() |
200000 |
2 min 22 sec. |
199999 |
|
StringBuilder with default capacity |
200000 |
less than 1 sec. |
14 |
|
StringBuilder with specified capacity |
200000 |
less than 1 sec. |
0
|
The StringBuilder with default parameter performs with the same speed as StringBuilder with specified. But of course we didn't count time which GarbageCollector needs to clean up the memory from unused objects. And of course the size of those objects also very important. There are 14 obsolete objects created by "StringBuilder with default capacity" which should be removed from memory and they all together take around half megabyte of memory size.
My Advice
My advice is to avoid using Sting.Concat() or + operator at all. Use instead StringBuilder with specified capacity. It is much faster and cheaper approach.
Technorati Tags:
.Net,
C#,
CLR,
StringBuilder