Saturday 11 August 2012

Java String Concatenation and Performance


The following methods will be used to concatenate strings.



  • Concatenation Operator (+)
  • String concat method - concat(String str)
  • StringBuffer append method - append(String str)
  • StringBuilder append method - append(String str)

Let us check which one is most efficient for string concatenation.

Concatenation Operator (+)

The quick and dirty way to concatenate strings in Java is to use the concatenation operator (+). This will yield a reasonable performance if you need to combine two or three strings (fixed-size). But if you want to concatenate n strings in a loop, the performance degrades in multiples of n. Given that String is immutable, for large number of string concatenation operations, using (+) will give us a worst performance.

The + operator, until JDK 1.4 StringBuffer is used internally and from JDK 1.5 StringBuilder is used to concatenate. After concatenation the resultant StringBuffer or StringBuilder is changed to String. i.e.Internally str3=str1 + str 2 statement would be executed as,


 str3=new StringBuffer().append(str1).append(str2).toString()

when a + is used for concatenation see how many steps are involved in the execution of the statement str=str+"*" 
  1. A StringBuffer object is created
  2. string1 is copied to the newly created StringBuffer object
  3. The “*” is appended to the StringBuffer (concatenation)
  4. The result is converted to back to a String object.
  5. The string1 reference is made to point at that new String.
  6. The old String that string1 previously referenced is then made null.
When concatenating static strings it is totally safe to use the "+" operator.A static strings concatenating means that all of the substrings building the final string are known at compile time. If this is the case we should use the "+" operator since the compiler will perform the concatenating at compile time without any performance penalty.


StringBuilder and StringBuffer


When we use dynamic strings the compiler cannot precalculate the concatenating result for us, instead, it uses StringBuilder. This is not that bad if we only do the operation once or twice but if we loop again and again over such code it will have dramatic affect on performance, think of the following example:

String result = "";
 for (int t=0; t<10000; ++t ) {

 result = result + "*";

 }

The compiler will generate something similar to that 

String result = "";
 for (int t=0; t<10000; ++t ) {

 result = new StringBuffer(result).append("*").toString();

 }

Obviously this is not the most efficient way to get our task done. The code generated by the compiler instantiates too many StringBuilder objects, invokes too many methods, and instantiates too many String objects. Using StringBuffer/StringBuilder we can do it more efficiently.


StringBuilder sb = new StringBuilder();


 for (int t=0; t<10000; ++t ) {
 sb.append("*");
 }
 String result = sb.toString();

So what is the difference between StringBuilder and StringBuffer? The difference is that
StringBuffer is a synchronized class, all of its methods are synchronized and as such it should be used in a multithreaded environment (when more than one thread access the same StringBuffer instance). Usually strings concatenating is done by a single thread - in that scenario the StringBuilder should be used.

There is a Third Way - String.concat()


The java.lang.String concat() method is another way to concat strings. This method should be pretty efficient when concatenating a small number of strings (I usually use it when concatenating two strings. For more than two I use the StringBuilder). The concat() method builds a char buffer in the exact size of the destination string, fills the buffer from the two original strings' underlying buffers (using System.arraycopy() which is considered to be a very efficient method) and returns a new string based on the newly allocated buffer. Here is the method code (taken from JDK 1.6)

public String concat(String str) {
 int otherLen = str.length();
  if (otherLen == 0) {
     return this;
 } 
 char buf[] = new char[count + otherLen];
 getChars(0, count, buf, 0); 
 str.getChars(0, otherLen, buf, count);
 return new String(0, count + otherLen, buf); 

}


Example Java Source Code For String Concatenation


Now let’s implement each of the four methods mentioned in the article. Nothing fancy here, plain implementations of (+), String.concat(), StringBuffer.append() & StringBuilder.append().


class Clock {

 private final long startTime;

 public Clock() {
  startTime = System.currentTimeMillis();
 }

 public long getElapsedTime() {
  return System.currentTimeMillis() - startTime;
 }
}

public class StringConcatenationExample {

 static final int N = 50000;

 public static void main(String args[]) {

  // Concatenation using + operator

  Clock clock = new Clock();

  // String to be used for concatenation
  String string1 = "";
  for (int i = 1; i <= N; i++) {

   // String concatenation using +
   string1 = string1 + "*";
  }
  // Recording the time taken to concatenate
  System.out.println("Using + Elapsed time: " + clock.getElapsedTime());

  // Concatenation using String.concat() method.

  clock = new Clock();
  String string2 = "";

  for (int i = 1; i <= N; i++) {

   // String concatenation using String.concat()
   string2 = string2.concat("*");
  }
  // Recording the time taken to concatenate
  System.out.println("Using String.concat Elapsed time: "
    + clock.getElapsedTime());

  // Concatenation using StringBuffer

  clock = new Clock();
  StringBuffer stringBuffer = new StringBuffer();
  for (int i = 1; i <= N; i++) {

   // String concatenation using StringBuffer
   stringBuffer.append("*");
  }
  String string3 = stringBuffer.toString();
  System.out.println("Using StringBuffer Elapsed time: "
    + clock.getElapsedTime());

  // Concatenation using StringBuilder

  clock = new Clock();
  StringBuilder stringBuilder = new StringBuilder();
  for (int i = 1; i <= N; i++) {

   // String concatenation using StringBuilder
   stringBuilder.append("*");
  }
  String string4 = stringBuffer.toString();
  System.out.println("Using StringBuilder Elapsed time: "
    + clock.getElapsedTime());

 }
}


The output is:


Using + Elapsed time: 1578
Using String.concat Elapsed time: 764
Using StringBuffer Elapsed time: 5
Using StringBuilder Elapsed time: 3

From the above generated performance metrics you can see that concatenation using StringBuilder is most efficient compared to other three methods.

Conclusion

For the simple operations we should use String.concat compared to (+), if we don’t want to create a new instance of StringBuffer/Builder. But for huge operations, we shouldn’t be using the concat operator, as seen in the performance results it will bring down the application to its knees and spike up the CPU utilization. To have the best performance, the clear choice is StringBuilder as long as you do not need thread-safety or synchronization

No comments:

Post a Comment