Advantage and Disadvantages of String Implementation in JAVA •KANG DADANG Blog

Advantages of the String implementation in JAVA

1. Compilation creates unique strings. At compile time, strings are resolved as far as possible. This includes applying the concatenation operator and converting other literals to strings. So hi7 and (hi+7) both get resolved at compile time to the same string, and are identical objects in the class string pool. Compilers differ in their ability to achieve this resolution. You can always check your compiler (e.g., by decompiling some statements involving concatenation) and change it if needed.

2. Because String objects are immutable, a substring operation doesn’t need to copy the entire underlying sequence of characters. Instead, a substring can use the same char array as the original string and simply refer to a different start point and endpoint in the char array. This means that substring operations are efficient, being both fast and conserving of memory; the extra object is just a wrapper on the same underlying char array with different pointers into that array.

3. Strings are implemented in the JDK as an internal char array with index offsets (actually a start offset and a character count). This basic structure is extremely unlikely to be changed in any version of Java.

4. Strings have strong support for internationalization. It would take a large effort to reproduce the internationalization support for an alternative class.

5. The close relationship with StringBuffers allows Strings to reference the same char array used by the StringBuffer.

This is a double-edged sword. For typical practice, when you use a StringBuffer to manipulate and append characters and data types, and then convert the final result to a String, this works just fine. The StringBuffer provides efficient mechanisms for growing, inserting, appending, altering, and other types of String manipulation. The resulting String then efficiently references the same char array with no extra character copying. This is very fast and reduces the number of objects being used to a minimum by avoiding intermediate objects. However, if the StringBuffer object is subsequently altered, the char array in that StringBuffer is copied into a new char array that is now referenced by the StringBuffer. The String object retains the reference to the previously shared char array. This means that copying overhead can occur at unexpected points in the application. Instead of the copying occurring at the toString( ) method call, as might be expected, any subsequent alteration of the StringBuffer causes a new char array to be created and an array copy to be performed. To make the copying overhead occur at predictable times, you could explicitly execute some method that makes the copying occur, such as StringBuffer.setLength( ). This allows StringBuffers to be reused with more predictable performance.

The disadvantages of the String implementation are

1. Not being able to subclass String means that it is not possible to add behavior to String for your own needs.

2. The previous point means that all access must be through the restricted set of currently available String methods, imposing extra overhead.

3. The only way to increase the number of methods allowing efficient manipulation of String characters is to copy the characters into your own array and manipulate them directly, in which case String is imposing an extra step and extra objects you may not need.

4. Char arrays are faster to process directly.

5. The tight coupling with String Buffer can lead to unexpectedly high memory usage. When StringBuffer toString( ) creates a String, the current underlying array holds the string, regardless of the size of the array (i.e., the capacity of the StringBuffer).

For example, a StringBuffer with a capacity of 10,000 characters can build a string of 10 characters. However, that 10-character String continues to use a 10,000-char array to store the 10 characters. If the StringBuffer is now reused to create another 10-character string, the StringBuffer first creates a new internal 10,000-char array to build the string with; then the new String also uses that 10,000-char array to store the 10 characters. Obviously, this process can continue indefinitely, using vast amounts of memory where not expected.The advantages of Strings can be summed up as ease of use, internationalization support, and compatibility to existing interfaces. Most methods expect a String object rather than a char array, and String objects are returned by many methods. The disadvantage of Strings boils down to inflexibility. With extra work, most things you can do with String objects can be done faster and with less intermediate object-creation overhead by using your own set of char array manipulation methods.