The StringBuffer
/StringBuilder
classes in Java are primarily used to modify String values without having to initialize a new String object everytime.
Is there a specific reason, it doesn't use a LinkedList
instead of an Char Array as it's underlying data structure?
Inserting a char into a Array will always result in a O(n) time to copy all elements to the next index, where as that that be done in O(1) time in case of a LinkedList
.
StringBuilder
has random access operations such as charAt
or substring
, which would be extremely slow with a linked list.
In fact, array lists aren't particularly slower than linked lists even when it comes to other operations like insertion. Typically StringBuilder
won't be used to create strings of a million characters so it's unlikely that we need to resize the buffer too many times.
I have to correct you that insertion at the end always requires a O(n)
copy of elements. The worst-case is indeed O(n)
but the amortized complexity is O(1)
because we don't just allocate one element at a time. When the array isn't big enough to make another insertion, most implementations double the size of the array.
Insertions in the middle always require the copy of elements at the right of the insertion, so yes it is pretty slow but it's not a typical use-case for a StringBuilder
where most insertions happen at the end. Also, linked lists have the same average complexity on insertions in the middle since they first have to reach the right node by iterating the list.
Another advantage of arrays compared to linked list is data locality. Array lists are faster to iterate than linked lists because when the processor loads a piece of memory around an element of an array, it will also cache some of the neighbours of this element which will therefore be returned faster. On the other hand, all elements of a linked list may live in very distant memory locations, which is not cache-friendly.
Because we double the size of the array of every resize, dynamic arrays can have a pretty big memory footprint (but at least we won't need to copy elements too often). Linked lists also have a fairly large memory footprint since they require one additional reference and a pointer for each element while elements are compactly stored in an array. On average, I'd say a typical array list will have a smaller memory footprint than a linked list but I may be wrong. This is particularly the case for primitive types - such as char
- because linked lists require wrapper objects (at least in Java since there are no pointers) whereas we can use much more compact primitive arrays.
Finally, I used StringBuilder
in my answer instead of StringBuffer
because this is the recommended implementation for most use-cases. StringBuffer
is only preferrable when thread-safety is a hard requirement. Otherwise, StringBuilder
will have better performance.
PS: Python's most prominent data structure is list
and guess what... it's implemented with a dynamic array! Resizable arrays are very often a better choice than linked lists. The only case in which a linked list is notably more performant is when the application focuses on elements close to the head of the list and makes frequent insertions or deletions in this area.