High Performance String Concatenation in ColdFusion

I was spending a bit of time last night preparing for the CF - Java Session I am giving at Bentley in a couple weeks and I decided to do an actual test on the difference between using a ColdFusion String and using a Java StringBuilder for concatenation. The results were even more striking than I had originally thought, so I decided to pass it on (sorry for the spoiler if you're going to attend my session at Bentley, I will have more than just this to talk about :).) If you want to learn more about leveraging Java with ColdFusion or a number of other topics, check out RIAUnleashed at Bentley College Waltham, MA November 13th.

So, back to my story, I took a string:

appendstring = "There's an old saying in Tennessee -- I know it's in Texas, probably in Tennessee -- that says, fool me once, shame on -- shame on you. Fool me -- you can't get fooled again";
God bless the wisdom of GWB.

And I appended it in a loop using ColdFusion Strings:

for (i=1;i<=iterations;i++) {
    string = string & appendstring;
}
And I also appended it in a loop using a Java StringBuilder:
stringbldr = createObject("java", "java.lang.StringBuilder").init();
for (i=1;i<=iterations;i++) {
    stringbldr.append(appendstring);
}

The results... With Strings I could append this sentence together 20,000 in 90 seconds... not bad right?

As it turns out String concatenation is actually exponential not linear (perhaps its cubic and not exponential... but it deteriorates quickly.) Take a look at the chart below:

Now a closer look at the same data, but extending to a scale meaningful to the StringBuilder:

With a StringBuilder, I could append the same string 1 MILLION times in 5 seconds!!! I actually had the system running out of memory in about 7 seconds (so be aware of that I suppose.) Trying to get to 1 Million iterations with Strings would take roughly "until the end of time"... okay, maybe not that long, but long enough that you would think a process hung on your server last Saturday.

Am I saying ColdFusion sucks at concatenation? No, not at all. The fact is you will have the same results concatenating strings in Java. In fact, a ColdFusion String IS A Java String. The difference is simple, ColdFusion does not have a keyword or object that does high performance string concatenation. There are hacks like appending an array and then using arraytolist(), but how awkward is that?

Next time you need to make a big string, just use a StringBuilder, its really no big deal. :) Remember, ColdFusion just makes your Java development faster, because the app you write using ColdFusion actually executes as Java.

Best of luck on whatever project you are working on today, :)

Jason

Update: 2:04PM October 23, 2009 With some of the interesting comments around other high performance concatenation methods I decided to give a couple a try under my test. The chart below includes String Concatenation, Appending a StringBuilder, Appending an Array and then Flattening it, and using CFSaveContent. Take a look at the chart below:

I think the results were a bit surprising at first, but after some discussion I think I actually get it. First, we know string concatenation is no good, easy. Next, we know StringBuilder's are linear, and they did what we previously expected.

Next up, an Array, you can see from the chart that an array is wicked fast and then the last iteration is costly because it converts to a string there. This made sense to me... the difference here is the array is only consumable as a string after everything is together. I moved the ArrayToList call inside my loop to make that part equivalent and while still on the same order of magnitude as a StringBuilder, it was several times slower. I guess if you have something that you can completely construct as an array and then just convert it when you need to consume it as a String, this is a viable option.

Lastly, CFSaveContent... this one perplexed me so I dropped a line to an old co-worker at Adobe. His insight was interesting, he said that CFSaveContent is essentially grabbing everything you are outputting to the ColdFusion output buffer which is already incredibly efficient and there waiting for strings (html output) to be thrown at it and then grabbing the content and sticking it in a variable. That does make me wonder what would happen if there were contention for the output buffer... but either way, in a single request environment this is incredibly efficient.

Well, very interesting stuff. I guess I would still prefer to use a StringBuilder, it just makes more sense to me to use something designed to manage strings. I can deal with an extra second and a half over 300,000 iterations. It's the hours it would take using a String that I take issue with. :)

That of course is a coding preference, and I encourage you to do whatever is appropriate for your situation.

Cheers again, :) Jason

Comments (Comment Moderation is enabled. Your comment will not appear until approved.)
Stephen Moretti's Gravatar StringBuilder and StringBuffer are considerably faster than using ColdFusion string for concatenation.

The reason for this is because under the hood ColdFusion string are java.lang.string. The String class is a constant, so in order to concat one to another you have to create a new string concat the existing string to the new string, assign it to that and then reassign it into the original variable.

StringBuilder and StringBuffer are mutable, so concatenation is done in place rather than through a series of assignments and object creations.

What I didn't realise was "Where possible, it is recommended that this class be used in preference to StringBuffer as it will be faster under most implementations. "
# Posted By Stephen Moretti | 10/23/09 12:18 PM
Quan Tran's Gravatar I recall that someone did a test which used cf's savecontent to do concatenation compared with java's stringbuffer and stringbuilder. The results were suprising in that the savecontent method came out on top.
# Posted By Quan Tran | 10/23/09 12:31 PM
Jason Delmore's Gravatar Absolutely correct Stephen. Immutability of Strings is one of the topics I will touch on in my session, but yes, every concatenation operation of strings requires a new object and that is the performance penalty. A StringBuilder/StringBuffer will just keep modifying the existing object and so the performance is relatively linear (increasing the buffer causes some "stepping" but its very minor.)

The difference between a StringBuilder and a StringBuffer is that a StringBuffer handles synchronization that makes it multi-thread safe, but slower. It's not that often that you will do a string concatenation across multiple threads, but if you do, then you will need a StringBuffer. A StringBuilder will be faster because it does not attempt to synchronize across threads. StringBuilder's were introduced with JDK 1.5, so if you're running an older version of ColdFusion on an older JVM they may not be available.

Anyhow, it is a good tool to have in your toolkit.

Cheers!
Jason
# Posted By Jason Delmore | 10/23/09 12:40 PM
tony petruzzi's Gravatar you mentioned using arraytolist() as a hack to achieve the same performance as using StringBuilder. Can you run your tests again and include this? it would be interesting to see.
# Posted By tony petruzzi | 10/23/09 1:11 PM
Jason Delmore's Gravatar I've added information and comments about using Arrays and CFSavecontent for string concatenation. I think I will stick with StringBuilders so that I do not need to catch output. I will usually be concatenating in a component that I do not want any output coming out of, and now with CF9 I will probably write components without any tags in them either. The array method is still pretty interesting, but since the difference is relatively marginal, I still think a StringBuilder makes more sense to me. But again, to each their own. :)
# Posted By Jason Delmore | 10/23/09 2:33 PM
Kevan Stannard's Gravatar Last I checked, cfsavecontent is using a StringBuffer to append it's content together, so if you had the following string in a cfsavecontent block:

Hello #name#, how are you.

It would result in StringBuffer code like:

sb.append("Hello ").append(nameVariableHere).append(", how are you.");

Kevan
# Posted By Kevan Stannard | 10/25/09 8:22 AM
Tony Nelson's Gravatar John Whish wrote a similar post awhile ago with what appears to be similar results, although your graphs are a nice touch. :)

http://www.aliaspooryorik.com/blog/index.cfm/e/pos...

If it comes down to preference, I'd stick with CF arrays over using a StringBuilder since I consider myself a CF guy and not a java guy.
# Posted By Tony Nelson | 10/25/09 9:39 PM
Jason Delmore's Gravatar Hi Kevan,

I'm not sure where you are getting that from. The CFSaveContent tag itself is one of the tags that is actually written in CFML (along with dump, trace, cache.) The CFML is very straightforward... grab the stuff generated from the tag. Now, the underlying implementation may be doing essentially what you are saying. I don't know... I will try to coax one of the CF developers to comment. :)

@Tony: My point is a CF guy IS a Java guy. Don't be a Java hater. :)

Jason
# Posted By Jason Delmore | 10/28/09 12:11 PM
BlogCFC was created by Raymond Camden. This blog is running version 5.9. Contact Blog Owner