
An updated version of this post is available here: Converting a hundred million integers to strings per second.
In this post I compare the performance of several methods of integer to string conversion in C++:
sprintfstd::stringstreamstd::to_stringfrom C++11boost::formatfrom the Boost Format libraryboost::lexical_castkarma::generatefrom the Boost Spirit Parser frameworkfmt::format_int,fmt::format,fmt::format_toandfmt::compilefrom the {fmt} library- Public-domain
ltoaimplementation decimal_fromfunction suggested by Alf P. Steinbach
To measure the performance I used a
benchmark from Boost Karma.
This benchmark generates 10,000,000 random integers and converts them to strings
using different methods measuring conversion time. I’ve replaced nonportable
itoa with sprintf and added several other methods.
Apart from adding new conversion methods, I’ve also noticed that the benchmark
used unnecessary conversion to std::string in some tests
to compensate for string operations in the other. To get more useful results,
I’ve split every such test in two, one that does conversion to std::string and
one that doesn’t. Tests that do unnecessary conversion to std::string have suffix
+std::string. They are suboptimal, but I’ve included them for completeness.
Here are the results ordered by the time it took a method to convert 10,000,000 integers to strings (obviously smaller is better); time ratio is the ratio of conversion time to the best time:
I consider these results pretty exciting. First they show that fmt::format_int
is the fastest of the tested methods, about 24% faster than
cppx::decimal_from, the next contender, and whopping 30x (not 30%) faster than
boost::format. Here’s the code to convert an integer to a string with
fmt::format_int:
fmt::format_int f(42);
// The result can be converted to std::string using f.str() or
// accessed as a C string using f.c_str().
auto s = f.c_str(); // s == "42"
Note that fmt::format_int automatically manages the output buffer unlike
sprintf and karma::generate which require an error-prone manual memory
management. In the case of karma::generate you can probably use an output
iterator such as back_insert_iterator for automatic memory management but the
performance will likely suffer.
Another remarkable and surprising (to me) observation is that
sprintf is not particularly good for integer formatting. It is more than 6
times slower than fmt::format_int. One possible reason for this is that
sprintf parses the format string, but so do fmt::format and fmt::format_to
which are 1.8 - 2.6 times faster than sprintf. The good thing is that you
don’t have to use sprintf in an attempt to sacrifice safety for performance.
There are much faster or at least equally slow but safer methods.
One recent addition to the benchmark is fmt::compile which does constexpr
format string compilation. As can be seen from the results fmt::compile +
fmt::format_to are almost as fast as an artisanal integer-to-string converter
optimized by hand (cppx::decimal_from). Thanks Louis Dionne for the idea and
Hana Dusíková for the proof-of-concept implementation of format string
compilation.
The benchmark results were obtained on macOS Mojave with Apple LLVM version
10.0.1 (clang-1001.0.46.4) and the following compiler flags: -O3 -DNDEBUG.
Running the benchmark:
$ git clone --recursive https://github.com/fmtlib/format-benchmark.git
$ cd format-benchmark
$ cmake .
$ make
$ ./int-generator-test.py
You can find out more about fmt::format_int and fmt::format in the {fmt}
library repository on GitHub and in the
documentation.
Update:
Since I don’t have ltoa on my platform, I’ve added a basic
public-domain implementation of this function from
here. Let me know in the
comment section if there is a better version available somewhere.
Update 2:
Added decimal_from function suggested by Alf P.
Steinbach. As sprintf and ltoa it requires a user-provided buffer.
Update 3: Inspired by a lesson learned from Alexandrescu’s talk that “no work is less work than some work” I’ve come up with a faster method of integer to string conversion. Unlike other methods it does one pass over the digits. All other methods I know do two passes and can be divided into two categories:
- Count digits (pass 1), then convert digits to chars writing from the end of the buffer (pass 2).
- Convert digits to chars writing from the beginning of the buffer (pass 1). Reverse the string in the buffer (pass 2).
Instead of doing this, I just convert digits to chars writing from the end ofi the buffer and return the pointer to the start of the converted string. In most cases there is some space left in the beginning of the buffer, but that’s fine because the same is true for the second category of methods above, they just have this space at the end of the buffer. This avoids unnecessary copying within a buffer that is often discarded anyway.
I’ve implemented this method in the fmt::format_int class available in the
{fmt} library.
Update 4:
Added side effects to make sure that the code being tested is not optimized
away by a super clever compiler (I wish there existed one). This is implemented
by computing a sum of lengths of all formatted strings using strlen.
The strlen function is used even in cases where std::string::size could be
used to make sure the same extra computation is done for all methods. Note that
since this adds a more or less constant factor to all the methods, high
performers are penalized more.
Update 5:
Fixed links to the {fmt} library.
Update 6 (25 Nov 2019):
Added fmt::compile which does constexpr format string compilation and
updated the test results.
Last modified on 2013-09-07