Home Home > 2014 > 10 > 08 > Ruby: Do not use += in loops
Sign up | Login

Ruby: Do not use += in loops

October 8th, 2014 by

Hi,
My motivation for writing this blog post is to have one simple place where I can point everybody using += in a loop. I will describe here why it can kill the performance of any application or library and will also show some measurement to demonstrate it.

Let’s start with practical measurement. I created a simple measurement script which looks like this:

require "benchmark"

N = 100000
Benchmark.bm(15) do |x|
  x.report("Array#+=") do
    arr = []
    N.times { arr += [1] }
  end

  x.report("Array#concat") do
    arr = []
    N.times { arr.concat [1] }
  end

  x.report("String#+=") do
    s = ""
    N.times { s += "1" }
  end

  x.report("String#<<") do
    s = ""
    N.times { s << "1"  }
  end
end

And result for N = 10 000 looks like this:

                      user     system      total        real
Array#+=          0.280000   0.020000   0.300000 (  0.302993)
Array#concat      0.000000   0.000000   0.000000 (  0.002291)
String#+=         0.040000   0.000000   0.040000 (  0.041442)
String#<<         0.000000   0.000000   0.000000 (  0.002437)

and for N = 100 000 looks like this:

                      user     system      total        real
Array#+=         31.410000   6.940000  38.350000 ( 38.635201)
Array#concat      0.030000   0.000000   0.030000 (  0.028933)
String#+=         3.590000   0.020000   3.610000 (  3.635690)
String#<<         0.030000   0.000000   0.030000 (  0.029524)

As can be seen, it does not raise in a linear way and now it is time for some theory to explain why.

When you push elements to the same object, the complexity class of the loop is O(n), since it grows in a linear way. For += it needs to create a copy of the array or string first and then append the new part. The copy complexity class is also O(n) as it depends on array size (the approximation would be O(n/2), which means an O(n) complexity class because half is constant). Copy optimization mechanisms can help, but the complexity class will remain the same. The result is that you need to perform n times the copy operation, which is O(n). So += complexity class is O(n^2). That explains why the times do not raise linearly but quadratically.

Therefore, += should never, ever, be used in loops. If you need to have a new object, then create an empty Array or String before the loop and push new elements into it. It is more efficient. Sure Ruby is not your tool if you are looking for a way to reduce every millisecond, but there is no excuse to use the worse algorithm as it can make application unusable performance-wise.

Both comments and pings are currently closed.

Comments are closed.