in_batches_of

Rails is nice. Rails is REAL nice, but, you would expect me to say that, I am a rails developer.

One thing that rails is not nice about is efficiency.
It’s elegant, its compact, but, the simplest things can have unforseen consequences unless you understand at a fundamental level, whats going on.

The persistence layer of Rails is a class called ActiveRecord (Specifically, most of it is tied up in ActiveRecord::Base)

When you want to do something with every row in a table, or with a subset of rows, you want to do something like this…

ModelClass.find(:all).each do |item|
  ... do stuff ...
end

Take it from me, that would be bad.

What it would do is instantiate every row in the table as an object and build an array of those objects in memory. Thats fine if you have 10, 100, 1000 rows in that table, but what happens when you have 100000 or a million? Well, we would be in for a long wait.

In the past I have used an idiom of handling the operation in batches, and written code for it each time. But, in the interests of DRY methods, and with the assistance of Baz (link just as soon as I get his blog address), I present another way.

ModelClass.in_batches_of(1000).each do |item|
… do stuff …
end

Ok! so, I cheated. heres how to do it.

class ActiveRecord::Base

  1. Execute the block given for every object in the selection specified by options
  2. note, the block must perform some action which will remove the object from the selection or infinite looping
  3. will ensue.

def self.in_batches_of(count = 100, options => {})

  1. no block would mean infinite loop.
    return unless block_given?

options_with_limit = options.merge(:limit => count)

  1. get the first block
    batch = find(:all, options_with_limit)

while batch.size > 0
batch.each {|item| yield(item)}

  1. get the next block
    batch = find(:all, options_with_limit)

end
end

end

Now, there is one limitation, that the block of code given should remove the row selected by the options from being selected a second time, without this, the code will force an infinite loop.

It would be simple to implement the above to remove this stricture, but, for my purposes at the moment, I leave this as an exercise for the reader.

I am almost certain that the resultant code would be nowhere near as cute.

Leave a Reply

Your email address will not be published. Required fields are marked *