Skip to content

Bulk Queueing

Mike Perham edited this page Oct 22, 2024 · 13 revisions

The push_bulk method allows us to push a large number of jobs to Redis. This method cuts out the redis network round trip latency. I wouldn"t recommend pushing more than 1000 per call but YMMV based on network quality, size of job args, etc. A large number of jobs can cause a bit of Redis command processing latency.

This method takes the same arguments as #push except that args is expected to be an Array of Arrays. All other keys are duplicated for each job. Each job is run through the client middleware pipeline and each job gets its own JID as normal.

It returns an array of JIDs, one for each pushed job. The number of jobs pushed can be less than the number given if any client middleware stops the processing for one or more jobs.

200.times do |idx|
  # each loader job will push 1000 jobs of some other type
  LoaderWorker.perform_async(idx)
end

class LoaderWorker
  include Sidekiq::Job

  SIZE = 1000

  def perform(idx)
    # assume we want to create a job for each of 200,000 database records
    # query for our set of 1000 records
    array_of_args = SomeModel.limit(SIZE).offset(idx * SIZE).pluck(:id).map { |el| [el] }

    # push 1000 jobs in one network call to Redis, saves 999 round trips
    Sidekiq::Client.push_bulk("class" => SomeJob, "args" => array_of_args)
  end
end

You can reference the push_bulk code in lib/sidekiq/client.rb:79-111

perform_bulk

Sidekiq 6.3.0 introduced the perform_bulk method, which is a thin wrapper around push_bulk.

class LoaderWorker
  include Sidekiq::Job

  def perform
    array_of_args = SomeModel.pluck(:id).zip
    
    # you can change the default batch_size if needed, default is 1000
    SomeJob.perform_bulk(array_of_args, batch_size: 500)
  end
end

This functionality is only available if you use the Sidekiq::Job API; it does not work with ActiveJob.

Clone this wiki locally