Autoscaling with batcher #3713

cceyda · 2024-05-27T04:05:17Z

cceyda
May 27, 2024

How is the scaleTarget calculated when auto batcher is used? Is it based on the input 'instances' shape? or the batched request count

Let's say I'm using batcher with (max) batch size 32 and Auto scaler with scaleMetric:rps (or concurrency)
then if my scaleTarget is 64 am I getting (assuming perfect batching conditions):

64 calls to preprocess with "instances" shape (32,...) effectively making it 32*64 instances to process total (on one pod)

OR

2 calls to preprocess with "instances" shape (32,...) effectively making it 32*2=64 instances to process total (on one pod)

If it is the second case then what happens if scaleMetric < maxBatchSize ?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Autoscaling with batcher #3713

{{title}}

Replies: 0 comments

Select a reply

Autoscaling with batcher #3713

cceyda May 27, 2024

Replies: 0 comments

cceyda
May 27, 2024