You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
How is the scaleTarget calculated when auto batcher is used? Is it based on the input 'instances' shape? or the batched request count
Let's say I'm using batcher with (max) batch size 32 and Auto scaler with scaleMetric:rps (or concurrency)
then if my scaleTarget is 64 am I getting (assuming perfect batching conditions):
64 calls to preprocess with "instances" shape (32,...) effectively making it 32*64 instances to process total (on one pod)
OR
2 calls to preprocess with "instances" shape (32,...) effectively making it 32*2=64 instances to process total (on one pod)
If it is the second case then what happens if scaleMetric < maxBatchSize ?
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
How is the
scaleTarget
calculated when auto batcher is used? Is it based on the input'instances'
shape? or the batched request countLet's say I'm using batcher with (max) batch size 32 and Auto scaler with
scaleMetric:rps
(or concurrency)then if my scaleTarget is 64 am I getting (assuming perfect batching conditions):
64 calls to
preprocess
with"instances"
shape (32,...) effectively making it 32*64 instances to process total (on one pod)OR
2 calls to
preprocess
with"instances"
shape (32,...) effectively making it 32*2=64 instances to process total (on one pod)If it is the second case then what happens if scaleMetric < maxBatchSize ?
Beta Was this translation helpful? Give feedback.
All reactions