Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client-side load balancing #4530

Open
swankjesse opened this issue Jan 4, 2019 · 20 comments
Open

Client-side load balancing #4530

swankjesse opened this issue Jan 4, 2019 · 20 comments
Labels
enhancement Feature not a bug
Milestone

Comments

@swankjesse
Copy link
Collaborator

We’re using OkHttp for server-to-server connections on high-latency HTTP/2 connections. I’d like some load-balancing features:

  • Multiple HTTP/2 connections to the same hostname IP address. We currently have code that aggressively tries to deduplicate connections to the same destination. But our destinations are virtual server-side load balancers that front application servers. We want to cycle out these application servers without clients having to wait for TLS handshakes.

  • Preallocated connections. Seeding the connection pool before we need it, so when we do we don’t have to pay for a TLS handshake.

  • Client-side connection TTLs. Long-lived connections can hide problems. By forcing connections to be recreated at a regular interval (say, 1 hour) we can detect performance problems (thundering herds, etc.) before they bite.

@swankjesse swankjesse added the enhancement Feature not a bug label Jan 4, 2019
@yschimke
Copy link
Collaborator

yschimke commented Jan 4, 2019

1 to preallocated connections, I've done horrors like this before

https://github.com/yschimke/okurl/blob/master/src/test/kotlin/commands/uberprices.kts#L18

warmup("https://api.mapbox.com/robots.txt", "https://api.uber.com/robots.txt")

@yschimke
Copy link
Collaborator

yschimke commented Jan 4, 2019

How intelligent do you plan the load balancing strategies to be? Or round robin. Do you want resilience by connecting to multiple backends when multiple ips are available.

@swankjesse swankjesse added this to the Backlog milestone Feb 18, 2019
@swankjesse
Copy link
Collaborator Author

I think we probably want to make the load balancing strategies pluggable. I’m generally not into pluggable things (why not just include a good implementation in the box!) but in this case I think being pluggable gives us options we can’t build otherwise.

One strategy I really like is from what gRPC calls an external load balancer.
https://grpc.io/grpc/core/md_doc_load-balancing.html
https://grpc.io/blog/loadbalancing

@esiqveland
Copy link

I am interested in seeing this happen, primarily for increased resilience by always being connected to multiple backends, spreading requests among them.

  1. gRPC does load balancing per request. Is this something that could fit into okhttp?

  2. The current Dns interface only lets you return a List<InetAddress>. This makes it a lot less practical to implement something like a DNS SRV lookup for a service, which results in an address and a port. Now changing Dns is probably a long shot, but a (address, port) pair is something I want you to consider when looking into client side load balancing.

@swankjesse
Copy link
Collaborator Author

@esiqveland

  1. Yes, I wanna do per-call load balancing.
  2. I’d like the load balancer to have the option of taking advice from a DNS service.

@yschimke
Copy link
Collaborator

Is this still a 4.1 target?

@swankjesse swankjesse modified the milestones: 4.1, Backlog Jul 22, 2019
@swankjesse
Copy link
Collaborator Author

Dropped to backlog. 4.1 should be small!

@yschimke
Copy link
Collaborator

Also related to #5424

Servers may indicate they don't want coalescing.

@yschimke
Copy link
Collaborator

Implicit requirement: Need visibility into effectiveness of these controls, whether servers are misconfigured and working against it, and whether there is pathological behaviour e.g. connection churn due to connection pool mis-sizing.

Ideally APIs would also allow for handling two cases

  • open connections on preferred networks e.g. company wifi, even when existing 3g connection may be present.
  • actively managing connections e.g. call noNewStreams on connections that should not be used for new streams.

@yschimke
Copy link
Collaborator

@swankjesse Any interest in kicking off discussion?

@swankjesse
Copy link
Collaborator Author

Yes! Gimme a couple days to write up some use cases. Can you write some too? I'd like to start by scoping what problems we're interested in addressing, and then move to modeling a solution.

@yschimke
Copy link
Collaborator

Actor: Developer working on Android network performance
Goal: Gather performance related metrics and tune connection pooling

For my app, I want to understand performance related metrics. Particularly before and after tuning the connection pool or making server related changes.

  • Rate of connection churn
  • Length of retained reused connections
  • Connection success rate stats e.g. by response code, by exceptions

@yschimke
Copy link
Collaborator

yschimke commented Sep 21, 2019

Actor: Developer working on Android app functionality
Goal: Debug specific connection problems and workaround them without disposing client

If I have issues with certain phones and faulty connections, I'd like ways to debug and report problems. What network type (wifi, cell) is a connection over. Close a connection that is persistently failing (ideally this is fixed by OkHttp without intervention).

Expose enough state to allow hooking tightly into developer tooling like fbflipper, chucker, to provide visibility into app performance.

@yschimke
Copy link
Collaborator

yschimke commented Sep 21, 2019

Actor: Developer working on Android app functionality
Goal: Making client functional decisions based on network types and availability.

I only want to download video over fast connections, or hit an intranet API when over a corporate wifi. Close an existing low latency stream and proactively reconnect on a fast stream before first use.

Hook into Android network events to optimise. Relate this to app foreground state, or battery usage.

@yschimke
Copy link
Collaborator

yschimke commented Sep 21, 2019

Yes! Gimme a couple days to write up some use cases. Can you write some too? I'd like to start by scoping what problems we're interested in addressing, and then move to modeling a solution.

These can be different requirements, wholly distinct features but they seems best addressed cohesively e.g. when do you want to proactively open and hold onto connections? When do you want to close/avoid persistent connections to save battery in the background? When do you want your app to start downloading and cache content opportunistically. When is load balancing appropriate?

Observe, understand, tune/optimise.

@yschimke
Copy link
Collaborator

yschimke commented Sep 22, 2019

Also relates to #3997 "Tune ConnectionPool pruning"

@sboishtyan
Copy link

Hello, what is the status of the task?

We have a scenario when we try to send logs to and our backend guys give us a bunch of host URLs, So we have to round-robining through them. Our current realization is Interceptor which handles request failures and modifies URL and retries them.

The code is ugly but works, so I interested in advice maybe OkHttpClient has a better API for our scenario

@swankjesse
Copy link
Collaborator Author

@sboishtyan that solution isn't that ugly! It's potentially part of what a load balancer would do. Biggest challenge on this issue is what the API looks like. It has the potential to be too simple or too complex!

@sboishtyan
Copy link

@swankjesse Could I help you with something that helps you design the API?

@swankjesse
Copy link
Collaborator Author

I’m not interested in contributions here right now. But if you have further details on your use case, please share.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Feature not a bug
Projects
None yet
Development

No branches or pull requests

4 participants