Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Continuous memory growth upon upgrading from 1.57.0 to 1.57.1 #38327

Open
alangenfeld opened this issue Dec 20, 2024 · 0 comments
Open

Continuous memory growth upon upgrading from 1.57.0 to 1.57.1 #38327

alangenfeld opened this issue Dec 20, 2024 · 0 comments

Comments

@alangenfeld
Copy link

What version of gRPC and what language are you using?

grpcio 1.57.0 and 1.57.1

What operating system (Linux, Windows,...) and version?

various

What runtime / compiler are you using (e.g. python version or version of gcc)

python 3.X (various)

What did you do?

This issue represents reports from Dagster users dagster-io/dagster#18997 .

What did you expect to see?

When upgrading the grpcio python package in the system processes that are grpc clients from 1.57.0 to 1.57.1, memory utilization should not change drastically.

What did you see instead?

For some users, memory consumption in 1.57.1 grows continuously for the life of the process. Downgrading to 1.57.0 bring memory consumption to normal.

Anything else we should know about your project / environment?

A Dagster deployment involves a GRPC server process that loads users Dagster code artifacts and two system processes (a webserver and a daemon) that communicate with the GRPC server. These system processes regularly communicate with the GRPC server to fetch various piece of information. We use a threaded grpc server and the client system processes make blocking requests, not asyncio.

A previous report #36117 was closed under the assumption that the issue was related to a resolved cpython asyncio bug python/cpython#111246. Since we observe this in a non asyncio set-up I don't believe that is an accurate assessment.

When viewing the changes between 1.57.0 and 1.57.1 v1.57.0...v1.57.1 the only meaningful change is #34557 a backport of #34549 that introduced a new heap allocation https://github.com/grpc/grpc/pull/34549/files#r1341637876 . It seems extremely likely that this change is related to the observed changes in memory utilization.

Well outside my area of expertise, but cross referencing https://github.com/abseil/abseil-cpp/blob/master/absl/strings/cord.h#L214-L252 and trying to make some guesses:

  • a reference counting issue is preventing the releaser from running
  • the releaser is running, but the heap allocations are causing fragmentation preventing memory from being freed back to the OS
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants