-
Notifications
You must be signed in to change notification settings - Fork 74.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TensorFlow binary crashes on Apple M1 in x86_64 Docker container #52845
Comments
Thanks @mohantym The links just reference the warning above which I believe is innocuous since Docker can emulate the image's platform. TensorFlow doesn't publish official linux/arm64/v8 images (would require an aarch64 TensorFlow build), but I would think that would remove the warning. Note that the problem is specifically with TensorFlow's assumptions about the emulated platform and not the image or other libraries, which run fine when emulating linux/amd64:
I suspect |
Hi @sanatmpa1! Could you please look at this issue? |
I am taking a class where we use tensorflow inside docker containers and everybody with an M1 mac in that class had this exact same issue including me. Unfortunately nobody has found a fix so I am going to subsribe to this issue as well, I hope there exist some kind of workarround/solution! |
Hi, I have the exact same issue. It is hindering my development process. While my app is deployed on an x86 server, I do need to use my M1 mac with emulation to develop code locally and to push it to production. All other major data science packages work correctly under x86 rosetta emulation: pandas, scikit-learn, torch, transformers, spacy, xgboost, lightgbm. I appreciate the great work you are doing with TensorFlow. I would be really grateful if you could take the time to help the data scientists / ML engineers out there who are using ARM-based development laptops. Thanks a lot, Alex PS: I am not interested in forks like tensorflow-macos etc as I need my work to be cross-platform. |
apple/tensorflow_macos#164 (comment) https://github.com/ARM-software/Tool-Solutions/tree/master/docker/tensorflow-aarch64 But as someone still needs to use this in emulation I suppose in that It could be a qemu BUG with |
Did anybody find any way to run tensorflow inside a docker container on any M1, M1 Pro or M1 Max device? Would really love to know any workaround so I can start building containers with tf. Thanks in advance for any tips! |
If the point is to have a published X86 wheel without AVX we have already an open ticket, so it is better to add a comment there instead of having a new ticket: If instead you want to have AVX TCG support in QEMU e.g. on M1 there is already an open ticket at: |
So I do think this is due to AVX instructions. If I install an unofficial wheel (e.g., from yaroslavvb/tensorflow-community-wheels#198) and run a variant of the
Thanks for the lead @bhack. I agree, some solutions which you mention are: |
For the first point I don't know if anyone at @Intel-tensorflow is interested to publish an SSE4.x only wheel in https://pypi.org/project/intel-tensorflow/ |
@dwyatte Thanks a lot for the tip. With an unofficial wheel I was able to get Tensorflow running within Docker on an Apple M1 processor 🚀 |
@gabac One you built or one that is available online? I'm facing the same issue... |
E.g. if you use pip as a package manager use e.g. |
Thanks, that did the trick! Unfortunately, Docker M1 Mac seems to be pretty slow... :( (not talking about training...) |
For performance you need to use |
any update? |
Any luck with this issue. I get this when i try to import tensorflow in python |
While this issue was originally opened around emulating TensorFlow on x86_64 in Docker, it does look like there are now Dockerfile
|
@dwyatte , Thanks for confirming, if your issue is resolved, could you please close this issue. |
Sure, I think we can close this now. QEMU also appears to have merged AVX instructions, so once that is pulled into Docker, it might also be possible to run via emulation. https://gitlab.com/qemu-project/qemu/-/issues/164#note_1140802183 |
@sachinprasadhs Will Google release prebuilt ARM64 Docker images to Docker Hub? I’m especially interested in an ARM64 tensorflow/serving image. |
Thanks for reaching out! I'm not aware of any plans to release prebuilt ARM64 Docker images. |
@learning-to-play It would be great for the community if we had prebuilt images for all architectures that we support. 🙏 |
* Bump ujson from 1.35 to 5.4.0 Bumps [ujson](https://github.com/ultrajson/ultrajson) from 1.35 to 5.4.0. - [Release notes](https://github.com/ultrajson/ultrajson/releases) - [Commits](ultrajson/ultrajson@v1.35...5.4.0) --- updated-dependencies: - dependency-name: ujson dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> * Meedan 2116 update image scoring (#244) * CHECK-2116 update alegre image endpoint to return correctly ordered scoring, add init perl to start.sh file while we're here * CHECK-2116 update alegre image endpoint to return correctly ordered scoring, add init perl to start.sh file while we're here * CHECK-2116 fix typo * CHECK-2116 update test for new scoring setup * CHECK-2116 update contract test * Meedan 2120 add limits (#245) * CHECK-2120 initial push on adding limit to all search responses * CHECK-2120 fix typo * CHECK-2120 remove bad id after testing context hashes on dev * CHECK-2120 update variable name * CHECK-2120 update test * CHECK-2120 fix typo * CHECK-2120 refactor audio similarity to make search function less complex * CHECK-2120 fix more minor code climate issue * Change Alegre port to 3100 to avoid conflict on Mac Monterey (#246) Port 5000, which Alegre currently runs on, is now used by AirPlay on Macs running Monterey. As a result, there is an error that port is in use when our application tries to use that port in development. To fix this, I modified the external port to 3100, which it seems to have been at some point in the past (reflected by Readme). For internal consistency, I went ahead and updated the internal port to 5000, as well, even though it wasn't really necessary. Fixes CHECK-2147 * Meedan 2178 delete with context (#247) * CHECK-2120 initial push on adding limit to all search responses * CHECK-2120 fix typo * CHECK-2120 remove bad id after testing context hashes on dev * CHECK-2120 update variable name * CHECK-2120 update test * CHECK-2120 fix typo * CHECK-2120 refactor audio similarity to make search function less complex * CHECK-2120 fix more minor code climate issue * CHECK-2178 add deletion conditional on context uniqueness * CHECK-2178 fix code climate issues * CHECK-2178 remove context on text until we are able to do something with it in next ticket * add type checking * and of course we want is list * CHECK-2178 add prints to diagnose these last bugs * CHECK-2178 work on type mismatch now * CHECK-2178 fix tests with updated input data * CHECK-2178 fix typo in function params and update tests to reflect added context * CHECK-2178 add context to test * CHECK-2178 remove prints * CHECK-2139 add parameters to establish min cutoff score from ES as we… (#250) * CHECK-2139 add parameters to establish min cutoff score from ES as well as per-model thresholding * CHECK-2139 resolve codeclimate suggestion * Use community version of Tensorflow that works with M1 The TensorFlow binary downloaded from a normal TensorFlow 2.3.1 pip install (from requirements) was crashing when we used the linux/x86_64 emulated arch with M1 macs (which is needed because TensorFlow does not yet have an arm-supported version). To solve this, we are using a community wheel of Tensorflow 2.3.1 compiled as we need it. More on this here: tensorflow/tensorflow#52845 Paired with Ahmed! CHECK-2147 * Fixes creating text graphs When I was trying to generate text clusters locally, it didn’t fail, but no clusters were returned. It worked well for images. Looks like some changes to text similarity were not reflects in the graph writer. Looks like "model" should now be "models" and "text" should be "content". I'm not sure, so I'll ask Devin to review it. Fixes CHECK-2212. * CHECK-2179 initial push on using context in text like other media (#249) * CHECK-2179 initial push on using context in text like other media * CHECK-2179 alter logic of delete to allow to attempt to delete any not-multi-context doc * CHECK-2179 re-add missing var * CHECK-2131 add errbit notification for broken search result (#253) * CHECK-2131 add errbit notification for broken search result * CHECK-2131 remove now irrelevant test * CHECK-2131 old test is changed due to minor change from API - fix maybe? * CHECK-2131 make test more robust * CHECK-2131 switch args * CHECK-2131 More test fixes * CHECK-2131 this set of tests man! * CHECK-2131 more fixing on these tests * CHECK-2387 don't allow nil thresholds (#255) * CHECK-2387 don't allow nil thresholds * CHECK-2387 ah the old zero is not game in python * CHECK-2284 update documentation to more explicitly call out that swagger docs wont work out of box (#257) * CHECK-2284 update documentation to more explicitly call out that swagger docs wont work out of box * MEEDAN-2284 fix whitespace * CHECK-2437 add support for using analyzers by language (#258) * CHECK-2437 add support for using analyzers by language * CHECK-2437 remove old dependencies from half-implementation of analyzers * CHECK-2437 shift es client * CHECK-2437 add tests for new use case * CHECK-2437 add fix for tests to actually pass * Meedan 2437 multiple analyzer indices (#261) * CHECK-2437 add support for using analyzers by language * CHECK-2437 remove old dependencies from half-implementation of analyzers * CHECK-2437 shift es client * CHECK-2437 add tests for new use case * CHECK-2437 add fix for tests to actually pass * CHECK-2437 resolve code review fixes * Optionally allow language override * CHECK-2437 add ascii folding and other minor tweaks (#262) * Change order of analyzer filters * remove draft lines * CHECK-1716 Add explicit model returns for all responses, also sneak in some language analyzer changes (#264) * CHECK-1716 Add explicit model returns for all responses, also sneak in some language analyzer changes * CHECK-1716 add updates to test fixtures * CHECK-1716 add more test fixes * CHECK-2608 version bump cld (#265) * CHECK-2608 add test function (#266) * Fixing PostgreSQL Dockerfile All CI builds were failing with this error: ``` W: The repository 'http://apt.postgresql.org/pub/repos/apt stretch-pgdg Release' does not have a Release file. E: Failed to fetch http://apt.postgresql.org/pub/repos/apt/dists/stretch-pgdg/11/binary-amd64/Packages 404 Not Found [IP: 147.75.85.69 80] E: Some index files failed to download. They have been ignored, or old ones used instead. The command '/bin/sh -c apt-get update && apt-get install -y gawk postgresql-plperl-$PG_MAJOR && localedef -i ru_RU -c -f UTF-8 -A /usr/share/locale/locale.alias ru_RU.UTF-8 && rm -rf /var/lib/apt/lists/*' returned a non-zero code: 100 Service 'postgres' failed to build : Build failed ``` Here's an announcement: https://www.postgresql.org/message-id/[email protected] Fixed by installing the packages from the archive repository. * CHECK-2690 remove vectors from responses for alegre text (#268) * Meedan 2690 remove vectors from response (#269) * CHECK-2690 remove vectors from responses for alegre text * CHECK-2690 apply stripper to every case * CHECK-2690 minor fix * CHECK-2702 fix thresholding function for audio (#270) * CHECK-2702 fix thresholding function for audio * CHECK-2702 fix tests * invert index * CHECK-2782 update matching to reject mismatched lengths (#273) * Bump pyjwt from 1.6.4 to 2.4.0 (#236) Bumps [pyjwt](https://github.com/jpadilla/pyjwt) from 1.6.4 to 2.4.0. - [Release notes](https://github.com/jpadilla/pyjwt/releases) - [Changelog](https://github.com/jpadilla/pyjwt/blob/master/CHANGELOG.rst) - [Commits](jpadilla/pyjwt@1.6.4...2.4.0) --- updated-dependencies: - dependency-name: pyjwt dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333 dependabot[bot]@users.noreply.github.com> * Bump joblib from 1.0.1 to 1.2.0 (#260) Bumps [joblib](https://github.com/joblib/joblib) from 1.0.1 to 1.2.0. - [Release notes](https://github.com/joblib/joblib/releases) - [Changelog](https://github.com/joblib/joblib/blob/master/CHANGES.rst) - [Commits](joblib/joblib@1.0.1...1.2.0) --- updated-dependencies: - dependency-name: joblib dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333 dependabot[bot]@users.noreply.github.com> * Bump certifi from 2018.10.15 to 2022.12.7 (#272) Bumps [certifi](https://github.com/certifi/python-certifi) from 2018.10.15 to 2022.12.7. - [Release notes](https://github.com/certifi/python-certifi/releases) - [Commits](certifi/python-certifi@2018.10.15...2022.12.07) --- updated-dependencies: - dependency-name: certifi dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333 dependabot[bot]@users.noreply.github.com> * Bump mako from 1.0.7 to 1.2.2 (#256) Bumps [mako](https://github.com/sqlalchemy/mako) from 1.0.7 to 1.2.2. - [Release notes](https://github.com/sqlalchemy/mako/releases) - [Changelog](https://github.com/sqlalchemy/mako/blob/main/CHANGES) - [Commits](https://github.com/sqlalchemy/mako/commits) --- updated-dependencies: - dependency-name: mako dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333 dependabot[bot]@users.noreply.github.com> * Bump protobuf from 3.9.2 to 3.18.3 (#259) Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 3.9.2 to 3.18.3. - [Release notes](https://github.com/protocolbuffers/protobuf/releases) - [Changelog](https://github.com/protocolbuffers/protobuf/blob/main/generate_changelog.py) - [Commits](protocolbuffers/protobuf@v3.9.2...v3.18.3) --- updated-dependencies: - dependency-name: protobuf dependency-type: direct:production ... Signed-off-by: dependabot[bot] <[email protected]> Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333 dependabot[bot]@users.noreply.github.com> * Update article.py * Update bulk_similarity_controller.py * Update bulk_similarity_controller.py Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333 dependabot[bot]@users.noreply.github.com> Co-authored-by: Devin Gaffney <[email protected]> Co-authored-by: Christa Hartsock <[email protected]> Co-authored-by: Christa Hartsock <[email protected]> Co-authored-by: Caio Almeida <[email protected]>
This issue is about dockers, please check the title. |
Please make sure that this is a bug. As per our
GitHub Policy,
we only address code/doc bugs, performance issues, feature requests and
build/installation issues on GitHub. tag:bug_template
System information
Describe the current behavior
Describe the expected behavior
Clean exit
Standalone code to reproduce the issue
Requires an Apple M1 (arm64) host OS:
docker run tensorflow/tensorflow:latest python -c "import tensorflow as tf"
This was previously mentioned in #42387 but unfortunately closed. When importing TensorFlow in an x86_64 docker container on an Apple M1, TensorFlow crashes. As far as I can tell, this should work as I can import and use other Python packages in the same container without problems (including things like
numpy
).It's unclear whether this is something that can be avoided at the TensorFlow level or an unavoidable bug in qemu ([1], [2]), but I wanted to reraise the issue.
The text was updated successfully, but these errors were encountered: