-
Notifications
You must be signed in to change notification settings - Fork 395
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NVML Shared Library Not Found on ARM64 platform with Mali GPU #76
Comments
Claude Sonnet 3.5 response: The issue is indeed that the code is trying to use NVIDIA-specific libraries on a non-NVIDIA system. The code needs modification to handle ARM64 platforms with Mali GPUs correctly. Here's how we can address this:
Here's an updated version of the import psutil
from tinygrad import Device
import pyudev
def linux_device_capabilities() -> DeviceCapabilities:
if DEBUG >= 2: print(f"tinygrad {Device.DEFAULT=}")
context = pyudev.Context()
# Check for Mali GPU
mali_gpu = next((device for device in context.list_devices(subsystem='mali0') if 'mali' in device.sys_name.lower()), None)
if mali_gpu:
gpu_name = mali_gpu.get('DEVNAME', 'Unknown Mali GPU')
return DeviceCapabilities(
model=f"Linux Box (ARM Mali)",
chip=gpu_name,
memory=psutil.virtual_memory().total // 2**20,
flops=DeviceFlops(fp32=100, fp16=200, int8=400) # Placeholder values, adjust as needed
)
# Check for NVIDIA GPU
try:
import pynvml
pynvml.nvmlInit()
handle = pynvml.nvmlDeviceGetHandleByIndex(0)
gpu_name = pynvml.nvmlDeviceGetName(handle)
gpu_memory_info = pynvml.nvmlDeviceGetMemoryInfo(handle)
if DEBUG >= 2: print(f"NVIDIA device {gpu_name=} {gpu_memory_info=}")
return DeviceCapabilities(
model=f"Linux Box ({gpu_name})",
chip=gpu_name,
memory=gpu_memory_info.total // 2**20,
flops=CHIP_FLOPS.get(gpu_name, DeviceFlops(fp32=0, fp16=0, int8=0))
)
except ImportError:
pass # NVIDIA libraries not available
except pynvml.NVMLError:
pass # NVIDIA GPU not found or error accessing it
# Fallback for other devices
return DeviceCapabilities(
model=f"Linux Box (Device: {Device.DEFAULT})",
chip=f"Unknown Chip (Device: {Device.DEFAULT})",
memory=psutil.virtual_memory().total // 2**20,
flops=DeviceFlops(fp32=0, fp16=0, int8=0)
) This updated function will:
You'll need to add Also, note that the FLOPS values for the Mali GPU are placeholders. You might want to adjust these based on the specific Mali GPU model you're using, or implement a more sophisticated detection method for Mali GPU capabilities. This change should allow the Exo project to run on your ARM64 platform with a Mali GPU without throwing the NVIDIA-related error. It will detect the Mali GPU and provide some basic information about it. Remember to install pyudev:
After making these changes, the project should be able to start without the NVIDIA error on your ARM64 platform with a Mali GPU. |
The issue has been resolved. Could you please merge the changes? Thank you. @stephanj |
When trying to run the Exo project on an ARM64 platform with a Mali GPU, I am encountering the following error:
The error message indicates that the Exo project is attempting to use the NVIDIA Management Library (NVML), which is not available on ARM64 platforms with Mali GPUs.
The text was updated successfully, but these errors were encountered: