You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The C APIs work with MPI and offload, and the Python APIs work for offload w/o MPI. But the combo of all 3 doesn't work. Is likely a bug in the SW stack; last tested with IMPI 2021.12 and oneAPI 2024.1.
%make clean; make -j -C src/kernel/ YK_CXXOPT=-O1 offload=1 mpi=1 ranks=2 py-yk-api-test
[0] MPI startup(): Number of NICs: 1
[0] MPI startup(): ===== NIC pinning on sdp7814 =====
[0] MPI startup(): Rank Pin nic
[0] MPI startup(): 0 enp1s0
Error: failure in zeMemGetAllocProperties 78000001
[0#908140:908140@sdp7814] MPI startup(): I_MPI_ROOT=/opt/intel/oneapi/mpi/2021.12
[0#908140:908140@sdp7814] MPI startup(): ONEAPI_ROOT=/opt/intel/oneapi
[0#908140:908140@sdp7814] MPI startup(): I_MPI_HYDRA_BOOTSTRAP=ssh
[0#908140:908140@sdp7814] MPI startup(): I_MPI_OFFLOAD=2
[0#908140:908140@sdp7814] MPI startup(): I_MPI_DEBUG= 5
[0#908140:908140@sdp7814] MPI startup(): I_MPI_PRINT_VERSION=1
Error: failure in zeMemGetAllocProperties 78000001
Abort(881416975) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Comm_split_type: Unknown error class, error stack:
PMPI_Comm_split_type(468)..................: MPI_Comm_split(MPI_COMM_WORLD, color=1, key=0, new_comm=0x5563a6824b5c) failed
PMPI_Comm_split_type(448)..................:
MPIR_Comm_split_type_impl(90)..............:
MPIDI_Comm_split_type(114).................:
MPIR_Comm_split_type_node_topo(262)........:
compare_info_hint(329).....................:
MPIDI_Allreduce_intra_composition_beta(788):
MPIDI_NM_mpi_allreduce(147)................:
MPIR_Allreduce_intra_auto(60)..............:
MPIR_Allreduce_intra_recursive_doubling(56):
MPIR_Localcopy(56).........................:
MPIDI_GPU_Localcopy(1135)..................:
MPIDI_GPU_ILocalcopy(1040).................: Error returned from GPU API
The text was updated successfully, but these errors were encountered:
The C APIs work with MPI and offload, and the Python APIs work for offload w/o MPI. But the combo of all 3 doesn't work. Is likely a bug in the SW stack; last tested with IMPI 2021.12 and oneAPI 2024.1.
%make clean; make -j -C src/kernel/ YK_CXXOPT=-O1 offload=1 mpi=1 ranks=2 py-yk-api-test
[0] MPI startup(): Number of NICs: 1
[0] MPI startup(): ===== NIC pinning on sdp7814 =====
[0] MPI startup(): Rank Pin nic
[0] MPI startup(): 0 enp1s0
Error: failure in zeMemGetAllocProperties 78000001
[0#908140:908140@sdp7814] MPI startup(): I_MPI_ROOT=/opt/intel/oneapi/mpi/2021.12
[0#908140:908140@sdp7814] MPI startup(): ONEAPI_ROOT=/opt/intel/oneapi
[0#908140:908140@sdp7814] MPI startup(): I_MPI_HYDRA_BOOTSTRAP=ssh
[0#908140:908140@sdp7814] MPI startup(): I_MPI_OFFLOAD=2
[0#908140:908140@sdp7814] MPI startup(): I_MPI_DEBUG= 5
[0#908140:908140@sdp7814] MPI startup(): I_MPI_PRINT_VERSION=1
Error: failure in zeMemGetAllocProperties 78000001
Abort(881416975) on node 0 (rank 0 in comm 0): Fatal error in PMPI_Comm_split_type: Unknown error class, error stack:
PMPI_Comm_split_type(468)..................: MPI_Comm_split(MPI_COMM_WORLD, color=1, key=0, new_comm=0x5563a6824b5c) failed
PMPI_Comm_split_type(448)..................:
MPIR_Comm_split_type_impl(90)..............:
MPIDI_Comm_split_type(114).................:
MPIR_Comm_split_type_node_topo(262)........:
compare_info_hint(329).....................:
MPIDI_Allreduce_intra_composition_beta(788):
MPIDI_NM_mpi_allreduce(147)................:
MPIR_Allreduce_intra_auto(60)..............:
MPIR_Allreduce_intra_recursive_doubling(56):
MPIR_Localcopy(56).........................:
MPIDI_GPU_Localcopy(1135)..................:
MPIDI_GPU_ILocalcopy(1040).................: Error returned from GPU API
The text was updated successfully, but these errors were encountered: