Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPI support #3349

Open
4 tasks
xq114 opened this issue Feb 9, 2023 · 7 comments
Open
4 tasks

MPI support #3349

xq114 opened this issue Feb 9, 2023 · 7 comments

Comments

@xq114
Copy link
Contributor

xq114 commented Feb 9, 2023

你在什么场景下需要该功能?

支持 MPI 程序,示例见 https://mpitutorial.com/tutorials/mpi-hello-world/

描述可能的解决方案

  • 支持查找和启用mpi相关工具/库:mpiexec,mpicc(unix-only)等。目前xmake-repo中已有msmpi和mpich,但没有对mpiexec以及其他mpi实现的查找。这部分可以放在 detect.sdks.find_mpi 里面。
  • 支持mpi程序的编译:add_requires("mpi")/add_packages("mpi"),将mpi抽象为package
  • 支持mpi程序的运行:mpi程序不能直接运行,需要通过mpiexec来运行:mpiexec -n 4 ./hello,命令行还可能指定host等更多信息,然而这和xmake run的模型冲突了,可能要想点别的办法,例如xmake run --mpiexec="-n 4" hello。并行数-n非常常用,也可考虑单独适配
  • 支持依赖mpi的第三方库:类似blas的情况,system fetch的包(各种实现都可以)和xmake-repo安装的包(必须指定一种实现)只能二选一,无法做到在system fetch失败之后fallback到特定包(除非将fetch提前到on_load阶段,但又和xmake设计冲突了)

描述你认为的候选方案

No response

其他信息

No response

@waruqi waruqi added this to the v2.7.8 milestone Feb 9, 2023
@waruqi
Copy link
Member

waruqi commented Feb 9, 2023

这个放到下个版本搞

@waruqi
Copy link
Member

waruqi commented Apr 3, 2023

最近还没时间去研究这个,可以帮忙提供一些 mpi 工具的相关下载链接和文档,一些基础的 find xxx 和 包 fetch 也可以先帮忙提 pr 进来。。

@waruqi waruqi modified the milestones: v2.7.8, v2.7.9 Apr 5, 2023
@waruqi waruqi removed this from the v2.7.9 milestone May 15, 2023
@xtlsoft
Copy link

xtlsoft commented Jun 24, 2023

For compiling, just add this line to your target:

set_toolset("cc", "gcc@mpicc")

and it just works!

@blackbhc
Copy link

blackbhc commented Aug 4, 2024

最近还没时间去研究这个,可以帮忙提供一些 mpi 工具的相关下载链接和文档,一些基础的 find xxx 和 包 fetch 也可以先帮忙提 pr 进来。。

作者你好,最近也遇到了这个问题,所以想提一点小建议,希望有所助益。

MPI本身只是一个通信协议,用于进程间通信以实现分布式计算,主要用于超算跨节点的大型并行计算,算是HPC领域的基础设施。所以,HPC相关的开发人员,主要是使用各种大规模数值模拟的科研人员,应该对mpi的支持有一定需求。

目前主要有三个主流实现,openmpimpich,以及Intel编译器的MPI实现。MPI协议与这三个实现,类似于blas协议和openblas,gslcblas库之间的关系。

MPI的各个实现接口较为类似,主要包括两个部分:

  • 编译接口mpicc/mpicxx等:
    • 底层仍是调用c/c /fortran编译器,编译参数完全相同,但额外绑定了一些动态库(与硬件平台以及具体的mpi实现有关)。
    • 如果在xmake中集成这部分,只需为mpicxx等编译器接口实现绑定平台、实现相关的动态库即可。我自己试过对mpich,手动add_links对应动态库,可以实现编译,但需要额外调整,例如导出CMakeLists后将对应的g /clang 接口手动替换为mpicxx等。
  • 运行接口mpirun(或mpiexec):
    • mpi因为需要使用局域网通信,所以mpi程序需要进入特定的通信模式才能进程运行。基本语法类似为 mpirun -np 12 mpi_programe-np后的数字为所用程序进程数。
    • 可以参考xmake run --mpiexec="-n 4" hello
* [ ]  支持mpi程序的运行:mpi程序不能直接运行,需要通过mpiexec来运行:`mpiexec -n 4 ./hello`,命令行还可能指定host等更多信息,然而这和xmake run的模型冲突了,可能要想点别的办法,例如`xmake run --mpiexec="-n 4" hello`。并行数-n非常常用,也可考虑单独适配。

另外,Intel HPC Toolkit的mpi实现中,mpicxx,mpirun均为shell脚本,如果作者想考虑集成mpi的话,intel家这些脚本可以直观显示mpi的基本工作流程,是一个不错的参考。

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


I haven’t had time to study this recently. You can help provide some relevant download links and documents for mpi tools. You can also help with PR for some basic find xxx and package fetch. .

Hello author, I have encountered this problem recently, so I would like to give some suggestions, hope it will be helpful.

MPI itself is just a communication protocol, used for inter-process communication to achieve distributed computing. It is mainly used for large-scale parallel computing across super-nodes and is considered an infrastructure in the HPC field. Therefore, HPC-related developers, mainly scientific researchers who use various large-scale numerical simulations, should have certain needs for mpi support.

There are currently three mainstream implementations, openmpi, mpich, and the MPI implementation of the Intel compiler. The relationship between the MPI protocol and these three implementations is similar to the blas protocol and the openblas and gslcblas libraries.

The various implementation interfaces of MPI are relatively similar and mainly include two parts:

  • Compile interface mpicc/mpicxx, etc.:
    • The bottom layer still calls the c/c /fortran compiler, the compilation parameters are exactly the same, but some additional dynamic libraries are bound (related to the hardware platform and specific mpi implementation).
    • If you integrate this part in xmake, you only need to implement the binding platform and related dynamic libraries for compiler interfaces such as mpicxx. I have tried manually adding_links to the corresponding dynamic library for mpich, which can achieve compilation, but requires additional adjustments, such as manually replacing the corresponding g /clang interface with mpicxx after exporting CMakeLists.
  • Run interface mpirun (or mpiexec):
    • Because mpi requires LAN communication, the mpi program needs to enter a specific communication mode before the process can run. The basic syntax is similar to mpirun -np 12 mpi_programe, the number after -np is the number of program processes used.
    • You can refer to xmake run --mpiexec="-n 4" hello
  • Support the running of mpi programs: mpi programs cannot be run directly and need to be run through mpiexec: mpiexec -n 4 ./hello. The command line may also specify host and other more information. However, this is different from the model of xmake run If there is a conflict, you may need to think of other solutions, such as xmake run --mpiexec="-n 4" hello. Parallel number -n is very commonly used, and individual adaptation can also be considered.

In addition, in the mpi implementation of Intel HPC Toolkit, mpicxx and mpirun are both shell scripts. If the author wants to consider integrating mpi, these scripts from Intel can visually display the basic workflow of mpi, which is a good reference.

@waruqi
Copy link
Member

waruqi commented Aug 5, 2024

最近还没时间去研究这个,可以帮忙提供一些 mpi 工具的相关下载链接和文档,一些基础的 find xxx 和 包 fetch 也可以先帮忙提 pr 进来。。

作者你好,最近也遇到了这个问题,所以想提一点小建议,希望有所助益。

MPI本身只是一个通信协议,用于进程间通信以实现分布式计算,主要用于超算跨节点的大型并行计算,算是HPC领域的基础设施。所以,HPC相关的开发人员,主要是使用各种大规模数值模拟的科研人员,应该对mpi的支持有一定需求。

目前主要有三个主流实现,openmpimpich,以及Intel编译器的MPI实现。MPI协议与这三个实现,类似于blas协议和openblas,gslcblas库之间的关系。

MPI的各个实现接口较为类似,主要包括两个部分:

  • 编译接口mpicc/mpicxx等:

    • 底层仍是调用c/c /fortran编译器,编译参数完全相同,但额外绑定了一些动态库(与硬件平台以及具体的mpi实现有关)。
    • 如果在xmake中集成这部分,只需为mpicxx等编译器接口实现绑定平台、实现相关的动态库即可。我自己试过对mpich,手动add_links对应动态库,可以实现编译,但需要额外调整,例如导出CMakeLists后将对应的g /clang 接口手动替换为mpicxx等。
  • 运行接口mpirun(或mpiexec):

    • mpi因为需要使用局域网通信,所以mpi程序需要进入特定的通信模式才能进程运行。基本语法类似为 mpirun -np 12 mpi_programe-np后的数字为所用程序进程数。
    • 可以参考xmake run --mpiexec="-n 4" hello
* [ ]  支持mpi程序的运行:mpi程序不能直接运行,需要通过mpiexec来运行:`mpiexec -n 4 ./hello`,命令行还可能指定host等更多信息,然而这和xmake run的模型冲突了,可能要想点别的办法,例如`xmake run --mpiexec="-n 4" hello`。并行数-n非常常用,也可考虑单独适配。

另外,Intel HPC Toolkit的mpi实现中,mpicxx,mpirun均为shell脚本,如果作者想考虑集成mpi的话,intel家这些脚本可以直观显示mpi的基本工作流程,是一个不错的参考。

好的,感谢,不过近期我暂时也没时间去支持它,等后面支持的时候可以参考下。如果有用户感兴趣的话,也可以尝试提 pr 去支持里面部分基础特性,比如增加 mpicc/mpicxx 的支持等待。

@Issues-translate-bot
Copy link

Bot detected the issue body's language is not English, translate it automatically.


I haven’t had time to study this recently. You can help provide some relevant download links and documents for mpi tools. You can also provide PR for some basic find xxx and package fetch. .

Hello author, I have encountered this problem recently, so I would like to give some suggestions, hoping it will be helpful.

MPI itself is just a communication protocol, used for inter-process communication to achieve distributed computing. It is mainly used for large-scale parallel computing across super-nodes and is considered an infrastructure in the HPC field. Therefore, HPC-related developers, mainly scientific researchers who use various large-scale numerical simulations, should have certain needs for mpi support.

Currently there are three main mainstream implementations, openmpi, mpich, and the MPI implementation of the Intel compiler. The relationship between the MPI protocol and these three implementations is similar to the blas protocol and the openblas and gslcblas libraries.

The various implementation interfaces of MPI are relatively similar and mainly include two parts:

  • Compile interface mpicc/mpicxx, etc.:

  • The bottom layer still calls the c/c /fortran compiler, and the compilation parameters are exactly the same, but some additional dynamic libraries are bound (related to the hardware platform and specific mpi implementation).

  • If you integrate this part in xmake, you only need to implement the binding platform and related dynamic libraries for compiler interfaces such as mpicxx. I have tried manually adding_links to the corresponding dynamic library for mpich, which can achieve compilation, but requires additional adjustments, such as manually replacing the corresponding g /clang interface with mpicxx after exporting CMakeLists.

  • Run interface mpirun (or mpiexec):

  • Because mpi requires LAN communication, the mpi program needs to enter a specific communication mode before the process can run. The basic syntax is similar to mpirun -np 12 mpi_programe, the number after -np is the number of program processes used.

  • Please refer to xmake run --mpiexec="-n 4" hello

* [ ] Supports the running of mpi programs: mpi programs cannot be run directly and need to be run through mpiexec: `mpiexec -n 4 ./hello`. The command line may also specify host and other more information. However, this is different from xmake run. If the model conflicts, you may need to think of other solutions, such as `xmake run --mpiexec="-n 4" hello`. Parallel number -n is very commonly used, and individual adaptation can also be considered.

In addition, in the mpi implementation of Intel HPC Toolkit, mpicxx and mpirun are both shell scripts. If the author wants to consider integrating mpi, these scripts from Intel can visually display the basic workflow of mpi, which is a good reference.

Okay, thank you, but I don’t have time to support it recently. You can refer to it when it is supported later. If any user is interested, you can also try to submit a PR to support some of the basic features, such as adding support for mpicc/mpicxx.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants