Home of the iDMA - a modular, parametrizable, and highly flexible Data Movement Accelerator (DMA) architecture targeting a wide range of platforms from ultra-low power edge nodes to high-performance computing systems. iDMA is part of the PULP (Parallel Ultra-Low-Power) platform, where it is used as a cluster level DMA in the Snitch Cluster and in the PULP Cluster.
iDMA currently implements the following protocols:
iDMA is centered around the idea to split the DMA engine in 3 distinct parts:
- Frontend: The frontend implements the communication with the platform and emits transfer requests
- Midend: Midend(s) transform a transfer request from the frontend to generic 1D transfers, which can be handled by the backend.
- Bakend: The backend gets a 1D transfer
(src_addr, dst_addr, length)
and executes it on the transport protocol's manager interface.
The interface between the parts are well-defined, making it easy to adapt to a new system or to add new capabilities.
The latest documentation can be accessed pre-built.
If you use iDMA in your work or research, you can cite us:
@misc{benz2023highperformance,
title={A High-performance, Energy-efficient Modular {DMA} Engine Architecture},
author={Thomas Benz and Michael Rogenmoser and Paul Scheffler and Samuel Riedel and Alessandro Ottaviano and Andreas Kurth and Torsten Hoefler and Luca Benini},
year={2023},
eprint={2305.05240},
archivePrefix={arXiv},
primaryClass={cs.AR}
}
The following systems/publications make use of iDMA:
An Open-Source Platform for High-Performance Non-Coherent On-Chip Communication
@article{Kurth2020AnOP,
title={An Open-Source Platform for High-Performance Non-Coherent On-Chip Communication},
author={Andreas Kurth and Wolfgang R{\"o}nninger and Thomas Emanuel Benz and Matheus A. Cavalcante and Fabian Schuiki and Florian Zaruba and Luca Benini},
journal={IEEE Transactions on Computers},
year={2020},
volume={71},
pages={1794-1809},
url={https://api.semanticscholar.org/CorpusID:221640945}
}
PsPIN: A high-performance low-power architecture for flexible in-network compute
@article{Girolamo2020PsPINAH,
title={PsPIN: A high-performance low-power architecture for flexible in-network compute},
author={Salvatore Di Girolamo and Andreas Kurth and Alexandru Calotoiu and Thomas Emanuel Benz and Timo Schneider and Jakub Ber{\'a}nek and Luca Benini and Torsten Hoefler},
journal={ArXiv},
year={2020},
volume={abs/2010.03536},
url={https://api.semanticscholar.org/CorpusID:222177442}
}
Indirection Stream Semantic Register Architecture for Efficient Sparse-Dense Linear Algebra
@article{Scheffler2020IndirectionSS,
title={Indirection Stream Semantic Register Architecture for Efficient Sparse-Dense Linear Algebra},
author={Paul Scheffler and Florian Zaruba and Fabian Schuiki and Torsten Hoefler and Luca Benini},
journal={2021 Design, Automation \& Test in Europe Conference \& Exhibition (DATE)},
year={2020},
pages={1787-1792},
url={https://api.semanticscholar.org/CorpusID:226964339}
}
A RISC-V in-network accelerator for flexible high-performance low-power packet processing
@article{Girolamo2021ARI,
title={A RISC-V in-network accelerator for flexible high-performance low-power packet processing},
author={Salvatore Di Girolamo and Andreas Kurth and Alexandru Calotoiu and Thomas Emanuel Benz and Timo Schneider and Jakub Ber{\'a}nek and Luca Benini and Torsten Hoefler},
journal={2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA)},
year={2021},
pages={958-971},
url={https://api.semanticscholar.org/CorpusID:235416184}
}
A 10-core SoC with 20 Fine-Grain Power Domains for Energy-Proportional Data-Parallel Processing over a Wide Voltage and Temperature Range
@article{Benz2021A1S,
title={A 10-core SoC with 20 Fine-Grain Power Domains for Energy-Proportional Data-Parallel Processing over a Wide Voltage and Temperature Range},
author={Thomas Emanuel Benz and Luca Bertaccini and Florian Zaruba and Fabian Schuiki and Frank K. G{\"u}rkaynak and Luca Benini},
journal={ESSCIRC 2021 - IEEE 47th European Solid State Circuits Conference (ESSCIRC)},
year={2021},
pages={263-266},
url={https://api.semanticscholar.org/CorpusID:240003121}
}
PATRONoC: Parallel AXI Transport Reducing Overhead for Networks-on-Chip targeting Multi-Accelerator DNN Platforms at the Edge
@article{Jain2023PATRONoCPA,
title={PATRONoC: Parallel AXI Transport Reducing Overhead for Networks-on-Chip targeting Multi-Accelerator DNN Platforms at the Edge},
author={Vikram Jain and Matheus A. Cavalcante and Nazareno Bruschi and Michael Rogenmoser and Thomas Emanuel Benz and Andreas Kurth and Davide Rossi and Luca Benini and Marian Verhelst},
journal={2023 60th ACM/IEEE Design Automation Conference (DAC)},
year={2023},
pages={1-6},
url={https://api.semanticscholar.org/CorpusID:260351087}
}
Sparse Stream Semantic Registers: A Lightweight ISA Extension Accelerating General Sparse Linear Algebra
@article{Scheffler2023SparseSS,
title={Sparse Stream Semantic Registers: A Lightweight ISA Extension Accelerating General Sparse Linear Algebra},
author={Paul Scheffler and Florian Zaruba and Fabian Schuiki and Torsten Hoefler and Luca Benini},
journal={ArXiv},
year={2023},
volume={abs/2305.05559},
url={https://api.semanticscholar.org/CorpusID:258564420}
}
Iguana: An End-to-End Open-Source Linux-capable RISC-V SoC in 130nm CMOS
@article{benziguana,
title={Iguana: An End-to-End Open-Source Linux-capable RISC-V SoC in 130nm CMOS},
author={Benz, Thomas and Scheffler, Paul and Sch{\"o}nleber, Jannis and Benini, Luca}
}
Cheshire: A Lightweight, Linux-Capable RISC-V Host Platform for Domain-Specific Accelerator Plug-In
@article{Ottaviano2023CheshireAL,
title={Cheshire: A Lightweight, Linux-Capable RISC-V Host Platform for Domain-Specific Accelerator Plug-In},
author={Alessandro Ottaviano and Thomas Emanuel Benz and Paul Scheffler and Luca Benini},
journal={ArXiv},
year={2023},
volume={abs/2305.04760},
url={https://api.semanticscholar.org/CorpusID:258557988}
}
MemPool: A Scalable Manycore Architecture with a Low-Latency Shared L1 Memory
@article{Riedel2023MemPoolAS,
title={MemPool: A Scalable Manycore Architecture with a Low-Latency Shared L1 Memory},
author={Samuel Riedel and Matheus A. Cavalcante and Renzo Andri and Luca Benini},
journal={ArXiv},
year={2023},
volume={abs/2303.17742},
url={https://api.semanticscholar.org/CorpusID:257900957}
}
OSMOSIS: Enabling Multi-Tenancy in Datacenter SmartNICs
@article{Khalilov2023OSMOSISEM,
title={OSMOSIS: Enabling Multi-Tenancy in Datacenter SmartNICs},
author={Mikhail Khalilov and Marcin Chrapek and Siyuan Shen and Alessandro Vezzu and Thomas Emanuel Benz and Salvatore Di Girolamo and Timo Schneider and Daniele Di Sensi and Luca Benini and Torsten Hoefler},
journal={ArXiv},
year={2023},
volume={abs/2309.03628},
url={https://api.semanticscholar.org/CorpusID:261582327}
}
iDMA is released under Solderpad v0.51 (SHL-0.51) see LICENSE
:
We are happy to accept pull requests and issues from any contributors. See CONTRIBUTING.md
for additional information.
iDMA can directly be integrated after cloning it from this repository. However, to regenerate the configuration registers, build the documentation, and run various checks on the source code, various tools are required.
bender >= v0.24.0
morty >= v0.6.0
Verilator = v4.202
Verible >= v0.0-1051-gd4cd328
Python3 >= 3.8
including some the libraries listed inrequirements.txt
Use make doc
to build the documentation. The output is located at doc/build
.
We currently do not include any free and open-source simulation setup. However, if you have access to Questa advanced simulator, a simulation can be launched using:
make idma_sim_all
cd target/sim/vsim
$VSIM -c -do "source compile.tcl; quit"
$VSIM -c -t 1ps -voptargs= acc \
job_file=jobs/backend_rw_axi/simple.txt \
-logfile rw_axi_simple.log \
-wlf rw_axi_simple.wlf \
tb_idma_backend_rw_axi \
-do "source start.tcl; run -all"
with gui:
make idma_sim_all
cd target/sim/vsim
$VSIM -c -do "source compile.tcl; quit"
$VSIM -t 1ps -voptargs= acc \
job_file=jobs/backend_rw_axi/simple.txt \
-logfile rw_axi_simple.log \
-wlf rw_axi_simple.wlf \
tb_idma_backend_rw_axi \
-do "source start.tcl; source wave/backend_rw_axi.do; run -all"
Where:
job_file=jobs/backend_rw_axi/simple.txt
can point to any valid job file-logfile rw_axi_simple.log
denotes the log file-wlf rw_axi_simple.wlf
specifies a wave filetb_idma_backend_rw_axi
can be any of the supplied testbenches