Welcome to actorch
, a deep reinforcement learning framework for fast prototyping based on
PyTorch. The following algorithms have been implemented so far:
- REINFORCE
- Advantage Actor-Critic (A2C)
- Actor-Critic Kronecker-Factored Trust Region (ACKTR)
- Trust Region Policy Optimization (TRPO)
- Proximal Policy Optimization (PPO)
- Advantage-Weighted Regression (AWR)
- Deep Deterministic Policy Gradient (DDPG)
- Distributional Deep Deterministic Policy Gradient (D3PG)
- Twin Delayed Deep Deterministic Policy Gradient (TD3)
- Soft Actor-Critic (SAC)
- Support for OpenAI Gymnasium environments
- Support for custom observation/action spaces
- Support for custom multimodal input multimodal output models
- Support for recurrent models (e.g. RNNs, LSTMs, GRUs, etc.)
- Support for custom policy/value distributions
- Support for custom preprocessing/postprocessing pipelines
- Support for custom exploration strategies
- Support for normalizing flows
- Batched environments (both for training and evaluation)
- Batched trajectory replay
- Batched and distributional value estimation (e.g. batched and distributional Retrace and V-trace)
- Data parallel and distributed data parallel multi-GPU training and evaluation
- Automatic mixed precision training
- Integration with Ray Tune for experiment execution and hyperparameter tuning at any scale
- Effortless experiment definition through Python-based configuration files
- Built-in visualization tool to plot performance metrics
- Modular object-oriented design
- Detailed API documentation
For Windows, make sure the latest Visual C runtime is installed.
First of all, install Python 3.6 or later. Open a terminal and run:
pip install actorch
Clone or download and extract the repository, navigate to <path-to-repository>/bin
and run
the installation script (install.sh
for Linux/macOS, install.bat
for Windows).
actorch
and its dependencies (pinned to a specific version) will be installed in
a Conda virtual environment named actorch-env
.
NOTE: you can directly use actorch-env
and the actorch
package in the local project
directory for development (see For development).
First of all, install Docker and NVIDIA Container Runtime.
Clone or download and extract the repository, navigate to <path-to-repository>
, open a
terminal and run:
docker build -t <desired-image-name> . # Build image
docker run -it --runtime=nvidia <desired-image-name> # Run container from image
actorch
and its dependencies (pinned to a specific version) will be installed in the
specified Docker image.
NOTE: you can directly use the actorch
package in the local project directory inside
a Docker container run from the specified Docker image for development (see For development).
First of all, install Python 3.6 or later.
Clone or download and extract the repository, navigate to <path-to-repository>
, open a
terminal and run:
pip install .
First of all, install Python 3.6 or later and Git.
Clone or download and extract the repository, navigate to <path-to-repository>
, open a
terminal and run:
pip install -e .[all]
pre-commit install -f
This will install the package in editable mode (any change to the package in the local
project directory will automatically reflect on the environment-wide package installed
in the site-packages
directory of your environment) along with its development, test
and optional dependencies.
Additionally, it installs a git commit hook.
Each time you commit, unit tests, static type checkers, code formatters and linters are
run automatically. Run pre-commit run --all-files
to check that the hook was successfully
installed. For more details, see pre-commit
's documentation.
In this example we will solve the OpenAI Gymnasium environment
CartPole-v1
using REINFORCE.
Copy the following configuration in a file named REINFORCE_CartPole-v1.py
(with the
same indentation):
import gymnasium as gym
from torch.optim import Adam
from actorch import *
experiment_params = ExperimentParams(
run_or_experiment=REINFORCE,
stop={"training_iteration": 50},
resources_per_trial={"cpu": 1, "gpu": 0},
checkpoint_freq=10,
checkpoint_at_end=True,
log_to_file=True,
export_formats=["checkpoint", "model"],
config=REINFORCE.Config(
train_env_builder=lambda **config: ParallelBatchedEnv(
lambda **kwargs: gym.make("CartPole-v1", **kwargs),
config,
num_workers=2,
),
train_num_episodes_per_iter=5,
eval_freq=10,
eval_env_config={"render_mode": None},
eval_num_episodes_per_iter=10,
policy_network_model_builder=FCNet,
policy_network_model_config={
"torso_fc_configs": [{"out_features": 64, "bias": True}],
},
policy_network_optimizer_builder=Adam,
policy_network_optimizer_config={"lr": 1e-1},
discount=0.99,
entropy_coeff=0.001,
max_grad_l2_norm=0.5,
seed=0,
enable_amp=False,
enable_reproducibility=True,
log_sys_usage=True,
suppress_warnings=True,
),
)
Open a terminal in the directory where you saved the configuration file and run
(if you installed actorch
in a virtual environment, you first need to activate
it, e.g. conda activate actorch-env
if you installed actorch
using Conda):
pip install gymnasium[classic_control] # Install dependencies for CartPole-v1
actorch run REINFORCE_CartPole-v1.py # Run experiment
NOTE: training artifacts (e.g. checkpoints, metrics, etc.) are saved in nested subdirectories.
This might cause issues on Windows, since the maximum path length is 260 characters. In that case,
move the configuration file (or set local_dir
) to an upper level directory (e.g. Desktop
),
shorten the configuration file name, and/or shorten the algorithm name
(e.g. DistributedDataParallelREINFORCE.rename("DDPR")
).
Wait for a few minutes until the training ends. The mean cumulative reward over the last 100 episodes should exceed 475, which means that the environment was successfully solved. You can now plot the performance metrics saved in the auto-generated TensorBoard (or CSV) log files using Plotly (or Matplotlib):
pip install actorch[vistool] # Install dependencies for VisTool
cd experiments/REINFORCE_CartPole-v1/<auto-generated-experiment-name>
actorch vistool plotly tensorboard
You can find the generated plots in plots
.
Congratulations, you ran your first experiment!
See examples
for additional configuration file examples.
HINT: since a configuration file is a regular Python script, you can use all the features of the language (e.g. inheritance).
@misc{DellaLibera2022ACTorch,
author = {Luca Della Libera},
title = {{ACTorch}: a Deep Reinforcement Learning Framework for Fast Prototyping},
year = {2022},
publisher = {GitHub},
journal = {GitHub repository},
howpublished = {\url{https://github.com/lucadellalib/actorch}},
}