Pytorch implementation of Hierarchical Intentional-Unintentional Soft Actor-Critic (HIU-SAC) algorithm
This repository contains the source code used for the experiments conducted in the paper: Hierarchical reinforcement learning for concurrent discovery of compound and composable policies
The algorithm has been tested on continuous control tasks in RoboLearn environments.
Some videos can be found at https://sites.google.com/view/hrl-concurrent-discovery
The code has been tested with PyTorch 1.0.1 and Python 3.5 (or later).
It is recommended to first create either a virtualenv or a conda environment.
- Option 1: Conda environment. First install either miniconda (recommended) or anaconda. Installation instructions
# Create the conda environment
conda create -n <condaenv_name> python=3.5
# Activate the conda environment
conda activate <condaenv_name>
- Option 2: Virtualenv. First install pip and virtualenv. Installation instructions
# Create the virtual environment
virtualenv -p python3.5 <virtualenv_name>
# Activate the virtual environment
source <virtualenv_name>/bin/activate
- Clone this repository
git clone https://github.com/domingoesteban/hiu_sac
- Install the requirements of this repository
cd hiu_sac
pip install -r requirements.txt
- Run HIU-SAC in one of the environments. Options: navigation2d, reacher, pusher, centauro
# python train.py -e <env_name>
python train.py -e navigation2d
- Visualize the learned policy (Specify the log directory that is printed during the learning process)
python eval.py <path_to_log_directory>
- Plot the learning curves in the composable and compound tasks (Specify the log directory that is printed during the learning process)
python eval.py <path_to_log_directory> -p
If this repository was useful for your research, we would appreciate that you can cite it:
@article{esteban2019hiusac,
title={Hierarchical Reinforcement Learning for Concurrent Discovery of Compound and Composable Policies},
author={Domingo Esteban and Leonel Rozo and Darwin G. Caldwell},
journal={arXiv preprint arXiv:1905.09668},
year={2019}
}