(Mathematical explanation of each method is explained within the notebooks)
- Behavior cloning (BC)
- Dataset Aggregation (DAgger)
- Generative Adversarial Imitation Learning (GAIL)
- Adversarial Inverse Reinforcement Learning (AIRL)
- Soft Q Imitation Learning (SQIL)
- SQIL - Soft Actor-Crtic (SQIL-SAC)
- Tiapkin, D., Belomestny, D., Calandriello, D., Moulines, E., Naumov, A., Perrault, P., Valko, M. and Menard, P., 2023 Regularized rl. arXiv preprint arXiv:2310.17303.
- Ross, S., Gordon, G. and Bagnell, D., 2011, June. A reduction of imitation learning and structured prediction to no-regret online learning. In Proceedings of the fourteenth international conference on artificial intelligence and statistics (pp. 627-635). JMLR Workshop and Conference Proceedings.
- Ho, J. and Ermon, S., 2016. Generative adversarial imitation learning. Advances in neural information processing systems, 29.
- Fu, J., Luo, K. and Levine, S., 2017. Learning robust rewards with adversarial inverse reinforcement learning. arXiv preprint arXiv:1710.11248.
- Reddy, S., Dragan, A.D. and Levine, S., 2019. Sqil: Imitation learning via reinforcement learning with sparse rewards. arXiv preprint arXiv:1905.11108.