Deep Reinforcement Learning

Advantage Actor-Critic (A2C) algorithm on Breakout (left) and Space Invaders (right)

Deep Deterministic Policy Gradient (DDPG) on MuJoCo virtual creatures The objective is to make these creatures walk.

Neural Style Transfer