In this assignment, we will use Monte-Carlo (MC) Methods and Temporal Difference (TD) Learning on couple of games and toy problems. The problems as given below:
- Train an agent that plays the Tic-Tac-Toe using Monte-Carlo Methods.
- Train an agent that generates the optimal policy through TD-Methods in the Frozen-Lake Environment.
- Build a Deep Q-Learning Network (DQN) which can play Atari Breakout and get the best scores. I was not able to implement this component of the assignment, so instead I build a DQN which can play the cart-pole game.
Details of the problems are included in the respective folders.
.
βββ Q_1
βΒ Β βββ Mc_OffPolicy_agent.dat
βΒ Β βββ Mc_OnPolicy_agent
βΒ Β βββ Monte-Carlo_Methods(3).html
βΒ Β βββ Monte-Carlo_Methods.ipynb
βΒ Β βββ __pycache__
βΒ Β βββ base_agent.py
βΒ Β βββ best_td_agent.dat
βΒ Β βββ gym-tictactoe
βΒ Β βββ human_agent.py
βΒ Β βββ mc_agents.py
βΒ Β βββ td_agent.py
βββ Q_2
βΒ Β βββ Expected_Sarsa.py
βΒ Β βββ Frozen_Lake_Through_TD_Methods.html
βΒ Β βββ Frozen_Lake_Through_TD_Methods.ipynb
βΒ Β βββ Q_Learning.py
βΒ Β βββ Sarsa.py
βΒ Β βββ __pycache__
βΒ Β βββ frozen_lake.py
βββ Q_3
βΒ Β βββ DQN_Agent.py
βΒ Β βββ Function_Approximation_DQN.html
βΒ Β βββ Function_Approximation_DQN.ipynb
βΒ Β βββ __pycache__
βΒ Β βββ cartpole-dqn.h5
βββ README.md
βββ assignment.pdf
7 directories, 21 files- Q_* - Contains files for respective problems along with trained models.
- assignment.pdf - contains the all the problems statements of the assignment.
At the time of doing the assignment, I did't have sufficient knowledge of DL to implement the last part of the assignment. I would like to complete this part of the assignment now.