Index:

Lab1:

Probability And Statistics:
Markov Chain, Sampling from Distributions,

Lab2:

Multi Arm Bandits:
Study of algorithms Like UCB, Thompson Sampling, Epsilon Greedy, Reinforce, Softmax for Multi Arm Bandits Problem with Bernoulli and Gaussian reward distribution.

Lab3:

DP Methods for RL:
Policy And Value Iteration for GridWorld

Lab4:

Model Free RL Algorithms:
MonteCarlo Control, SARSA, Q-Learning for MountainCar (Continious env), Taxi (discrete env).

Lab5:

Linear Function Approximation and Policy Gradients:
~~MonteCarlo Control, SARSA, Q-Learning with function approximation~~, DQN and A2C

Mini Project:

Literature survey, implementation and evaluation of Proximal Policy Optimization for various tasks.

Others:

Other codes and assignments following various MOOCs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Index:

Lab1:

Lab2:

Lab3:

Lab4:

Lab5:

Mini Project:

Others:

Files

README.md

Latest commit

History

README.md

File metadata and controls

Index:

Lab1:

Lab2:

Lab3:

Lab4:

Lab5:

Mini Project:

Others: