Reinforcement Learning beginner to master - AI in Python
- Descripción
- Currículum
- Reseñas
This is the most complete Reinforcement Learning course on Udemy. In it you will learn the basics of Reinforcement Learning, one of the three paradigms of modern artificial intelligence. You will implement from scratch adaptive algorithms that solve control tasks based on experience. You will also learn to combine these algorithms with Deep Learning techniques and neural networks, giving rise to the branch known as Deep Reinforcement Learning.
This course will give you the foundation you need to be able to understand new algorithms as they emerge. It will also prepare you for the next courses in this series, in which we will go much deeper into different branches of Reinforcement Learning and look at some of the more advanced algorithms that exist.
The course is focused on developing practical skills. Therefore, after learning the most important concepts of each family of methods, we will implement one or more of their algorithms in jupyter notebooks, from scratch.
This course is divided into three parts and covers the following topics:
Part 1 (Tabular methods):
– Markov decision process
– Dynamic programming
– Monte Carlo methods
– Time difference methods (SARSA, Q-Learning)
– N-step bootstrapping
Part 2 (Continuous state spaces):
– State aggregation
– Tile Coding
Part 3 (Deep Reinforcement Learning):
– Deep SARSA
– Deep Q-Learning
– REINFORCE
– Advantage Actor-Critic / A2C (Advantage Actor-Critic / A2C method)


-
1[IMPORTANT] English captions available for sections 1-4Text lesson
-
2WelcomeVideo lesson
Advanced Reinforcement Learning in Python: from DQN to SAC
https://www.udemy.com/course/advanced-reinforcement/?referralCode=2C96ADF61C80DD7FD392
Advanced Reinforcement Learning in Python: cutting-edge DQNs
https://www.udemy.com/course/advanced-deep-qnetworks/?referralCode=7430E30376CCFEB8BEE9
-
3Reinforcement Learning seriesText lesson
-
4Course structureVideo lesson
-
5Environment setup [Important]Text lesson
-
6SetupVideo lesson
Link to the code repository:
https://github.com/escape-velocity-labs/beginner_master_rl
-
7Complete codeText lesson
-
8Elements common to all control tasksVideo lesson
-
9The Markov decision process (MDP)Video lesson
-
10Types of Markov decision processVideo lesson
-
11Trajectory vs episodeVideo lesson
-
12Reward vs ReturnVideo lesson
-
13Discount factorVideo lesson
-
14PolicyVideo lesson
-
15State values v(s) and action values q(s,a)Video lesson
-
16Bellman equationsVideo lesson
-
17Solving a Markov decision processVideo lesson
-
18Setup - MDP in codeText lesson
-
19MDP in code - Part 1Video lesson
-
20MDP in code - Part 2Video lesson
-
21Introduction to Dynamic ProgrammingVideo lesson
-
22Value iterationVideo lesson
-
23Setup - Value iterationText lesson
-
24Coding - Value iteration 1Video lesson
-
25Coding - Value iteration 2Video lesson
-
26Coding - Value iteration 3Video lesson
-
27Coding - Value iteration 4Video lesson
-
28Coding - Value iteration 5Video lesson
-
29Policy iterationVideo lesson
-
30Setup - Policy iterationText lesson
-
31Coding - Policy iteration 1Video lesson
-
32Policy evaluationVideo lesson
-
33Coding - Policy iteration 2Video lesson
-
34Policy ImprovementVideo lesson
-
35Coding - Policy iteration 3Video lesson
-
36Coding - Policy iteration 4Video lesson
-
37Policy iteration in practiceVideo lesson
-
38Generalized Policy Iteration (GPI)Video lesson
-
39Monte Carlo methodsVideo lesson
-
40Solving control tasks with Monte Carlo methodsVideo lesson
-
41On-policy Monte Carlo controlVideo lesson
-
42Setup - On-policy Monte Carlo controlText lesson
-
43Coding - On-policy Monte Carlo control 1Video lesson
-
44Coding - On-policy Monte Carlo control 2Video lesson
-
45Coding - On-policy Monte Carlo control 3Video lesson
-
46Setup - Constant alpha Monte CarloText lesson
-
47Coding - Constant alpha Monte CarloVideo lesson
-
48Off-policy Monte Carlo controlVideo lesson
-
49Setup - Off-policy Monte Carlo controlText lesson
-
50Coding - Off-policy Monte Carlo 1Video lesson
-
51Coding - Off-policy Monte Carlo 2Video lesson
-
52Coding - Off-policy Monte Carlo 3Video lesson
-
53Temporal difference methodsVideo lesson
-
54Solving control tasks with temporal difference methodsVideo lesson
-
55Monte Carlo vs temporal difference methodsVideo lesson
-
56SARSAVideo lesson
-
57Setup - SARSAText lesson
-
58Coding - SARSA 1Video lesson
-
59Coding - SARSA 2Video lesson
-
60Q-LearningVideo lesson
-
61Setup - Q-LearningText lesson
-
62Coding - Q-Learning 1Video lesson
-
63Coding - Q-Learning 2Video lesson
-
64Advantages of temporal difference methodsVideo lesson
-
72Setup - Classic control tasksText lesson
-
73Coding - Classic control tasksVideo lesson
-
74Working with continuous state spacesVideo lesson
-
75State aggregationVideo lesson
-
76Setup - Continuous state spacesText lesson
-
77Coding - State aggregation 1Video lesson
-
78Coding - State aggregation 2Video lesson
-
79Coding - State aggregation 3Video lesson
-
80Tile codingVideo lesson
-
81Coding - Tile coding 1Video lesson
-
82Coding - Tile coding 2Video lesson
-
83Coding - Tile coding 3Video lesson
