Artificial Intelligence: Reinforcement Learning in Python
- Descripción
- Currículum
- Reseñas
When people talk about artificial intelligence, they usually don’t mean supervised and unsupervised machine learning.
These tasks are pretty trivial compared to what we think of AIs doing – playing chess and Go, driving cars, and beating video games at a superhuman level.
Reinforcement learning has recently become popular for doing all of that and more.
Much like deep learning, a lot of the theory was discovered in the 70s and 80s but it hasn’t been until recently that we’ve been able to observe first hand the amazing results that are possible.
In 2016 we saw Google’s AlphaGo beat the world Champion in Go.
We saw AIs playing video games like Doom and Super Mario.
Self-driving cars have started driving on real roads with other drivers and even carrying passengers (Uber), all without human assistance.
If that sounds amazing, brace yourself for the future because the law of accelerating returns dictates that this progress is only going to continue to increase exponentially.
Learning about supervised and unsupervised machine learning is no small feat. To date I have over TWENTY FIVE (25!) courses just on those topics alone.
And yet reinforcement learning opens up a whole new world. As you’ll learn in this course, the reinforcement learning paradigm is very from both supervised and unsupervised learning.
It’s led to new and amazing insights both in behavioral psychology and neuroscience. As you’ll learn in this course, there are many analogous processes when it comes to teaching an agent and teaching an animal or even a human. It’s the closest thing we have so far to a true artificial general intelligence. What’s covered in this course?
-
The multi-armed bandit problem and the explore-exploit dilemma
-
Ways to calculate means and moving averages and their relationship to stochastic gradient descent
-
Markov Decision Processes (MDPs)
-
Dynamic Programming
-
Monte Carlo
-
Temporal Difference (TD) Learning (Q-Learning and SARSA)
-
Approximation Methods (i.e. how to plug in a deep neural network or other differentiable model into your RL algorithm)
-
How to use OpenAI Gym, with zero code changes
-
Project: Apply Q-Learning to build a stock trading bot
If you’re ready to take on a brand new challenge, and learn about AI techniques that you’ve never seen before in traditional supervised machine learning, unsupervised machine learning, or even deep learning, then this course is for you.
See you in class!
“If you can’t implement it, you don’t understand it”
-
Or as the great physicist Richard Feynman said: “What I cannot create, I do not understand”.
-
My courses are the ONLY courses where you will learn how to implement machine learning algorithms from scratch
-
Other courses will teach you how to plug in your data into a library, but do you really need help with 3 lines of code?
-
After doing the same thing with 10 datasets, you realize you didn’t learn 10 things. You learned 1 thing, and just repeated the same 3 lines of code 10 times…
Suggested Prerequisites:
-
Calculus
-
Probability
-
Object-oriented programming
-
Python coding: if/else, loops, lists, dicts, sets
-
Numpy coding: matrix and vector operations
-
Linear regression
-
Gradient descent
WHAT ORDER SHOULD I TAKE YOUR COURSES IN?:
-
Check out the lecture “Machine Learning and AI Prerequisite Roadmap” (available in the FAQ of any of my courses, including the free Numpy course)
UNIQUE FEATURES
-
Every line of code explained in detail – email me any time if you disagree
-
No wasted time “typing” on the keyboard like other courses – let’s be honest, nobody can really write code worth learning about in just 20 minutes from scratch
-
Not afraid of university-level math – get important details about algorithms that other courses leave out
-
6Section Introduction: The Explore-Exploit DilemmaVideo lesson
-
7Applications of the Explore-Exploit DilemmaVideo lesson
-
8Epsilon-Greedy TheoryVideo lesson
-
9Calculating a Sample Mean (pt 1)Video lesson
-
10Epsilon-Greedy Beginner's Exercise PromptVideo lesson
-
11Designing Your Bandit ProgramVideo lesson
-
12Epsilon-Greedy in CodeVideo lesson
-
13Comparing Different EpsilonsVideo lesson
-
14Optimistic Initial Values TheoryVideo lesson
-
15Optimistic Initial Values Beginner's Exercise PromptVideo lesson
-
16Optimistic Initial Values CodeVideo lesson
-
17UCB1 TheoryVideo lesson
-
18UCB1 Beginner's Exercise PromptVideo lesson
-
19UCB1 CodeVideo lesson
-
20Bayesian Bandits / Thompson Sampling Theory (pt 1)Video lesson
-
21Bayesian Bandits / Thompson Sampling Theory (pt 2)Video lesson
-
22Thompson Sampling Beginner's Exercise PromptVideo lesson
-
23Thompson Sampling CodeVideo lesson
-
24Thompson Sampling With Gaussian Reward TheoryVideo lesson
-
25Thompson Sampling With Gaussian Reward CodeVideo lesson
-
26Exercise on Gaussian RewardsVideo lesson
-
27Why don't we just use a library?Video lesson
-
28Nonstationary BanditsVideo lesson
-
29Bandit Summary, Real Data, and Online LearningVideo lesson
-
30(Optional) Alternative Bandit DesignsVideo lesson
-
31Suggestion BoxVideo lesson
-
34MDP Section IntroductionVideo lesson
-
35GridworldVideo lesson
-
36Choosing RewardsVideo lesson
-
37The Markov PropertyVideo lesson
-
38Markov Decision Processes (MDPs)Video lesson
-
39Future RewardsVideo lesson
-
40Value FunctionsVideo lesson
-
41The Bellman Equation (pt 1)Video lesson
-
42The Bellman Equation (pt 2)Video lesson
-
43The Bellman Equation (pt 3)Video lesson
-
44Bellman ExamplesVideo lesson
-
45Optimal Policy and Optimal Value Function (pt 1)Video lesson
-
46Optimal Policy and Optimal Value Function (pt 2)Video lesson
-
47MDP SummaryVideo lesson
-
48Dynamic Programming Section IntroductionVideo lesson
-
49Iterative Policy EvaluationVideo lesson
-
50Designing Your RL ProgramVideo lesson
-
51Gridworld in CodeVideo lesson
-
52Iterative Policy Evaluation in CodeVideo lesson
-
53Windy Gridworld in CodeVideo lesson
-
54Iterative Policy Evaluation for Windy Gridworld in CodeVideo lesson
-
55Policy ImprovementVideo lesson
-
56Policy IterationVideo lesson
-
57Policy Iteration in CodeVideo lesson
-
58Policy Iteration in Windy GridworldVideo lesson
-
59Value IterationVideo lesson
-
60Value Iteration in CodeVideo lesson
-
61Dynamic Programming SummaryVideo lesson
-
62Monte Carlo IntroVideo lesson
-
63Monte Carlo Policy EvaluationVideo lesson
-
64Monte Carlo Policy Evaluation in CodeVideo lesson
-
65Monte Carlo ControlVideo lesson
-
66Monte Carlo Control in CodeVideo lesson
-
67Monte Carlo Control without Exploring StartsVideo lesson
-
68Monte Carlo Control without Exploring Starts in CodeVideo lesson
-
69Monte Carlo SummaryVideo lesson
-
78Approximation Methods Section IntroductionVideo lesson
-
79Linear Models for Reinforcement LearningVideo lesson
-
80Feature EngineeringVideo lesson
-
81Approximation Methods for PredictionVideo lesson
-
82Approximation Methods for Prediction CodeVideo lesson
-
83Approximation Methods for ControlVideo lesson
-
84Approximation Methods for Control CodeVideo lesson
-
85CartPoleVideo lesson
-
86CartPole CodeVideo lesson
-
87Approximation Methods ExerciseVideo lesson
-
88Approximation Methods Section SummaryVideo lesson
