Skip to content

Implementation of the Multi-Armed Bandit where each arm returns continuous numerical rewards. Covers Epsilon-Greedy, UCB1, and Thompson Sampling with detailed explanations.

License

Notifications You must be signed in to change notification settings

ReinerJasin/Multi-Armed-Bandit

Repository files navigation

Multi-Armed-Bandit

Implementation of the Multi-Armed Bandit where each arm returns continuous numerical rewards. Covers Epsilon-Greedy, UCB1, and Thompson Sampling with detailed explanations.

Multi-Armed Bandit Implementation on Numerical Data

This repository explores the Multi-Armed Bandit problem using numerical data instead of the traditional Bernoulli distribution (which returns only 0 or 1). It provides a comprehensive overview of the fundamental concepts, alongside practical implementations of three popular algorithms:

✅ Epsilon-Greedy – Balancing exploration and exploitation through a probability-based approach. ✅ UCB1 (Upper Confidence Bound) – Optimizing decision-making with confidence intervals. ✅ Thompson Sampling – A Bayesian approach to adaptive learning.

The notebook includes detailed explanations, code implementations, and visualizations to help you understand how these algorithms work in real-world scenarios.

📌 Ideal for: Data scientists, AI researchers, and anyone interested in reinforcement learning.

Feel free to explore, experiment, and contribute! 🚀

About

Implementation of the Multi-Armed Bandit where each arm returns continuous numerical rewards. Covers Epsilon-Greedy, UCB1, and Thompson Sampling with detailed explanations.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •