A multi-armed bandit problem - or, simply, a bandit problem - is a sequential allocation problem defined by a set of actions. At each time step, a unit resource is allocated to an action and some observable payoff is obtained. The goal is to maximize the total payoff obtained in a sequence of allocations. The name bandit refers to the colloquial term for a slot machine (a ""one-armed bandit"" in American slang). In a casino, a sequential allocation problem is obtained when the player is facing many slot machines at once (a ""multi-armed bandit""), and must repeatedly choose where to insert the next coin.
Multi-armed bandit problems are the most basic examples of sequential decision problems with an exploration-exploitation trade-off. This is the balance between staying with the option that gave highest payoffs in the past and exploring new options that might give higher payoffs in the future. Although the study of bandit problems dates back to the 1930s, exploration-exploitation trade-offs arise in several modern applications, such as ad placement, website optimization, and packet routing. Mathematically, a multi-armed bandit is defined by the payoff process associated with each option.
In this book, the focus is on two extreme cases in which the analysis of regret is particularly simple and elegant: independent and identically distributed payoffs and adversarial payoffs. Besides the basic setting of finitely many actions, it also analyzes some of the most important variants and extensions, such as the contextual bandit model. This monograph is an ideal reference for students and researchers with an interest in bandit problems.
Le informazioni nella sezione "Riassunto" possono far riferimento a edizioni diverse di questo titolo.
EUR 17,49 per la spedizione da U.S.A. a Italia
Destinazione, tempi e costiEUR 5,98 per la spedizione da Regno Unito a Italia
Destinazione, tempi e costiDa: PBShop.store UK, Fairford, GLOS, Regno Unito
PAP. Condizione: New. New Book. Delivered from our UK warehouse in 4 to 14 business days. THIS BOOK IS PRINTED ON DEMAND. Established seller since 2000. Codice articolo IQ-9781601986269
Quantità: 15 disponibili
Da: California Books, Miami, FL, U.S.A.
Condizione: New. Codice articolo I-9781601986269
Quantità: Più di 20 disponibili
Da: GreatBookPrices, Columbia, MD, U.S.A.
Condizione: New. Codice articolo 19193988-n
Quantità: Più di 20 disponibili
Da: GreatBookPrices, Columbia, MD, U.S.A.
Condizione: As New. Unread book in perfect condition. Codice articolo 19193988
Quantità: Più di 20 disponibili
Da: GreatBookPricesUK, Woodford Green, Regno Unito
Condizione: New. Codice articolo 19193988-n
Quantità: Più di 20 disponibili
Da: THE SAINT BOOKSTORE, Southport, Regno Unito
Paperback / softback. Condizione: New. This item is printed on demand. New copy - Usually dispatched within 5-9 working days 234. Codice articolo C9781601986269
Quantità: Più di 20 disponibili
Da: GreatBookPricesUK, Woodford Green, Regno Unito
Condizione: As New. Unread book in perfect condition. Codice articolo 19193988
Quantità: 5 disponibili
Da: PBShop.store US, Wood Dale, IL, U.S.A.
PAP. Condizione: New. New Book. Shipped from UK. THIS BOOK IS PRINTED ON DEMAND. Established seller since 2000. Codice articolo L0-9781601986269
Quantità: Più di 20 disponibili
Da: Ria Christie Collections, Uxbridge, Regno Unito
Condizione: New. In. Codice articolo ria9781601986269_new
Quantità: Più di 20 disponibili
Da: moluna, Greven, Germania
Condizione: New. Dieser Artikel ist ein Print on Demand Artikel und wird nach Ihrer Bestellung fuer Sie gedruckt. Inhaltsverzeichnis1: Introduction 2: Stochastic bandits: fundamental results 3: Adversarial bandits: fundamental results 4: Contextual Bandits 5: Linear bandits 6: Nonlinear bandits 7: Variants. Acknowledgements. ReferencesKl. Codice articolo 448142518
Quantità: Più di 20 disponibili