Classifier systems are an intriguing approach to a broad range of machine learning problems, based on automated generation and evaluation of condi tion/action rules. Inreinforcement learning tasks they simultaneously address the two major problems of learning a policy and generalising over it (and re lated objects, such as value functions). Despite over 20 years of research, however, classifier systems have met with mixed success, for reasons which were often unclear. Finally, in 1995 Stewart Wilson claimed a long-awaited breakthrough with his XCS system, which differs from earlier classifier sys tems in a number of respects, the most significant of which is the way in which it calculates the value of rules for use by the rule generation system. Specifically, XCS (like most classifiersystems) employs a genetic algorithm for rule generation, and the way in whichit calculates rule fitness differsfrom earlier systems. Wilson described XCS as an accuracy-based classifiersystem and earlier systems as strength-based. The two differin that in strength-based systems the fitness of a rule is proportional to the return (reward/payoff) it receives, whereas in XCS it is a function of the accuracy with which return is predicted. The difference is thus one of credit assignment, that is, of how a rule's contribution to the system's performance is estimated. XCS is a Q learning system; in fact, it is a proper generalisation of tabular Q-learning, in which rules aggregate states and actions. In XCS, as in other Q-learners, Q-valuesare used to weightaction selection.
Le informazioni nella sezione "Riassunto" possono far riferimento a edizioni diverse di questo titolo.
From the reviews:
"This book is a monograph on learning classifier systems ... . The main objective of the book is to compare strength-based classifier systems with accuracy-based systems. ... The book is equipped with nine appendices. ... The biggest advantage of the book is its readability. The book is well written and is illustrated with many convincing examples." (Jerzy W. Grzymal-Busse, Mathematical Reviews, Issue 2005 k)
1 Introduction.- 1.1 Two Example Machine Learning Tasks.- 1.2 Types of Task.- 1.2.1 Supervised and Reinforcement Learning.- 1.2.2 Sequential and Non-sequential Decision Tasks.- 1.3 Two Challenges for Classifier Systems.- 1.3.1 Problem 1: Learning a Policy from Reinforcement.- 1.3.2 Problem 2: Generalisation.- 1.4 Solution Methods.- 1.4.1 Method 1: Reinforcement Learning Algorithms.- 1.4.2 Method 2: Evolutionary Algorithms.- 1.5 Learning Classifier Systems.- 1.5.1 The Tripartite LCS Structure.- 1.5.2 LCS = Policy Learning + Generalisation.- 1.5.3 Credit Assignment in Classifier Systems.- 1.5.4 Strength and Accuracy-based Classifier Systems.- 1.6 About the Book.- 1.6.1 Why Compare Strength and Accuracy.- 1.6.2 Are LCS EC- or RL-based.- 1.6.3 Moving in Design Space.- 1.7 Structure of the Book.- 2 Learning Classifier Systems.- 2.1 Types of Classifier Systems.- 2.1.1 Michigan and Pittsburgh LCS.- 2.1.2 XCS and Traditional LCS?.- 2.2 Representing Rules.- 2.2.1 The Standard Ternary Language.- 2.2.2 Other Representations.- 2.2.3 Summary of Rule Representation.- 2.2.4 Notation for Rules.- 2.3 XCS.- 2.3.1 Wilson’s Motivation for XCS.- 2.3.2 Overview of XCS.- 2.3.3 Wilson’s Explore/Exploit Framework.- 2.3.4 The Performance System.- 2.3.4.1 The XCS Performance System Algorithm.- 2.3.4.2 The Match Set and Prediction Array.- 2.3.4.3 Action Selection.- 2.3.4.4 Experience-weighting of System Prediction.- 2.3.5 The Credit Assignment System.- 2.3.5.1 The MAM Technique.- 2.3.5.2 The Credit Assignment Algorithm.- 2.3.5.3 Sequential and Non-sequential Updates.- 2.3.5.4 Parameter Update Order.- 2.3.5.5 XCS Parameter Updates.- 2.3.6 The Rule Discovery System.- 2.3.6.1 Random Initial Populations.- 2.3.6.2 Covering.- 2.3.6.3 The Niche Genetic Algorithm.- 2.3.6.4 Alternative Mutation Schemes.- 2.3.6.5 Triggering the Niche GA.- 2.3.6.6 Deletion of Rules.- 2.3.6.7 Classifier Parameter Initialisation.- 2.3.6.8 Subsumption Deletion.- 2.4 SB-XCS.- 2.4.1 Specification of SB-XCS.- 2.4.2 Comparison of SB-XCS and Other Strength LCS.- 2.5 Initial Tests of XCS and SB-XCS.- 2.5.1 The 6 Multiplexer.- 2.5.2 Woods2.- 2.6 Summary.- 3 How Strength and Accuracy Differ.- 3.1 Thinking about Complex Systems.- 3.2 Holland’s Rationale for CS-1 and his Later LCS.- 3.2.1 Schema Theory.- 3.2.2 The Bucket Brigade.- 3.2.3 Schema Theory + Bucket Brigade = Adaptation.- 3.3 Wilson’s Rationale for XCS.- 3.3.1 A Bias towards Accurate Rules.- 3.3.2 A Bias towards General Rules.- 3.3.3 Complete Maps.- 3.3.4 Summary.- 3.4 A Rationale for SB-XCS.- 3.5 Analysis of Populations Evolved by XCS and SB-XCS.- 3.5.1 SB-XCS.- 3.5.2 XCS.- 3.5.3 Learning Rate.- 3.6 Different Goals, Different Representations.- 3.6.1 Default Hierarchies.- 3.6.2 Partial and Best Action Maps.- 3.6.3 Complete Maps.- 3.6.4 What do XCS and SB-XCS Really Learn?.- 3.7 Complete and Partial Maps Compared.- 3.7.1 Advantages of Partial Maps.- 3.7.2 Disadvantages of Partial Maps.- 3.7.3 Complete Maps and Strength.- 3.7.4 Contrasting Complete and Partial Maps in RL Terminology.- 3.7.5 Summary of Comparison.- 3.8 Ability to Express Generalisations.- 3.8.1 Mapping Policies and Mapping Value Functions.- 3.8.2 Adapting the Accuracy Criterion.- 3.8.3 XCS-hard and SB-XCS-easy Functions.- 3.8.4 Summary of Generalisation and Efficiency.- 3.9 Summary.- 4 What Should a Classifier System Learn?.- 4.1 Representing Boolean Functions.- 4.1.1 Truth Tables.- 4.1.2 On-sets and Off-sets.- 4.1.3 Sigma Notation.- 4.1.4 Disjunctive Normal Form.- 4.1.5 Representing Functions with Sets of Rules.- 4.1.6 How Should a Classifier System Represent a Solution?.- 4.1.7 The Value of a Single Rule.- 4.1.1 The Value of a Set of Rules.- 4.1.1 Complete and Correct Representations.- 4.1.1 Minimal Representations.- 4.1.1 Non-overlapping Representations.- 4.1.1 Why XCS Prefers Non-overlapping Populations.- 4.1.1 Should we Prefer Non-overlapping Populations?.- 4.1.1 Optimal Rule Sets: [O]s.- 4.1.1 Conflicting Rules.- 4.1.1 Representation in XCS.- 4.3 How Should We Measure Performance?.- 4.3.1 Measures of Performance.- 4.3.2 Measures of Population State.- 4.3.3 Measuring Performance and Measuring State.- 4.3.4 New Population State Metrics.- 4.3.5 Testing XCS with %[PI].- 4.3.6 Testing XCS with %[m-DNF].- 4.3.7 Summary of Metrics and Properties.- 4.4 Summary.- 5 Prospects for Adaptation.- 5.1 Known Problems with Strength LCS.- 5.2 Methodology for Rule Type Analysis.- 5.3 Analysis of Rule Types.- 5.3.1 Correct and Incorrect Actions.- 5.3.2 Overgeneral Rules.- 5.3.3 Strong Overgeneral Rules.- 5.3.4 Fit Overgeneral Rules.- 5.3.5 Parallel Definitions of Strength and Fitness.- 5.4 When are Strong and Fit Overgenerals Possible?.- 5.4.1 Biases in the Reward Function are Relevant.- 5.4.2 Competition for Action Selection.- 5.4.3 Competition for Reproduction.- 5.5 Strong Overgenerals in XCS.- 5.5.1 Biases between Actions do not Produce Strong Overgenerals.- 5.5.2 Some Properties of Accuracy-based Fitness.- 5.6 Strong Overgenerals in SB-XCS.- 5.6.1 When are Strong Overgenerals Impossible in SB-XCS?.- 5.6.2 What Makes Strong Overgenerals Possible in SB-XCS?.- 5.7 Fit Overgenerals and the Survival of Rules under the GA.- 5.7.1 Comparison on an Unbiased Reward Function.- 5.7.2 Comparison on a Biased Reward Function.- 5.7.3 Discussion.- 5.8 Designing Strong and Fit Overgenerals for XCS.- 5.8.1 Biased Variance Functions.- 5.8.2 Empirical Results.- 5.8.3 Avoiding Fit Overgenerals.- 5.8.4 SB-XCS and Biased Variance Functions.- 5.9 Strong and Fit Undergeneral Rules.- 5.10 Why Bias the Reward Function.- 5.10.1 Some State-actions are more Important than Others.- 5.10.2 A Rule Allocation Bias can Focus Resources.- 5.11 Rule Allocation Reconsidered.- 5.11.1 Knowing What Not to Do.- 5.11.2 Managing Exploration.- 5.11.3 Complete and Partial Maps Revisited.- 5.11.4 Alternatives to Biasing the Reward Function.- 5.11.5 Can SB-XCS Avoid Strong and Fit Overgenerals?.- 5.12 Sequential Tasks.- 5.12.1 The Need to Pass Values Back.- 5.12.2 The Need for Discounting.- 5.12.3 How Q-functions become Biased.- 5.12.4 Examples.- 5.12.5 Woods2 Revisited.- 5.12.6 When Will the Value Function be Unbiased?.- 5.13 What Tasks can we Solve with SB-XCS?.- 5.14 Extensions.- 5.14.1 Fitness Sharing.- 5.14.2 Other Factors Contributing to Strong Overgenerals.- 5.14.3 Qualitative and Quantitative Approaches.- 5.15 Summary.- 6 Classifier Systems and Q-learning.- 6.1 Classifier Systems and Q-learning.- 6.1.1 Q-learning in Classifier Systems.- 6.1.2 Is it Really Q-learning?.- 6.1.3 XCS is a Proper Generalisation of Tabular Q-learning.- 6.1.4 Summary.- 6.2 The GA-view and RL-view Revisited.- 6.2.1 How SB-XCS Determines Policies.- 6.2.2 How XCS Determines Policies.- 6.2.3 Three Approaches to Determining a Policy.- 6.2.4 The GA-view and the RL-view.- 6.2.5 Combining Evolution and Q-learning.- 6.3. XCS is Closer to Tabular Q-learning than to SB-XCS.- 6.4. Summary.- 7 Conclusion.- 7.1 The Capacities of Various Types of LCS.- 7.2 Contributions.- 7.3 The Take-home Message.- 7.4 Open Problems and Future Work.- 7.4.1 Fitness Sharing and Strength-based Fitness.- 7.4.2 Further Study of Accuracy-based Fitness.- 7.5 Concluding Remarks.- 7.5.1 The Moral of the Story: The Need for a Complex Systems Design Methodology.- 7.5.2 Classifier Systems and Reinforcement Learning.- 7.5.3 The Future.- Appendices.- A Evaluation of Macroclassifiers.- B Example XCS Cycle.- B.1 The Performance System Algorithm.- B.2 The Credit Assignment Algorithm.- B.3 The Rule Discovery Algorithm.- C Learning from Reinforcement.- C.1 Three Learning Paradigms.- C.1.1 Supervised Learning.- C.1.2 Reinforcement Learning.- C.1.3 Unsupervised Learning.- C.2 The Explore/Exploit Dilemma: a Feature of RL.- C.3 Sequential and Non-sequential Tasks.- C.3.1 Immediate Reward and Long-term Value.- C.3.2 Sequential Decisions Imply RL.- C.3.3 Episodic and Continuing Tasks.- C.4 The Agent’s Goal: Maximising Return.- C.4.1 Return and Reward.- C.4.2 Sequential Formulations of Return.- C.5 Formalising RL Tasks.- C.5.1 Environment.- C.5.2 Learning Agent.- C.5.3 Agent-environment Interaction.- C.6 Summary.- D Generalisation Problems.- D.1 Why Generalise?.- D.1.1 The Curse of Dimensionality.- D.1.2 The Need for Generalisation.- D.2 Generalisation in RL.- D.2.1 Generalising Over Policies and Value Functions.- D.3 State Aggregation.- D.4 State Space and Generalisation Space.- D.5 Summary.- E Value Estimation Algorithms.- E.1 The Value of State-actions.- E.2 Non-sequential RL: Estimating Reward Functions.- E.2.1 The Value of State-actions in Non-sequential Tasks.- E.3 Estimating Expectations with Sample Averages.- E.3.1 Incremental Updates.- E.3.2 A General Form of Incremental Update.- E.3.3 Setting StepSize in Incremental Updates.- E.3.4 A Prediction Algorithm for Non-sequential RL.- E.4 Sequential RL: Estimating Long-term Value Functions.- E.4.1 Long-term Value Functions.- E.4.2 The Value of State-actions in Sequential Tasks.- E.4.3 The Value of a Policy.- E.4.4 Estimating Values with Monte Carlo Methods.- E.4.5 Estimating Values with Temporal Difference Methods.- E.4.6 Russell and Norvig’s Maze: A Sequential RL Task.- E.4.7 Summary of Sequential RL.- E.5 State Aggregation.- E.5.1 Fixed and Adaptive Aggregation Schemes.- E.5.2 The Value of Aggregations I: Return.- E.5.3 The Value of Aggregations II: Predictive Utility.- E.6 Storing Value Estimates.- E.6.1 Storing Estimates of Aggregations.- E.6.2 Sparse Estimators, Models and Search.- E.6.3 Function Approximators.- E.7 Summary.- F Generalised Policy Iteration Algorithms.- F.1 Policy Improvement.- F.2 Optimal Policies.- F.3 Generalised Policy Iteration.- F.3.1 How Well must we Evaluate a Policy?.- F.3.2 Convergence Properties of GPI Control Algorithms.- F.3.3 Initialising Value Functions.- F.3.4 What Characterises GPI Algorithms?.- F.4 State-value Functions.- F.5 Summary.- G Evolutionary Algorithms.- G.1 Evolution.- G.2 Elements of EAs.- G.2.1 A Generic EA.- G.2.2 Population-based Search.- G.2.3 Fitness Functions.- G.2.4 Probabilistic Selection of Parents.- G.2.5 Genetic Operators.- G.2.6 Replacement.- G.3 EAs as Search.- G.3.1 Local and Global Optima.- G.4 The Generalisation Problem.- G.5 Niching and Mating Restriction.- G.5.1 Fitness Sharing.- G.5.2 Crowding.- G.5.3 Mating Restriction.- G.6 RL with EAs.- G.6.1 Non-associative RL with an EA.- G.6.2 Associative RL with an EA.- G.6.3 Sequential RL with an EA.- G.7 Comparing GPI and EA Methods for RL.- G.7.1 Similarities between GPI and EA Methods.- G.8 Summary.- H The Origins of Sarsa.- H.1 Modified Connectionist Q-learning.- H.2 ZCS’s Implicit Bucket Brigade.- H.3 Who Invented Sarsa?.- I Notation.- References.
Le informazioni nella sezione "Su questo libro" possono far riferimento a edizioni diverse di questo titolo.
EUR 4,26 per la spedizione in U.S.A.
Destinazione, tempi e costiEUR 3,41 per la spedizione in U.S.A.
Destinazione, tempi e costiDa: Silicon Valley Fine Books, Sunnyvale, CA, U.S.A.
Condizione: Fine. First Edition, hardcover. 307 pages. Fine, a very sharp copy with a few light pressure marks on cover. Codice articolo C17812
Quantità: 1 disponibili
Da: Lucky's Textbooks, Dallas, TX, U.S.A.
Condizione: New. Codice articolo ABLIING23Mar2912160256813
Quantità: Più di 20 disponibili
Da: GreatBookPrices, Columbia, MD, U.S.A.
Condizione: New. Codice articolo 2146228-n
Quantità: Più di 20 disponibili
Da: Ria Christie Collections, Uxbridge, Regno Unito
Condizione: New. In. Codice articolo ria9781852337704_new
Quantità: Più di 20 disponibili
Da: GreatBookPricesUK, Woodford Green, Regno Unito
Condizione: New. Codice articolo 2146228-n
Quantità: Più di 20 disponibili
Da: Grand Eagle Retail, Mason, OH, U.S.A.
Hardcover. Condizione: new. Hardcover. The Distinguished Dissertations series is published on behalf of the Conference of Professors and Heads of Computing and the British Computer Society, who annually select the best British PhD dissertations in computer science for publication. The dissertations are selected on behalf of the CPHC by a panel of eight academics. Each dissertation chosen makes a noteworthy contribution to the subject and reaches a high standard of exposition, placing all results clearly in the context of computer science as a whole. In this way computer scientists with significantly different interests are able to grasp the essentials - or even find a means of entry - to an unfamiliar research topic. Machine learning promises both to create machine intelligence and to shed light on natural intelligence. A fundamental issue for either endevour is that of credit assignment, which we can pose as follows: how can we credit individual components of a complex adaptive system for their often subtle effects on the world? For example, in a game of chess, how did each move (and the reasoning behind it) contribute to the outcome?This text studies aspects of credit assignment in learning classifier systems, which combine evolutionary algorithms with reinforcement learning methods to address a range of tasks from pattern classification to stochastic control to simulation of learning in animals. Credit assignment in classifier systems is complicated by two features: 1) their components are frequently modified by evolutionary search, and 2) components tend to interact. Classifier systems are re-examined from first principles and the result is, primarily, a formalization of learning in these systems, and a body of theory relating types of classifier systems, learning tasks, and credit assignment pathologies. Most significantly, it is shown that both of the main approaches have difficulties with certain tasks, which the other type does not. Classifier systems are an intriguing approach to a broad range of machine learning problems, based on automated generation and evaluation of condi tion/action rules. Shipping may be from multiple locations in the US or from the UK, depending on stock availability. Codice articolo 9781852337704
Quantità: 1 disponibili
Da: BuchWeltWeit Ludwig Meier e.K., Bergisch Gladbach, Germania
Buch. Condizione: Neu. This item is printed on demand - it takes 3-4 days longer - Neuware -Classifier systems are an intriguing approach to a broad range of machine learning problems, based on automated generation and evaluation of condi tion/action rules. Inreinforcement learning tasks they simultaneously address the two major problems of learning a policy and generalising over it (and re lated objects, such as value functions). Despite over 20 years of research, however, classifier systems have met with mixed success, for reasons which were often unclear. Finally, in 1995 Stewart Wilson claimed a long-awaited breakthrough with his XCS system, which differs from earlier classifier sys tems in a number of respects, the most significant of which is the way in which it calculates the value of rules for use by the rule generation system. Specifically, XCS (like most classifiersystems) employs a genetic algorithm for rule generation, and the way in whichit calculates rule fitness differsfrom earlier systems. Wilson described XCS as an accuracy-based classifiersystem and earlier systems as strength-based. The two differin that in strength-based systems the fitness of a rule is proportional to the return (reward/payoff) it receives, whereas in XCS it is a function of the accuracy with which return is predicted. The difference is thus one of credit assignment, that is, of how a rule's contribution to the system's performance is estimated. XCS is a Q learning system; in fact, it is a proper generalisation of tabular Q-learning, in which rules aggregate states and actions. In XCS, as in other Q-learners, Q-valuesare used to weightaction selection. 328 pp. Englisch. Codice articolo 9781852337704
Quantità: 2 disponibili
Da: moluna, Greven, Germania
Gebunden. Condizione: New. Dieser Artikel ist ein Print on Demand Artikel und wird nach Ihrer Bestellung fuer Sie gedruckt. There are few texts that deal with learning classifier systems at all most include only a chapter or two on them, and are out of dateThe study of learning classifier systems has made great progress in the last few years, and is an increasingly ac. Codice articolo 4289774
Quantità: Più di 20 disponibili
Da: California Books, Miami, FL, U.S.A.
Condizione: New. Codice articolo I-9781852337704
Quantità: Più di 20 disponibili
Da: GreatBookPricesUK, Woodford Green, Regno Unito
Condizione: As New. Unread book in perfect condition. Codice articolo 2146228
Quantità: Più di 20 disponibili