Applied Bayesian and Classical Inference: The Case of the Federalist Papers - Rilegato

Mosteller, Frederick

9780387909912: Applied Bayesian and Classical Inference: The Case of the Federalist Papers

Rilegato

ISBN 10: 0387909915 ISBN 13: 9780387909912

Casa editrice: Springer Verlag, 1984

Vedi tutte le copie di questa edizione con ISBN

3 Usato

Da: EUR 66,35

4 Nuovo

Da: EUR 107,98

The new version has two additions. First, at the suggestion of Stephen Stigler I we have replaced the Table of Contents by what he calls an Analytic Table of Contents. Following the title of each section or subsection is a description of the content of the section. This material helps the reader in several ways, for example: by giving a synopsis of the book, by explaining where the various data tables are and what they deal with, by telling what theory is described where. We did several distinct full studies for the Federalist papers as well as many minor side studies. Some or all may offer information both to the applied and the theoretical reader. We therefore try to give in this Contents more than the few cryptic words in a section heading to ~peed readers in finding what they want. Seconq, we have prepared an extra chapter dealing with authorship work published from. about 1969 to 1983. Although a chapter cannot compre� hensively Gover a field where many books now appear, it can mention most ofthe book-length works and the main thread of authorship' studies published in English. We founq biblical authorship studies so extensive and com� plicated that we thought it worthwhile to indicate some papers that would bring out the controversies that are taking place. We hope we have given the flavor of developments over the 15 years mentioned. We have also corrected a few typographical errors.

Le informazioni nella sezione "Riassunto" possono far riferimento a edizioni diverse di questo titolo.

Contenuti

Analytic Table of Contents.- 1. The Federalist Papers As a Case Study.- 1.1. Purpose.- To study how Bayesian inference works in a large-scale data analysis, we chose to try to resolve the problem of the authorship of the disputed Federalist papers..- 1.2. The Federalist papers.- The Federalist papers were written by Hamilton, Madison, and Jay. Jay’s papers are known. Of the 77 papers originally published in newspapers, 12 are in dispute between Hamilton and Madison, and 3 may regarded as joint by them. Historians have varied in their attributions..- 1.3. Early work.- Frederick Williams and Frederick Mosteller found that sentence length and its variability within papers did not discriminate. Tables 1.3–1, 2, 3, 4 show that they found some discriminating power in percentage of nouns, of adjectives, of one- and two-letter words, and of the’s. Together these variables could have decided whether Hamilton or Madison wrote all the disputed papers, if that were the problem, but the problem is to make an effective assignment for each paper..- 1.4. Recent work—pilot study.- We call marker words those which one author often uses and the other rarely uses. Douglass Adair found while (Hamilton) versus whilst (Madison). We found enough (Hamilton) and upon (Hamilton); see Tables 1.4–1, 2 for incidence and rates. Tables 1.4–3, 4, 5 give an over-view of marker words for Federalist and non-Federalist writings. Alone, they would not settle the dispute compellingly..- 1.5. Plots and honesty.- Some say that the dispute is not a matter of honesty but a matter of memory. Hamilton was hurried in his annotation by an impending duel, but Madison had plenty of time. Editing may be a hazard. We want to use many words as discriminating variables..- 1.6. The plan of the book.- 2. Words and Their Distributions.- 2.1. Why words?.- Hamilton and Madison use the same words at different rates, and so their rates offer a vehicle for discrimination. Some words like by and to vary relatively little in their rates as context changes, others like war vary a lot, as the empirical distributions in the four tables show. Generally, less meaningful words offer more stability..- 2.2. Variation with time.- In Table 2.2–2, a separate study illustrated by Madison’s rates for 11 function words over a 26-year period examines the stability of rates through time. We desire stability because we need additional text of known authorship to choose words and their rates for discriminating between authors. Among function words, some pronouns and auxiliary verbs seem unstable..- 2.3. How frequency of use varies.- For establishing a mathematical model, we need to find out empirically how rates of use by an author vary from one chunk of writing to another..- 2.4. Correlations between rates for different words.- Theoretical study shows that the correlation between the rates of occurrence for different words should ordinarily be small but negative. An empirical study whose results appear in Table 2.4–1 shows that these correlations are ordinarily negligible for our work..- 2.5. Pools of words.- Three pools of words produced potential discriminators..- 2.6. Word counts and their accuracies.- Some word counts were carried out by hand using slips of paper, one word per slip. Others were done by a high-speed computer which constructed a concordance..- 2.7. Concluding remarks.- Although words .offer .only .one set .of discriminators, .one needs a large enough Pool of potential discriminators to .offer a good chance .of success. We need to avoid selection and regression effects. Ideally we want enough data to get a grip on the distribution theory for the variables to be used..- 3. The Main Study.- In the main study, we use Bayes’ theorem to determine odds of authorship for each disputed paper by weighting the evidence from words. Bayesian methods enter centrally in estimating the word rates and choosing the words to use as discriminators. We use not one but an empirically based range of prior distributions. We present the results for the disputed papers and examine the sensitivity of the results to various aspects of the analysis..- After a brief guide to the chapter, we describe some views of prob-ability as a degree of belief and we discuss the need and the difficulties of such an interpretation..- 3.1. Introduction to Bayes’ theorem and its applications.- We give an overview, abstracted from technical detail, of the ideas and methods of the main study, and we describe the principal sources of difficulties and how we go about meeting them..- 3.1 A. An example applying Bayes’ theorem with both initial odds and parameters known.- 3.1B. Selecting words and weighting their evidence.- 3.1C. Initial odds.- 3.1D. Unknown parameters.- 3.2. Handling unknown parameters of data distributions.- We begin to set out the components of our Bayesian analysis..- 3.2A. Choosing prior distributions.- 3.2B. The interpretation of the prior distributions.- 3.2C. Effect of varying the prior.- 3.2D. The posterior distribution of (?, ?).- 3.2E. Negative binomial.- 3.2F. Final choices of underlying constants.- 3.3. Selection of words.- The prior distributions are the route for allowing and protecting against selection effects in choice of words . We use an unselected pool of 90 words for estimating the underlying constants of the priors, and we assume the priors apply to the populations of words from which we developed our pool of 165 words. We then selectively reduce that pool to the final 30 words. We describe a stratification of words into word groups and our deletion of two groups because of contextuality..- 3.4. Log odds.- We compute the logarithm of the odds factor that changes initial odds to final odds and call it simply log odds. The computations use the posterior modal estimates as if they were exact and are made under the various choices of underlying constants and using both negative binomial or Poisson models..- 3.4A. Checking the method.- 3.4B. The disputed papers.- 3.5 Log odds by words and word groups.- 3.5A. Word groups.- 3.5B. Single words.- 3.5C. Contributions of marker and high-frequency words.- 3.6. Late Hamilton papers.- We assess the log odds for four of the late Federalist papers, written by Hamilton after the newspaper articles appeared and not used in any of our other analyses. The log odds all favor Hamilton, very strongly for all but the shortest paper..- 3.7. Adjustments to the log odds.- Through special studies, we estimate the magnitude of effects on the log odds of various approximations and imperfect assumptions underlying the main computations and results presented in Section 3.4. Percentage reductions in log odds are a good way to extrapolate from the special studies to the main study..- 3.7A. Correlation.- 3.7B. Effects of varying the underlying constants that determine the prior distributions.- 3.7C. Accuracy of the approximate log odds calculation.- 3.7D. Changes in word counts.- 3.7E. Approximate adjusted log odds for the disputed papers.- 3.7F. Can the odds be believed?.- 4. Theoretical Basis of the Main Study.- This chapter is a sequence of technical sections supporting the methods and results of the main study presented in Chapter 3. We set out the distributional assumptions, our methods of determining final odds of authorship, and the logical basis of the inference. We explain our methods for choosing prior distributions. We develop theory and approximate methods to explore the adequacy of the assumptions and to support the methods and the findings..- 4.1. The negative binomial distribution.- We review and develop properties of the negative binomial family of distributions..- 4.1 A. Standard properties.- 4.1B. Distributions of word frequency.- 4.1C. Parametrization.- 4.1D. Estimation.- 4.2. Analysis of the papers of known authorship.- We treat the choice of prior distributions, the determination of the posterior distribution, and the computational problem in finding posterior modes..- 4.2A. The data: notations and distributional assumptions.- 4.2B. Object of the analysis.- 4.2C. Prior distributions : assumptions.- 4.2D. The posterior distribution.- 4.2E. The modal estimates.- 4.2F. An alternative choice of modes.- 4.2G. Choice of initial estimate.- 4.3. Abstract structure of the main study.- We describe an abstract structure for our problem; we derive the appropriate formulas for our application of Bayes’ theorem and give a formal basis for the method of bracketing the prior distribution. The treatment is abstracted both from the notation of words and their distributions and from numerical evaluations..- 4.3A. Notation and assumptions.- 4.3B. Stages of analysis.- 4.3C. Derivation of the odds formula.- 4.3D. Historical information.- 4.3E. Odds for single papers.- 4.3F. Prior distributions for many nuisance parameters.- 4.3G. Summary.- 4.4 Odds factors for the negative binomial model.- We develop properties of the Poisson and negative binomial families of distributions. The discussion of appropriate shapes for the likelihood ratio function may suggest new ways to choose the form of distributions..- 4.4A. Odds factors for an unknown paper.- 4.4B. Integration difficulties in evaluation of ?.- 4.4C. Behavior of likelihood ratios.- 4.4D. Summary.- 4.5. Choosing the prior distributions.- We give methods for choosing sets of underlying constants to bracket the prior distributions and we explore the effects of varying the prior on the log odds. Choices are based in part on empirical analysis but also on heuristic considerations of reasonableness, analogy, and tractability..- 4.5A. Estimation of ?1 and ? 2 : first analysis 125.- 4.5B. Estimation of ?1 and ?2 : second analysis.- 4.5C. Estimation of ?3.- 4.5D. Estimation of ?4 and ?5.- 4.5E. Effect of varying the set of underlying constants.- 4.5F. Upon: a case study.- 4.5G. Summary.- 4.6. Magnitudes of adjustments required by the modal approximation to the odds factor.- We study, by example, the effect of using the posterior mode as if it were exact. To make the assessment we develop some general asymptotic theory of posterior densities..- 4.6A. Ways of studying the approximation.- 4.6B. Normal theory for adjusting the negative binomial modal approximation.- 4.6C. Approximations to expectations.- 4.6D. Notes on asymptotic methods.- 4.7. Correlations.- We study the magnitudes of effects of erroneous assumptions: the effects of correlations between rates for different words..- 4.7A. Independence and odds.- 4.7B. Adjustment for a pair of words.- 4.7C. Example. The words upon and on.- 4.7D. Study of 15 word pairs.- 4.7E. Several words.- 4.7F. Further theory.- 4.7G. Summary.- 4.8. Studies of regression effects.- To study the adequacy of assumptions, we compare the performance of the log odds for the disputed papers with theoretical expectations..- 4.8A. Introduction.- 4.8B. The study of word rates.- 4.8C. Total log odds for the final 30 words.- 4.8D. Log odds by word group.- 4.8E. Theory for the Poisson model.- 4.8F. Theory for the negative binomial model.- 4.8G. Two-point formulas for expectations of negative binomial log odds.- 4.9. A logarithmic penalty study.- 4.9A. Probability predictions.- 4.9B. The Federalist application: procedure.- 4.9C. The Federalist application: the penalty function.- 4.9D. The Federalist application: numerical results.- 4.9E. The Federalist application: adjusted log odds.- 4.9F. The choice of penalty function.- 4.9G. An approximate likelihood interpretation.- 4.10. Techniques in the final choice of words.- This section provides details of a special difficulty, and its possible general value lies in illustrating how to investigate the effects of a split into two populations of what was thought to be a single population..- 4.10A. Systematic variation in Madison’s writing.- 4.10B. Theory.- 5. Weight-Rate Analysis.- 5.1. The study, its strengths and weaknesses.- Using a screening set of papers, we choose words and weights to use in a linear discriminant function for distinguishing authors. We use a calibrating set to allow for selection and regression effects. A stronger study would use the covariance structure of the rates for different words in choosing the weights; we merely allow for it through the calibrating set. The zero-rate words also weaken the study because we have not allowed for length of paper as we have done in the main study and in a robust one reported later..- 5.2. Materials and techniques.- Using the pool of words described in Chapter 2, we develop a linear discriminant function $$\tilde y = \sum {W_i \tilde x_{i,} } $$ , where $$W_i $$ is the weight assigned to the i th. word and $$x_i $$ is the rate for that word. The $$W_i $$ are chosen so that � tends to be high if Hamilton is the author, low if Madison is. Ideally the weights are proportional to the difference between the authors’ rates and inversely proportional to the sum of the variances. By asimplified and robust calculation, an index of importance of a word was created. We use it to cut the number of words used to 20..- 5.3 Results for the screening and calibrating sets.- The 20 words, their weights, and estimated importances are displayed in Table 5.3–1, upon being outstanding by a factor of 4. Table 5.3–2 shows the results of applying the weights to the screening set of papers. Hamilton’s 23 average .87 and all exceed .40, while Madison’s 25 average —.41 and all are below —.19. For the calibrating set Hamilton averages .92 and Madison —.38. The smallest Hamilton score is .31, and the largest Madison is .15 (zero plays no special role here)..- 5.4. Regression effects.- As a rough measure of separation, we use the number of standard deviations between the Hamilton and Madison means. For the whole set of 20 words, the separation regresses from 6.9 standard deviations in the screening set to 4.5 in the calibrating set. In Section 5.3, we see almost no change from screening to calibration set in the average separations; the loss comes from increased standard deviations. In a general way, as the groups of words become more contextual the regression effect is larger. Group 1, the word upon, actually gains strength from screening to calibration set..- 5.5. Results for the disputed papers.- After displaying the numerical outcome of the weight-rate discriminant function for the disputed papers in Table 5.5–1, we carry out two types of analyses, one based on significance tests and one based on likelihood ratios. In Table 5.5–2 we show two t-statistics and corresponding P-values for each paper, first for testing that the paper is a Hamilton paper, and second for testing that the paper is a Madison paper. We compute $$ t_i = \frac{{y - \overline y _i }} {{s_i \sqrt {1 + (1/n_j )} }} $$ where j = Hamilton or Madison, y is the value for the disputed paper from Table 5.5–1, sj is the standard deviation for author j for the calibrating set, and nj = 25, the number of papers in each calibrating set. Except for paper 55, the P-values for the Hamilton hypotheses are all very small (less than .004); the P-values for the Madison hypotheses are large, the smallest being .087. Paper 55 is further from Madison than from Hamilton but both P-values are significant..- Table 5.5–3 gives log likelihood ratios for the joint and disputed papers, assuming normal distributions and using the means and variances in the calibrating set. To allow for the uncertainty in estimating the means and variances, conservative 90 per cent confidence limits are shown for the log likelihood ratio, and a Bayesian log odds is calculated using the t-distribution. Except for paper 55, which goes slightly in Hamilton’s favor, the odds favor Madison for the disputed papers..- 6. A Robust Hand-Calculated Bayesian Analysis.- 6.1. Why a robust study?.- Because the main study leans on parametric assumptions and heavy calculations, we want a study to check ourselves that depends less on distributional assumptions and that has calculations that a human bei...

Le informazioni nella sezione "Su questo libro" possono far riferimento a edizioni diverse di questo titolo.

Editore: Springer Verlag
Data di pubblicazione: 1984
Lingua: Inglese
ISBN 10: 0387909915
ISBN 13: 9780387909912
Rilegatura: Copertina rigida
Numero di pagine: 303
Contatto del produttore: non disponibile
Persona responsabile: CMN Printpool Inh. Mathis Neumann
https://www.also.com/ec/cms5/en_6000/6000/index.jsp

Friedrich-Karl-Str. 9
Hamburg
Germania

Altre edizioni note dello stesso titolo

9781461297598: Applied Bayesian and Classical Inference: The Case of The Federalist Papers

Edizione in evidenza

ISBN 10: 1461297591 ISBN 13: 9781461297598
Casa editrice: Springer, 2011
Brossura

Springer-Verlag B..., 1984 (Rilegato)

Risultati della ricerca per Applied Bayesian and Classical Inference: The Case...

Immagini fornite dal venditore

Applied Bayesian and Classical Inference: The Case of The Federalist Papers

Frederick Mosteller and David L. Wallace

Editore: Springer-Verlag, Berlin, 1984

ISBN 10: 0387909915 ISBN 13: 9780387909912

Antico o usato Rilegato Prima edizione

Da: Lectern Books, Brooklyn, NY, U.S.A.

Valutazione del venditore 4 su 5 stelle

Hardcover. Condizione: Near Fine. First Edition. Large 8vo. 303 pp. Near Fine. Slight shelfwear to boards and spine. Codice articolo c00742

Contatta il venditore

Compra usato

EUR 66,35

Spese di spedizione: EUR 5,15

In U.S.A.

Quantit�: 1 disponibili

Aggiungi al carrello

Foto dell'editore

Applied Bayesian and Classical Inference: The Case of The Federalist Papers (Springer Series in Statistics)

Mosteller, F.,Wallace, D. L.

Editore: Springer, 1984

ISBN 10: 0387909915 ISBN 13: 9780387909912

Antico o usato Rilegato

Da: HPB-Red, Dallas, TX, U.S.A.

Valutazione del venditore 5 su 5 stelle

hardcover. Condizione: Good. Connecting readers with great books since 1972! Used textbooks may not include companion materials such as access codes, etc. May have some wear or writing/highlighting. We ship orders daily and Customer Service is our top priority! Codice articolo S_445771352

Contatta il venditore