The main idea is to demonstrate that errors in judgment happen all the time, and it is not a random occurrence. It is also to present the complex character of these mistakes as a combination of bias and noise, eventually recommending tools for managing this issue and maintain strict decision hygiene.
Introduction: Two Kinds of Error
The introduction presents the book’s central theme: handling human errors, and describes two types of such errors: noise and bias. It also shows graphic representation with A on target, B – noisy, C – biased, and D – a mix of noise and bias.
Part l: Finding Noise
This part explores the difference between noise and bias, showing that public and private organizations can be noisy. It reviews two areas: sentencing (public sector) and insurance (private sector).
1. Crime and Noisy Punishment
This chapter presents the result of various research projects that convincingly demonstrate judge decisions depend on many irrelevant factors such as lunchtime, weather, and whatnot. It discusses Marvin Frankel’s organization “The Lawyers’ Committee for Human Rights” and its legislative achievement in establishing sentencing guidelines. Here are data from the study of results:” expected difference in sentence length between judges was 17%, or 4.9 months, in 1986 and 1987. That number fell to 11%, or 3.9 months, between 1988 and 1993.” In 2005 congress changed guidelines from mandatory to advisory, and variance between sentences by different judges nearly doubled.
2. A Noisy System
This chapter discusses noise in the insurance business. First, it describes the result of the noise audit in the insurance company that discovered 55% variance in underwriters’ premium estimates, even if executives’ expectations were around 10%. It then analyses how this could happen and concludes that it resulted from the illusion of agreement. The further discussion includes psychological processes that lead to this, costs of high noise levels, and the need for regular noise estimates and measures to decrease it.
3. Singular Decisions
This chapter discusses singular decisions vs. recurrent decisions and concludes that these are also quite noisy. The main point here is singular decisions are the same as recurring decisions made only once, so people should apply the same noise-reducing technics in both cases.
Part II: Your Mind Is a Measuring Instrument
Part II investigates the nature of human judgment and explores how to measure accuracy and error. It discusses how human decisions are susceptible to both bias and noise. This part makes an interesting point:” judgment can therefore be described as measurement in which the instrument is a human mind. Implicit in the notion of measurement is the goal of accuracy—to approach truth and minimize error.”
4. Matters of Judgment
This chapter presents a case study about CEO selection as an example of the judgment process overloaded with relevant and irrelevant information. First, it offers the idea of internal signal:” The essential feature of this internal signal is that the sense of coherence is part of the experience of judgment. It is not contingent on a real outcome. As a result, the internal signal is just as available for nonverifiable judgments as it is for real, verifiable ones.” Further, it reviews ways to evaluate judgment even if results are often inconclusive. It also discusses the value of consistency and defines noise as an inconsistency that damages the system’s credibility.
5. Measuring Error
This chapter discusses how much bias and noise contribute to error. The main point here is that decision-makers should handle noise as rigorously as bias because it could cause similar levels of damage. This chapter also provides a bit of simple statistical tools relevant for measuring bias and noise.
6. The Analysis of Noise
This chapter demonstrates the use of tools to analyze noise in sentencing. It uses the breakdown of the system noise into the Level and the Pattern noises:
- Level noise is variability in the average level of judgments by different judges.
- Pattern noise is variability in judges’ responses to particular cases.
It also gives formula: System Noise2 = Level Noise2 + Pattern Noise2
The conclusion: “Level noise is when judges show different levels of severity. Pattern noise is when they disagree with one another on which defendants deserve more severe or more lenient treatment. And part of pattern noise is occasion noise—when judges disagree with themselves.”
7. Occasion Noise
This chapter discusses the noise from multiple small, difficult-to-measure factors. The repetitive estimates of unknown data demonstrated that the best assessment comes as an average of numerous estimates, with the first being usually closer to the truth. It parallels multiple individual estimates with one estimate by the crowd and finds it correct, naming it “the crowd within.” This chapter also discusses sources of occasional noise: psychological such as mood, gullibility, weather, and so on. The main point is that individuals are not constantly the same, and their behavior and decisions depend on multiple factors. It refers to interesting research demonstrating a 19% drop in granting asylum if the previous two positive asylum hearings. The conclusions are: “Judgment is like a free throw: however hard we try to repeat it precisely, it is never exactly identical.” and “Although you may not be the same person you were last week, you are less different from the ‘you’ of last week than you are from someone else today. Occasion noise is not the largest source of system noise.”
8. How Groups Amplify Noise
This chapter reviews group decision-making and finds it even noisier than individual decision-making. It occurs due to an increase in number and influence of irrelevant factors:” Who speaks first, who speaks last, who speaks with confidence, who is wearing black, who is seated next to whom, who smiles or frowns or gestures at the right moment.” The chapter reviews groups’ music downloads, various referenda, and web comments in the UK and the USA. The chapter also discusses informational cascades when a slight change in the sequence of presentations creates a path-dependent dynamic of support to one decision. The final part of the chapter discusses group polarization when one idea initially gets incrementally higher support than others later, resulting in increasingly higher support when people rush to join the majority. It generally leads to higher levels of noise and errors. The conclusion:” Since many of the most important decisions in business and government are made after some sort of deliberative process, it is especially important to be alert to this risk. Organizations and their leaders should take steps to control noise in the judgments of their individual members.”
Part III: Noise in Predictive Judgments
Part II explores predictive judgment, the use of rules and algorithms, and the superiority of these methods over humans in predictive power.
9. Judgments and Models
This chapter compares the accuracy of predictions made by professionals, by machines, and by simple rules. The conclusion is that the professionals come third in this competition. The chapter compares the new employee’s performance prediction based on human judgment and formal modeling and algorithms to reach this conclusion. The model beats humans not only in this case but also in clinical predictions. Moreover, it is true not only for formal modeling but also for modeling individual approaches. The model of a person predicts future outcomes better than this person’s judgment.
10. Noiseless Rules
This chapter explores why algorithms are better than experts and shows that noise is a significant factor in human judgment’s inferiority. Predictions are accurate to the extent that prediction matches outcome as measured by the percent concordant (PC). PC of 50% is a random match, and higher means more predictable power. Here is a nice graph for complexity increase:
The chapter analyses this and concludes that, generally, simple rules work better. However, AI machine learning produces even better results. The chapter then reviews an example of better bail decisions. In the end, the chapter discusses the reasons people distrust algorithms and rules.
11. Objective Ignorance
This chapter discusses an essential limit on predictive accuracy: most judgments are made in a state of objective ignorance because many things the future depends on can not be known. The chapter reviews the meaning of objective ignorance in-depth and provides multiple examples from pundits to judges and bail panels. One fascinating point here is the defiance of ignorance and human overconfidence, which adds a lot to the noise, lowering decision-making quality.
12. The Valley of the Normal
Finally, this chapter shows that objective ignorance affects not just an ability to predict events but even the capacity to understand them—an essential part of the answer to the puzzle of why noise tends to be invisible. The chapter also describes a large-scale longitudinal project tracing thousands of children and families over decades, analyzing predictions and outcomes. The result:” The main conclusion of the challenge is that a large mass of predictive information does not suffice for the prediction of single events in people’s lives—and even the prediction of aggregates is quite limited.” In other words, it demonstrated the difference between knowledge based on data and understanding of the situation that could produce a valid prediction. In the end, the chapter provides the following list of the limits of agreement:
- “Correlations of about .20 (PC = 56%) are quite common in human affairs.”
- “Correlation does not imply causation, but causation does imply correlation.”
- “Most normal events are neither expected nor surprising, and they require no explanation.”
- “In the valley of the normal, events are neither expected nor surprising—they just explain themselves.”
- “We think we understand what is going on here, but could we have predicted it?”
Part IV: How Noise Happens
Part IV explores psychological causes of noise, “including personality and cognitive style; idiosyncratic variations in the weighting of different considerations; and the different uses that people make of the very same scales.”
13. Heuristics, Biases, and Noise
This chapter presents three important judgment heuristics on which System 1 extensively relies. It shows how these heuristics cause predictable, directional errors (statistical bias) as well as noise. For example, these errors could be aiming at the same bull’s eye but hitting different spots or aiming at different bull’s eyes but hitting the same place. The authors discuss substitution, conclusion, and other psychological biases. They caution against blaming errors on unspecified biases and distorting evidence to fit prejudgment based on the first impressions. They also suggest that biases common for a group create systemic bias, but if biases are different, it just makes more noise.
14. The Matching Operation
This chapter focuses on matching—a particular operation of System 1—and discusses the errors it can produce. It mainly comes down to the difference in measurement scales when the exact estimate creates errors because of scaling mismatch.
This chapter turns to an indispensable accessory in all judgments: the scale on which the judgments are made. It shows that the choice of an appropriate scale is a prerequisite for good judgment and that ill-defined or inadequate scales are an important source of noise. Here authors provide the formula for measuring noisy scales:
Variance of Judgments = Variance of Just Punishments + (Level Noise) 2 + (Pattern Noise) 2
They also provide a graphic representation for punitive scales:
This chapter explores the psychological source of what may be the most intriguing type of noise: the patterns of responses that different people have to different cases. Like individual personalities, these patterns are not random and are mostly stable over time, but their effects are not easily predictable. Here is another formula:
(Pattern Noise)2 = (Stable Pattern Noise) 2 + (Occasion Noise) 2
17. The Sources of Noise
This chapter summarizes the previous discussion about noise and its components. It also proposes an answer to the puzzle raised earlier: why is noise, despite its ubiquity, rarely considered an important problem? Here is a combined graphical representation of Mean Square Error (MSE):
Part V: Improving Judgments
Part V explores ways to improve human judgment.
18. Better Judges for Better Judgments
This chapter discusses the characteristics of superior judges. Authors look at such characteristics as Intelligence and Cognitive style. They also discuss the role of true experts, who produce verifiable predictions and respect-experts – people with credentials who make unverifiable statements.
19. Debiasing and Decision Hygiene
This chapter reviews many attempts to counteract psychological biases, with some clear failures and some clear successes. It also briefly reviews debiasing strategies and suggests a promising: asking a designated decision observer to search for diagnostic signs that could indicate, in real time, that a group’s work is being affected by one or several familiar biases. The authors look at Ex Post and Ex Ante debiasing and provide some experimental data on this. They also discuss debiasing limitations. One of the methods they discuss is a decision observer with a checklist to assure proper coverage of biases and decision points. Overall, they suggest strict decision hygiene to decrease both biases and noise.
20. Sequencing Information in Forensic Science
This chapter reviews the case of forensic science, which illustrates the importance of sequencing information. The search for coherence leads people to form early impressions based on the limited evidence available and then to confirm their emerging prejudgment. This makes it important not to be exposed to irrelevant information early in the judgment process. The authors review an example of fingerprint analysis and how various biases and noise impacted its quality. They also stress the need for a second opinion that has to be independent to be meaningful.
21. Selection and Aggregation in Forecasting
This chapter reviews the case of forecasting, which illustrates the value of one of the most important noise-reduction strategies: aggregating multiple independent judgments. The “wisdom of crowds” principle is based on the averaging of multiple independent judgments, which is guaranteed to reduce noise. Beyond straight averaging, there are other methods for aggregating judgments, also illustrated by the example of forecasting. Authors here refer to Tetlock’s “Good Judgment Project” and discuss its mixed results.
22. Guidelines in Medicine
This chapter offers the review of noise in medicine and efforts to reduce it. It points to the importance and general applicability of a noise-reduction strategy previously introduced with the example of criminal sentencing: judgment guidelines. Guidelines can be a powerful noise-reduction mechanism because they directly reduce between-judge variability in final judgments. Here authors pay special attention to psychiatry, the field with deficient levels of consistency between specialists’ judgments.
23. Defining the Scale in Performance Ratings
This chapter turns to a challenge in business life: performance evaluations. Efforts to reduce noise there demonstrate the critical importance of using a shared scale grounded in an outside view. This is an important decision hygiene strategy for a simple reason: judgment entails the translation of an impression onto a scale, and if different judges use different scales, there will be noise. Here authors suggest that the use of a relative scale is more appropriate than absolutes.
24. Structure in Hiring
This chapter explores the related but distinct topic of personnel selection, which has been extensively researched over the past hundred years. It illustrates the value of an essential decision hygiene strategy: structuring complex judgments. By structuring, authors mean decomposing a judgment into its component parts, managing the process of data collection to ensure the inputs are independent of one another, and delaying the holistic discussion and the final judgment until all these inputs have been collected.
25. The Mediating Assessments Protocol
This chapter proposes a general approach to option evaluation called the mediating assessments protocol, or MAP for short. MAP starts from the premise that “options are like candidates” and describes schematically how structured decision making, along with the other decision hygiene strategies mentioned above, can be introduced in a typical decision process for both recurring and singular decisions.
Part VI: Optimal Noise
Part VI explores the proper noise level, considering that it is not possible or even preferable to eradicate it.
26. The Costs of Noise Reduction
This chapter reviews the first two of seven major objections to efforts to reduce or eliminate noise:
- First, reducing noise can be expensive; it might not be worth the trouble. The steps that are necessary to reduce noise might be highly burdensome. In some cases, they might not even be feasible.
- Second, some strategies introduced to reduce noise might introduce errors of their own. Occasionally, they might produce systematic bias. If all forecasters in a government office adopted the same unrealistically optimistic assumptions, their forecasts would not be noisy, but they would be wrong. If all doctors at a hospital prescribed aspirin for every illness, they would not be noisy, but they would make plenty of mistakes.
This chapter reviews five more objections, which are also common and which are likely to be heard in many places in coming years, especially with increasing reliance on rules, algorithms, and machine learning:
- Third, if we want people to feel that they have been treated with respect and dignity, we might have to tolerate some noise. Noise can be a by-product of an imperfect process that people end up embracing because the process gives everyone (employees, customers, applicants, students, those accused of crime) an individualized hearing, an opportunity to influence the exercise of discretion, and a sense that they have had a chance to be seen and heard.
- Fourth, noise might be essential to accommodate new values and hence to allow moral and political evolution. If we eliminate noise, we might reduce our ability to respond when moral and political commitments move in new and unexpected directions. A noise-free system might freeze existing values.
- Fifth, some strategies designed to reduce noise might encourage opportunistic behavior, allowing people to game the system or evade prohibitions. A little noise, or perhaps a lot of it, might be necessary to prevent wrongdoing.
- Sixth, a noisy process might be a good deterrent. If people know that they could be subject to either a small penalty or a large one, they might steer clear of wrongdoing, at least if they are risk-averse. A system might tolerate noise as a way of producing extra deterrence.
- Finally, people do not want to be treated as if they are mere things, or cogs in some kind of machine. Some noise-reduction strategies might squelch people’s creativity and prove demoralizing.
28. Rules or Standards?
This chapter presents the authors’ general conclusion that even when the objections to various methods such as rigid guidelines are given their due, noise reduction remains a worthy and even an urgent goal. It defends this conclusion by exploring a dilemma that people face every day, even if they are not always aware of it.
Review and Conclusion: Taking Noise Seriously
Here the authors once again summarize the main points of this book. They strongly recommend paying attention to the noise and applying massive efforts to limit the noise to acceptable levels while stressing that it is not possible and even not reasonable to remove it altogether.
MY TAKE ON IT:
I think this is an excellent book on the problem of poor decision-making that causes myriad issues and cost lots of treasure and, in some cases, lots of blood. The division of the problem into noise and bias is very effective, and specific suggestions of improvements via checklists, second independent opinions, explicit recognition of various biases, and, overall, strict decision hygiene could be highly valuable. However, I would not hold my breath anticipating improvements. I believe that problem is more in the absence of solid feedback for decision-makers in government and top levels of big corporations, which makes these people irresponsible and therefore uninterested in improving decision-making processes.