Book Review: “How to Lie with Statistics” by Darell Huff

“According to one random survey, I should be much happier than I am”

– A quote from the book

This book “How to Lie with Statistics” by Darell Huff is a classic book on interpreting data and charts. It was first published in 1954 and the data points presented for sure go back a hundred years! However the book contains timeless insights as to how data and its presentation can be misused and misconstrued to bias inferences

This book is written in a witty, humorous style which makes it easier to finish in one or two sittings. As the author says in the introduction “This book is a sort of primer in ways to use statistics to deceive”

Here are a few key takeaways for me

  1. To test whether a random sample is indeed “random” – answer the question: Does every name or thing in the whole group have an equal chance to be in the sample?
  2. In sampling there are at least 3 errors involved – samples of population are not random and therefore not representative of any population. Any questionnaire of is also a sample of possible question that can be utilized in a given context and the answers provided are also no more than samples of respondent’s attitudes and experiences on each question
  3. It is not necessary that a poll be rigged – that is, that the results be deliberately twisted in order to create a false impression. The tendency of the sample to be biased in this consistent direction can rig it automatically!
  4. When you are told that something is an average you still don’t know very much about it unless you can find out which of the common kinds of average it is – mean, median, or mode
  5. Only when there is a substantial number of trials involved is the law of averages a useful description or prediction
  6. Only way to think about IQs and many other sampling results is in ranges
  7. It is an interesting fact that the death rate or number of deaths often is a better measure of the incidence of an ailment than direct incidence figures – simply because the quality of reporting and record-keeping is so much higher on fatalities!
  8. One of the trickiest ways to misrepresent statistical data is by means of a map. A map introduces a fine bag of variables in which facts can be concealed and relationships distorted
  9. Any percentage figure based on a small number of cases is likely to be misleading. It is more informative to give the figure itself
  10. Percentiles are deceptive too. The odd thing about percentiles is that a student with a 99-percentile rating is probably quite a bit superior to one standing at 90, while those at the 40 and 60 percentiles may be of nearly equal achievement. This comes from the habit that so many characteristics have of clustering about their own average
  11. In order to look a phoney statistic in the eye and face it down, ask the following 5 questions
    1. Who says so?
      1. Look for conscious bias (e.g. suppression of unfavorable data, shifting measurement units, using an improper measure (mean vs. median))
      2. Look sharply for unconscious bias. It’s even more dangerous. Especially in case of citations, make sure that the authority stands behind the information, not merely somewhere alongside it
    2. How does he know?
      1. Watch out for evidence of a biased sample, one that has been selected improperly or has selected itself
      2. Is the sample large enough to permit any reliable conclusion?
    3. What is missing?
      1. A correlation given without a measure of reliability (probable error, standard error) is not to be taken very seriously
      2. If you are handed an index, you may ask what’s missing there
      3. Sometimes it is percentages that are given and raw figures that are missing, and this can be deceptive too
    4. Did somebody change the subject?
      1. Watch out for a switch somewhere between the raw figure and the conclusion. One thing is all too often reported as another
      2. As just indicated, more reported cases of a disease are not always the same thing as more cases of the disease
      3. In surveys, saying and doing may not be the same thing at all
      4. Beware of firsters. Almost anybody can claim to be first in something if he is not too particular about what it is
    5. Does it make sense?
      1. The impressively precise figure is something else that contradicts common sense

In my assessment, some of the insights from the book seem to have morphed into modern-day management aphorisms e.g. correlation is not causation; when anecdote and data differs, trust the anecdote more, more fiction has been written in Excel than Word!

Good book, can be purchased from Amazon from here!

Leave a comment