Cognitive distortions in recent decades

Cognitive distortions are thinking patterns that are strongly associated with internalizing disorders such as depression and anxiety.
Historical traces of in millions of books published over the course of the last two centuries in English, Spanish, and German show a pronounced “hockey stick” pattern: Over the past two decades the textual analogs of cognitive distortions surged well above historical levels, including those of World War I and II, after declining or stabilizing for most of the 20th century.

This results point to the possibility that recent socioeconomic changes, new technology, and social media are associated with a surge of cognitive distortions.

The theory underlying cognitive-behavioral therapy (CBT), the gold standard for the treatment of depression and other internalizing disorders, holds that cognitive distortions are associated with internalizing disorders; they reflect negative affectivity and avoidant behavioral patterns in the context of environmental stress. Language is closely intertwined with this dynamic.
Recent research shows that individuals with internalizing disorders express significantly higher levels of cognitive distortions in their language to the point that their prevalence may be used as an index of vulnerability for depression.

Examples of CDS n-grams shown inside gray boxes, surrounded by plausible context words that may vary without affecting whether the n-gram marks the expression of a cognitive distortion of the given type (e.g., mindreading, emotional reasoning, or labeling and mislabeling). CDS were designed to capture the expression of a particular cognitive distortion type, regardless of its specific lexical context. The team of experts defined n-grams to mark 12 commonly distinguished types of cognitive distortions. Note that our prevalence measurements count only the CDS n-gram occurrence regardless of context (“everyone thinks,” “still feels,” and “I am a”).

The research analyzes the prevalence of a large set of markers of cognitive distortions over the past 125 y in a collection of more than 14 million books published in English, Spanish, and German. Specifically, the longitudinal prevalence of hundreds of short sequences of one to five words (n-grams) are examined. The n-grams, labeled cognitive distortion schemata (CDS), were designed by a team of CBT experts, computational linguists, and bilingual native speakers and externally validated by a panel of CBT experts, to capture the expression of 12 types of cognitive distortions. The CDS n-grams were designed as short, unambiguous, and stand-alone statements that expressed the core of a particular cognitive distortion type, using highly frequent terms.

(A–C) Median z scores of time series of CDS n-gram prevalence from 1855 to 2020 (125 y) in US English (A), Spanish (B), and German (C) with year markers added for major historical events. All time series reveal stable or declining levels for most of the 20th century followed by a sharp surge of cognitive distortions in the past three decades.
US English shows declining levels from 1899 to 1978, with minor peaks around 1914 and 1940 (World War I and World War II) and notably 1968. This decline is followed by a surge of CDS prevalence starting in 1978 that continues to 2019.
For Spanish we find stable levels from 1895 to the early 1980s at which point a trend occurs toward higher CDS prevalence levels above any of those previously observed.
German shows stable CDS prevalence levels, with the exception of strong peaks around and after World War I and World War II, until 2007 at which point a sudden surge occurs.
CDS prevalence for English, Spanish, and German superimposed with a null-model estimate of random n-gram prevalence.
Colored bands indicate 95% confidence intervals of yearly z-score values estimated with 10,000-fold bootstrap of the set of individual CDS time series.
Gray band indicates 95% confidence interval of a null model of 10,000 sets of 241 randomly chosen n-grams with the same length distribution as the English (US) CDS set.
(A–L) CDS n-gram prevalence from 1855 to 2019 (median z score smoothed by 10-y rolling mean), for English, Spanish, and German, grouped by cognitive distortion type, namely
(A) catastrophizing, (B) dichotomous reasoning, (C) disqualifying the positive, (D) emotional reasoning, (E) fortune telling, (F) labeling and mislabeling, (G) magnification and minimization, (H) mental filtering, (I) mindreading, (J) overgeneralizing, (K) personalizing, and (L) should statements. Nearly all time series reveal a universal hockey-stick pattern of recently surging CDS n-gram prevalence levels across cognitive distortion types.
The value  C indicates the log (base 10) of the total frequency of CDS n-grams in the specific cognitive distortion category as an indication of the order of magnitude of its contribution to our observations.

While the differences between the languages are interesting, perhaps the most important point is that the expression of cognitive distortions increases for all three languages in the recent three decades, leading to a distinct hockey-stick pattern indicating a surge of the CDS prevalence levels, which serve as lexical markers of cognitive distortions.

Leave a comment