Cooperation versus social welfare

“Cooperation versus social welfare”

Understanding and promoting cooperative behaviour among self-interested individuals is a critical concern in physical, biological, and social sciences. Numerous foundational mechanisms for the evolution of cooperation have been identified, and these mechanisms have served as the basis for developing tools and interventions designed to sustain and enhance cooperative behaviour. However, since both foundational mechanisms and the derived tools and interventions often involve costs affecting individuals or institutions, striving for maximum cooperation can sometimes harm social welfare, defined as the total population payoff. Herein, we review existing evolutionary mechanisms for the evolution of cooperation as well as tools and interventions based on these mechanisms, emphasising the often-overlooked hidden costs that may lead to a misalignment between cooperation and social welfare. By explicitly incorporating these hidden factors into the models, we analyse the conditions under which they reduce social welfare, across a broad range of social dilemma games and evolutionary forces. Additionally, we review experimental studies that support and inform mathematical models and agent-based simulations.
We highlight when considering social welfare is crucial, as misalignment is most likely to occur. Ultimately, we argue that social welfare, not just cooperation, should be the primary optimisation objective when designing interventions for social good. We also suggest several key directions to further explore this often-overlooked issue in the literature. Overall, we reveal that hidden costs often influence the alignment between cooperation and social welfare, challenging the common prioritisation of cooperation alone.

Highlights
  • Emphasised optimising social welfare, not just cooperation, for effective interventions.
  • Revealed often-overlooked costs in cooperation mechanisms, impacting social welfare.
  • Reward-based interventions are more consistently aligned with social welfare than punishment.
  • Direct and indirect reciprocities exhibit high alignment between cooperation and social welfare.
  • While optional participation can sustain cooperation, managing hidden costs is crucial to maximising social welfare.
  • Unevenly distributed benefits occur in interconnected population structures.

Several foundational mechanisms have been proposed to explain the evolution of cooperation, such as kin and group selection, direct, indirect, and network reciprocity. Based on these underlying mechanisms, several tools and interventions have been identified to promote the evolution of cooperation, including peer and institutional incentives and optional participation. In these works, the emphasis is often placed on the degree or level of cooperation that a given approach can induce.

Since mutual cooperation is usually collectively more desirable than mutual defection and unilateral cooperation, ensuring high levels of cooperation usually also enhances the overall welfare of the population. Nonetheless, both foundational mechanisms and the derived tools and interventions often entail certain costs that impact individuals or institutions. For example, at the individual level, applying direct reciprocity can impose significant cognitive costs, as it requires remembering previous interactions to guide future behaviour. At the institutional level, organizations may bear significant financial burdens to promote cooperation, such as implementing compliance systems that disincentive self-interested action. These costs have the potential to decrease social welfare, defined as the total payoff of the population. Thus, balancing between promoting cooperation and managing associated costs is critical for optimising social welfare.

Game models

Interactions are modelled using pairwise games with payoff matrix of the form:

Each player has the option to either cooperate (C) or defect (D). Mutual cooperation results in a payoff of R, mutual defection results in a payoff of P, unilateral cooperation leads to a payoff of S, and unilateral defection results in a payoff of T. In our analysis, we employ a widely-used parameterization for pairwise games. Specifically, we normalize  and ; furthermore, we set the bounds for T between 0 and 2, and for S between  -1 and 1. Based on the ordering of these payoffs, we identify four prototypical games:

  1. The Prisoner’s Dilemma (PD): .This game is characterized by the combination of two properties: players prefer mutual defection to unilateral cooperation (S < 0) and prefer unilateral defection to mutual cooperation (T > 1). Therefore, defection is a strictly dominant strategy.
  2. The Snowdrift (SD) game: 2 ≥ T > 1 > S > 0.This game is characterized by the combination of two properties: players prefer unilateral defection to mutual cooperation (T > 1) and prefer unilateral cooperation to mutual defection (S > 0).
  3. The Stag Hunt (SH) game: .This game is characterized by the combination of two properties: players prefer unilateral cooperation to mutual defection (S > 0) and prefer mutual cooperation to unilateral defection (T < 1).
  4. The Harmony (H) game: 1 > T ≥ 0, 1 ≥ S > 0.This game is characterized by the combination of two properties: players prefer mutual cooperation to unilateral defection (T < 1) and prefer unilateral cooperation to mutual defection (S > 0). Therefore, cooperation is a strictly dominant strategy.
Pairwise games used to model interactions. The payoffs for mutual cooperation and defection are normalized to R=1 and P=0, respectively, while T and S vary between 0 to 2 and  -1 to 1. The four distinct squares represent four proto-typical pairwise games: the Prisoner’s Dilemma (PD), Stag-Hunt (SH), Harmony (H), and Snowdrift (SD) games.

Peer reward and peer punishment

To provide peer incentives, individuals incur a personal cost to either punish a violator (peer punishment) or reward a cooperator (peer reward) following an interaction. Consequently, the violator’s payoff decreases, while the cooperator’s payoff increases.

Peer punishment is a form of retaliation or negative reciprocity, wherein individuals incur a cost to impose a cost on another. In peer punishment, players independently decide to penalise defectors. Conversely, rewarding is prevalent in human societies, reflecting positive reciprocity towards prosocial, well-behaved, or otherwise kind actions. Unlike punishment, rewarding involves incurring a cost for the benefit of another. In peer rewarding, players independently choose to reward other cooperators.

We consider in our model new strategies, peer (social) punisher (PP) and peer (social) rewarder (PR). They cooperate in a pairwise game, and after the game, they pay a cost ε to punish a defective co-player or reward a cooperative one, respectively. The rewarded/punished player receives an increase/decrease of δ in their payoff.

We consider minimal models of peer incentives in the one-shot pairwise game, with three strategies: unconditional cooperator (C), unconditional defector (D), and either PP or PR. The payoff matrices for peer punishment and peer reward cases are given as follows, respectively

For a clear understanding of cooperation dynamics under peer incentives, we study the stationary distribution and gradient of selection of the systems under different impact of incentives (δ). For a small impact (δ=1), the stationary distribution of the system converges near D and on the C-D edge of the triangle simplex, for both peer reward and punishment. For large δ (i.e. more cost-efficient incentivisation), while reward only shifts this convergence points on this edge, punishment can move these points to the more cooperative outcomes on the C-PP edge (compare panels b and c)

 Impact of peer punishment on strategy evolution for varying impact of punishment, namely: δ=1 in (a), δ=3 in (b), and δ=5 in (c). The arrows represent the gradient of selection of the system in a triangle simplex, showing the most probable evolutionary trajectory as the system moves away from its current state. The colours on the left represent a mapping of the magnitude of the gradient, with colours close to red indicating larger gradients and colours nearing blue indicating smaller gradients. The colours on the right represent a mapping of the stationary distribution, where darker hues correspond to a larger stationary distribution. Parameters are set as  S=0.1, T=1.2.

Impact of peer reward on strategy evolution for varying impact of reward, namely: δ=1 in (a), δ=3 in (b), and δ=5 in (c). Parameters are set as  S=0.1, T=1.2.

Institutional reward and institutional punishment

To reward a cooperator or punish a defector, the institution must incur a cost, which are δ/a for rewarding a cooperator and δ/b for punishing a defector, where a and b are positive constants representing the efficiency ratios for providing each type of incentive. This results in the cooperator’s payoff increasing by δ and the defector’s payoff decreasing by δ.

Minimise institutional spending vs maximising social welfare

Recent research has examined cost optimisation of institutional incentives for promoting cooperation. These studies aim to minimise institutional expenses while preserving high cooperation levels to maximise social welfare, aligning with this review.
In particular, these studies typically formulate a bi-objective optimisation problem: ensure a specified minimum level of cooperation at the lowest institutional cost. Although they offer insight into optimal incentive policies, they do not necessarily guarantee maximal social welfare. Indeed, the two objectives are often misaligned

Direct reciprocity

Consider a minimal model in (finitely) repeated games with three strategies: i) AllC (unconditional cooperator – always cooperates in each round), ii) AllD (unconditional defector – always defects in each round), and iii) TFT (Tit-for-Tat – cooperates in the first round and thereafter copies the co-player’s previous move).

We adopt the following payoff matrix, where ϕ stands for the number of rounds in the repeated game:



Conclusions and future directions

As mutual cooperation tends to be collectively more beneficial than mutual defection, achieving high levels of cooperation typically results in enhanced population welfare.

However, since these mechanisms and interventions often involve costs and benefits that affect individual payoffs, striving for the highest levels of cooperation may sometimes be harmful to social welfare.

MechanismsPDHSDSHParameters
Peer Reward1111
Peer Punishment0.530.520.700.57
Institutional Reward1
1
1
1
0.8
1
1
1
a=b=1
a=b=3
Institutional Punishment0.48
0.51
0.2
0.3
0.4
0.42
0.4
0.45
a=b=1
a=b=3
Direct reciprocity10.850.980.99
Indirect reciprocity1
1
0.72
0.78
0.99
1
0.93
0.93
.
L6
Optional participation0.48
0.35
0.49
0.12
0.51
0.17
0.46
0.27
a=1
a=3
Network reciprocity1
1
1
0.955
0.924
0.896
0.99
1
SL
SF
Summary: Levels of Alignment for Different Mechanisms of Cooperation across pairwise games, Prisoner’s Dilemma (PD), Harmony (H), Snow Drift (SD) and Stag Hunt (SH). Parameters are defined as follows: a and b are the efficiency ratios for reward and punishment, respectively; L6 is the stern-judging norm and L7 is the staying norm; SL denotes square lattice and SF denotes scale-free networks.

Overall, we found that various mechanisms and interventions suffer misalignment at different levels. Table 1 provides a summary of the levels of alignment across the different mechanisms and interventions analysed in this review. Specifically,

  • Misalignment is far more likely with punishment-based interventions, while reward-based interventions are much more likely to align cooperation and social welfare. First, for peer incentives: Peer punishment tends to be more effective at increasing cooperation, but peer rewards more effectively improve overall social welfare. Misalignment from peer punishment is especially likely in specific parameter ranges of specific pairwise games. Second, in case of institutional incentives: Institutional rewards generally align cooperation with social welfare better than institutional punishment, particularly when institutions are highly cost-efficient. Misalignment with institutional punishment is more common when the impact of punishment is either too weak or too strong.
  • Optional participation may result in significant misalignment. Enabling alternative choices (e.g., opting out of participating in the interaction) can sustain cooperation, but this involves hidden costs. The alignment between cooperation and social welfare is reduced when the benefit of choosing an alternative is high, suggesting the importance of managing the associated benefits for optional participation effectively.
  • Direct and indirect reciprocity often ensures a high level of alignment between cooperation and social welfare, particularly for the PD, SH and SD games. In general, both cooperation and social welfare decrease with increasing complexity and verification costs of conditional strategies like TFT and the lead-eight like standing and judging.
  • For network reciprocity, heterogeneous networks often foster higher cooperation levels compared to homogeneous networks. This increase in cooperation also improves total population social welfare, but also leads to inequality among nodes of different degrees.

Our study primarily focused on cooperation; however, the arguments presented are also relevant to other prosocial behaviours including trust, fairness, moral behaviour, honest signalling, social cohesion, technology safe and responsible development, collective risk avoidance, and compliance with pandemic interventions. It would be a critical direction for future investigation to systematically re-evaluate the existing evolutionary mechanisms underlying these prosocial behaviours to determine whether they effectively promote social welfare. Such an investigation could provide valuable insights into the mechanisms that drive human prosocial collective action across various domains, potentially informing strategies to enhance societal well-being.

Our focus here is on maximising the overall social welfare of a population. However, it is often beneficial to consider how welfare is distributed, as a fairer distribution can be more desirable even if the total welfare remains constant. We have demonstrated, for cooperation dilemmas, this issue might happen in heterogenous networks where players at the hub nodes might easily accumulate a significantly higher level of wealth compared to those with few connections. Moreover, this tension between fairness and efficiency can happen even in well-mixed populations if we consider asymmetric interactions such as those modelled by the Ultimatum game. In these games, fairness between providers and recipients is often preferred, even if it means sacrificing some total welfare when unfair offers are encouraged to be rejected.

Leave a comment