“Fast, slow, and metacognitive thinking in AI”
Inspired by the ”thinking fast and slow” cognitive theory of human decision making, we propose a multi-agent cognitive architecture (SOFAI) that is based on ”fast”/”slow” solvers and a metacognitive module. We then present experimental results on the behavior of an instance of this architecture for AI systems that make decisions about navigating in a constrained environment. We show that combining the two decision modalities through a separate metacognitive function allows for higher decision quality with less resource consumption compared to employing only one of the two modalities. Analyzing how the system achieves this, we also provide evidence for the emergence of several human-like behaviors, including skill learning, adaptability, and cognitive control.
It is generally acknowledged that AI still lacks many capabilities which would naturally be included in a notion of (human) intelligence, such as generalizability, adaptability, robustness, explainability, causal analysis, abstraction, common sense reasoning, metacognition, and ethical judgement.
To have these capabilities, humans employ a complex and seamless integration of learning and reasoning, supported by both implicit and explicit knowledge. This integration is related to the so-called ”thinking fast and slow” theory of human decision making, according to which both kinds of knowledge, and both intuitive/unconscious processes and deliberate ones, support creating an internal model of the world and making high-quality decisions based on it.
The proposed architecture, called SOFAI, for Slow and Fast AI, ingests incoming problem instances that are solved by either System 1 (“fast”) agents (also called “solvers”), that react by exploiting only past experience, or by System 2 (“slow”) agents, that are deliberately activated when there is the need to reason and/or search for solutions of higher quality beyond what is expected from the System 1 agents.
System 1 (S1) solvers act solely by leveraging past experience (generated by them or by other solvers), thus they do not systematically reason about incoming problem instances. Just like human’s System 1, they rely on implicit knowledge (that is, training data in AI terms). On the other hand, System 2 (S2) solvers may exploit both implicit and explicit knowledge (that is, symbolic representations in AI terms) and employ multi-step systematic reasoning processes such as, for example, search, logical inference, sequential sampling, chain of thought, etc. S2 solvers typically engage in deeper reasoning that involves multiple steps, with their complexity often scaling with the size of the input problem. This explains why S2 solvers are generally computationally slower than S1 solvers, thereby motivating the use of the terms ‘fast’ and ‘slow’ to distinguish between them. However, before being able to make decisions, S1 solvers need to learn from available data, thus, they need additional offline time to do that. For example, an LLM or another ML solver has the characteristics of an S1 solver, while a symbolic planner or a search algorithm has the features of an S2 solver.
The metacognition (MC) module in SOFAI is an agent that determines which solver will make the next move (real-time MC), compares past trajectories with S2-only simulated ones to possibly update its real-time behavior (reflective MC), and stores each newly executed move in the model of Self (learning MC).

We also examine how SOFAI can achieve this efficiency, exploring the emergence of human-like capabilities, with particular focus on three of them:
- Skill learning, the human ability to leverage experience to internalize some decision processes, that pass from System 2 to System 1. Typical examples of skills that go through this process in humans are driving or reading.
- Adaptability, the human ability to recognize one’s capabilities and limitations in order to use these capabilities optimally to make decisions. For example, visually impaired people learn to use other senses (e.g., touch) to complete tasks (e.g., reading).
- Cognitive control, the human ability to recognize when a scenario is high-risk or is subject to behavioral constraints, and to act accordingly and more carefully. For example, for humans completing a relatively simple math operation (e.g, sum of double-digit numbers) in a high-stakes context (e.g., math exam) usually requires engaging a slow solver.

Our experiments show that SOFAI performs well on all three capabilities. More precisely:
- Skill learning: Initially, SOFAI uses mostly S2 solvers, and later shifts to utilizing mostly S1 solvers when sufficient experience over moves and trajectories is collected.
- Adaptability: Given several versions of the S1 solvers, with different levels of competence, SOFAI tunes their use to make sure that solution quality is kept sufficiently high.
- Cognitive control: In a high-risk scenario, SOFAI employs solvers in a risk-averse fashion, and subsequently makes decisions that violate fewer constraints.
Cognitive control in SOFAI
Cognitive control is the psychological process that enables the employment of cognitive processes to produce more accurate results while suppressing automatic, but less reliable, responses.
In SOFAI, we introduce a risk aversion parameter (ra) to indicate the critical importance of accurately solving the problem at hand. Humans can often cognitively control their behavior to be more careful when a problem is more critical. … It is easy to see that a high risk aversion leads SOFAI to a minimal constraint violation, while a low risk aversion generates a high level of constraint violation. This brings the MC module to choose the solver that produces fewer violations and is likely to produce more accurate results. In fact, with ra = 1 SOFAI adopts the S2 solver more frequently than with ra = 0.
The primary hypothesis was that SOFAI could outperform both the S1 and S2 solvers individually. This hypothesis is confirmed by the experimental results observed in the grid navigation instance while utilizing the SOFAI architecture. This indicates that integrating S1 and S2 solvers through SOFAI’s centralized metacognition module can enhance decision-making. This is due to the ability of the real-time MC module to orchestrate between different solvers and decide on the best one to adopt, given the goals of saving time, avoiding constraint violations, and maximizing reward.
The experimental results also show emerging behaviors that can be observed in animal and especially human thinking: adaptability, skill learning, and cognitive control.
Human beings have the ability to leverage cognitive resources to best deal with incoming tasks.
SOFAI shows itself to be able to do just that: when offered parterre of S1 and S2 solvers, SOFAI dynamically adapts its decision-making process to leverage the strengths of the available solvers, ensuring the best goal-driven outcomes. The metacognitive module is primarily responsible for this behavior.
