“Reframing neuroergonomics in an evolutionary and active inference context”
Everyday situations, such as feeling nauseous in virtual-reality environments or getting dizzy when reading as a car passenger, reveal how easily our senses can become confused when modern technology disrupts the innate relationship between the physical environment and human sensory systems.
Such disruptions expose the vulnerability of the human senses to conflicting input arising in technologically altered environments. Even in the absence of direct sensory conflict, in complex technological settings such as digital factories and modern operating rooms, the convergence of multiple competing stimuli within and across sensory modalities further amplifies sensory load and cognitive strain.
The common denominator of all such problems is that our ancient sensory processing and perceptual systems do not fit well with the technological world we have created. This evolutionary mismatch is already significant, but it will become even more critical as mixed reality concepts and advanced digital technologies integrate more deeply into our daily lives.
Focusing on sensory mismatch and sensory strain as two significant ramifications of the Anthropocene, we reframe neuroergonomics in an evolutionary and active inference context.
Our reframing argues that neuroergonomics must prioritize technology design that respects evolutionarily tuned priors, and should additionally deploy measured epigenetic, gene–culture, and learning-driven interventions as complementary levers to support adaptive change.
Thus, we highlight the importance of aligning and grounding neuroergonomic design with the human sensory system according to constraints and affordances defined by human evolutionary history.

Sensory capacities that are likely to be more genetically entrenched are assumed to remain essentially constant over the Holocene (left y-axis; the peach-colored plane illustrates this assumed constancy).
In contrast, the complexity of land- and soundscapes (right y-axis; blue-to-green gradients) increases markedly with rising degrees of urbanization in recent years (z-axis), especially since industrialization.
Modern multisensory urban landscapes may be both quantitatively richer and qualitatively different from those that shaped our perception.
Cultural evolution has outpaced biological evolution by far, creating mismatches between evolved multisensory priors and present-day informational demands.

(Right) A speculative and heuristic illustration of the distinction between two conceptual layers that may affect human perception and cognition: The ”core” shared by the whole species that features only small individual differences and that changes over evolutionary time scales, and the ”mantle” of individual differences acquired over a life-time. Two humans in very different environments will begin life very similar and then their mantles diverge through modulatory influences.
Within a multimodal framework, sensory conflict and strain can be fruitfully modeled as natural expressions of active inference under the Free Energy Principle. Sensory conflict or mismatch arises whenever different modalities deliver incongruent information, producing prediction errors that signal that the brain’s generative model is insufficient to explain the current environment. For instance, when seeing but not feeling motion in virtual reality setups, the conflict with deeply entrenched physical regularities in the brain’s predictive model evokes the “full body response” of cybersickness. While for sensory conflict the connection to active inference and its generative brain-modeling approach is direct, it is more nuanced for sensory strain. Sensory strain reflects the adaptive, resource-intensive regulation required to stabilize inference under uncertainty: reallocating precision and attentional gain across modalities and spatiotemporal scales to sustain goal-directed perception and action in noisy, dynamic environments.
In particular, earlier models of attentional effort have been mapped to the free energy minimization framework that can be extended to multiple modalities. In a generative spatiotemporal scale space, strain appears as coordinated control between the evolutionarily entrenched core (fast, local predictions and rapid exogenous adjustments) and higher, slower layers (abstract, forward-looking predictions and endogenous, reasoning-based loops). Precision-weight optimization determines which information streams are amplified or suppressed at each scale, depending on reliability, priors, and current goals.
Within the active-inference framework, minimizing uncertainty directly motivates the deployment of attentional effort required to navigate sensory strain. Operationally, this involves model updating, precision reweighting, and action/attention within an active inference framework, where attentional gain and noise suppression mechanisms enhance attended objects along the sensory pathways, and learned experiences modulate selection policies over time.
This perspective offers a concrete instantiation of this regulation: precision weights bias selection toward the target stream and away from distractors, integrating context and priors to maintain coherent tracking under load. Strain, in this sense, is a graded expression of how much precision must be reallocated across the scale space to keep inference stable, increasing when distractor salience rises or when target predictability falls. It also reflects the cost of maintaining long-horizon beliefs (e.g., conversational goals) against short-horizon exogenous overrides, with precision weights dynamically re-tuned as the (sensory) scene evolves for a specific example listening scenario. Because precision is context- and also experience-dependent, identical scenes can induce different strain profiles across individuals, as priors and learned contingencies shape the weighting of candidate streams and the thresholds for reallocation. Ultimately, sensory strain marks the ongoing effort of a unified, active inference system to minimize immediate and future surprise by continuously tuning precision across spatiotemporal scales by balancing rapid exogenous responses with sustained endogenous control to preserve perceptual stability in complex multisensory settings.
Future neuroergonomic research should adopt a first-principles, evolution-anchored approach to what environments humans are built for, especially given the growing sensory conflict and sensory strain that characterize the Anthropocene. This requires acknowledging that human perception operates through both core, species-typical constraints and mantle-level, experience-dependent plasticity, which jointly determine how far sensory and attentional mechanisms can be adapted through training or technology in contexts that induce conflict or sustained strain. Design must therefore build on this dual structure, respecting genetically entrenched limits while leveraging modulatory learning processes to scaffold safe adaptation. Within a multimodal active inference framework, we show that sensory conflict and sensory strain can be fruitfully modeled, offering a unified theoretical framework while acknowledging that empirical validation across contexts is still needed. Advances in neurophysiological decoding and attention-aware modality selection may help mitigate these Anthropocene-specific pressures by dynamically prioritizing and routing information. This potentially yields more resilient and effective human–technology interactions that are better aligned with human perceptual and attentional capacities in increasingly complex environments.
