Humans and animals are not always rational. “Decoding reward–curiosity conflict in decision-making from irrational behaviors” discusses the fact humans not only rationally exploit rewards but also explore an environment owing to their curiosity. However, the mechanism of such curiosity-driven irrational behavior is largely unknown.
The article develops a decision-making model for a two choice task based on the free energy principle, which is a theory integrating recognition and action selection. The model describes irrational behaviors depending on the curiosity level.
Applying it to rat behavioral data, found that the rat had negative curiosity, reflecting conservative selection sticking to more certain options and that the level of curiosity was upregulated by the expected future information obtained from an uncertain environment. The decoding approach presented, can be a fundamental tool for identifying the neural basis for reward–curiosity conflicts.
Animals and humans perceive the external world through their sensory systems and make decisions accordingly. Generally, they cannot make optimal decisions because of the uncertainty of the environment as well as the limited computational capacity of the brain and time constraints associated with decision-making. In fact, they perform irrational actions. For example, people play lotteries and gamble despite low reward expectations. In this case, they face a dilemma between low expected reward and curiosity regarding whether a reward will be acquired. Thus, understanding how animals control the balance between reward and curiosity is important for clarifying the whole decision-making process. However, a method is yet to be established for quantifying the reward–curiosity balance has yet been established.
Some irrational behaviors emerge because of the strength of curiosity. For example, conservative individuals avoid uncertainty and prefer to select an action that leads to predictable outcomes. Conversely, inquisitive individuals strongly desire to know the environment rather than rewards and prefer to select an action that leads to unpredictable outcomes. Too conservative and inquisitive natures can be interpreted as autism spectrum disorder and attention deficit hyperactivity disorder, patients with which are known to substantially avoid and seek novel information, respectively. Rational individuals fall midway between these two extremes. In an ambiguous environment, they select an action to efficiently understand the environment, and if the environment becomes clear, they select an action to efficiently exploit the rewards. Therefore, curiosity has a major impact on behavioral patterns, and it is believed that animals control the balance between reward and curiosity in a context-dependent manner.
Decision-making has been modeled primarily by reinforcement learning (RL), which is a theory for describing reward-seeking adaptive behavior in which animals not only exploit rewards but also explore the environment. In RL, explorative behavior was addressed by a passive, random choice of action. However, animals actively explore the environment by selecting actions that minimize the uncertainty of the environment given their curiosity.
Recently, the free energy principle (FEP) was proposed by Karl Friston under the Bayesian brain hypothesis, in which the brain optimally recognizes the outside world according to Bayesian estimation.
The FEP addresses not only the recognition of the external world but also the information-seeking action selection, which minimizes the uncertainty of the recognition of the external world, known as “active inference”. Furthermore, FEP proposed a score of action, called expected free energy, which consists of the expected reward and curiosity with the same unit. Thus, action selection can be formulated by maximizing both reward and curiosity. Note that curiosity can be regarded as information gain, that is, the extent to which we expect our recognition to be updated by the new observation through the action.
However, FEP assumes that the weighting of rewards and curiosity is always even and constant. Although a previous FEP study modeled active inference in a two-choice task, it assumed a constant intensity of curiosity and thus could not treat actual animal behaviors in which the weights of rewards and curiosity are expected to change over time.
Hence, conventional theories such as RL and FEP are limited in describing the conflict between reward and curiosity.
In this study, we extended FEP by incorporating a meta-parameter that controls the conflict dynamics between reward and curiosity, called the reward–curiosity decision-making (ReCU) model. The ReCU model can exhibit various behavioral patterns, such as greedy behavior toward reward, information-seeking behaviors with high curiosity and conservative behaviors avoiding uncertainty. Moreover, we developed a machine learning method called the inverse FEP (iFEP) method to estimate the internal variables of decision-making information processing.
Applying the iFEP method to a behavioral time series in a two-choice
task, we successfully estimated the internal variables, such as variations
in curiosity, recognition of reward availability and its confidence.