Reinforcement learning (RL) models traditionally frame striatal dopaminergic activity (sDA) as a reward prediction error signal (RPEs). However, recent studies have proposed that sDA also regulates parameters like learning rate or integration timescales. We propose that sDA strategically regulates the balance between recency and primacy biases in decision making via adjustments in integration timescale and the exploration-exploitation index. Concurrently, recency and primacy have been shown to depend on the ratio of sensory information (stimulus perceptual strength; SI) vs category information (frequency with which pieces of evidence favor a response option; CI). Hence, we aimed at exploring how SI and CI may interact with sDA dynamics towards efficient learning. We conducted a visual decision-making study with 105 healthy students. Participants identified the dominant color in sequences of 10 static dot clouds (red or blue) and rated their confidence. Virtual coins were rewarded if correct and lost if too slow. We manipulated both SI (difference of red - blue dots) and CI (ratio of red/blue frames). Analyses included reverse correlation to compute integration kernels. We designed a new reinforcement learning drift diffusion model (RLDDM) whose parameters varied depending on SI and CI levels. Recency increased with the size of the SI-CI difference, total information (SI+CI) and performance. In our RLDDM, the SI-CI difference signals volatility, thereby increasing the discount rate, while both average size of RPEs and average expected value (Q) modulate the learning rate. These findings reveal how the relation between statistical traits of evidence might drive online adjustments of sDA mechanisms to support learning.
Towards a dual learning system in humans: sensory and category information bias learning through dopamine
Alejandro Sospedra
2026
Abstract
Reinforcement learning (RL) models traditionally frame striatal dopaminergic activity (sDA) as a reward prediction error signal (RPEs). However, recent studies have proposed that sDA also regulates parameters like learning rate or integration timescales. We propose that sDA strategically regulates the balance between recency and primacy biases in decision making via adjustments in integration timescale and the exploration-exploitation index. Concurrently, recency and primacy have been shown to depend on the ratio of sensory information (stimulus perceptual strength; SI) vs category information (frequency with which pieces of evidence favor a response option; CI). Hence, we aimed at exploring how SI and CI may interact with sDA dynamics towards efficient learning. We conducted a visual decision-making study with 105 healthy students. Participants identified the dominant color in sequences of 10 static dot clouds (red or blue) and rated their confidence. Virtual coins were rewarded if correct and lost if too slow. We manipulated both SI (difference of red - blue dots) and CI (ratio of red/blue frames). Analyses included reverse correlation to compute integration kernels. We designed a new reinforcement learning drift diffusion model (RLDDM) whose parameters varied depending on SI and CI levels. Recency increased with the size of the SI-CI difference, total information (SI+CI) and performance. In our RLDDM, the SI-CI difference signals volatility, thereby increasing the discount rate, while both average size of RPEs and average expected value (Q) modulate the learning rate. These findings reveal how the relation between statistical traits of evidence might drive online adjustments of sDA mechanisms to support learning.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


