class: center, middle, inverse, title-slide # Neural mechanisms of learning and decision-making ### Shany Grossman and Ondrej Zika ### Max Planck Research Group NeuroCode ### Last update: 29 October, 2021 --- ## Overview ### Part 1: Basic RL and dopamine (Ondrej) - one very good point ### Part 2: Avanced RL (Shany) - an excellent point - yalla! --- ## Part 1: Basic RL and dopamine ### Introduction - Learning by reinforcement - Pavlovian/Operant conditioning - Hebbian learning as biological counterpart - mention single cell recordings in Amygdala? acquired predicatbility - What is RL and where it comes from? --- ### Associative (Pavlovian) learning - forming stimulus-outcome associations #### Rescorla-Wagner model (multi-stimulus RL learning) - description and example of different alpha simulations - pitfalls (reinstatement, second order conditioning) --- ### Dopamine - anatomy of dopaminergic afferents and BG - large variabilty in terms of computation - involved in movement control (Prkinson's) and vigor, go/nogo pathways - involvement in reward (believed to signal obtained reward) - timing (Walton), state signalling (Starkweather), distributed processing RPE (Daw recent) --- ### DA and RPE - activity associated with obtaining food -> reward? - maybe not reward, but the difference between expectation and obtained (Schultz) - asymmetry in firing rate potential (relevant later for aversive learning) - replicated in many single-cell recording, fMRI and optogenetic studies --- ### how rare this is - a link between a normative strategy and a feature of the nervous system - yes but also if you think about it it's not a surprise since evolution is based on trial-and-error learning --- ### TD-learning - taking temporal information into account, predicting aggregate reward over actions - stimuli predicting highest average reward elicit highest DA responses --- ### Value - predicting future rewards - keeping track of *relative* values of different stimuli/features - useful framework to study decision making - softmax - vmPFC/OFC role in representing value as a common currency across categories - maybe mention metaRL --- ### Appetitive versus aversive learning - issue with dopamine floor - absolute prediction error, end/avoidance of aversive stimulus is rewarding (Moutoussis 2008) - major differences between appetitive and aversive learning (discuss fight/flight pathway PAG etc) --- ### Learning rates and uncertianty - how to decide how much to learn? - noise in observations - changeability of the environment --- ### metalearning - learning the volatility - learning the structure ---