Unsupervised Pretraining In Biological Neural Networks

All experimental procedures were conducted according to the Institutional Animal Care and Use Committee (IACUC), and received ethical approval from the IACUC board at the HHMI Janelia Research Campus.

Experimental methods

Animals

We performed 89 recordings in 19 mice bred to express GCaMP6s in excitatory neurons: TetO-GCaMP6s × camK2a-tTa mice (available as RRID:IMSR_JAX:024742 and RRID:IMSR_JAX:003010)⁵². Of these mice, 13 were male and 6 were female, and ranged from 2 to 11 months of age. Mice were housed in reverse light cycle, and were pair housed with their siblings before and after surgery. The mice had a running wheel in their cage, as well as corncob bedding with Nestlets. During training and imaging periods, we replaced the running wheel with a tube, to potentially motivate the mice to run longer while head fixed. Owing to the stability of the cranial window surgery, we often used the same mice for multiple experiments in the laboratory: two of the mice were used in ref. ⁵³. Two of the mice were raised in complete darkness. We did not see differences compared with normal-reared mice, so we pooled them together. It was not possible to blind the experimenter with respect to behavioural experiments that involved water deprivation (see below). The number of mice in each experimental group was chosen to be comparable with other studies involving complex behaviour and imaging procedures such as ours. Mice were randomly blockigned into groups where relevant.

We also used 23 C57 female mice for behaviour-only experiments. These mice were only implanted with a headbar and not a cranial window.

Surgical procedures

Surgeries were performed in adult mice (postnatal day 35 (P35)–P333) following procedures previously described⁵⁴. In brief, mice were anaesthetized with isoflurane while a craniotomy was performed. Marcaine (no more than 8 mg kg⁻¹) was injected subcutaneously beneath the incision area, and warmed fluids + 5% dextrose and 0.1 mg kg⁻¹ buprenorphine (systemic ***gesic) were administered subcutaneously along with 2 mg kg⁻¹ dexamethasone via the intramuscular route. For mice with cranial windows, measurements were taken to determine bregma-lambda distance and the location of a 4-mm circular window over the V1 cortex, as far lateral and caudal as possible without compromising the stability of the implant. A 4 + 5-mm double window was placed into the craniotomy so that the 4-mm window replaced the previously removed bone piece and the 5-mm window lay over the edge of the bone. After surgery, 5 mg kg⁻¹ ketoprofen was administered subcutaneously and the mice were allowed to recover on heat. The mice were monitored for pain or distress, and 5 mg kg⁻¹ ketoprofen was administered for 2 days following surgery.

Imaging acquisition

We used a custom-built two-photon mesoscope³¹ to record neural activity, and ScanImage⁵⁵ for data acquisition. We used a custom online z-correction module (now in ScanImage), to correct for z and xy drift online during the recording. As previously described⁵⁴, we used an upgrade of the mesoscope that allowed us to approximately double the number of recorded neurons using temporal multiplexing⁵⁶.

The mice were free to run on an air-floating ball. Mice were acclimatized to running on the ball for several sessions before training and imaging.

Visual stimuli

We showed virtual reality corridors to the mice on three perpendicular LED tablet screens, which surrounded each mouse (covering 270° of their visual field of view). To present the stimuli, we used PsychToolbox-3 in MATLAB⁵⁷. The virtual reality corridors were each 4 m long, with 2 m of grey space between corridors. The corridors were shown in a random order. The mice moved forward in the virtual reality corridors by running faster than a threshold of 6 cm s⁻¹, but the virtual corridors always moved at a constant speed (60 cm s⁻¹) as long as mice kept running faster than the threshold. Running was detected using an optical tracking sensor placed close to the ball.

The virtual reality corridors were created by concatenating four random crops from one of four large texture images: circle, leaf, rock and brick (Extended Data Fig. 1a). Grating stimuli (with angles of 0° and 45°) were also used to create the virtual reality corridors for a subset of mice (‘unsupervised gratings’ and VR_g mice).

For behaviour-only experiments, each mouse was trained with rewards on one random pair of stimuli, such as leaf–circle or rock–brick. Mice with unsupervised pretraining (VR_n mice) were pretrained on the same pair on which they were trained with rewards.

For imaging experiments, mice from both the task and the unsupervised cohorts were only trained on or exposed to one pair of stimuli, such as leaf–circle or rock–brick. For the mice exposed to the grating stimuli, we presented more than one pair of naturalistic stimuli before and after exposure to the gratings stimuli and recorded the neural responses to the naturalistic stimuli pairs, so each mouse can have more than one imaging session before and after. We also presented more than one pair of naturalistic stimuli to the naive mice, so each naive mouse can have more than one imaging session for testing naive responses.

The two different types of leaf1-swap stimuli (Extended Data Figs. 6f and 7a) were introduced separately for task mice but are pooled together for statistics ***ysis. The two leaf1-swap stimuli were introduced in the same session for unsupervised and naive mice, but were treated as two different data points and pooled together for statistical ***yses.

Water restriction procedure

Water restriction procedures were conducted according to the IACUC. During the virtual reality + reward training condition, animals received an average of 1 ml water per day (range of 0.8–1.2 ml depending on health status and behavioural performance). Before reaching 1 ml water per day after the initiation of the restriction procedure, we gradually reduced the water amount from 2 ml per day to 1.5 ml per day until finally to 1 ml per day. The behaviour-only mice were water restricted for 5 days right before the virtual reality + reward training condition. Once the mice finished the virtual reality + reward training session, the remaining water (0.8–1.2 ml minus the amount received during experiment) was provided 0.5 h after the training. During the whole water restriction period, the body weight, appearance and behaviours were monitored using a standard quantitative health blockessment system⁵⁸.

Water reward delivery and lick detection

A capacitance detector was connected with the metal lick port to detect licking. Mice received a drop of water (2.5 µl) if they correctly licked inside the reward corridor. In day 1 of the virtual reality + reward training session, we always delivered the water pblockively (pblockive mode) so that the mice could get used to acquiring reward when stimuli were present. For all the behaviour-only mice (Fig. 5) and some of the imaging mice (Figs. 1–4), we switched to active-reward mode after day 1 so that the mice had to lick within the reward zone to trigger the water delivery. For some of the imaging mice (Figs. 1–4), we kept using the pblockive mode but added a delay (1 s or 1.5 s) between the sound cue and reward delivery. Given that mice started licking as soon as they entered the corridor and until they received the water, adding a delay versus active-reward mode did not change how the mice behaved (Figs. 1–4).

Behavioural training

All animals were handled via refined handling techniques for at least 3 days before being acclimated to head fixation on the ball. Animals were acclimated gradually (0.5–1 h per day) on the ball over at least 3 days until they could be head fixed without exhibiting any signs of distress. Then, animals began a running training regiment (1 h per day), which lasted for at least 5 days to ensure they could run smoothly and continuously on the ball before being exposed to the closed-loop virtual linear corridor. For water-restricted mice, we trained them for 2 days to get used to acquiring water from the spout when no stimulus was presented, before the virtual reality + reward training session. The VR_n and VR_g pretraining groups of mice (Fig. 5) were trained to acquire water from the spout on the last 2 days of unsupervised pretraining with no stimuli presented, after the virtual reality session ended to avoid the blockociative learning between stimuli and rewards. For the group without pretraining (Fig. 5), learning to get reward from the spout was similarly carried out after the running training session on the last 2 days of running training.

For the behaviour-only experiment, all animals started training in the virtual reality + reward training session on a Monday and continued training for exactly 5 days. The first day of training consisted of pblockive reward training, during which the reward was always delivered to the mouse in the rewarded corridor at the beginning of the reward zone. The next 4 days consisted of active reward training, during which the mice were required to lick in the reward zone to trigger the reward. This ensured a consistent training schedule during the critical learning period. The beginning of the reward zone was randomly chosen per trial from a uniform distribution between 2 m and 3 m, and continued until the end of the corridor. This meant that the earliest position in the rewarded corridor in which the mouse can receive the reward was a range of 2–3 m.

For imaging mice, the sound cue was presented in all trial types, for task and unsupervised mice, and the time of the sound cue was randomly chosen per trial from a uniform distribution between positions 0.5 m and 3.5 m. For task mice, the sound cue indicated the beginning of the reward zone in the rewarded corridor. The reward was delivered if a lick was detected after the sound cue in the rewarded corridor. As mice kept licking (anticipatory licking) as soon as they entered the reward corridors and learned to lick right after the sound cue, the reward delivery locations were also approximately uniformly distributed between 0.5 m and 3.5 m (Fig. 1c). In some mice, the reward was delivered pblockively with a delay after the sound cue, but these mice still showed anticipatory licking before the sound cue. To rule out whether the licking was due to detecting signals related to the reward delivery, such as water coming out the spout or sound produced by solenoid valve, we considered a lick response if the mouse licked at least once inside the corridor but before the sound cue. Although the rewards were absent, the sound cue was still presented in the unsupervised training experiment for consistency.

Data ***ysis

For ***ysis, we used Python 3 (ref. ⁵⁹), primarily based on numpy and scikit-learn^60,61, as well as Rastermap³⁹. The figures were made using matplotlib and jupyter-notebook^62,63.

Processing of calcium imaging data

Calcium imaging data were processed using Suite2p³², available on GitHub (www.github.com/MouseLand/suite2p). Suite2p performs motion correction, region of interest detection, cell clblockification, neuropil correction and spike deconvolution as previously described⁶⁴. For non-negative deconvolution, we used a timescale of decay of 0.75 s (ref. ⁶⁵). All our ***yses were based on deconvolved fluorescence traces.

Neural selectivity (d′)

To compute the selectivity index d′, illustrated in Fig. 1f, we only selected data points inside the 0–4-m region of the corridors where the textures were shown. We excluded the data points in which the animal was not running, so that all data points included for calculating the selectivity index came from similar engagement or arousal levels of the mice. Note that these data points are computed from original estimated deconvolved traces without interpolation. We first calculated the means (μ₁ and μ₂) and standard deviations (σ₁ and σ₂) of activities for any two corridors, then computed the d′. The criteria for selective neurons was (| {d}^{{prime} }| ge 0.3):

$${d}^{{prime} }=frac{{mu }_{1}-{mu }_{2}}{frac{{sigma }_{1}}{2}+frac{{sigma }_{2}}{2}}$$

To make the density plots across the cortex (for example, Fig. 1i), we computed 2D histograms for each session based on the selective neurons in that session. We then applied a 2D Gaussian filter to this matrix and divided by the number of total recorded neurons in that session to get a density map for each mouse. Before averaging the density maps across mice, we blockigned NaN to areas where no neurons were recorded. This ensured no underestimation on the density within areas where not all mice have neurons recorded.

For sequence similarity ***yses (Fig. 2 and Extended Data Figs. 3 and 6), we used half of leaf1 and circle1 trials (train trials) to compute the selectivity index d′ and we selected neurons based on the criteria (| {d}^{{prime} },| ge 0.3). We then split the other half of leaf1 and circle1 trials (test trials) into odd versus even trials to compute spatial tuning curves for odd and even trials separately for each selective neuron. From these spatial tuning curves, we used the position with the maximal response as the preferred position for each neuron. To compute tuning curves for other stimuli such as leaf2, circle2 and swap (which were not used to find selective neurons), we split all trials into odd and even trials. The preferred positions of the same neurons in different corridors or in odd versus even trials were used to compute a correlation coefficient (r).

Coding direction and similarity index

To compute the coding direction (Figs. 2 and Fig. 3 and Extended Data Figs. 3, 6 and 7), for example, in leaf1 versus circle1, we first chose leaf1-selective and circle1-selective neurons based on their d′ from the train trials (using the top 5% selective neurons each to leaf1 and circle1, same as the sequence similarity ***ysis). Then, we acquired the neural activity for every position through interpolation, and normalized the neural activity r for each neuron by subtracting the baseline response in the grey portion of the corridor μ_grey, and dividing by the average standard deviation of the responses of a neuron in each corridor:

$${{bf{r}}}_{norm}=frac{{bf{r}}-{mu }_{grey}}{frac{{sigma }_{leaf1}}{2}+frac{{sigma }_{circle1}}{2}}$$

We then computed the mean normalized activity μ_leaf1 of leaf1-selective neurons and the mean normalized activity μ_circle1 of circle1-selective neurons at each position in each corridor. The coding direction ({{bf{v}}}_{t}^{{rm{proj}}}) on a given trial t was defined as the difference

$${{bf{v}}}_{t}^{{rm{proj}}}={mu }_{{rm{leaf1}}}-{mu }_{{rm{circle1}}}$$

Note that this is equivalent to blockigning weights of (frac{1}{{N{rm{trials}}}_{{rm{leaf1}}}}), (frac{-1}{{N{rm{trials}}}_{{rm{circle1}}}}) and 0, respectively for positively selective, negatively selective and non-selective neurons, and using those weights as a projection vector for the neural data. We investigated the coding direction always on test trials not used for selecting neurons, either from held-out trials of leaf1 and circle1, or for trials with other stimuli. We averaged the responses across each trial type: ({{bf{v}}}_{{rm{leaf1}}}^{{rm{proj}}}=mathop{sum }limits_{tin {rm{leaf1}}}^{{N}_{{rm{leaf1}}}}{{bf{v}}}_{t}^{{rm{proj}}}) (for example, Fig. 2i, left).

Average projections for each trial type were computed by averaging these projections within the texture area (0–4 m) (for example, Fig. 2i, right), denoted as ({a}_{{rm{leaf1}}}^{{rm{proj}}}), ({a}_{{rm{leaf2}}}^{{rm{proj}}}), ({a}_{{rm{circle1}}}^{{rm{proj}}}) and ({a}_{{rm{circle2}}}^{{rm{proj}}}). We then defined the similarity index (SI) on a per-stimulus basis, for example, for leaf2, as:

$$begin{array}{l},,dy,=,{a}_{{rm{leaf1}}}^{{rm{proj}}}-{a}_{{rm{leaf2}}}^{{rm{proj}}}\ ,,dx,=,{a}_{{rm{leaf2}}}^{{rm{proj}}}-{a}_{{rm{circle1}}}^{{rm{proj}}}\ {rm{S}}{{rm{I}}}_{{rm{leaf2}}},=,frac{dx-dy}{dx+dy},,,,,,,(-1le {rm{SI}}le 1)end{array}$$

which is quantified in Fig. 2j. We also computed the coding direction for different sets of selective neurons, for example, leaf1 versus leaf2, and then computed the similarity indices for leaf3 and circle1 (Extended Data Fig. 6c,d).

Reward-prediction neurons

Reward-prediction neurons were either selected using the clustering algorithm Rastermap (Fig. 4a–d) or using a d′ criterion (Fig. 4e–n). Using Rastermap, we selected the reward-prediction neurons based on their special firing patterns of only responding inside the rewarded corridor and specifically before reward delivery. Using d′, we first interpolated the neural activity of single neurons based on their position inside the corridor and constructed a matrix (trials by positions). Only the leaf1 trials (rewarded for task mouse cohort, and unrewarded for the unsupervised cohort) were chosen and divided into early-cue trials versus late-cue trials based on the sound cue position inside the corridor. We used cue position instead of reward position because the sound cues were played in each corridor at a random position, with or without reward, and these sound cue positions were highly correlated with reward positions in the rewarded corridor (Fig. 1c). We then calculated the ({d}_{{rm{late}},{rm{vs}},{rm{early}}}^{{prime} }) as:

$${d}_{{rm{late}},{rm{vs}},{rm{early}}}^{{prime} }=frac{{mu }_{{rm{late}}}-{mu }_{{rm{early}}}}{frac{{sigma }_{{rm{late}}}}{2}+frac{{sigma }_{{rm{early}}}}{2}}$$

and selected the reward-prediction neurons with ({d}_{{rm{late}},{rm{vs}},{rm{early}}}^{{prime} }ge 0.3). The activity of the reward-prediction neural population in Fig. 4 (except Fig. 4d) was acquired following k-fold cross-validation. We randomly split all trials into tenfolds. we used ninefolds as training trials to compute ({d}_{{rm{late}},{rm{vs}},{rm{early}}}^{{prime} }). Trial-by-trial activity for the remaining onefold (test trials) was computed by averaging across the reward-prediction neurons that met the selection criteria. We repeated this ten times until average population activity for every fold (and thus every trial) was acquired.

To obtain reward-prediction activity aligned to the first lick (Fig. 4j), only rewarded trials (leaf1) with a first lick happening after 2 m from corridor entry were included to enable us to investigate the reward-prediction signal before licking starts. Owing to this criteria, one mouse was excluded because there were no trials with a first lick later than 2 m.

To obtain reward-prediction activity and activity of leaf1-selective neurons in leaf2 trials (Fig. 4k,l), one mouse was excluded due to having only one leaf2 trial without licking.

Running speed

To compare the running speed before and after learning, we selected the period when mice were running faster than 6 cm s⁻¹ for at least 66 ms (a threshold that triggers the motion of the virtual reality; Extended Data Fig. 2d). For Extended Data Fig. 8e, the running speed was interpolated to the timepoints of the imaging frames using the function scipy.interpolate.interp1d. For Extended Data Fig. 2a–c, the running speed for every position (0–6 m, with a 0.1-m step size) was also acquired through the same interpolation method. Extended Data Fig. 2c shows the averaged running speed inside the texture areas (0–4 m).

Statistics and reproducibility

We performed paired, two-sided Student’s t-tests in Figs. 1d,j, 2f,j, 3b,c, 4g,k,l and Extended Data Figs. 1f,g, 2c,d, 4a,c,d, 5b, 6d,i, 7d,g, 8a,c and 9; and performed independent two-sided Student’s t-tests in Figs. 3e,h and 5f. Statistical significance was calculated as *P < 0.05, **P < 0.01 and ***P < 0.001. No adjustments were made for multiple comparisons. Error bars on all figures represent s.e.m. The exact P values are below for each figure. Where four values are reported, these are for V1, medial, lateral and anterior regions.

Fig. 1d: 0.714 before learning and 5.97 × 10⁻⁴ after learning
Fig. 1j: 0.940, 0.00726, 0.0341 and 0.0261 for task mice; 0.212, 3.21 × 10⁻⁴, 0.245 and 0.0202 for unsupervised mice; and 0.146, 0.318, 0.0632 and 0.655 for unsupervised grating mice
Fig. 2f: 7.97 × 10⁻⁶, 2.56 × 10⁻⁴, 1.84 × 10⁻⁵ and 4.1 × 10⁻³ for task mice; 1.72 × 10⁻⁷, 1.04 × 10⁻⁴, 1.25 × 10⁻⁴ and 3.07 × 10⁻³ for unsupervised mice; and 2.11 × 10⁻¹⁰, 1.25 × 10⁻⁵, 2.55 × 10⁻⁸ and 8.661 × 10⁻⁶ for naive mice
Fig. 2j: 0.0015, 2.94 × 10⁻⁴, 0.0020 and 0.0013 for task mice; 1.72 × 10⁻⁴, 7.24 × 10⁻⁵, 2.44 × 10⁻⁴ and 0.0082 for unsupervised mice; and 2.47 × 10⁻⁶, 5.97 × 10⁻⁹, 2.35 × 10⁻⁷ and 2.73 × 10⁻⁶ for naive mice
Fig. 3b: 0.0037, 0.0922, 0.0026 and 0.0136 task mice; and 2.34 × 10⁻⁵, 0.0081, 2.96 × 10⁻⁴ and 0.0146 for unsupervised mice
Fig. 3c: 0.0074
Fig. 3e: P_{supervised vs naive}: 0.991, 1.381 × 10⁻⁵, 0.352 and 0.053; P_{unsupervised vs naive}: 0.797, 4.45 × 10⁻⁶, 0.282 and 0.727; and P_{unsupervised grating vs naive}: 0.226, 0.284, 0.239 and 0.570
Fig. 3h: P_{supervised vs naive}: 0.084, 0.002, 2.81 × 10⁻⁴ and 6.2 × 10⁻⁷; P_{unsupervised vs naive}: 5.26 × 10⁻⁵, 1.18 × 10⁻⁵, 9.32 × 10⁻⁵ and 0.011; and P_{unsupervised grating vs naive}: 0.316, 0.705, 0.375 and 0.945
Fig. 4g: 0.0069 for task mice and 0.708 for unsupervised mice
Fig. 4k: 0.014
Fig. 4l: 0.180
Fig. 5f. ({P}_{V{R}_{n}{rm{v}}{rm{s}}{rm{n}}{rm{o}}{rm{p}}{rm{r}}{rm{e}}{rm{t}}{rm{r}}{rm{a}}{rm{i}}{rm{n}}{rm{i}}{rm{n}}{rm{g}}}) 1 by day: 0.00493, 0.00555, 0.0554, 0.259 and 0.579; ({P}_{V{R}_{n}{rm{v}}{rm{s}}V{R}_{g}}) by day: 0.538, 0.0348, 0.221, 0.541 and 0.978; and ({P}_{V{R}_{g}{rm{v}}{rm{s}}{rm{n}}{rm{o}}{rm{p}}{rm{r}}{rm{e}}{rm{t}}{rm{r}}{rm{a}}{rm{i}}{rm{n}}{rm{i}}{rm{n}}{rm{g}}}) by day: 0.00782, 0.148, 0.415, 0.476 and 0.510
Extended Data Fig. 1f (left): 0.206, 0.00497, 0.0297 and 0.0114 for task mice; and 0.0389, 1.15 × 10⁻⁴, 0.982 and 0.249 for unsupervised mice
Extended Data Fig. 1f (right): 0.562, 0.0219, 0.104 and 0.134 for task mice; and 0.017, 0.00130, 0.0129 and 0.00977 for unsupervised mice
Extended Data Fig. 1g (left): 0.419 for task mice and 0.0466 for unsupervised mice
Extended Data Fig. 1g (right): 0.574 for task mice and 0.126 for unsupervised mice
Extended Data Fig. 2c (left): 0.883 for circle1 and 0.660 for leaf1
Extended Data Fig. 2c (right): 0.154 for circle1 and 0.724 for leaf1
Extended Data Fig. 2d (left): 0.814 for circle1 and 0.533 for leaf1
Extended Data Fig. 2d (right): 0.017 for circle1 and 0.923 for leaf1
Extended Data Fig. 4a: P_circle1 = 0.163, P_circle2 = 0.923, P_leaf1 = 0.028 and P_leaf2 = 0.013
Extended Data Fig. 4c (V1): P_circle1 = 0.492, P_circle2 = 0.106, P_leaf1 = 0.018 and P_leaf2 = 0.041
Extended Data Fig. 4c (medial): P_circle1 = 0.947, P_circle2 = 0.190, P_leaf1 = 0.474 and P_leaf2 = 0.072
Extended Data Fig. 4c (lateral): P_circle1 = 0.936, P_circle2 = 0.150, P_leaf1 = 0.046 and P_leaf2 = 0.326
Extended Data Fig. 4c (anterior): P_circle1 = 0.316, P_circle2 = 0.112, P_leaf1 = 0.013 and P_leaf2 = 0.935
Extended Data Fig. 4d: P_V1 = 0.531, P_medial = 0.808, P_lateral = 0.308 and P_anterior = 0.015
Extended Data Fig. 5b: 0.772, 0.799, 0.474 and 0.978 for task mice; and 0.957, 0.476, 0.904 and 0.691 for unsupervised mice
Extended Data Fig. 6d: 0.0034, 5.46 × 10⁻⁴, 0.0059 and 0.082 for task mice; 0.019, 0.011, 0.0057 and 0.094 for unsupervised mice; 0.486, 0.654, 0.130 and 0.352 for naive mice; and 0.414, 0.217, 0.077 and 0.307 for unsupervised grating mice
Extended Data Fig. 6i: 6.18 × 10⁻⁴, 8.79 × 10⁻⁴, 0.0015 and 0.082 for task mice; 3.23 × 10⁻⁶, 5.34 × 10⁻⁶, 1.44 × 10⁻⁶ and 7.51 × 10⁻⁶ for unsupervised mice; and 6.45 × 10⁻⁷, 1.99 × 10⁻⁴, 5.14 × 10⁻⁵ and 0.0040 for naive mice
Extended Data Fig. 7d: 2.06 × 10⁻⁶, 5.47 × 10⁻⁴, 2.31 × 10⁻⁴ and 1.41 × 10⁻⁴ for task mice; 4.75 × 10⁻⁹, 3.30 × 10⁻⁵, 1.25 × 10⁻⁷ and 2.62 × 10⁻⁵ for unsupervised mice; and 4.71 × 10⁻¹⁰, 0.881, 3.94 × 10⁻⁶ and 3.33 × 10⁻⁷ for naive mice
Extended Data Fig. 7g: P_{leaf1 vs leaf2} = 0.347
Extended Data Fig. 8a: 0.496, 0.151, 0.091 and 0.0069 for task mice; and 0.441, 0.632, 0.882 and 0.708 for unsupervised mice
Extended Data Fig. 8c: 0.277, 0.700, 0.548 and 0.0210 for task mice; and 0.183, 0.276, 0.235 and 0.0546 for unsupervised mice
Extended Data Fig. 9 (day 1): 0.0166 for VR_n pretraining, 0.157 for VR_g pretraining and 0.272 for no pretraining
Extended Data Fig. 9 (day 2): 0.0033 for VR_n pretraining, 0.0088 for VR_g pretraining and 0.0823 for no pretraining
Extended Data Fig. 9 (day 3): 0.0120 for VR_n pretraining, 0.188 for VR_g pretraining and 0.0050 for no pretraining
Extended Data Fig. 9 (day 4): 0.067 for VR_n pretraining, 0.0356 for VR_g pretraining and 0.0180 for no pretraining
Extended Data Fig. 9 (day 5): 0.1185 for VR_n pretraining, 0.0230 for VR_g pretraining and 0.0317 for no pretraining.

Retinotopy

Retinotopic maps for each imaging mouse were computed based on receptive field estimation using neural responses to natural images (at least 500 natural images repeated 3 times each). This proceeded in several steps:

(1)

Obtained a well-fit convolutional encoding model of neural responses with an optimized set of 200 spatial kernels, using a reference mouse (Extended Data Fig. 1b).
(2)

Fitted all neurons from our imaging mice to these kernels to identify the preferred kernel and the preferred spatial position (Extended Data Fig. 1c).
(3)

Aligned the spatial position maps to a single map from the reference mouse.
(4)

Outlined the brain regions in the reference mouse using spatial maps and approximately following the retinotopic maps from ref. ³³.

Compared with previous approaches, ours took advantage of single-neuron responses rather than averaging over entire local populations, and by using natural images that we can better drive neurons and obtain their specific receptive field models. The mapping procedure was sufficiently efficient that it could be performed in a new mouse with responses to only 500 test images each repeated 3 times. Below we describe each step in detail:

For step 1, using the reference mouse, we used the following to model the response of neuron n to image img:

$${F}_{n}({rm{img}})={a}_{n}cdot (Kcirc ,{rm{img}})({k}_{n},{x}_{n},{y}_{n})$$

where a_n is a positive scalar amplitude, ∘ represents the convolution operation, x_n and y_n represent the position in the convolution map for neuron n, k_n represents the index of the convolutional map, and K is a matrix of size 200 × 13 × 13 containing the convolutional filters. This model was fit to neural responses to a natural image dataset of approximately 5,000 images shown at a resolution of 120 × 480, which were downsampled to 30 × 120 for fitting. The kernels K were initialized with random Gaussian noise. An iterative expectation maximization-like algorithm was used to optimize the kernels, which alternated between: (1) finding the best position (x_n, y_n) for each neuron n, as well as the best kernel k_n and the best amplitude a_n; and (2) optimizing K given a fixed blockignment (x_n, y_n, k_n, a_n) for each neuron n. The first part of the iteration was done in a brute force manner: responses of each kernel at each location for each image were obtained and correlated with the responses of each neuron. The highest correlated match for each neuron was then found and its corresponding (x_n, y_n, k_n, a_n) was used to fit K. The best estimate for kernels K was approximately equivalent to averaging the linear receptive field all cells n blockigned to a kernel k_n after alignment to their individual spatial centres x_n, y_n. After each iteration, the kernels were translated so their centres of mblock would be centred in the 13 × 13 pixel frame. The centre of mblock was obtained after taking the absolute value of the kernel coefficients. After less than ten iterations, the kernels converged to a set of well-defined filters (Extended Data Fig. 1b).

For step 2, after the kernels K were estimated once, for a single-reference recording, we used them for all recordings by repeating the first step of the iterative algorithm in step 1 with a slight modification. Instead of blockigning each neuron (x_n, y_n, k_n, a_n) independently, we averaged the 2D, maximum correlation maps of the nearest 50 neurons to each neuron, and then took their maximum. This essentially smoothed the spatial correlations to ensure robust estimation even for neurons with relatively little signal (Extended Data Fig. 1c).

For step 3, to align spatial maps to the reference mouse, we used kriging interpolation to find a tissue-to-retinotopy transformation f. Intuitively, we wanted to model the data from the alignment mouse as a smooth function f from a 2D space of (z, t) positions in tissue to another 2D space of retinotopic preferences (x,y). For a new mouse with tissue positions (({z}^{{prime} },{t}^{{prime} })) and retinotopic positions (({x}^{{prime} },{y}^{{prime} })), we could then optimize an affine transform A composed of a 2 × 2 matrix A₁ and 1 × 2 bias term A₂ such that

$$({z}_{a}^{{prime} },{t}_{a}^{{prime} })={A}_{1}cdot ({z}^{{prime} },{t}^{{prime} })+{A}_{2}$$

so that

$${rm{Cost}},=,parallel ,f({z}_{a}^{{prime} },{t}_{a}^{{prime} })-({x}^{{prime} },{y}^{{prime} }){parallel }^{2}$$

is minimized. To fit the smooth function f, we used kriging interpolation, so that f is the kriging transform

$$begin{array}{l}f({z}_{a}^{{prime} },{t}_{a}^{{prime} }),=,F(({z}_{a}^{{prime} },{t}_{a}^{{prime} }),(z,t))cdot F((z,t),(z,t))\ ,,,,cdot ,{rm{Cov}}((z,t),(x,y)),end{array}$$

where F is a squared exponential kernel (F(a,b)=exp (-parallel a-b{parallel }^{2}/{sigma }^{2})) with a spatial constant σ of 200 μm and Cov is the covariance between inputs and outputs. Note that we could precompute the second part of f as it does not depend on ((z{z}_{a}^{{prime} },{t}_{a}^{{prime} })). We then optimized the affine transform A. A was initialized based on a grid search over possible translation values within ±500 μm. After the grid search, we used gradient descent on the values of A, allowing for translation and rotation, but with a regularization term on A₁ to keep the matrix close to the identity. Finally, for some sessions, the optimization did not converge, in which case we restricted the matrix A₁ to a fixed determinant, thus preventing a scaling transform.

For step 4, this final step was to delineate area borders on the reference mouse, which were then transformed to all mice as described in step 3. Similar to ref. ³³, we computed the sign map and parcellated it into regions where the sign did not change. Ambiguities in the sign map were resolved by approximately matching areas to the data from ref. ³³. Note that the exact outlines of the areas in some cases had different shapes from those in ref. ³³. This is to be expected from two sources: (1) the maps in ref. ³³ were computed from widefield imaging data, which effectively blurs over large portions of the cortex, thus obscuring some boundaries and regions; and (2) our specific cranial windows are in a different position from ref. ³³. Nonetheless, we do not think the small mismatch in area shapes would have a large effect on our conclusions, given that we combined multiple areas into large regions.

Reporting summary

Further information on research design is available in the Nature Portfolio Reporting Summary linked to this article.

Source link

Related Posts

Meet the crew of Blue Origin’s NS-33 tourism launch to the edge of space

NSW budget unlikely to allocate extra funding for long-promised great koala national park | New South Wales politics

Leave a Reply Cancel reply

Unsupervised pretraining in biological neural networks