fmri-nsd_fsaverage-alexnet_untrained
Model Summary
Modality |
fMRI |
|---|---|
Training Dataset |
Natural Scenes Dataset (NSD) (fsaverage surface space) |
Species |
Human |
Stimuli |
Images |
Model Type |
AlexNet (untrained) |
Creator |
Alessandro Gifford |
Description
These encoding models consist in a linear mapping (through linear regression) of an untrained AlexNet (Krizhevsky et al., 2012) image features onto fMRI responses. Prior to mapping onto fMRI responses, the image features have been downsampled to 250 principal components using principal component analysis.
The encoding models were trained on the Natural Scenes Dataset (NSD) (Allen et al., 2022), 7T fMRI responses of 8 subjects to 73k natural scenes coming from the COCO dataset (Lin et al., 2014). One encoding model was trained for each NSD subject, and for each fMRI vertex.
Preprocessing. The encoding models are trained on NSD’s data prepared in FreeSurfer’s fsaverage space, from the “betas_fithrf_GLMdenoise_RR” preprocessing version. Note that the NSD data were z-scored at each scan session, and as a consequence the in silico fMRI responses generated by the encoding models also live in z-scored space.
Model training partition. fMRI responses for up to 9,000 non-shared images (i.e., the images uniquely seen by each subject during the NSD experiment).
Model validation partition. fMRI responses for up to 485/1,000 shared images (i.e., the 485 shared images that not all subjects saw for up to three times during the NSD experiment).
Model testing partition. fMRI responses for 515/1,000 shared images (i.e., the 515 images that each subject saw for exactly three times during the NSD experiment). The models are additionally tested out-of-distribution on NSD-synthetic, the out-of-distribution component of NSD consisting of fMRI responses from the same 8 NSD subjects to 286 NSD-synthetic images.
Metadata
fmri
lh_ncsnr :
(163842,)- Left hemisphere noise-ceiling signal-to-noise ratio per vertex (computed on NSD-core)rh_ncsnr :
(163842,)- Right hemisphere noise-ceiling signal-to-noise ratio per vertex (computed on NSD-core)lh_ncsnr_nsdsynthetic :
(163842,)- Left hemisphere noise-ceiling signal-to-noise ratio per vertex (computed on NSD-synthetic)rh_ncsnr_nsdsynthetic :
(163842,)- Right hemisphere noise-ceiling signal-to-noise ratio per vertex (computed on NSD-synthetic)
- lh_fsaverage_rois
dict- Left hemisphere ROI definitions on fsaverage surfaceV1v :
(710,)- Visual area 1 ventralV1d :
(828,)- Visual area 1 dorsalV2v :
(632,)- Visual area 2 ventralV2d :
(692,)- Visual area 2 dorsalV3v :
(567,)- Visual area 3 ventralV3d :
(669,)- Visual area 3 dorsalhV4 :
(531,)- Human V4 complexEBA :
(3231,)- Extrastriate body areaFBA-1 :
(574,)- Fusiform body area 1FBA-2 :
(0,)- Fusiform body area 2mTL-bodies :
(0,)- Medial temporal lobe body-selective regionOFA :
(432,)- Occipital face areaFFA-1 :
(552,)- Fusiform face area 1FFA-2 :
(0,)- Fusiform face area 2mTL-faces :
(0,)- Medial temporal lobe face-selective regionaTL-faces :
(329,)- Anterior temporal lobe face-selective regionOPA :
(2021,)- Occipital place areaPPA :
(1859,)- Parahippocampal place areaRSC :
(1298,)- Retrosplenial complexOWFA :
(317,)- Occipital word form areaVWFA-1 :
(1395,)- Visual word form area 1VWFA-2 :
(474,)- Visual word form area 2mfs-words :
(490,)- Mid-fusiform sulcus word-selective regionmTL-words :
(475,)- Medial temporal lobe word-selective regionearly :
(5758,)- Early visual cortex (V1-V3)midventral :
(867,)- Mid-level ventral streammidlateral :
(1091,)- Mid-level lateral streammidparietal :
(1079,)- Mid-level parietal regionsventral :
(9680,)- Ventral visual streamlateral :
(10253,)- Lateral visual streamparietal :
(5176,)- Parietal regionsnsdgeneral :
(18461,)- NSD general visual cortex mask- rh_fsaverage_rois
dict- Right hemisphere ROI definitions on fsaverage surfaceV1v :
(444,)- Visual area 1 ventralV1d :
(991,)- Visual area 1 dorsalV2v :
(887,)- Visual area 2 ventralV2d :
(725,)- Visual area 2 dorsalV3v :
(682,)- Visual area 3 ventralV3d :
(535,)- Visual area 3 dorsalhV4 :
(765,)- Human V4 complexEBA :
(4421,)- Extrastriate body areaFBA-1 :
(206,)- Fusiform body area 1FBA-2 :
(1234,)- Fusiform body area 2mTL-bodies :
(0,)- Medial temporal lobe body-selective regionOFA :
(305,)- Occipital face areaFFA-1 :
(330,)- Fusiform face area 1FFA-2 :
(1003,)- Fusiform face area 2mTL-faces :
(0,)- Medial temporal lobe face-selective regionaTL-faces :
(283,)- Anterior temporal lobe face-selective regionOPA :
(2849,)- Occipital place areaPPA :
(1250,)- Parahippocampal place areaRSC :
(1136,)- Retrosplenial complexOWFA :
(590,)- Occipital word form areaVWFA-1 :
(397,)- Visual word form area 1VWFA-2 :
(649,)- Visual word form area 2mfs-words :
(0,)- Mid-fusiform sulcus word-selective regionmTL-words :
(0,)- Medial temporal lobe word-selective regionearly :
(5634,)- Early visual cortex (V1-V3)midventral :
(1050,)- Mid-level ventral streammidlateral :
(1191,)- Mid-level lateral streammidparietal :
(1181,)- Mid-level parietal regionsventral :
(9393,)- Ventral visual streamlateral :
(10535,)- Lateral visual streamparietal :
(4818,)- Parietal regionsnsdgeneral :
(19523,)- NSD general visual cortex mask
encoding_models
train_img_num :
(9000,)- Image indices used for trainingval_img_num :
(485,)- Image indices used for validationtest_img_num :
(515,)- Image indices used for testinglh_correlation_nsdcore :
(163842,)- Left hemisphere correlation on NSD core datasetrh_correlation_nsdcore :
(163842,)- Right hemisphere correlation on NSD core datasetlh_r2_nsdcore :
(163842,)- Left hemisphere R² on NSD core datasetrh_r2_nsdcore :
(163842,)- Right hemisphere R² on NSD core datasetlh_noise_ceiling_nsdcore :
(163842,)- Left hemisphere noise ceiling on NSD core datasetrh_noise_ceiling_nsdcore :
(163842,)- Right hemisphere noise ceiling on NSD core datasetlh_explained_variance_nsdcore :
(163842,)- Left hemisphere % explained variance on NSD core datasetrh_explained_variance_nsdcore :
(163842,)- Right hemisphere % explained variance on NSD core datasetlh_correlation_nsdsynthetic :
(163842,)- Left hemisphere correlation on NSD synthetic datasetrh_correlation_nsdsynthetic :
(163842,)- Right hemisphere correlation on NSD synthetic datasetlh_r2_nsdsynthetic :
(163842,)- Left hemisphere R² on NSD synthetic datasetrh_r2_nsdsynthetic :
(163842,)- Right hemisphere R² on NSD synthetic datasetlh_noise_ceiling_nsdsynthetic :
(163842,)- Left hemisphere noise ceiling on NSD synthetic datasetrh_noise_ceiling_nsdsynthetic :
(163842,)- Right hemisphere noise ceiling on NSD synthetic datasetlh_explained_variance_nsdsynthetic :
(163842,)- Left hemisphere % explained variance on NSD synthetic datasetrh_explained_variance_nsdsynthetic :
(163842,)- Right hemisphere % explained variance on NSD synthetic dataset
Input
Type |
|
|---|---|
Shape |
|
Description |
The input should be a batch of RGB images. |
Constraints |
|
Output
Type |
|
|---|---|
Shape |
|
Description |
The output is a tuple containing the left hemisphere (LH) and right hemisphere (RH) in silico fMRI responses for the batch images. |
Dimensions:
Name |
Description |
|---|---|
batch_size |
Number of stimuli in the batch. |
lh_vertices |
Number of selected LH vertices for which the in silico fMRI responses are generated. |
rh_vertices |
Number of selected RH vertices for which the in silico fMRI responses are generated. |
Parameters
Parameters used in get_encoding_model
This function loads the encoding model.
model_id |
Type: str
Required: Yes
Description: Unique identifier of the model to load.
Valid Values: fmri-nsd_fsaverage-alexnet_untrained
Example: “fmri-nsd_fsaverage-alexnet_untrained”
|
subject |
Type: int
Required: Yes
Description: Subject ID from the NSD dataset (1-8).
Valid Values: 1, 2, 3, 4, 5, 6, 7, 8
Example: 1
|
selection |
Type: dict
Required: No
Description: Specifies which outputs to include in the model responses. If not provided, fMRI responses are generate for all LH and RH fMRI vertices.
Properties:
roi
Type: str
Description: The region-of-interest (ROI) for which the in silico fMRI responses (of both
hemispherese) are generated.
Valid values: “V1d”, “V1v”, “V2d”, “V2v”, “V3d”, “V3v”, “hV4”, “OFA”, “FFA-1”, “FFA-2”, “mTL-faces”, “aTL-faces”, “OVWFA”, “VWFA-1”, “VWFA-2”, “mfs-words”, “mTL-words”, “OPA”, “PPA”, “RSC”, “EBA”, “FBA-1”, “FBA-2”, “mTL-bodies”, “early”, “midventral”, “midlateral”, “midparietal”, “parietal”, “lateral”, “ventral”, “nsdgeneral”
lh_vertices
Type: numpy.ndarray
Description: Binary one-hot encoded vector with ones indicating the left hemisphere (LH)
vertices for which the in silico fMRI responses are generated. This vector must
have exactly the same length as the number of LH fsaverage vertices (163,842).
The vertices from the one-hot encoded vector are only selected if the “roi” key
is not provided, or has value None.
rh_vertices
Type: numpy.ndarray
Description: Binary one-hot encoded vector with ones indicating the right hemisphere (RH)
vertices for which the in silico fMRI responses are generated. This vector must
have exactly the same length as the number of RH fsaverage vertices (163,842).
The vertices from the one-hot encoded vector are only selected if the “roi” key
is not provided, or has value None.
|
device |
Type: str
Required: No
Description: Device to run the model on. ‘auto’ will use CUDA if available, otherwise CPU.
Valid Values: “cpu”, “cuda”, “auto”
Example: “auto”
|
Parameters used in encode
This function generates in silico neural responses using the encoding model previously loaded.
model |
Type: BaseModelInterface
Required: Yes
Description: An instantiated and loaded encoding model.
|
stimulus |
Type: numpy.ndarray
Required: Yes
Description: A batch of RGB images to be encoded. Images should be in integer format with values in the range [0, 255], and square dimensions (e.g. 224×224).
Example: “An array of shape [100, 3, 224, 224] representing 100 RGB images.”
|
return_metadata |
Type: bool
Required: No
Description: Whether to return the encoding model’s metadata together with the in silico neural resposnes.
Example: True
|
show_progress |
Type: bool
Required: No
Description: Whether to show a progress bar during encoding (for large batches).
Example: True
|
Parameters used in get_model_metadata
This function loads the encoding model’s metadata without having to load the model itself.
model_id |
Type: str
Required: Yes
Description: Unique identifier of the model to load.
Valid Values: fmri-nsd_fsaverage-alexnet_untrained
Example: “fmri-nsd_fsaverage-alexnet_untrained”
|
subject |
Type: int
Required: Yes
Description: Subject ID from the NSD dataset (1-8).
Valid Values: 1, 2, 3, 4, 5, 6, 7, 8
Example: 1
|
Performance
Accuracy Plots (AWS directory):
brain-encoding-response-generator/encoding_models/modality-fmri/train_dataset-nsd_fsaverage/model-alexnet_untrained/encoding_models_accuracy
Example Usage
from berg import BERG
# Initialize BERG
berg = BERG(berg_dir="path/to/brain-encoding-response-generator")
# Load the model
model = berg.get_encoding_model(
"fmri-nsd_fsaverage-alexnet_untrained",
subject=1,
)
# Prepare the stimulus images
# Image shape should be [batch_size, 3 RGB channels, height, width]
stimulus = np.random.randint(0, 255, (100, 3, 256, 256))
# Generates the in silico neural responses using the encoding model previously loaded
responses = berg.encode(
model,
stimulus,
show_progress=True
)
# The in silico fMRI responses will be a tuple of numpy.ndarray of shape:
# ([batch_size, lh_vertices], [batch_size, rh_vertices])
# where:
# - lh_vertices is the number of selected left hemisphere (LH) vertices for which the in silico
# fMRI responses are generated.
# - rh_vertices is the number of selected right hemisphere (RH) vertices for which the in silico
# fMRI responses are generated.
# Generate in silico neural responses with metadata
responses, metadata = berg.encode(
model,
stimulus,
return_metadata=True
)
# Load the encoding model's metadata without having to load the model itself
metadata = berg.get_model_metadata(
"fmri-nsd_fsaverage-alexnet_untrained",
subject=1
)
References
Model building code: https://github.com/gifale95/BERG/tree/main/berg_creation_code
NSD paper (Allen et al., 2022): https://doi.org/10.1038/s41593-021-00962-x
NSD-synthetic paper (Gifford et al., 2025): https://doi.org/10.48550/arXiv.2503.06286
COCO dataset (Lin et al., 2014): https://cocodataset.org/#home
AlexNet (Krizhevsky et al., 2012): https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf