fmri-nsd_fsaverage-alexnet_untrained

Model Summary

Modality	fMRI
Training Dataset	Natural Scenes Dataset (NSD) (fsaverage surface space)
Species	Human
Stimuli	Images
Model Type	AlexNet (untrained)
Creator	Alessandro Gifford

Description

These encoding models consist in a linear mapping (through linear regression) of an untrained AlexNet (Krizhevsky et al., 2012) image features onto fMRI responses. Prior to mapping onto fMRI responses, the image features have been downsampled to 250 principal components using principal component analysis.

The encoding models were trained on the Natural Scenes Dataset (NSD) (Allen et al., 2022), 7T fMRI responses of 8 subjects to 73k natural scenes coming from the COCO dataset (Lin et al., 2014). One encoding model was trained for each NSD subject, and for each fMRI vertex.

Preprocessing. The encoding models are trained on NSD’s data prepared in FreeSurfer’s fsaverage space, from the “betas_fithrf_GLMdenoise_RR” preprocessing version. Note that the NSD data were z-scored at each scan session, and as a consequence the in silico fMRI responses generated by the encoding models also live in z-scored space.

Model training partition. fMRI responses for up to 9,000 non-shared images (i.e., the images uniquely seen by each subject during the NSD experiment).

Model validation partition. fMRI responses for up to 485/1,000 shared images (i.e., the 485 shared images that not all subjects saw for up to three times during the NSD experiment).

Model testing partition. fMRI responses for 515/1,000 shared images (i.e., the 515 images that each subject saw for exactly three times during the NSD experiment). The models are additionally tested out-of-distribution on NSD-synthetic, the out-of-distribution component of NSD consisting of fMRI responses from the same 8 NSD subjects to 286 NSD-synthetic images.

Metadata

fmri

lh_ncsnr : (163842,) - Left hemisphere noise-ceiling signal-to-noise ratio per vertex (computed on NSD-core)

rh_ncsnr : (163842,) - Right hemisphere noise-ceiling signal-to-noise ratio per vertex (computed on NSD-core)

lh_ncsnr_nsdsynthetic : (163842,) - Left hemisphere noise-ceiling signal-to-noise ratio per vertex (computed on NSD-synthetic)

rh_ncsnr_nsdsynthetic : (163842,) - Right hemisphere noise-ceiling signal-to-noise ratio per vertex (computed on NSD-synthetic)

lh_fsaverage_roisdict - Left hemisphere ROI definitions on fsaverage surface
V1v : (710,) - Visual area 1 ventral

V1d : (828,) - Visual area 1 dorsal

V2v : (632,) - Visual area 2 ventral

V2d : (692,) - Visual area 2 dorsal

V3v : (567,) - Visual area 3 ventral

V3d : (669,) - Visual area 3 dorsal

hV4 : (531,) - Human V4 complex

EBA : (3231,) - Extrastriate body area

FBA-1 : (574,) - Fusiform body area 1

FBA-2 : (0,) - Fusiform body area 2

mTL-bodies : (0,) - Medial temporal lobe body-selective region

OFA : (432,) - Occipital face area

FFA-1 : (552,) - Fusiform face area 1

FFA-2 : (0,) - Fusiform face area 2

mTL-faces : (0,) - Medial temporal lobe face-selective region

aTL-faces : (329,) - Anterior temporal lobe face-selective region

OPA : (2021,) - Occipital place area

PPA : (1859,) - Parahippocampal place area

RSC : (1298,) - Retrosplenial complex

OWFA : (317,) - Occipital word form area

VWFA-1 : (1395,) - Visual word form area 1

VWFA-2 : (474,) - Visual word form area 2

mfs-words : (490,) - Mid-fusiform sulcus word-selective region

mTL-words : (475,) - Medial temporal lobe word-selective region

early : (5758,) - Early visual cortex (V1-V3)

midventral : (867,) - Mid-level ventral stream

midlateral : (1091,) - Mid-level lateral stream

midparietal : (1079,) - Mid-level parietal regions

ventral : (9680,) - Ventral visual stream

lateral : (10253,) - Lateral visual stream

parietal : (5176,) - Parietal regions

nsdgeneral : (18461,) - NSD general visual cortex mask

rh_fsaverage_roisdict - Right hemisphere ROI definitions on fsaverage surface
V1v : (444,) - Visual area 1 ventral

V1d : (991,) - Visual area 1 dorsal

V2v : (887,) - Visual area 2 ventral

V2d : (725,) - Visual area 2 dorsal

V3v : (682,) - Visual area 3 ventral

V3d : (535,) - Visual area 3 dorsal

hV4 : (765,) - Human V4 complex

EBA : (4421,) - Extrastriate body area

FBA-1 : (206,) - Fusiform body area 1

FBA-2 : (1234,) - Fusiform body area 2

mTL-bodies : (0,) - Medial temporal lobe body-selective region

OFA : (305,) - Occipital face area

FFA-1 : (330,) - Fusiform face area 1

FFA-2 : (1003,) - Fusiform face area 2

mTL-faces : (0,) - Medial temporal lobe face-selective region

aTL-faces : (283,) - Anterior temporal lobe face-selective region

OPA : (2849,) - Occipital place area

PPA : (1250,) - Parahippocampal place area

RSC : (1136,) - Retrosplenial complex

OWFA : (590,) - Occipital word form area

VWFA-1 : (397,) - Visual word form area 1

VWFA-2 : (649,) - Visual word form area 2

mfs-words : (0,) - Mid-fusiform sulcus word-selective region

mTL-words : (0,) - Medial temporal lobe word-selective region

early : (5634,) - Early visual cortex (V1-V3)

midventral : (1050,) - Mid-level ventral stream

midlateral : (1191,) - Mid-level lateral stream

midparietal : (1181,) - Mid-level parietal regions

ventral : (9393,) - Ventral visual stream

lateral : (10535,) - Lateral visual stream

parietal : (4818,) - Parietal regions

nsdgeneral : (19523,) - NSD general visual cortex mask

encoding_models

train_img_num : (9000,) - Image indices used for training

val_img_num : (485,) - Image indices used for validation

test_img_num : (515,) - Image indices used for testing

lh_correlation_nsdcore : (163842,) - Left hemisphere correlation on NSD core dataset

rh_correlation_nsdcore : (163842,) - Right hemisphere correlation on NSD core dataset

lh_r2_nsdcore : (163842,) - Left hemisphere R² on NSD core dataset

rh_r2_nsdcore : (163842,) - Right hemisphere R² on NSD core dataset

lh_noise_ceiling_nsdcore : (163842,) - Left hemisphere noise ceiling on NSD core dataset

rh_noise_ceiling_nsdcore : (163842,) - Right hemisphere noise ceiling on NSD core dataset

lh_explained_variance_nsdcore : (163842,) - Left hemisphere % explained variance on NSD core dataset

rh_explained_variance_nsdcore : (163842,) - Right hemisphere % explained variance on NSD core dataset

lh_correlation_nsdsynthetic : (163842,) - Left hemisphere correlation on NSD synthetic dataset

rh_correlation_nsdsynthetic : (163842,) - Right hemisphere correlation on NSD synthetic dataset

lh_r2_nsdsynthetic : (163842,) - Left hemisphere R² on NSD synthetic dataset

rh_r2_nsdsynthetic : (163842,) - Right hemisphere R² on NSD synthetic dataset

lh_noise_ceiling_nsdsynthetic : (163842,) - Left hemisphere noise ceiling on NSD synthetic dataset

rh_noise_ceiling_nsdsynthetic : (163842,) - Right hemisphere noise ceiling on NSD synthetic dataset

lh_explained_variance_nsdsynthetic : (163842,) - Left hemisphere % explained variance on NSD synthetic dataset

rh_explained_variance_nsdsynthetic : (163842,) - Right hemisphere % explained variance on NSD synthetic dataset

Input

Type	`numpy.ndarray`
Shape	`['batch_size', 3, 'height', 'width']`
Description	The input should be a batch of RGB images.
Constraints	Image values should be integers in range [0, 255]. Image dimensions (height, width) should be equal (square). Minimum recommended image size: 224×224 pixels.

Output

Type	`tuple of numpy.ndarray`
Shape	`([batch_size, lh_vertices], [batch_size, rh_vertices])`
Description	The output is a tuple containing the left hemisphere (LH) and right hemisphere (RH) in silico fMRI responses for the batch images.

Dimensions:

Name	Description
batch_size	Number of stimuli in the batch.
lh_vertices	Number of selected LH vertices for which the in silico fMRI responses are generated.
rh_vertices	Number of selected RH vertices for which the in silico fMRI responses are generated.

Parameters

Parameters used in `get_encoding_model`

This function loads the encoding model.

model_id	Type: str Required: Yes Description: Unique identifier of the model to load. Valid Values: fmri-nsd_fsaverage-alexnet_untrained Example: “fmri-nsd_fsaverage-alexnet_untrained”
subject	Type: int Required: Yes Description: Subject ID from the NSD dataset (1-8). Valid Values: 1, 2, 3, 4, 5, 6, 7, 8 Example: 1
selection	Type: dict Required: No Description: Specifies which outputs to include in the model responses. If not provided, fMRI responses are generate for all LH and RH fMRI vertices. Properties: roi Type: str Description: The region-of-interest (ROI) for which the in silico fMRI responses (of both hemispherese) are generated. Valid values: “V1d”, “V1v”, “V2d”, “V2v”, “V3d”, “V3v”, “hV4”, “OFA”, “FFA-1”, “FFA-2”, “mTL-faces”, “aTL-faces”, “OVWFA”, “VWFA-1”, “VWFA-2”, “mfs-words”, “mTL-words”, “OPA”, “PPA”, “RSC”, “EBA”, “FBA-1”, “FBA-2”, “mTL-bodies”, “early”, “midventral”, “midlateral”, “midparietal”, “parietal”, “lateral”, “ventral”, “nsdgeneral” lh_vertices Type: numpy.ndarray Description: Binary one-hot encoded vector with ones indicating the left hemisphere (LH) vertices for which the in silico fMRI responses are generated. This vector must have exactly the same length as the number of LH fsaverage vertices (163,842). The vertices from the one-hot encoded vector are only selected if the “roi” key is not provided, or has value None. rh_vertices Type: numpy.ndarray Description: Binary one-hot encoded vector with ones indicating the right hemisphere (RH) vertices for which the in silico fMRI responses are generated. This vector must have exactly the same length as the number of RH fsaverage vertices (163,842). The vertices from the one-hot encoded vector are only selected if the “roi” key is not provided, or has value None.
device	Type: str Required: No Description: Device to run the model on. ‘auto’ will use CUDA if available, otherwise CPU. Valid Values: “cpu”, “cuda”, “auto” Example: “auto”

Parameters used in `encode`

This function generates in silico neural responses using the encoding model previously loaded.

model	Type: BaseModelInterface Required: Yes Description: An instantiated and loaded encoding model.
stimulus	Type: numpy.ndarray Required: Yes Description: A batch of RGB images to be encoded. Images should be in integer format with values in the range [0, 255], and square dimensions (e.g. 224×224). Example: “An array of shape [100, 3, 224, 224] representing 100 RGB images.”
return_metadata	Type: bool Required: No Description: Whether to return the encoding model’s metadata together with the in silico neural resposnes. Example: True
show_progress	Type: bool Required: No Description: Whether to show a progress bar during encoding (for large batches). Example: True

Parameters used in `get_model_metadata`

This function loads the encoding model’s metadata without having to load the model itself.

model_id	Type: str Required: Yes Description: Unique identifier of the model to load. Valid Values: fmri-nsd_fsaverage-alexnet_untrained Example: “fmri-nsd_fsaverage-alexnet_untrained”
subject	Type: int Required: Yes Description: Subject ID from the NSD dataset (1-8). Valid Values: 1, 2, 3, 4, 5, 6, 7, 8 Example: 1

Performance

Accuracy Plots (AWS directory):

brain-encoding-response-generator/encoding_models/modality-fmri/train_dataset-nsd_fsaverage/model-alexnet_untrained/encoding_models_accuracy

Example Usage

from berg import BERG

# Initialize BERG
berg = BERG(berg_dir="path/to/brain-encoding-response-generator")

# Load the model
model = berg.get_encoding_model(
    "fmri-nsd_fsaverage-alexnet_untrained",
    subject=1,
)

# Prepare the stimulus images
# Image shape should be [batch_size, 3 RGB channels, height, width]
stimulus = np.random.randint(0, 255, (100, 3, 256, 256))

# Generates the in silico neural responses using the encoding model previously loaded
responses = berg.encode(
    model,
    stimulus,
    show_progress=True
)

# The in silico fMRI responses will be a tuple of numpy.ndarray of shape:
# ([batch_size, lh_vertices], [batch_size, rh_vertices])
# where:
# - lh_vertices is the number of selected left hemisphere (LH) vertices for which the in silico
#   fMRI responses are generated.
# - rh_vertices is the number of selected right hemisphere (RH) vertices for which the in silico
#   fMRI responses are generated.

# Generate in silico neural responses with metadata
responses, metadata = berg.encode(
    model,
    stimulus,
    return_metadata=True
)

# Load the encoding model's metadata without having to load the model itself
metadata = berg.get_model_metadata(
    "fmri-nsd_fsaverage-alexnet_untrained",
    subject=1
)

References

Model building code: https://github.com/gifale95/BERG/tree/main/berg_creation_code
NSD paper (Allen et al., 2022): https://doi.org/10.1038/s41593-021-00962-x
NSD-synthetic paper (Gifford et al., 2025): https://doi.org/10.48550/arXiv.2503.06286
COCO dataset (Lin et al., 2014): https://cocodataset.org/#home
AlexNet (Krizhevsky et al., 2012): https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf