fmri-cneuromod_algo2025-text2fmri

Model Summary

Modality

fMRI

Training Dataset

CNeuroMod (Algonauts 2025 challenge preparation)

Species

Human

Stimuli

Text

Model Type

Transformers

Creator

Shrey Dixit

Description

Text2fMRI offers a suite of lightweight encoding models, available through the Hugging Face collection ‘ShreyDixit/Text2fMRI’, designed to predict whole-brain fMRI responses solely from movie language transcripts.

Trained on the CNeuroMods dataset (Friends and Movie10)—the same data used for the Algonauts 2025 Challenge—this model generates in silico neural responses to movies without requiring visual or audio input.

Multiple model configurations are available to suit different resource constraints. The smallest and most lightweight model configuration consists of approximately 52M trainable parameters, leveraging a frozen 500M parameter LLM (Qwen-2.5-0.5B) for feature extraction.

Additionally, this model includes specific utility functions to query available Hugging Face model variants prior to instantiation (berg.get_model_variants()) and to render animated spatial visualizations of predicted activity (model.generate_glass_brain_animation()). The atlas files required for glass brain visualization are provided separately in the BERG directory.

Metadata

Note

Atlas files for glass brain visualization (Schaefer 1000-parcel MNI coordinates) are provided separately in the BERG directory and are not part of the per-subject metadata files.

roi_masks

Cont : (1000,) - Binary mask for Control/Frontoparietal network parcels

Default : (1000,) - Binary mask for Default Mode network parcels

DorsAttn : (1000,) - Binary mask for Dorsal Attention network parcels

Limbic : (1000,) - Binary mask for Limbic network parcels

SalVentAttn : (1000,) - Binary mask for Salience/Ventral Attention network parcels

SomMot : (1000,) - Binary mask for Somatomotor network parcels

Vis : (1000,) - Binary mask for Visual network parcels

Input

Type

list[str]

Description

A list of strings where each string corresponds to the text spoken during a
single fMRI Time Repetition (TR).

Example

["Hello, are you", "awake? Yes,"]

Output

Type

torch.Tensor

Shape

['num_timepoints', 'num_rois']

Description

The predicted fMRI activity for the given stimulus.

Dimensions

num_timepoints: Number of TRs (timepoints) in the input stimulus.
num_rois: Number of Regions of Interest (1000 parcels from Schaefer 2018 atlas).

Parameters

Parameters used in get_encoding_model

This function loads the encoding model.

model_id

Type: str
Required: Yes
Description: Unique identifier of the model to load.
Valid Values: fmri-cneuromod_algo2025-text2fmri
Example: “fmri-cneuromod_algo2025-text2fmri”

subject

Type: int
Required: Yes
Description: The ID of the subject to generate predictions for.
Valid Values: 1, 2, 3, 5
Example: 1

device

Type: str
Required: No
Description: The computing device to use for inference.
Valid Values: “cpu”, “cuda”, “auto”
Example: “auto”

model_variant

Type: str
Required: No
Description: HuggingFace repository ID of a specific pretrained variant to load.
If provided, the model config associated with this variant is used and any
user-passed config argument is ignored.
If None (default), loads the default configuration (Qwen-2.5-0.5B).
Use model.get_pretrained_variants() on any loaded model to see all available options.
Example: “ShreyDixit/Text2fMRI-Qwen-2.5-0.5B”

selection

Type: dict
Required: No
Description: Optional filter to restrict the output to specific brain networks.

Properties:

roi
Type: list[str]
Description: Filter output by Schaefer 2018 (7-network) atlas labels.
Valid values: “Vis”, “SomMot”, “DorsAttn”, “SalVentAttn”, “Limbic”, “Cont”, “Default”
Example: [‘Vis’]

voxel_index
Type: numpy.ndarray
Description: Binary one-hot encoded vector indicating which voxels to include.
Must have exactly the same length as the number of available voxels (1000).
Each position set to 1 indicates that voxel should be included.
Example: [0, 0, ‘…’, 1, 1, 0]

Parameters used in encode

This function generates in silico neural responses using the encoding model previously loaded.

model

Type: BaseModelInterface
Required: Yes
Description: An instantiated and loaded encoding model.

stimulus

Type: list[str]
Required: Yes
Description: A list of strings where each string corresponds to the text spoken during a
single fMRI Time Repetition (TR).
Example:
[“Hello, are you”, “awake? Yes,”]

low_mem_use

Type: bool
Required: No
Description: If True, sequentially loads/unloads the Feature Extractor and the Encoding Model
to minimize VRAM usage, at the cost of slower execution.
Example: True

return_metadata

Type: bool
Required: No
Description: Whether to return the encoding model’s metadata together with the in silico neural responses.
Example: True

show_progress

Type: bool
Required: No
Description: Whether to show a progress bar during encoding (for large batches).
Example: True

Parameters used in get_model_metadata

This function loads the encoding model’s metadata without having to load the model itself.

model_id

Type: str
Required: Yes
Description: Unique identifier of the model to load.
Valid Values: fmri-cneuromod_algo2025-text2fmri
Example: “fmri-cneuromod_algo2025-text2fmri”

subject

Type: int
Required: Yes
Description: The ID of the subject to generate predictions for.
Valid Values: 1, 2, 3, 5
Example: 1

Model-specific utility methods

get_model_variants()

Retrieve available pretrained variants for this model without instantiating it. This is called from the main BERG class.

model_id

Type: str
Required: Yes
Description: Unique identifier of the model to load.
variants = berg.get_model_variants("fmri-cneuromod_algo2025-text2fmri")

generate_glass_brain_animation()

Generates and saves an animated glass brain GIF from the predicted responses. Called directly on the loaded model instance.

responses

Type: numpy.ndarray
Required: Yes
Description: Model predictions generated by the encode() function.

out_path

Type: str
Required: No
Default: brain_activation.gif
Description: Where to save the generated GIF.
model.generate_glass_brain_animation(responses, out_path="activation.gif")

Performance

Metrics:

  • Performance Metrics: Available in Hugging Face Collection: ShreyDixit/text2fmri

Example Usage

from berg import BERG

# Initialize BERG
berg = BERG(berg_dir="path/to/brain-encoding-response-generator")

# Discover all model variants
variants = berg.get_model_variants("fmri-cneuromod_algo2025-text2fmri")

# Load the model
model = berg.get_encoding_model(
    "fmri-cneuromod_algo2025-text2fmri",
    subject=1,
    model_variant="ShreyDixit/Text2fMRI-Qwen-2.5-0.5B",
    selection={
        "roi": ["Vis"],
        "voxel_index": [0, 0, '...', 1, 1, 0]
    }
)

# Prepare the stimulus (text/sentences)
stimulus = ["Hello, are you", "awake? Yes,"]

# Generates the in silico neural responses using the encoding model previously loaded
responses = berg.encode(
    model,
    stimulus,
    low_mem_use=True
)

# The in silico fMRI responses will be a torch.Tensor of shape:
# ['num_timepoints', 'num_rois']
# where:
# - num_timepoints: Number of TRs (timepoints) in the input stimulus.
# - num_rois: Number of Regions of Interest (1000 parcels from Schaefer 2018 atlas).

# Generate in silico neural responses with metadata
responses, metadata = berg.encode(
    model,
    stimulus,
    return_metadata=True
)

# Load the encoding model's metadata without having to load the model itself
metadata = berg.get_model_metadata(
    "fmri-cneuromod_algo2025-text2fmri",
    subject=1
)

# Generate a gif out of the responses
gif_path = model.generate_glass_brain_animation(
  responses=responses,
  out_path="brain_activation.gif")

References