fmri-cneuromod_algo2025-text2fmri

Model Summary

Modality	fMRI
Training Dataset	CNeuroMod (Algonauts 2025 challenge preparation)
Species	Human
Stimuli	Text
Model Type	Transformers
Creator	Shrey Dixit

Description

Text2fMRI offers a suite of lightweight encoding models, available through the Hugging Face collection ‘ShreyDixit/Text2fMRI’, designed to predict whole-brain fMRI responses solely from movie language transcripts.

Trained on the CNeuroMods dataset (Friends and Movie10)—the same data used for the Algonauts 2025 Challenge—this model generates in silico neural responses to movies without requiring visual or audio input.

Multiple model configurations are available to suit different resource constraints. The smallest and most lightweight model configuration consists of approximately 52M trainable parameters, leveraging a frozen 500M parameter LLM (Qwen-2.5-0.5B) for feature extraction.

Additionally, this model includes specific utility functions to query available Hugging Face model variants prior to instantiation (berg.get_model_variants()) and to render animated spatial visualizations of predicted activity (model.generate_glass_brain_animation()). The atlas files required for glass brain visualization are provided separately in the BERG directory.

Metadata

Note

Atlas files for glass brain visualization (Schaefer 1000-parcel MNI coordinates) are provided separately in the BERG directory and are not part of the per-subject metadata files.

roi_masks

Cont : (1000,) - Binary mask for Control/Frontoparietal network parcels

Default : (1000,) - Binary mask for Default Mode network parcels

DorsAttn : (1000,) - Binary mask for Dorsal Attention network parcels

Limbic : (1000,) - Binary mask for Limbic network parcels

SalVentAttn : (1000,) - Binary mask for Salience/Ventral Attention network parcels

SomMot : (1000,) - Binary mask for Somatomotor network parcels

Vis : (1000,) - Binary mask for Visual network parcels

Input

Type	`list[str]`
Description	A list of strings where each string corresponds to the text spoken during a single fMRI Time Repetition (TR).
Example	`["Hello, are you", "awake? Yes,"]`

Output

Type	`torch.Tensor`
Shape	`['num_timepoints', 'num_rois']`
Description	The predicted fMRI activity for the given stimulus.
Dimensions	num_timepoints: Number of TRs (timepoints) in the input stimulus. num_rois: Number of Regions of Interest (1000 parcels from Schaefer 2018 atlas).

Parameters

Parameters used in `get_encoding_model`

This function loads the encoding model.

model_id	Type: str Required: Yes Description: Unique identifier of the model to load. Valid Values: fmri-cneuromod_algo2025-text2fmri Example: “fmri-cneuromod_algo2025-text2fmri”
subject	Type: int Required: Yes Description: The ID of the subject to generate predictions for. Valid Values: 1, 2, 3, 5 Example: 1
device	Type: str Required: No Description: The computing device to use for inference. Valid Values: “cpu”, “cuda”, “auto” Example: “auto”
model_variant	Type: str Required: No Description: HuggingFace repository ID of a specific pretrained variant to load. If provided, the model config associated with this variant is used and any user-passed config argument is ignored. If None (default), loads the default configuration (Qwen-2.5-0.5B). Use model.get_pretrained_variants() on any loaded model to see all available options. Example: “ShreyDixit/Text2fMRI-Qwen-2.5-0.5B”
selection	Type: dict Required: No Description: Optional filter to restrict the output to specific brain networks. Properties: roi Type: list[str] Description: Filter output by Schaefer 2018 (7-network) atlas labels. Valid values: “Vis”, “SomMot”, “DorsAttn”, “SalVentAttn”, “Limbic”, “Cont”, “Default” Example: [‘Vis’] voxel_index Type: numpy.ndarray Description: Binary one-hot encoded vector indicating which voxels to include. Must have exactly the same length as the number of available voxels (1000). Each position set to 1 indicates that voxel should be included. Example: [0, 0, ‘…’, 1, 1, 0]

Parameters used in `encode`

This function generates in silico neural responses using the encoding model previously loaded.

model	Type: BaseModelInterface Required: Yes Description: An instantiated and loaded encoding model.
stimulus	Type: list[str] Required: Yes Description: A list of strings where each string corresponds to the text spoken during a single fMRI Time Repetition (TR). Example: [“Hello, are you”, “awake? Yes,”]
low_mem_use	Type: bool Required: No Description: If True, sequentially loads/unloads the Feature Extractor and the Encoding Model to minimize VRAM usage, at the cost of slower execution. Example: True
return_metadata	Type: bool Required: No Description: Whether to return the encoding model’s metadata together with the in silico neural responses. Example: True
show_progress	Type: bool Required: No Description: Whether to show a progress bar during encoding (for large batches). Example: True

Parameters used in `get_model_metadata`

This function loads the encoding model’s metadata without having to load the model itself.

model_id	Type: str Required: Yes Description: Unique identifier of the model to load. Valid Values: fmri-cneuromod_algo2025-text2fmri Example: “fmri-cneuromod_algo2025-text2fmri”
subject	Type: int Required: Yes Description: The ID of the subject to generate predictions for. Valid Values: 1, 2, 3, 5 Example: 1

Model-specific utility methods

`get_model_variants()`

Retrieve available pretrained variants for this model without instantiating it. This is called from the main BERG class.

model_id

Type: str
Required: Yes
Description: Unique identifier of the model to load.

variants = berg.get_model_variants("fmri-cneuromod_algo2025-text2fmri")

`generate_glass_brain_animation()`

Generates and saves an animated glass brain GIF from the predicted responses. Called directly on the loaded model instance.

responses	Type: `numpy.ndarray` Required: Yes Description: Model predictions generated by the encode() function.
out_path	Type: `str` Required: No Default: brain_activation.gif Description: Where to save the generated GIF.

model.generate_glass_brain_animation(responses, out_path="activation.gif")

Performance

Metrics:

Performance Metrics: Available in Hugging Face Collection: ShreyDixit/text2fmri

Example Usage

from berg import BERG

# Initialize BERG
berg = BERG(berg_dir="path/to/brain-encoding-response-generator")

# Discover all model variants
variants = berg.get_model_variants("fmri-cneuromod_algo2025-text2fmri")

# Load the model
model = berg.get_encoding_model(
    "fmri-cneuromod_algo2025-text2fmri",
    subject=1,
    model_variant="ShreyDixit/Text2fMRI-Qwen-2.5-0.5B",
    selection={
        "roi": ["Vis"],
        "voxel_index": [0, 0, '...', 1, 1, 0]
    }
)

# Prepare the stimulus (text/sentences)
stimulus = ["Hello, are you", "awake? Yes,"]

# Generates the in silico neural responses using the encoding model previously loaded
responses = berg.encode(
    model,
    stimulus,
    low_mem_use=True
)

# The in silico fMRI responses will be a torch.Tensor of shape:
# ['num_timepoints', 'num_rois']
# where:
# - num_timepoints: Number of TRs (timepoints) in the input stimulus.
# - num_rois: Number of Regions of Interest (1000 parcels from Schaefer 2018 atlas).

# Generate in silico neural responses with metadata
responses, metadata = berg.encode(
    model,
    stimulus,
    return_metadata=True
)

# Load the encoding model's metadata without having to load the model itself
metadata = berg.get_model_metadata(
    "fmri-cneuromod_algo2025-text2fmri",
    subject=1
)

# Generate a gif out of the responses
gif_path = model.generate_glass_brain_animation(
  responses=responses,
  out_path="brain_activation.gif")

References

Course Materials: Dixit, S. (2026). Text2fMRI: Brain Encoding Models using LLMs (Course Materials) (v0.1.2). Zenodo. https://doi.org/10.5281/zenodo.18369862
Huggingface Collection: https://huggingface.co/ShreyDixit/Text2fMRI-Qwen-2.5-0.5B
Algonauts 2025 challenge dataset: https://github.com/courtois-neuromod/algonauts_2025.competitors