fmri-cneuromod_algo2025-text2fmri
Model Summary
Modality |
fMRI |
|---|---|
Training Dataset |
CNeuroMod (Algonauts 2025 challenge preparation) |
Species |
Human |
Stimuli |
Text |
Model Type |
Transformers |
Creator |
Shrey Dixit |
Description
Text2fMRI offers a suite of lightweight encoding models, available through the Hugging Face collection ‘ShreyDixit/Text2fMRI’, designed to predict whole-brain fMRI responses solely from movie language transcripts.
Trained on the CNeuroMods dataset (Friends and Movie10)—the same data used for the Algonauts 2025 Challenge—this model generates in silico neural responses to movies without requiring visual or audio input.
Multiple model configurations are available to suit different resource constraints. The smallest and most lightweight model configuration consists of approximately 52M trainable parameters, leveraging a frozen 500M parameter LLM (Qwen-2.5-0.5B) for feature extraction.
Additionally, this model includes specific utility functions to query available Hugging Face model variants prior to instantiation (berg.get_model_variants()) and to render animated spatial visualizations of predicted activity (model.generate_glass_brain_animation()). The atlas files required for glass brain visualization are provided separately in the BERG directory.
Metadata
Note
Atlas files for glass brain visualization (Schaefer 1000-parcel MNI coordinates) are provided separately in the BERG directory and are not part of the per-subject metadata files.
roi_masks
Cont :
(1000,)- Binary mask for Control/Frontoparietal network parcelsDefault :
(1000,)- Binary mask for Default Mode network parcelsDorsAttn :
(1000,)- Binary mask for Dorsal Attention network parcelsLimbic :
(1000,)- Binary mask for Limbic network parcelsSalVentAttn :
(1000,)- Binary mask for Salience/Ventral Attention network parcelsSomMot :
(1000,)- Binary mask for Somatomotor network parcelsVis :
(1000,)- Binary mask for Visual network parcels
Input
Type |
|
|---|---|
Description |
A list of strings where each string corresponds to the text spoken during a
single fMRI Time Repetition (TR).
|
Example |
|
Output
Type |
|
|---|---|
Shape |
|
Description |
The predicted fMRI activity for the given stimulus. |
Dimensions |
num_timepoints: Number of TRs (timepoints) in the input stimulus.
num_rois: Number of Regions of Interest (1000 parcels from Schaefer 2018 atlas).
|
Parameters
Parameters used in get_encoding_model
This function loads the encoding model.
model_id |
Type: str
Required: Yes
Description: Unique identifier of the model to load.
Valid Values: fmri-cneuromod_algo2025-text2fmri
Example: “fmri-cneuromod_algo2025-text2fmri”
|
subject |
Type: int
Required: Yes
Description: The ID of the subject to generate predictions for.
Valid Values: 1, 2, 3, 5
Example: 1
|
device |
Type: str
Required: No
Description: The computing device to use for inference.
Valid Values: “cpu”, “cuda”, “auto”
Example: “auto”
|
model_variant |
Type: str
Required: No
Description: HuggingFace repository ID of a specific pretrained variant to load.
If provided, the model config associated with this variant is used and any
user-passed config argument is ignored.
If None (default), loads the default configuration (Qwen-2.5-0.5B).
Use model.get_pretrained_variants() on any loaded model to see all available options.
Example: “ShreyDixit/Text2fMRI-Qwen-2.5-0.5B”
|
selection |
Type: dict
Required: No
Description: Optional filter to restrict the output to specific brain networks.
Properties:
roi
Type: list[str]
Description: Filter output by Schaefer 2018 (7-network) atlas labels.
Valid values: “Vis”, “SomMot”, “DorsAttn”, “SalVentAttn”, “Limbic”, “Cont”, “Default”
Example: [‘Vis’]
voxel_index
Type: numpy.ndarray
Description: Binary one-hot encoded vector indicating which voxels to include.
Must have exactly the same length as the number of available voxels (1000).
Each position set to 1 indicates that voxel should be included.
Example: [0, 0, ‘…’, 1, 1, 0]
|
Parameters used in encode
This function generates in silico neural responses using the encoding model previously loaded.
model |
Type: BaseModelInterface
Required: Yes
Description: An instantiated and loaded encoding model.
|
stimulus |
Type: list[str]
Required: Yes
Description: A list of strings where each string corresponds to the text spoken during a
single fMRI Time Repetition (TR).
Example:
[“Hello, are you”, “awake? Yes,”]
|
low_mem_use |
Type: bool
Required: No
Description: If True, sequentially loads/unloads the Feature Extractor and the Encoding Model
to minimize VRAM usage, at the cost of slower execution.
Example: True
|
return_metadata |
Type: bool
Required: No
Description: Whether to return the encoding model’s metadata together with the in silico neural responses.
Example: True
|
show_progress |
Type: bool
Required: No
Description: Whether to show a progress bar during encoding (for large batches).
Example: True
|
Parameters used in get_model_metadata
This function loads the encoding model’s metadata without having to load the model itself.
model_id |
Type: str
Required: Yes
Description: Unique identifier of the model to load.
Valid Values: fmri-cneuromod_algo2025-text2fmri
Example: “fmri-cneuromod_algo2025-text2fmri”
|
subject |
Type: int
Required: Yes
Description: The ID of the subject to generate predictions for.
Valid Values: 1, 2, 3, 5
Example: 1
|
Model-specific utility methods
get_model_variants()
Retrieve available pretrained variants for this model without instantiating it. This is called from the main BERG class.
model_id |
Type:
strRequired: Yes
Description: Unique identifier of the model to load.
|
variants = berg.get_model_variants("fmri-cneuromod_algo2025-text2fmri")
generate_glass_brain_animation()
Generates and saves an animated glass brain GIF from the predicted responses. Called directly on the loaded model instance.
responses |
Type:
numpy.ndarrayRequired: Yes
Description: Model predictions generated by the encode() function.
|
out_path |
Type:
strRequired: No
Default: brain_activation.gif
Description: Where to save the generated GIF.
|
model.generate_glass_brain_animation(responses, out_path="activation.gif")
Performance
Metrics:
Performance Metrics: Available in Hugging Face Collection: ShreyDixit/text2fmri
Example Usage
from berg import BERG
# Initialize BERG
berg = BERG(berg_dir="path/to/brain-encoding-response-generator")
# Discover all model variants
variants = berg.get_model_variants("fmri-cneuromod_algo2025-text2fmri")
# Load the model
model = berg.get_encoding_model(
"fmri-cneuromod_algo2025-text2fmri",
subject=1,
model_variant="ShreyDixit/Text2fMRI-Qwen-2.5-0.5B",
selection={
"roi": ["Vis"],
"voxel_index": [0, 0, '...', 1, 1, 0]
}
)
# Prepare the stimulus (text/sentences)
stimulus = ["Hello, are you", "awake? Yes,"]
# Generates the in silico neural responses using the encoding model previously loaded
responses = berg.encode(
model,
stimulus,
low_mem_use=True
)
# The in silico fMRI responses will be a torch.Tensor of shape:
# ['num_timepoints', 'num_rois']
# where:
# - num_timepoints: Number of TRs (timepoints) in the input stimulus.
# - num_rois: Number of Regions of Interest (1000 parcels from Schaefer 2018 atlas).
# Generate in silico neural responses with metadata
responses, metadata = berg.encode(
model,
stimulus,
return_metadata=True
)
# Load the encoding model's metadata without having to load the model itself
metadata = berg.get_model_metadata(
"fmri-cneuromod_algo2025-text2fmri",
subject=1
)
# Generate a gif out of the responses
gif_path = model.generate_glass_brain_animation(
responses=responses,
out_path="brain_activation.gif")
References
Course Materials: Dixit, S. (2026). Text2fMRI: Brain Encoding Models using LLMs (Course Materials) (v0.1.2). Zenodo. https://doi.org/10.5281/zenodo.18369862
Huggingface Collection: https://huggingface.co/ShreyDixit/Text2fMRI-Qwen-2.5-0.5B
Algonauts 2025 challenge dataset: https://github.com/courtois-neuromod/algonauts_2025.competitors