ecog-zada2025-gpt2_xl
Model Summary
Modality |
ECoG |
|---|---|
Training Dataset |
Zadal et al. (2025) |
Species |
Human |
Stimuli |
Text (natural speech transcript) |
Model Type |
GPT2-XL |
Creator |
Zaid Zada |
Description
This encoding model consists of a linear mapping (through ridge regression) of GPT-2 XL (Radford et al., 2019) contextual word embeddings onto intracranial electrocorticographic (ECoG) high-gamma band activity during natural speech comprehension. Prior to mapping onto neural responses, word embeddings are extracted from layer 24 (1,600-dimensional) of GPT-2 XL, a 48-layer autoregressive language model with 1.5 billion parameters. When a word is split into multiple sub-word tokens by the GPT-2 tokenizer, the token embeddings are averaged to produce a single word-level embedding.
The encoding models were trained on the Podcast ECoG dataset (Zada et al., Scientific Data 2025), consisting of 9 human participants with a total of 1,330 intracranial electrodes listening to a 30-minute audio podcast (~5,100 words).
Neural data. The preprocessed high-gamma band power (70–150 Hz) was extracted following the dataset’s official pipeline. The continuous signal was downsampled to 32 Hz, then epoched from -2.0 s to +2.0 s relative to each word onset, producing 129 time lags per electrode. Each lag represents the temporal offset from word onset (e.g., lag +0.3 s captures neural activity 300 ms after the word began playing). No noise ceiling is available for this dataset because each word is heard exactly once, precluding estimation of trial-to-trial variability.
Model training partition. All available word epochs (~5,100 per subject) were used for training. Independent encoding models were trained for each subject. Features and neural data were standardized prior to fitting. The regularization hyperparameter was selected via 5-fold inner cross-validation from 10 log-spaced alpha values between 10^1 and 10^10.
Model evaluation. Encoding accuracy was evaluated using 2-fold cross-validation matching the paper’s methodology: models were trained on each half of the podcast and tested on the other, and the two fold correlations (Pearson’s r per electrode × lag) were averaged. Encoding performance varies substantially across subjects due to differences in electrode placement. Subjects 01, 06, and 09 show the strongest encoding (best electrodes r ≈ 0.30–0.35), while subjects 05 and 07 show near-chance performance.
Encoding accuracy. The temporal accuracy plots average across all electrodes, many of which lie outside the language network, resulting in modest peak correlations (~0.03–0.05). The original paper reports accuracy around 12% but used only a subset of electrodes, selected through siginificance. To approximate the paper’s electrode selection, we additionally provide plots for the top 10% and 30% of electrodes ranked by peak correlation. For electrode-level detail, refer to the spatial maps. Note that the actual model is trained on all data but evaluated via 2-fold CV (each fold sees only half the data), so model performance is expected to be slightly better than the reported accuracy.
Output. Each encoding model predicts time-resolved high-gamma responses for all electrodes (or user-specified subsets) across 129 time lags for each input word.
Metadata
ecog
subject_id :
str- Subject identifiern_electrodes :
int- Number of electrodes (varies by subject)n_lags :
int- Number of time lags (129)sfreq :
float- Sampling frequency (32 Hz)tmin :
float- Epoch start (-2.0 s)tmax :
float- Epoch end (+2.0 s)times :
(129,)- Time points in secondsch_names :
(n_electrodes,)- Electrode namesch_coords :
(n_electrodes,3)- Electrode MNI coordinates
encoding_model
correlation_results :
(n_electrodes, 129)- 2-fold CV encoding accuracy (Pearson’s r)
Input
Type |
|
|---|---|
Description |
The input should be a list of words (strings). Context is built from all preceding words in the list. |
Constraints |
|
Example |
|
Output
Type |
|
|---|---|
Shape |
|
Description |
The output is a 3D array containing in silico high-gamma ECoG responses. |
Dimensions |
n_words: Number of words in the input.
n_electrodes: Number of electrodes (subject-dependent, or filtered by selection).
n_lags: Number of time lags relative to word onset.
|
Parameters
Parameters used in get_encoding_model
This function loads the encoding model.
model_id |
Type: str
Required: Yes
Description: Unique identifier of the model to load.
Valid Values: ecog-zada2025-gpt2_xl
Example: “ecog-zada2025-gpt2_xl”
|
subject |
Type: str
Required: Yes
Description: Subject ID from the Podcast ECoG dataset.
Valid Values: “01”, “02”, “03”, “04”, “05”, “06”, “07”, “08”, “09”
Example: “03”
|
selection |
Type: dict
Required: No
Description: Specifies which outputs to include in the model responses.
Can include specific electrodes and/or lags (timepoints). If not provided,
responses are generated for all electrodes and time lags.
Properties:
electrode_index
Type: numpy.ndarray
Description: Binary one-hot encoded vector indicating which electrodes to include.
Length must match the number of electrodes for the selected subject:
- Subject 01: 99 electrodes
- Subject 02: 90 electrodes
- Subject 03: 235 electrodes
- Subject 04: 143 electrodes
- Subject 05: 159 electrodes
- Subject 06: 166 electrodes
- Subject 07: 116 electrodes
- Subject 08: 72 electrodes
- Subject 09: 188 electrodes
Example: [0, 0, 1, 1, 0, ‘…’, 1]
lags
Type: numpy.ndarray
Description: Binary one-hot encoded vector indicating which time lags (timepoints) to include.
Must have exactly 129 elements.
Each position set to 1 indicates that time lag should be included.
Example: [0, 0, ‘…’, 1, 1, 0]
|
device |
Type: str
Required: No
Description: Device to run the model on. ‘auto’ will use CUDA if available, otherwise CPU.
Valid Values: “cpu”, “cuda”, “auto”
Example: “auto”
|
Parameters used in encode
This function generates in silico neural responses using the encoding model previously loaded.
model |
Type: BaseModelInterface
Required: Yes
Description: An instantiated and loaded encoding model.
|
stimulus |
Type: list[str]
Required: Yes
Description: A list of words (strings) to encode. Context is built from the full preceding word sequence.
Example: “[“The”, “quick”, “brown”, “fox”, “jumped”]”
|
return_metadata |
Type: bool
Required: No
Description: Whether to return the encoding model’s metadata together with the in silico neural responses.
Example: True
|
show_progress |
Type: bool
Required: No
Description: Whether to show a progress bar during encoding.
Example: True
|
Parameters used in get_model_metadata
This function loads the encoding model’s metadata without having to load the model itself.
model_id |
Type: str
Required: Yes
Description: Unique identifier of the model to load.
Valid Values: ecog-zada2025-gpt2_xl
Example: “ecog-zada2025-gpt2_xl”
|
subject |
Type: str
Required: Yes
Description: Subject ID from the Podcast ECoG dataset.
Valid Values: “01”, “02”, “03”, “04”, “05”, “06”, “07”, “08”, “09”
Example: “03”
|
Performance
Accuracy Plots (AWS directory):
brain-encoding-response-generator/encoding_models/modality-ecog/train_dataset-zada2025/model-gpt2_xl/encoding_models_accuracy
Example Usage
from berg import BERG
# Initialize BERG
berg = BERG(berg_dir="path/to/brain-encoding-response-generator")
# Load the model
model = berg.get_encoding_model(
"ecog-zada2025-gpt2_xl",
subject="03",
selection={
"electrode_index": [0, 0, 1, 1, 0, '...', 1],
"lags": [0, 0, '...', 1, 1, 0]
}
)
# Prepare the stimulus (text/sentences)
stimulus = ["The", "quick", "brown", "fox", "jumped"]
# Generates the in silico neural responses using the encoding model previously loaded
responses = berg.encode(
model,
stimulus,
show_progress=True
)
# The in silico fMRI responses will be a numpy.ndarray of shape:
# ['n_words', 'n_electrodes', 'n_lags']
# where:
# - n_words: Number of words in the input.
# - n_electrodes: Number of electrodes (subject-dependent, or filtered by selection).
# - n_lags: Number of time lags relative to word onset.
# Generate in silico neural responses with metadata
responses, metadata = berg.encode(
model,
stimulus,
return_metadata=True
)
# Load the encoding model's metadata without having to load the model itself
metadata = berg.get_model_metadata(
"ecog-zada2025-gpt2_xl",
subject="03"
)
References
Model building code: https://github.com/gifale95/BERG/tree/main/berg_creation_code/02_train_encoding_models/train_dataset-zada2025/train_encoding.py
Podcast ECoG Paper (Zada et al., 2025): https://doi.org/10.1038/s41597-025-05462-2
Podcast ECoG Data (Zada et al., 2025): https://openneuro.org/datasets/ds005574
GPT-2 (Radford et al., 2019): https://cdn.openai.com/better-language-models/language_models_are_unsupervised_multitask_learners.pdf