Running pcarpet
Here we will demonstrate how to run the pcarpet pipeline data on some example preprocessed fMRI data.
Inputs and initialization
pcarpet
needs 3 paths to be specified:
fMRI file: a 4d nifti file containing the preprocessed fMRI data
Mask file: a 3d nifti file, containing a binary mask that defines a single region-of-interest (e.g. cortex)
Output folder: for storing the outputs generated by pcarpet
The fMRI and Mask files need to be in the same space, with identical x-y-z dimensions. For example, they can both be in the native anatomical space of the subject, or in a template space to which the data have been resampled.
[1]:
import os
import pcarpet
[2]:
# Folder containing example data
example_folder = '/home/niko/MRI/pcarpet_example/macaque'
# 1. Path to preprocessed fMRI file
func = os.path.join(example_folder, 'func_preproc.nii.gz')
# 2. Path to a cortical mask
cortex_mask = os.path.join(example_folder, 'cortex_mask.nii.gz')
# 3. Path to a folder for storing the outputs
output_folder = os.path.join(example_folder, 'outputs')
To initialize pcarpet, we first create Dataset
object, using the three paths from above
[3]:
MyData = pcarpet.Dataset(func, cortex_mask, output_folder)
Initialized Dataset object:
fMRI file: /home/niko/MRI/pcarpet_example/macaque/func_preproc.nii.gz
Mask file: /home/niko/MRI/pcarpet_example/macaque/cortex_mask.nii.gz
Output directory: /home/niko/MRI/pcarpet_example/macaque/outputs
Once the Dataset
object is created, we have two ways of running the pcarpet
pipeline on it:
Running the entire pipeline at once
Running the pipeline step-by-step
Below, we will first demonstrate the convenient first option. We will then go through the step-by-step option, which may be useful for debugging and for understanding the vairous inputs and outputs of the pipeline.
Option 1: Running the entire pcarpet pipeline at once
We can directly call the run_pcarpet
method of the Dataset
object, which will execute the pipeline with default options.
[4]:
MyData.run_pcarpet()
Reading data...
fMRI data read: dimensions (80, 33, 80, 600)
Mask read: dimensions (80, 33, 80)
fMRI data reshaped to voxels x time (211200, 600).
21589 voxels retained after masking.
Carpet matrix created with shape (21589, 600).
Carpet normalized to zero-mean unit-variance.
Carpet reordered.
PCA fit to carpet and results saved.
First 5 PCs correlated with carpet.
Out of these, 3 sign-flipped.
First 5 PCs correlated with fMRI data.
TR of 2.000 seconds read from fMRI header
Visual report generated and saved as fPCs_carpet_corr_report.
We can see what the default options were, since they are stored in a dictionary called used_options
.
[5]:
ops = MyData.used_options
for key, val in ops.items():
print(f'{key}: {val}')
tSNR_thresh: 15.0
reorder_carpet: True
save_carpet: False
save_pca_scores: False
ncomp: 5
flip_sign: True
TR: auto
Any of the above default options can be overriden by explicitly passing them as arguments to the run_pcarpet
call. For example:
MyData.run_pcarpet(tSNR_thesh=10, reorder_carpet=False)
The meaning of each of these options will become clear in the step-by-step guide that follows.
Option 2: Running pcarpet step-by-step
Step 1: Importing the data
The first function, import_data
, uses the nibabel
package to import the fMRI data and the Mask into python as numpy arrays, and stores them as object attributes data
and mask
, respectiveyl. It also stores other attributes, like the x-y-z-t dimensions of fMRI data, the nifti header and the affine matrix.
[6]:
MyData.import_data()
Reading data...
fMRI data read: dimensions (80, 33, 80, 600)
Mask read: dimensions (80, 33, 80)
Step 2: Getting the carpet
The second function, get_carpet
, generates a ‘carpet’ from the fMRI data and the mask. A carpet is a 2d matrix, shaped voxels x time, which contains the normalized (z-score) BOLD-fMRI signal from within the mask.
The fMRI data is first reshaped to 2d. Voxels that are outside the mask as well as voxels below a specified temporal signal-to-noise ratio threshold tSNR_thresh
(default = 15) are discarded. The retained voxels are tranformed into 2d and normalized through z-scoring (subtract mean and divide by standard deviation along the time dimension).
By default, the rows of the carpet matrix (voxels) are ordered according to their (decreasing) correlation with the global (mean across voxels) signal. The reordering helps to highlight widespread signal fluctuations. It can be turned off by setting the reorder_carpet
argument to False
.
The carpet matrix is stored as a carpet
attribute of the Dataset
object. Optionally, it can be written to the output folder as a carpet.npy
file (can be large), by setting the save_carpet
argument to True
.
[7]:
MyData.get_carpet(tSNR_thresh=15.0,
reorder_carpet=True, save_carpet=True)
fMRI data reshaped to voxels x time (211200, 600).
21589 voxels retained after masking.
Carpet matrix created with shape (21589, 600).
Carpet normalized to zero-mean unit-variance.
Carpet reordered.
Carpet saved as 'carpet.npy'.
Step 3: Fit PCA to carpet
The third function, fit_pca2carpet
, fits PCA to the carpet
matrix and saves the principal componens (PCs), the explained variance ratios, and optionally the PCA scores (PCA-tranformed carpet).
The PCs are stored as a pca_comps
attribute of the Dataset
object and are also written to the output folder as a PCs.npy
file.
The explained variance ratios per PC are stored as a expl_var
attribute of the Dataset
object and are also written to the output folder as a PCA_expl_var.npy
file.
If the save_pca_scores
option is set to True
, the PCA-transformed data will be written to the output folder as a PCA_scores.npy
file (will be large).
[8]:
MyData.fit_pca2carpet(save_pca_scores=True)
PCA fit to carpet and results saved.
Step 4: Correlate PCs with carpet
The fourth function, correlate_with_carpet
, Correlates the first ncomp
(defaults to 5) principal components (first PCs = fPCs) with all carpet voxel time-series. The fPCs are written to the output folder as fPCs.csv
. The correlation matrix is written to the output folder as fPCs_carpet_corr.npy
.
If the flip_sign
option is True
(enabled by default), an fPC (and its correlation values) will be sign-flipped when the median of its original correlation with carpet voxels is negative. This enforces the sign of the fPC to match the sign of the BOLD signal activity for most voxels. The sign-flipped fPCs are only used for downstream analysis and visualization. If any flips occurr, new versions of the two above output files are written: fPCs_flipped.csv
and
fPCs_carpet_corr_flipped.npy
.
A report table of the above analyis is written to the output folder as fPCs_carpet_corr_report.csv
. For each fPC, the table reports the explained variance ratio, the original median value of the correlation across carpet voxels, and whether sign-flipping was done for downstream analysis.
[9]:
MyData.correlate_with_carpet(ncomp=5, flip_sign=True)
First 5 PCs correlated with carpet.
Out of these, 3 sign-flipped.
Step 5: Correlate PCs with fMRI
The fifth function, correlate_with_fmri
, correlates the retained (and possibly sign-flipped) first ncomp
PCs (fPCs) with the original 4d fMRI dataset. The resulting Pearson’s correlation maps are written to the output folder as a 4d NIFTI file (3d space + ncomp
) named fPCs_fMRI_corr.nii.gz
. This file contains ncomp
3d correation maps, which represent the spatial distribution of each fPC across the brain.
[10]:
MyData.correlate_with_fmri()
First 5 PCs correlated with fMRI data.
Step 6: Visualize results
The sixth and final function, plot_report
, generates a visual report of the results, including the carpet plot, the first ncomp
PCs (fPCs), their correlation with the carpet, and their explained variance ratios. The plot image, named fPCs_carpet_corr_report
is written to the output folder in both ‘.png’ (raster) and ‘.svg’ (vector) formats.
To correctly display the time-scale, the function attempts to read the repetition time TR
from the nifti header (TR
=’auto’). This can be overriden by explicitly passing a TR
value (in seconds) as an argument
[11]:
MyData.plot_report(TR='auto')
TR of 2.000 seconds read from fMRI header
Visual report generated and saved as fPCs_carpet_corr_report.
Outputs
Here we will list the outputs of the pcarpet
pipeline that are written to the pre-specified output_folder
.
Default outputs
The following outputs are always generated:
PCs.npy
: 2d numpy array containing all Principal Components (PCs)PCA_expl_var.npy
: 1d numpy array with explained variance ratios per PCfPCS.csv
: comma-separated table containing the firstncomp
PCs (fPCs) as columnsfPCs_carpet_corr.npy
: 2d numpy array containing the correlation (r) values between fPCs and carpet voxel time-seriesfPCs_carpet_corr_report.csv
: comma separated table containing the following columns:
PC: PC1, PC2, …
expl_var: explained variance ratio
carpet_r_median: median r value across carpet voxels
sign flipped: boolean,
True
ifflip_sign = True
and ifcarpet_r_median < 0
fPCs_carpet_corr_report.png
: visual report in raster formatfPCs_carpet_corr_report.svg
: visual report in vector formatfPCs_fMRI_corr.nii.gz
: 4d nifti file, correlation maps between each (potentially sign-flipped) fPC and the entire fMRI scan. The file containsncomp
3d maps.
Conditional outputs
If flip_sign = True
and if at least one sign flip occurrs, new versions of outputs 3 and 4 are generated:
fPCS_flipped.csv
fPCs_carpet_corr_flipped.npy
Optional outputs
The following outputs are generated only when explicitly specified. Their file size can be large, depending on the fMRI and mask file sizes.
carpet.npy
(saved ifsave_carpet = True
): 2d numpy array containing the carpet matrixPCA_scores.npy
(saved ifsave_pca_scores = True
): 2d numpy array containing PCA scores, i.e. the PCA-transformed carpet data
Importing outputs into python
Importing .npy files with numpy
[12]:
# importing .npy files with numpy
import numpy as np
carpet = np.load(os.path.join(output_folder, 'carpet.npy'))
print(f'Carpet, shape = {carpet.shape}')
PCs = np.load(os.path.join(output_folder, 'PCs.npy'))
print(f'PCs, shape = {PCs.shape}')
PCA_scores = np.load(os.path.join(output_folder, 'PCA_scores.npy'))
print(f'PCA scores, shape = {PCA_scores.shape}')
expl_var = np.load(os.path.join(output_folder, 'PCA_expl_var.npy'))
print(f'Explained variance ratio, shape = {expl_var.shape}')
fPCs_carpet_corr = np.load(os.path.join(output_folder, 'fPCs_carpet_corr.npy'))
print(f'fPCs_carpet_corr, shape = {fPCs_carpet_corr.shape}')
Carpet, shape = (21589, 600)
PCs, shape = (600, 600)
PCA scores, shape = (21589, 600)
Explained variance ratio, shape = (600,)
fPCs_carpet_corr, shape = (21589, 5)
Importing .csv files with pandas
[13]:
import pandas as pd
fPCs = pd.read_csv(os.path.join(output_folder, 'fPCs.csv'))
fPCs.head()
[13]:
PC1 | PC2 | PC3 | PC4 | PC5 | |
---|---|---|---|---|---|
0 | -0.068611 | 0.020185 | -0.062595 | -0.091971 | 0.049780 |
1 | -0.071512 | 0.005875 | -0.038379 | -0.091214 | 0.092050 |
2 | -0.055034 | 0.019877 | 0.017327 | -0.053867 | 0.101125 |
3 | -0.026231 | 0.039495 | 0.061318 | -0.019174 | 0.076513 |
4 | -0.000228 | 0.039751 | 0.059199 | -0.009998 | 0.040623 |
[14]:
report = pd.read_csv(os.path.join(output_folder, 'fPCs_carpet_corr_report.csv'))
report.head()
[14]:
PC | expl_var | carpet_r_median | sign_flipped | |
---|---|---|---|---|
0 | PC1 | 0.108726 | -0.392726 | True |
1 | PC2 | 0.029700 | -0.005034 | True |
2 | PC3 | 0.026533 | -0.030266 | True |
3 | PC4 | 0.017705 | 0.015283 | False |
4 | PC5 | 0.014326 | 0.018219 | False |
.