Running pcarpet

Here we will demonstrate how to run the pcarpet pipeline data on some example preprocessed fMRI data.

Inputs and initialization

pcarpet needs 3 paths to be specified:

  1. fMRI file: a 4d nifti file containing the preprocessed fMRI data

  2. Mask file: a 3d nifti file, containing a binary mask that defines a single region-of-interest (e.g. cortex)

  3. Output folder: for storing the outputs generated by pcarpet

The fMRI and Mask files need to be in the same space, with identical x-y-z dimensions. For example, they can both be in the native anatomical space of the subject, or in a template space to which the data have been resampled.

[1]:
import os
import pcarpet
[2]:
# Folder containing example data
example_folder = '/home/niko/MRI/pcarpet_example/macaque'

# 1. Path to preprocessed fMRI file
func = os.path.join(example_folder, 'func_preproc.nii.gz')
# 2. Path to a cortical mask
cortex_mask = os.path.join(example_folder, 'cortex_mask.nii.gz')
# 3. Path to a folder for storing the outputs
output_folder = os.path.join(example_folder, 'outputs')

To initialize pcarpet, we first create Dataset object, using the three paths from above

[3]:
MyData = pcarpet.Dataset(func, cortex_mask, output_folder)

Initialized Dataset object:
        fMRI file: /home/niko/MRI/pcarpet_example/macaque/func_preproc.nii.gz
        Mask file: /home/niko/MRI/pcarpet_example/macaque/cortex_mask.nii.gz
        Output directory: /home/niko/MRI/pcarpet_example/macaque/outputs

Once the Dataset object is created, we have two ways of running the pcarpet pipeline on it:

  1. Running the entire pipeline at once

  2. Running the pipeline step-by-step

Below, we will first demonstrate the convenient first option. We will then go through the step-by-step option, which may be useful for debugging and for understanding the vairous inputs and outputs of the pipeline.

Option 1: Running the entire pcarpet pipeline at once

We can directly call the run_pcarpet method of the Dataset object, which will execute the pipeline with default options.

[4]:
MyData.run_pcarpet()
Reading data...
        fMRI data read: dimensions (80, 33, 80, 600)
        Mask read: dimensions (80, 33, 80)
fMRI data reshaped to voxels x time (211200, 600).
21589 voxels retained after masking.
Carpet matrix created with shape (21589, 600).
Carpet normalized to zero-mean unit-variance.
Carpet reordered.
PCA fit to carpet and results saved.
First 5 PCs correlated with carpet.
Out of these, 3 sign-flipped.
First 5 PCs correlated with fMRI data.
TR of 2.000 seconds read from fMRI header
Visual report generated and saved as fPCs_carpet_corr_report.
_images/example_usage_8_1.png

We can see what the default options were, since they are stored in a dictionary called used_options.

[5]:
ops = MyData.used_options
for key, val in ops.items():
    print(f'{key}: {val}')
tSNR_thresh: 15.0
reorder_carpet: True
save_carpet: False
save_pca_scores: False
ncomp: 5
flip_sign: True
TR: auto

Any of the above default options can be overriden by explicitly passing them as arguments to the run_pcarpet call. For example:

MyData.run_pcarpet(tSNR_thesh=10, reorder_carpet=False)

The meaning of each of these options will become clear in the step-by-step guide that follows.

Option 2: Running pcarpet step-by-step

Step 1: Importing the data

The first function, import_data, uses the nibabel package to import the fMRI data and the Mask into python as numpy arrays, and stores them as object attributes data and mask, respectiveyl. It also stores other attributes, like the x-y-z-t dimensions of fMRI data, the nifti header and the affine matrix.

[6]:
MyData.import_data()
Reading data...
        fMRI data read: dimensions (80, 33, 80, 600)
        Mask read: dimensions (80, 33, 80)

Step 2: Getting the carpet

The second function, get_carpet, generates a ‘carpet’ from the fMRI data and the mask. A carpet is a 2d matrix, shaped voxels x time, which contains the normalized (z-score) BOLD-fMRI signal from within the mask.

The fMRI data is first reshaped to 2d. Voxels that are outside the mask as well as voxels below a specified temporal signal-to-noise ratio threshold tSNR_thresh (default = 15) are discarded. The retained voxels are tranformed into 2d and normalized through z-scoring (subtract mean and divide by standard deviation along the time dimension).

By default, the rows of the carpet matrix (voxels) are ordered according to their (decreasing) correlation with the global (mean across voxels) signal. The reordering helps to highlight widespread signal fluctuations. It can be turned off by setting the reorder_carpet argument to False.

The carpet matrix is stored as a carpet attribute of the Dataset object. Optionally, it can be written to the output folder as a carpet.npy file (can be large), by setting the save_carpet argument to True.

[7]:
MyData.get_carpet(tSNR_thresh=15.0,
                  reorder_carpet=True, save_carpet=True)
fMRI data reshaped to voxels x time (211200, 600).
21589 voxels retained after masking.
Carpet matrix created with shape (21589, 600).
Carpet normalized to zero-mean unit-variance.
Carpet reordered.
Carpet saved as 'carpet.npy'.

Step 3: Fit PCA to carpet

The third function, fit_pca2carpet, fits PCA to the carpet matrix and saves the principal componens (PCs), the explained variance ratios, and optionally the PCA scores (PCA-tranformed carpet).

The PCs are stored as a pca_comps attribute of the Dataset object and are also written to the output folder as a PCs.npy file.

The explained variance ratios per PC are stored as a expl_var attribute of the Dataset object and are also written to the output folder as a PCA_expl_var.npy file.

If the save_pca_scores option is set to True, the PCA-transformed data will be written to the output folder as a PCA_scores.npy file (will be large).

[8]:
MyData.fit_pca2carpet(save_pca_scores=True)
PCA fit to carpet and results saved.

Step 4: Correlate PCs with carpet

The fourth function, correlate_with_carpet, Correlates the first ncomp (defaults to 5) principal components (first PCs = fPCs) with all carpet voxel time-series. The fPCs are written to the output folder as fPCs.csv. The correlation matrix is written to the output folder as fPCs_carpet_corr.npy.

If the flip_sign option is True (enabled by default), an fPC (and its correlation values) will be sign-flipped when the median of its original correlation with carpet voxels is negative. This enforces the sign of the fPC to match the sign of the BOLD signal activity for most voxels. The sign-flipped fPCs are only used for downstream analysis and visualization. If any flips occurr, new versions of the two above output files are written: fPCs_flipped.csv and fPCs_carpet_corr_flipped.npy.

A report table of the above analyis is written to the output folder as fPCs_carpet_corr_report.csv. For each fPC, the table reports the explained variance ratio, the original median value of the correlation across carpet voxels, and whether sign-flipping was done for downstream analysis.

[9]:
MyData.correlate_with_carpet(ncomp=5, flip_sign=True)
First 5 PCs correlated with carpet.
Out of these, 3 sign-flipped.

Step 5: Correlate PCs with fMRI

The fifth function, correlate_with_fmri, correlates the retained (and possibly sign-flipped) first ncomp PCs (fPCs) with the original 4d fMRI dataset. The resulting Pearson’s correlation maps are written to the output folder as a 4d NIFTI file (3d space + ncomp) named fPCs_fMRI_corr.nii.gz. This file contains ncomp 3d correation maps, which represent the spatial distribution of each fPC across the brain.

[10]:
MyData.correlate_with_fmri()
First 5 PCs correlated with fMRI data.

Step 6: Visualize results

The sixth and final function, plot_report, generates a visual report of the results, including the carpet plot, the first ncomp PCs (fPCs), their correlation with the carpet, and their explained variance ratios. The plot image, named fPCs_carpet_corr_report is written to the output folder in both ‘.png’ (raster) and ‘.svg’ (vector) formats.

To correctly display the time-scale, the function attempts to read the repetition time TR from the nifti header (TR=’auto’). This can be overriden by explicitly passing a TR value (in seconds) as an argument

[11]:
MyData.plot_report(TR='auto')
TR of 2.000 seconds read from fMRI header
Visual report generated and saved as fPCs_carpet_corr_report.
_images/example_usage_23_1.png

Outputs

Here we will list the outputs of the pcarpet pipeline that are written to the pre-specified output_folder.

Default outputs

The following outputs are always generated:

  1. PCs.npy: 2d numpy array containing all Principal Components (PCs)

  2. PCA_expl_var.npy: 1d numpy array with explained variance ratios per PC

  3. fPCS.csv: comma-separated table containing the first ncomp PCs (fPCs) as columns

  4. fPCs_carpet_corr.npy: 2d numpy array containing the correlation (r) values between fPCs and carpet voxel time-series

  5. fPCs_carpet_corr_report.csv: comma separated table containing the following columns:

  • PC: PC1, PC2, …

  • expl_var: explained variance ratio

  • carpet_r_median: median r value across carpet voxels

  • sign flipped: boolean, True if flip_sign =  True and if carpet_r_median < 0

  1. fPCs_carpet_corr_report.png: visual report in raster format

  2. fPCs_carpet_corr_report.svg: visual report in vector format

  3. fPCs_fMRI_corr.nii.gz: 4d nifti file, correlation maps between each (potentially sign-flipped) fPC and the entire fMRI scan. The file contains ncomp 3d maps.

Conditional outputs

If flip_sign = True and if at least one sign flip occurrs, new versions of outputs 3 and 4 are generated:

  1. fPCS_flipped.csv

  2. fPCs_carpet_corr_flipped.npy

Optional outputs

The following outputs are generated only when explicitly specified. Their file size can be large, depending on the fMRI and mask file sizes.

  1. carpet.npy (saved if save_carpet = True): 2d numpy array containing the carpet matrix

  2. PCA_scores.npy (saved if save_pca_scores = True): 2d numpy array containing PCA scores, i.e. the PCA-transformed carpet data

Importing outputs into python

Importing .npy files with numpy

[12]:
# importing .npy files with numpy
import numpy as np

carpet = np.load(os.path.join(output_folder, 'carpet.npy'))
print(f'Carpet, shape = {carpet.shape}')

PCs = np.load(os.path.join(output_folder, 'PCs.npy'))
print(f'PCs, shape = {PCs.shape}')

PCA_scores = np.load(os.path.join(output_folder, 'PCA_scores.npy'))
print(f'PCA scores, shape = {PCA_scores.shape}')

expl_var = np.load(os.path.join(output_folder, 'PCA_expl_var.npy'))
print(f'Explained variance ratio, shape = {expl_var.shape}')

fPCs_carpet_corr = np.load(os.path.join(output_folder, 'fPCs_carpet_corr.npy'))
print(f'fPCs_carpet_corr, shape = {fPCs_carpet_corr.shape}')
Carpet, shape = (21589, 600)
PCs, shape = (600, 600)
PCA scores, shape = (21589, 600)
Explained variance ratio, shape = (600,)
fPCs_carpet_corr, shape = (21589, 5)

Importing .csv files with pandas

[13]:
import pandas as pd

fPCs = pd.read_csv(os.path.join(output_folder, 'fPCs.csv'))
fPCs.head()
[13]:
PC1 PC2 PC3 PC4 PC5
0 -0.068611 0.020185 -0.062595 -0.091971 0.049780
1 -0.071512 0.005875 -0.038379 -0.091214 0.092050
2 -0.055034 0.019877 0.017327 -0.053867 0.101125
3 -0.026231 0.039495 0.061318 -0.019174 0.076513
4 -0.000228 0.039751 0.059199 -0.009998 0.040623
[14]:
report = pd.read_csv(os.path.join(output_folder, 'fPCs_carpet_corr_report.csv'))
report.head()
[14]:
PC expl_var carpet_r_median sign_flipped
0 PC1 0.108726 -0.392726 True
1 PC2 0.029700 -0.005034 True
2 PC3 0.026533 -0.030266 True
3 PC4 0.017705 0.015283 False
4 PC5 0.014326 0.018219 False

.