This notebook teaches how to use the BIDS Archive, BIDS Incremental, and BIDS Run classes.

# Pre-Run Setup


## Activating the RT-Cloud Conda Environment in Jupyter

After successfully following the Anaconda (conda) setup instructions in the `README` file, be sure to also activate the `rtcloud` kernel in this notebook. To do this, first make sure that you activated the `rtcloud` conda environment in the terminal before you ran the `jupyter notebook` command to launch this notebook. Then, go to Kernel -> Change Kernel, then select the kernel with `rtcloud` in it. If the kernel isn't in the list, execute the following command, restart the notebook server, and try again:
```
python -m ipykernel install --user --name=rtcloud
```
After changing kernels, the kernel name in the upper right hand corner of the notebook should look similar to:
```
Python [conda env:rtcloud]
```

In [None]:
# Initial Setup (run this before executing later code cells): Imports and Constants
""" Add rtCommon to the path """
import os
import sys

currPath = os.path.dirname(os.path.realpath(os.getcwd())) # docs
rootPath = os.path.dirname(currPath) # project root
sys.path.append(rootPath)

import io
import json
import pickle
import shutil
import subprocess
import tempfile

import pandas as pd

from rtCommon.bidsArchive import BidsArchive
from rtCommon.bidsCommon import getDicomMetadata, loadBidsEntities
from rtCommon.bidsIncremental import BidsIncremental
from rtCommon.bidsRun import BidsRun
from rtCommon.errors import MissingMetadataError, MetadataMismatchError
from rtCommon.imageHandling import convertDicomFileToNifti, readDicomFromFile, readNifti

TARGET_DIR = 'dataset'
TEMP_NIFTI_NAME = 'temp.nii'
DICOM_PATH = 'tests/test_input/001_000013_000005.dcm'

# Preliminaries

There are a few terms that are important to understand before starting to use the BIDS Archive & BIDS Incremental tutorial.


## Understanding BIDS Entities

BIDS Entities, referred to later as just 'entities', are used to create the file names of files in a BIDS Archive and describe the file they name. You may already be familiar with common ones, like `subject`, `task`, and `run`. An example of a file name containing BIDS Entities is `sub-01_task-languageproduction_run-01_bold.nii.gz`, which contains the `subject`, `task`, and `run` entities.

Most entities are used in a key-value form (`key-value`, e.g., `sub-01`) and have their name and value present wherever they are used. They have three main representations. 

1. **Entity**: One word, all lowercase. A summary of the entity. (e.g., 'ceagent', 'subject')
2. **Full Name**: Up to several words, fully describes the entity. (e.g., 'Contrast Enhancing Agent', 'Subject')
3. **File Name Key**: A few characters, used in file names. (e.g., 'ce', 'sub')

A few entities and their multiple representations are shown in the table below:

| Entity | Full Name | File Name Key |
| --- | --- | --- |
| ceagent | Contrast Enhancing Agent | ce |
| subject | Subject | sub |
| session | Session | ses |
| run | Run | run |

As of this writing, the other valid entities for BIDS and BIDS Derivatives are listed in the BIDS Standard and in JSON files in the PyBids Github repository (https://github.com/bids-standard/pybids/tree/master/bids/layout/config). An easy way to view all valid entities in RT-Cloud, which are loaded from PyBids, is to view the dictionary returned by the following:

In [None]:
entities = loadBidsEntities()
print("Entities:", entities.keys())

There are also a few entities with only one representation, which aren't used in a key-value form. Examples include `datatype` (e.g., 'func' or 'anat'), `extension` (e.g., '.nii', '.nii.gz', '.json'), and `suffix` (e.g., 'bold'). See the table below for how these can appear in file naming and archive organization.

Together, these entities provide a unique and consistent way to name files and organize the BIDS dataset.

#### Exercise: What entities are present in the path `sub-01/func/sub-01_task-languageproduction_run-01_bold.nii.gz`, and what are the entity values? 

##### Answer: 
| Entity Name | Value |
|---          | ---   |
| subject | 01 |
| datatype | func |
| task | languageproduction |
| run | 01 |
| suffix | bold |
| extension | .nii.gz|

# BIDS Archive: Opening Existing Dataset

Objective: Learn how to create a BIDS Archive pointing to a specific dataset on disk.

Procedure:
1. Download a small, sample dataset from OpenNeuro to use with `BidsArchive`.
2. Open the dataset using `BidsArchive` and print out some summary data about it

In [None]:
# https://openneuro.org/datasets/ds002014/versions/1.0.1/download -- relatively small dataset (<40MB)
shutil.rmtree(TARGET_DIR, ignore_errors=True)
command = 'aws s3 sync --no-sign-request s3://openneuro.org/ds002014 ' + TARGET_DIR
command = command.split(' ')
if subprocess.call(command) == 0:
    print("Dataset successfully downloaded")
else:
    print("Error in calling download command")

In [None]:
""" Open downloaded dataset """
archive = BidsArchive(TARGET_DIR)
print('Archive: ', archive)

# BIDS Archive: Querying Dataset

Objective: Learn how to extract information and files from the `BidsArchive`.

Procedure:

1. Search for images in the dataset.
2. Search for sidecar metadata for the images in the dataset.

In [None]:
# Any BIDS entity can be extracted from the archive using getEntity() (e.g., getSubjects(), getRuns(), getTasks())
print('Dataset info: Subjects: {subjects} | Runs: {runs} | Tasks: {tasks}\n'
      .format(subjects=archive.getSubjects(), runs=archive.getRuns(), tasks=archive.getTasks()))

# Arguments can be passed as keywords or using a dictionary with equivalent results
entityDict = {'subject': archive.getSubjects()[0], 'run': archive.getRuns()[0]}
imagesUsingDict = archive.getImages(**entityDict)
imagesUsingKeywords = archive.getImages(subject=archive.getSubjects()[0], run=archive.getRuns()[0])
assert imagesUsingDict == imagesUsingKeywords

print('Number of image files associated with Subject {}, Run {}: {}'.format(
    entityDict['subject'], entityDict['run'], len(imagesUsingDict)))

# Get all images from the functional runs
images = archive.getImages(datatype='func')
print('Number of functional images: {}'.format(len(images)))

# Anatomical images can be retrieved too
images = archive.getImages(datatype='anat')
print('Number of anatomical images: {}'.format(len(images)))

In [None]:
# No images are returned if matches aren't found
subjectName='invalidSubject'
images = archive.getImages(subject=subjectName)
print('Number of image files associated with Subject "{}": {}'.format(subjectName, len(images)))

Now that we've seen how to get images from an archive, we'll look at how to get metadata for images we've retrieved from the archive.

To get metadata for an image, the path to the image file is required. Every `BIDSImageFile` returned from `getImages` has a `path` property you can use to obtain this path.

In [None]:
# Get all image files, then create a dictionary mapping each image file's path to its metadata dictionary
imageFiles = archive.getImages()
metadata = {i.path: archive.getSidecarMetadata(i.path) for i in imageFiles}
for path, metaDict in metadata.items():
    print('Metadata for:', path, "is:\n", json.dumps(metaDict, indent=4, sort_keys=True), "\n")

The last piece of data we'll see how to get from an archive is the events file corresponding to a particular scanning run.

In [None]:
# Event files to get can be filtered by entities, as with 
# getImages and getSidecarMetadata
events = archive.getEvents(subject='01', 
                           task='languageproduction', run=1)

# All event files can be retrieved when specifiying no entities
events = archive.getEvents()

# Event files are returned as BIDSDataFile objects
# See the PyBids documentation for more information on those
eventsFile = events[0]
print('Events file type: ', type(eventsFile))

# One method of the BIDSDataFile object returns
# a Pandas data frame of the events file
eventsDF = eventsFile.get_df()

print("Sample data: \n", eventsDF[:][:5])

# BIDS Archive: Getting & Appending Scanning Runs

One of the most important functions that a `BIDS Archive` enables in the context of RT-Cloud is working with `BIDS Runs`. From a `BIDS Archive`, you can get all the image data and metadata from a particular scanning run packaged into a `BIDS Run` using `getBidsRun`. The opposite operation, for when you already have data accumulated in a `BIDS Run` and want to append it to an existing archive (or create a new archive) is to append it to a `BIDS Archive` using `appendBidsRun`.

For example, if you have a complete dataset that you want to test a new real-time experiment on, you can use `getBidsRun` to iterate over your entire dataset. From the run, you can stream the BIDS Incrementals in the run to RT-Cloud and the new experimental script you want to try out. Then, when you're running your new experiment in RT-Cloud for real, as `BIDS Incremental` files are streamed from the scanner to your script, you can create a new `BIDS Archive` in an empty folder on the computer running your script, build up your run as it happens in a `BidsRun`, and then add it to your archive all at once by calling `appendBidsRun` when your scanning run completes.

In [None]:
# Setup
td = tempfile.TemporaryDirectory()
print('Temporary directory path:', td.name)

In [None]:
# Get a run
entityDict = {'subject': '01', 'task': 'languageproduction'}
firstSubjectLangRun = archive.getBidsRun(**entityDict)
    
# If we append this run to an empty archive, the two archives will have the same data

# Create new archive
newArchive = BidsArchive(td.name)
newArchive.appendBidsRun(firstSubjectLangRun)

# Compare runs
assert firstSubjectLangRun == newArchive.getBidsRun(**entityDict)

# We can also build up runs from incrementals -- here, we'll fake a new run by modifying
# the metadata to be for subject #2
subjectTwoRun = BidsRun()
for i in range(firstSubjectLangRun.numIncrementals()):
    incremental = firstSubjectLangRun.getIncremental(i)
    incremental.setMetadataField('subject', '02')
    subjectTwoRun.appendIncremental(incremental)
    
assert '02' not in newArchive.getSubjects()
newArchive.appendBidsRun(subjectTwoRun)
assert '02' in newArchive.getSubjects()

In [None]:
# Cleanup
td.cleanup()

# BIDS Incremental: Creating Incremental

A `BIDS Incremental` has two primary components:
1. A NIfTI image
2. A metadata dictionary storing information about the image.

It also has a few other components that are used when the `BIDS Incremental` is written to disk, and may be used by you for other purposes. Those are:
1. The dataset description dictionary, which becomes the `dataset_description.json` in a BIDS Archive.
2. The README string, which becomes the `README` file in a BIDS archive.
3. The events dataframe, which becomes the `<file name entities>_events.tsv` file in a BIDS archive.

To create a `BIDS Incremental`, only the image and the metadata dictionary are needed, and default versions of the other components are created if the `BIDS Incremental` is written to disk.

When reading from a BIDS-compliant dataset, all metadata
should already be present, and using BIDS Archive methods
to read the image and metadata is sufficient to create the
incremental.

In [None]:
# Get the NIfTI image
imageFile = archive.getImages(subject='01', run=1)[0]
image = imageFile.get_image()

# Get the metadata for the image
metadata = archive.getSidecarMetadata(imageFile, includeEntities=True)

# Create the BIDS Incremental
incremental = BidsIncremental(image, metadata)
print('Created Incremental: ', incremental)

If converting from a DICOM image, sometimes extra work is needed to obtain all the metadata needed to create a valid BIDS-Incremental. This is because a BIDS-Incremental is a fully valid BIDS dataset itself, which has slightly different metadata than a DICOM. 

While RT-Cloud does its best to extract all possible metadata needed for BIDS from the DICOM image's metadata (e.g., it automatically extracts any BIDS entities from the DICOM's `ProtocolName` metadata field), sometimes you will have to manually specify fields for your experiment. The following example shows how these fields sometimes must be added by the user of the system.

In [None]:
# Setup
td = tempfile.TemporaryDirectory()
print('Temporary directory path:', td.name)

In [None]:
TEMP_NIFTI_PATH = os.path.join(td.name, TEMP_NIFTI_NAME)
dicomPath = os.path.join(rootPath, DICOM_PATH)
convertDicomFileToNifti(dicomPath, TEMP_NIFTI_PATH)
image = readNifti(TEMP_NIFTI_PATH)

dicomMetadata = getDicomMetadata(readDicomFromFile(dicomPath))

try:
    incremental = BidsIncremental(image, dicomMetadata)
except MissingMetadataError as e:
    print("-------- Metadata required for BIDS, unable to be extracted from DICOM --------")
    print(e)
    print("----------------")
    # We can see that 'subject', 'suffix', and 'datatype' were not able to be 
    # extracted from the DICOM's metadata. 
    # This implies RT-Cloud was able to extract the other required fields 
    # (task, RepetitionTime, and EchoTime).
    # Therefore, we'll only have to manually provide 'subject', 'suffix', 
    # and 'datatype' based on our knowledge of the experiment.
    
# Here, we'll pretend the subject is the 1st subject, the imaging methodology
# was fMRI BOLD, and the datatype is func, representing a functional run
dicomMetadata.update({'subject': '01', 'suffix': 'bold', 'datatype': 'func'})

# Now, the incremental's creation will succeed
incremental = BidsIncremental(image, dicomMetadata)

print('Created Incremental:', incremental)

In [None]:
# Cleanup
td.cleanup()

# BIDS Incremental: Querying Incremental

A `BIDS Incremental` is the basic unit of data transfer in RT-Cloud, and your scripts will often interact directly with an Incremental and the data within it. This part of the tutorial will show you how to obtain different parts of the Incremental's data.

### Querying Metadata

In [None]:
# Getting, setting, and removing metadata
fields = ['subject', 'task', 'RepetitionTime', 'ProtocolName']
oldValues = {key: incremental.getMetadataField(key) for key in fields}

print('-------- Getting Fields --------')
for field in fields:
    print(field + ': ' + str(incremental.getMetadataField(field)))
    
print('\n-------- After Setting Fields --------')
for field in fields:
    incremental.setMetadataField(field, 'test')
for field in fields:
    print(field + ': ' + str(incremental.getMetadataField(field)))
    
print('\n-------- Removing Fields --------')
for field in fields:
    # Note that required fields can only be changed, not removed
    try:
        incremental.removeMetadataField(field)
    except RuntimeError as e:
        print(str(e))
for field in fields:
    try:
        print(field + ': ' + str(incremental.getMetadataField(field)))
    except KeyError as e:
        print(str(e))
        
# Restore original values
for key, value in oldValues.items():
    incremental.setMetadataField(key, value)

In [None]:
print('\n-------- Full Metadata Dictionary --------')
print(incremental.getImageMetadata())

### Querying Image and Image-Related Properties

In addition to these methods, there are several properties that help extract particular entities or data having to do with the NIfTI image contained within the Incremental.

In [None]:
# Entities
print('Suffix:', incremental.getSuffix())
print('Datatype:', incremental.getDatatype())
print('BIDS Entities:', incremental.getEntities())

In [None]:
# Image properties
print('Image dimensions:', incremental.getImageDimensions())
print('\nImage header:', incremental.getImageHeader())
print('\nImage data:', incremental.getImageData())

### Querying BIDS Archive-Related Properties

Because each `BIDS Incremental` can also be made into a fully valid, on-disk BIDS Archive, there are also a variety of properties in the `BIDS Incremental` about how its data would be represented on disk in folders and files.

When a `BIDS Archive` is created from a `BIDS Incremental`, several files are created in the archive, including the `README`, the events file, and the `dataset_description.json`. These have default values, but can all be manually modified.

In [None]:
old_values = {'readme': incremental.readme, 'datasetMetadata': incremental.datasetMetadata, 'events': incremental.events}

print('-------- Default Property Values --------')
print('\nREADME:', f'"{incremental.readme}"')
print('\nSource Dictionary for dataset_description.json:', incremental.datasetMetadata)
print('\nEvents File:', incremental.events)

# modify the properties
incremental.readme = 'Tutorial Dataset'

incremental.datasetMetadata['Name'] = 'Tutorial Dataset'
incremental.datasetMetadata['Authors'] = ["Your Name", "Your Collaborator's Name"]

incremental.events = pd.DataFrame({'onset': 0.0, 'duration': 5.0, 'response_time':1.0}, index=[0])

print('-------- Properties Post-Change --------')
print('\nREADME:', f'"{incremental.readme}"')
print('\nSource Dictionary for dataset_description.json:', incremental.datasetMetadata)
print('\nEvents File:', incremental.events)

# restore previous values
incremental.readme = old_values['readme']
incremental.datasetMetadata = old_values['datasetMetadata']
incremental.events = old_values['events']

The file paths and names that will be created for the data in the archive can also be queried without actually writing the archive to disk.

In [None]:
print('\n-------- Directory Names and Paths --------')
print('Dataset directory name:', incremental.getDatasetName())
print('Data directory path:', incremental.getDataDirPath())

print('\n-------- File Names --------')
print('Image file name:', incremental.getImageFileName())
print('Metadata file name:', incremental.getMetadataFileName())
print('Events file name:', incremental.getEventsFileName())

print('\n-------- File Paths --------')
print('Image file path:', incremental.getImageFilePath())
print('Metadata file path:', incremental.getMetadataFilePath())
print('Events file path:', incremental.getEventsFilePath())

# BIDS Incremental: Writing to Disk

One of the key features of a `BIDS Incremental` is that it is also a valid, 1-image `BIDS Archive`. Thus, a `BIDS Incremental` can be written out to an archive on disk and navigated on the file system.

In [None]:
# Setup
td = tempfile.TemporaryDirectory()
print('Temporary directory path:', td.name)

In [None]:
incremental.writeToDisk(td.name)

archiveFromIncremental = BidsArchive(td.name)
print('Archive:', archiveFromIncremental)
print('\nBIDS Files in Archive from Incremental:', archiveFromIncremental.get())

In [None]:
# Cleanup
td.cleanup()

# BIDS Incremental: Sending Over a Network

`BIDS Incrementals` are designed for transfer from one computer to another, often from the fMRI scanner room computer to the cloud for data processing. The process of preparing the Incremental for send and unpacking it on the other side is quite simple, using the Python `pickle` module.

In this example, just the packing/unpacking process will be shown -- to actually send it over the network, pass the serialized object to your data transfer library of choice.

In [None]:
# Serialize object
pickledBuf = pickle.dumps(incremental)

# Deserialize object
unpickled = pickle.loads(pickledBuf)

# Compare equality
assert unpickled == incremental
print('Unpickled:', unpickled, '\nIncremental:', incremental)