Working with the Template

This section of the tutorial guides you through the development steps of an image processing and analysis project using the IPA Project Template. We will start by copying the template to a new project directory and then build up this example-project step-by-step. The point of this tutorial is to show you how to work with the template and specifically how to separate workflow development from applying the workflow to data.

Attention

Follow the installation instructions for pixi and copier before your continue here.

Issues & Feedback

In case you encounter any issues or have questions regarding our project template, please open an issue.

Copy the Template

First we need to create an instance of the template by running the copier command:

pixi x copier copy git+https://github.com/fmi-faim/ipa-project-template example-project

You will be asked a series of questions:

What is your first and last name?
Your first and last name will be put into the pixi.toml file as author.
What is your email address?
Your email address will be put into the pixi.toml file.
What is your project name?
Answer Example Project.
What is your organization name?
This information will be put into the pixi.toml file.
Do you want to include the most common Python packages?
Answer y.
Do you want to include Napari in your project?
Answer y.
Do you want to include Nextflow in your project?
Answer n. We will add nextflow later.
Do you want to add installation and initialization scripts for Unix systems? Answer n.
Do you want to add a config and run demo? Answer n.

Next, we need to git initialize the project and enable versioning:

git init
git add .
git commit -m "Initial commit."
git branch -M main

Now, you can build the documentation page with:

pixi run build_docs

The very last setup step is to get some raw data. Please follow the instructions in data.

Congratulations!

You have created your first project from the IPA Project Template!

Develop Image Processing and Analysis Routine

Now it is time to prototype the image processing and analysis routine. The most interactive way to do this is with jupyter lab. You can start jupyter lab with the following command:

pixi run jupyter

This will open jupyter lab in your browser, and you should navigate to the sandbox. The sandbox is our prototyping place. Any results created by code in the sandbox are not expected to be reproducible. The reasoning behind this is, that we first want to explore what might work, before we focus on reproducible results.

We want to develop a workflow which

segments the nuclei,
measures mean intensity of each nucleus, and saves the result to csv.

Feel free to develop your own workflow or get inspired by our jupyter notebook.

Commit your changes

Now would be a good time to lint your code and commit your changes into the git repository.

pixi run lint
git add .
git commit -m "First draft"

Here is a nice git tutorial you might want to look at to get started.

Convert Routine to Reproducible Processing Steps

When you are done developing your image processing and analysis routine, it is time to convert it to a Python script. The Python script must be written, such that we can call it with a config file and apply it to an arbitrary raw data directory. We want to split up the processing routine into multiple sub-steps, where each step saves intermediate processing results to processed_data.

Converting a jupyter notebook into a series of standalone processing steps is an important step to make our research code reproducible. At this point in time we want to think carefully about how to split our jupyter notebook into a series of individual processing scripts. For this example-project we suggest to split the processing in two steps:

segment
measure

Info

One could keep everything in a single big processing step. However, by splitting the code into sub-steps, we can recover from intermediate results in case a processing step fails.

To get started you can run the copier command again:

Danger

Make sure you are in the parent directory of the example-project when you call the copier command.

pixi x copier copy git+https://github.com/fmi-faim/ipa-project-template example-project

This time answer y to the Do you want to add a config and run demo? question. This will add the following to your template:

source
└── s01_demo
  ├── __init__.py
  ├── config.py
  └── run.tif

Implement s01_segment

Rename s01_demo to s01_segment and then we want to think about the parameters we want to expose to the user. In our workflow we want to configure three parameters:

raw_data_dir: The directory where the raw tiff files are stored.
suffix: Suffix of the tiff files. Sometimes it is .TIF and sometimes it is .tif.
output_dir: Where to store the segmentation masks.

Edit the config.py to match our implementation.

We added a prompt() function to the AcquisitionConfig, which we can use to ask the user for input. Now we only need to add a pixi-task which will call the script. We do this by adding this line to our pixi.toml file.

Uncommitted changes

If you make your pixi task depend on the source_status task that comes with the template, you can ensure that it warns you if there are any uncommitted changes, thereby ensuring that you always commit any changes and run on a clean, reproducible state of your repository.

Now that we can create config files, we will add the processing code to run.py. Edit the file until it matches our implementation.

Note

This script will not work at the moment, because we are using the MeasureConfig in this line. However, this file does not exist yet.

Implement s02_segment

Create a new directory in source called s02_measure and add all the files from our reference implementation.

Finally, we want to add the pixi-tasks to run the two processing steps. Edit your pixi.toml file and add these two lines.

Congratulations!

You have converted your jupyter notebook to reproducible processing steps! To run them, follow the step-by-step tutorial

Commit your changes

Don't forget to commit your changes into the git repository.

Bundle Processing Steps into a Workflow

So far, you can run your processing steps one-by-one. However, we can automate this and especially for larger projects it makes sense to explore workflow orchestrators, which take care of running one step after another. Furthermore, such workflow orchestrators can recover processing in case of failure or scale up processing to run in parallel on high performance computing or in the cloud. In our case we will use nextflow. Unfortunately, nextflow is not available on Windows.

To get started you can run the copier command again:

Danger

Make sure you are in the parent directory of the example-project when you call the copier command.

pixi x copier copy git+https://github.com/fmi-faim/ipa-project-template example-project

This time answer y to the Do you want to include Nextflow in your project? question.

This will add nextflow.config and workflow.nf to the source directory. Edit these files until they match our reference implementation and verify that the pixi.toml file matches our pixi.toml file.

Congratulations!

You have added a nextflow workflow to your example-project! To run it, follow the workflow tutorial

Add Visualization Notebook

It is good practice to provide some form of visualization of your processing results. We do this by adding a 3rd step s03_visualization to our source directory. In this directory we have the visualization_utils.py file and the Visualize_Results.ipynb notebook. You can copy these files from our reference implementation. Additionally, you need to add these two lines to your pixi.toml. Now you are ready to run the visualization as described here.

Finalize Documentation

Finally, we want to create all the necessary documentation for this project. Run the following command to get a live view of the website:

pixi run show_docs

Now you can edit the markdown files in docs and the website will be rendered fresh if any change is detected.

Congratulations!

You finished the IPA Project Template tutorial!
Don't forget to always commit your changes.