Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

conditional workflows #15

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
105 changes: 105 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
# Common Workflow Language (CWL) Workflows

CWL feature extraction workflow for imaging dataset

## Workflow Steps:

create a [Conda](https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#activating-an-environment) environment using python = ">=3.9,<3.12"

#### 1. Install polus-plugins.

- clone a image-tools repository
`git clone https://github.com/camilovelezr/image-tools.git ../`
- cd `image-tools`
- create a new branch
`git checkout -b hd2 remotes/origin/hd2`
- `pip install .`

#### 2. Install workflow-inference-compiler.
- clone a workflow-inference-compiler repository
`git clone https://github.com/camilovelezr/workflow-inference-compiler.git ../`
- cd `workflow-inference-compiler`
- create a new branch
`git checkout -b hd2 remotes/origin/hd2`
- `pip install -e ".[all]"`

#### 3. Install image-workflow.
- cd `image-workflows`
- poetry install

#### Note:
Ensure that the [docker-desktop](https://www.docker.com/products/docker-desktop/) is running in the background. To verify that it's operational, you can use the following command:
`docker run -d -p 80:80 docker/getting-started`
This command will launch the `docker/getting-started container` in detached mode (-d flag), exposing port 80 on your local machine (-p 80:80). It's a simple way to test if Docker Desktop is functioning correctly.

## Details
This workflow integrates eight distinct plugins, starting from data retrieval from [Broad Bioimage Benchmark Collection](https://bbbc.broadinstitute.org/), renaming files, correcting uneven illumination, segmenting nuclear objects, and culminating in the extraction of features from identified objects

Below are the specifics of the plugins employed in the workflow
1. [bbbc-download-plugin](https://github.com/saketprem/polus-plugins/tree/bbbc_download/utils/bbbc-download-plugin)
2. [file-renaming-tool](https://github.com/PolusAI/image-tools/tree/master/formats/file-renaming-tool)
3. [ome-converter-tool](https://github.com/PolusAI/image-tools/tree/master/formats/ome-converter-tool)
4. [basic-flatfield-estimation-tool](https://github.com/PolusAI/image-tools/tree/master/regression/basic-flatfield-estimation-tool)
5. [apply-flatfield-tool](https://github.com/PolusAI/image-tools/tree/master/transforms/images/apply-flatfield-tool)
6. [kaggle-nuclei-segmentation](https://github.com/hamshkhawar/image-tools/tree/kaggle-nuclei_seg/segmentation/kaggle-nuclei-segmentation)
7. [polus-ftl-label-plugin](https://github.com/hamshkhawar/image-tools/tree/kaggle-nuclei_seg/transforms/images/polus-ftl-label-plugin)
8. [nyxus-plugin](https://github.com/PolusAI/image-tools/tree/kaggle-nuclei_seg/features/nyxus-plugin)

## Execute CWL workflows
Three different CWL workflows can be executed for specific datasets
1. segmentation
2. analysis

During the execution of the segmentation workflow, `1 to 7` plugins will be utilized. However, for executing the analysis workflow, `1 to 8` plugins will be employed.
If a user wishes to execute a workflow for a new dataset, they can utilize a sample YAML file to input parameter values. This YAML file can be saved in the desired subdirectory of the `configuration` folder with the name `dataset.yml`

If a user opts to run a workflow without background correction, they can set `background_correction` to false. In this case, the workflow will skip steps `4 and 5`

`python -m polus.image.workflows --name="BBBC001" --workflow=analysis`

A directory named `outputs` is generated, encompassing CLTs for each plugin, YAML files, and all outputs are stored within the `outdir` directory.
```
outputs
├── experiment
│ └── cwl_adapters
| experiment.cwl
| experiment.yml
|
└── outdir
└── experiment
├── step 1 BbbcDownload
│ └── outDir
│ └── bbbc.outDir
│ └── BBBC
│ └── BBBC039
│ └── raw
│ ├── Ground_Truth
│ │ ├── masks
│ │ └── metadata
│ └── Images
│ └── images
├── step 2 FileRenaming
│ └── outDir
│ └── rename.outDir
├── step 3 OmeConverter
│ └── outDir
│ └── ome_converter.outDir
├── step 4 BasicFlatfieldEstimation
│ └── outDir
│ └── estimate_flatfield.outDir
├── step 5 ApplyFlatfield
│ └── outDir
│ └── apply_flatfield.outDir
├── step 6 KaggleNucleiSegmentation
│ └── outDir
│ └── kaggle_nuclei_segmentation.outDir
├── step 7 FtlLabel
│ └── outDir
│ └── ftl_plugin.outDir
└── step 8 NyxusPlugin
└── outDir
└── nyxus_plugin.outDir

```
#### Note:
Step 7 and step 8 are executed only in the case of the `analysis` workflow.
Empty file added configuration/__init__.py
Empty file.
14 changes: 14 additions & 0 deletions configuration/analysis/BBBC001.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
---
name : BBBC001
file_pattern : /.*/.*/.*/Images/.*/.*_{row:c}{col:dd}f{f:dd}d{channel:d}.tif
out_file_pattern : x{row:dd}_y{col:dd}_p{f:dd}_c{channel:d}.tif
image_pattern: x{x:dd}_y{y:dd}_p{p:dd}_c{c:d}.ome.tif
seg_pattern: x{x:dd}_y{y:dd}_p{p:dd}_c0.ome.tif
ff_pattern: "x00_y03_p0\\(0-5\\)_c{c:d}_flatfield.ome.tif"
df_pattern: "x00_y03_p0\\(0-5\\)_c{c:d}_darkfield.ome.tif"
group_by: c
map_directory: false
features: ALL
file_extension: pandas
background_correction: false

13 changes: 13 additions & 0 deletions configuration/analysis/BBBC039.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
---
name : BBBC039
file_pattern : /.*/.*/.*/Images/.*/.*_{row:c}{col:dd}_s{s:d}_w{channel:d}.*.tif
out_file_pattern : x{row:dd}_y{col:dd}_p{s:dd}_c{channel:d}.tif
image_pattern: x{x:dd}_y{y:dd}_p{p:dd}_c{c:d}.ome.tif
seg_pattern: x{x:dd}_y{y:dd}_p{p:dd}_c1.ome.tif
ff_pattern: "x\\(00-15\\)_y\\(01-24\\)_p0\\(1-9\\)_c{c:d}_flatfield.ome.tif"
df_pattern: "x\\(00-15\\)_y\\(01-24\\)_p0\\(1-9\\)_c{c:d}_darkfield.ome.tif"
group_by: c
map_directory: false
features: "ALL_INTENSITY"
file_extension: pandas
background_correction: false
Empty file.
13 changes: 13 additions & 0 deletions configuration/analysis/sample.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
---
name :
file_pattern :
out_file_pattern :
image_pattern:
seg_pattern:
ff_pattern:
df_pattern:
group_by:
map_directory:
features:
file_extension:
background_correction:
11 changes: 11 additions & 0 deletions configuration/segmentation/BBBC001.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
name : BBBC001
file_pattern : /.*/.*/.*/Images/.*/.*_{row:c}{col:dd}f{f:dd}d{channel:d}.tif
out_file_pattern : x{row:dd}_y{col:dd}_p{f:dd}_c{channel:d}.tif
image_pattern: x{x:dd}_y{y:dd}_p{p:dd}_c{c:d}.ome.tif
seg_pattern: x{x:dd}_y{y:dd}_p{p:dd}_c0.ome.tif
ff_pattern: "x00_y03_p0\\(0-5\\)_c{c:d}_flatfield.ome.tif"
df_pattern: "x00_y03_p0\\(0-5\\)_c{c:d}_darkfield.ome.tif"
group_by: c
map_directory: false
background_correction: false
11 changes: 11 additions & 0 deletions configuration/segmentation/BBBC039.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
---
name : BBBC039
file_pattern : /.*/.*/.*/Images/.*/.*_{row:c}{col:dd}_s{s:d}_w{channel:d}.*.tif
out_file_pattern : x{row:dd}_y{col:dd}_p{s:dd}_c{channel:d}.tif
image_pattern: x{x:dd}_y{y:dd}_p{p:dd}_c{c:d}.ome.tif
seg_pattern: x{x:dd}_y{y:dd}_p{p:dd}_c1.ome.tif
ff_pattern: "x\\(00-15\\)_y\\(01-24\\)_p0\\(1-9\\)_c{c:d}_flatfield.ome.tif"
df_pattern: "x\\(00-15\\)_y\\(01-24\\)_p0\\(1-9\\)_c{c:d}_darkfield.ome.tif"
group_by: c
map_directory: false
background_correction: false
Empty file.
12 changes: 12 additions & 0 deletions configuration/segmentation/sample.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
---
name :
file_pattern :
out_file_pattern :
image_pattern:
seg_pattern:
ff_pattern:
df_pattern:
group_by:
map_directory:
features:
file_extension:
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
File renamed without changes.
38 changes: 38 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
[tool.poetry]
name = "polus-image-workflows"
version = "0.1.1-dev1"
description = "Build and execute pipelines of polus plugins on Compute."
authors = ["Hamdah Shafqat Abbasi <[email protected]>"]
readme = "README.md"
packages = [{include = "polus", from = "src"}]

[tool.poetry.dependencies]
python = ">=3.9,<3.12"
typer = "^0.9.0"
pyyaml = "^6.0.1"
pydantic = "^2.6.1"
cwl-utils="0.31"
toil="^5.12"
polus-plugins = {path = "../image-tools", develop = true}
workflow-inference-compiler = {path = "../workflow-inference-compiler", develop = true}

[tool.poetry.group.dev.dependencies]
jupyter = "^1.0.0"
nbconvert = "^7.11.0"
pytest = "^7.4.4"
bump2version = "^1.0.1"
pre-commit = "^3.3.3"
black = "^23.3.0"
ruff = "^0.0.274"
mypy = "^1.4.0"
pytest-xdist = "^3.3.1"
pytest-sugar = "^0.9.7"

[build-system]
requires = ["poetry-core>=1.0.0"]
build-backend = "poetry.core.masonry.api"

[tool.pytest.ini_options]
addopts = [
"--import-mode=importlib",
]
Empty file.
65 changes: 65 additions & 0 deletions src/polus/image/workflows/__main__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
"""CWL Workflow."""
import logging
import typer
from pathlib import Path
from polus.image.workflows.utils import LoadYaml
from workflows.cwl_analysis import CWLAnalysisWorkflow
from workflows.cwl_nuclear_segmentation import CWLSegmentationWorkflow
from pathlib import Path


app = typer.Typer()

# Initialize the logger
logging.basicConfig(
format="%(asctime)s - %(name)-8s - %(levelname)-8s - %(message)s",
datefmt="%d-%b-%y %H:%M:%S",
)
logger = logging.getLogger("WIC Python API")
logger.setLevel(logging.INFO)


@app.command()
def main(
name: str = typer.Option(
...,
"--name",
"-n",
help="Name of imaging dataset of Broad Bioimage Benchmark Collection (https://bbbc.broadinstitute.org/image_sets)"
),
workflow: str = typer.Option(
...,
"--workflow",
"-w",
help="Name of cwl workflow"
)
) -> None:

"""Execute CWL Workflow."""

logger.info(f"name = {name}")
logger.info(f"workflow = {workflow}")

config_path = Path(__file__).parent.parent.parent.parent.parent.joinpath(f"configuration/{workflow}/{name}.yml")
print(config_path)


model = LoadYaml(workflow=workflow, config_path=config_path)
params = model.parse_yaml()

if workflow == "analysis":
logger.info(f"Executing {workflow}!!!")
model = CWLAnalysisWorkflow(**params)
model.workflow()

if workflow == "segmentation":
logger.info(f"Executing {workflow}!!!")
model = CWLSegmentationWorkflow(**params)
model.workflow()


logger.info("Completed CWL workflow!!!")


if __name__ == "__main__":
app()
68 changes: 68 additions & 0 deletions src/polus/image/workflows/utils.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
import pydantic
from pathlib import Path
from typing import Dict
from typing import Union
import yaml


GITHUB_TAG = "https://raw.githubusercontent.com"


ANALYSIS_KEYS = ["name", "file_pattern", "out_file_pattern", "image_pattern", "seg_pattern", "ff_pattern", "df_pattern", "group_by", "map_directory", "features", "file_extension", "background_correction"]
SEG_KEYS = ["name", "file_pattern", "out_file_pattern", "image_pattern", "seg_pattern", "ff_pattern", "df_pattern", "group_by", "map_directory", "background_correction"]


class DataModel(pydantic.BaseModel):
data: Dict[str, Dict[str, Union[str, bool]]]


class LoadYaml(pydantic.BaseModel):
"""Validation of Dataset yaml."""
workflow:str
config_path: Union[str, Path]

@pydantic.validator("config_path", pre=True)
@classmethod
def validate_path(cls, value: Union[str, Path]) -> Union[str, Path]:
"""Validation of Paths."""
if not Path(value).exists():
msg = f"{value} does not exist! Please do check it again"
raise ValueError(msg)
if isinstance(value, str):
return Path(value)
return value

@pydantic.validator("workflow", pre=True)
@classmethod
def validate_workflow_name(cls, value: str) -> str:
"""Validation of workflow name."""
if not value in ["analysis", "segmentation", "visualization"]:
msg = f"Please choose a valid workflow name i-e analysis segmentation visualization"
raise ValueError(msg)
return value

def parse_yaml(self) -> Dict[str, Union[str, bool]]:
"""Parsing yaml configuration file for each dataset."""

with open(f'{self.config_path}','r') as f:
data = yaml.safe_load(f)

check_values = any([v for _, v in data.items() if f is None])

if check_values is True:
msg = f"All the parameters are not defined! Please do check it again"
raise ValueError(msg)


if self.workflow == "analysis":
if data['background_correction'] == True:
if list(data.keys()) != ANALYSIS_KEYS:
msg = f"Please do check parameters again for analysis workflow!!"
raise ValueError(msg)

if self.workflow == "segmentation":
if data['background_correction'] == True:
if list(data.keys()) != SEG_KEYS:
msg = f"Please do check parameters again for segmentation workflow!!"
raise ValueError(msg)
return data
Empty file added workflows/__init__.py
Empty file.
Loading
Loading