flowserv.model.workflow.step module

Definitions for the different types of steps in a serial workflow. At this point we distinguish three types of workflow steps:

flowserv.model.workflow.step.FunctionStep and flowserv.model.workflow.step.ContainerStep :class: flowserv.model.workflow.step.NotebookStep.

A flowserv.model.workflow.step.CodeStep is used to execute a given function within the workflow context. The code is executed within the same thread and environment as the flowserv engine. Code steps are intended for minor actions (e.g., copying of files or reading results from previous workflow steps). For these actions it would cause too much overhead to create an external Python script that is run as a subprocess or a Docker container image.

A flowserv.model.workflow.step.ContainerStep is a workflow step that is executed in a separate container-like environment. The environment can either be a subprocess with specific environment variable settings or a Docker container.

class flowserv.model.workflow.step.CodeStep(identifier: str, func: Callable, arg: Optional[str] = None, varnames: Optional[Dict] = None, inputs: Optional[List[str]] = None, outputs: Optional[List[str]] = None)

Bases: flowserv.model.workflow.step.WorkflowStep

Workflow step that executes a given Python function.

The function is evaluated using the current state of the workflow arguments. If the executed function returns a result, the returned object can be added to the arguments. That is, the argument dictionary is updated and the added object is availble for the following workflows steps.

exec(context: Dict)

Execute workflow step using the given arguments.

The given set of input arguments may be modified by the return value of the evaluated function.

Parameters

context (dict) – Mapping of parameter names to their current value in the workflow executon state. These are the global variables in the execution context.

class flowserv.model.workflow.step.ContainerStep(identifier: str, image: str, commands: Optional[List[str]] = None, env: Optional[Dict] = None, inputs: Optional[List[str]] = None, outputs: Optional[List[str]] = None)

Bases: flowserv.model.workflow.step.WorkflowStep

Workflow step that is executed in a container environment. Contains a reference to the container identifier and a list of command line statements that are executed in a given environment.

add(cmd: str) flowserv.model.workflow.step.ContainerStep

Append a given command line statement to the list of commands in the workflow step.

Returns a reference to the object itself.

Parameters

cmd (string) – Command line statement

Return type

flowserv.model.workflow.serial.Step

flowserv.model.workflow.step.FunctionStep

alias of flowserv.model.workflow.step.CodeStep

class flowserv.model.workflow.step.NotebookStep(identifier: str, notebook: str, output: Optional[str] = None, requirements: Optional[List[str]] = None, params: Optional[List[str]] = None, varnames: Optional[Dict] = None, inputs: Optional[List[str]] = None, outputs: Optional[List[str]] = None)

Bases: flowserv.model.workflow.step.WorkflowStep

cli_command(context: Dict) str

Get command to run notebbok using papermill from command line.

This method is used when running a notebook inside a Docker container.

Parameters

context (dict) – Mapping of parameter names to their current value in the workflow executon state. These are the global variables in the execution context.

exec(context: Dict, rundir: str)

Execute the notebook using papermill in the given workflow context.

Parameters
  • context (dict) – Mapping of parameter names to their current value in the workflow executon state. These are the global variables in the execution context.

  • rundir (string) – Directory for the workflow run that contains all the run files.

class flowserv.model.workflow.step.WorkflowStep(identifier: str, step_type: int, inputs: Optional[List[str]] = None, outputs: Optional[List[str]] = None)

Bases: object

Base class for the different types of steps (actor) in a serial workflow.

We distinguish several workflow steps including steps that are executed in a container-like environment and steps that directly execute Python code.

The aim of this base class is to provide functions to distinguish between these two types of steps and to maintain properties that are common to all steps.

Each step in a serial workflow has a unique identifier (name) and optional lists of input files and output files. All files are specified as relative path expressions (keys).

is_code_step() bool

True if the workflow step is of type flowserv.model.workflow.step.CodeStep.

Return type

bool

is_container_step() bool

True if the workflow step is of type flowserv.model.workflow.step.ContainerStep.

Return type

bool

is_notebook_step() bool

True if the workflow step is of type :class:’flowserv.model.workflow.step.NotebookStep’.

Return type

bool

property name: str

Synonym for the step identifier.

Return type

string

flowserv.model.workflow.step.output_notebook(name: str, input: str) str

Generate name for output notebook.

If an output name is given it is returned as it is. Otherwise, the name of the input notebook will have the suffix .ipynb replaced by .out.ipynb. If the input notebook does not have a suffix .ipynb the suffix .out.ipynb is appended to the input notebook name.

Parameters
  • name (string) – User-provided name for the output notebook. This value may be None.

  • input (string) – Name of the input notebook.

Return type

string