Using work blocks and order of function calls per task instance on the accelerator

Based on the characteristics of an application, you can use single-use work blocks or multi-use work blocks to efficiently implement data partitioning on the accelerators.

For a given task that can be partitioned into N work blocks, the following describes how the different types of work blocks can be used, and also the order of function calls per task instance based on a single instance of a the task on a single accelerator:
  1. Task instance initialization (this is done by the ALF runtime)
  2. Conditional execute: alf_accel_task_context_setup is only called if the task has context. The runtime calls it when the initial task context data has been loaded to the accelerator and before any work blocks are processed.
  3. For each work block WB(k):
    1. If there are pending context merges, go to Step 4.
    2. For each iteration of a multi-use work block i < N (total number of iteration)
      1. alf_accel_input_list_prepare(WB(k), i, N): It is only called when the task requires accelerator data partition.
      2. alf_accel_comp_kernel(WB(k), i, N): The computational kernel is always called.
      3. alf_accel_output_list_prepare(WB(k), i, N): It is only called when the task requires accelerator data partition.
  4. Conditional execute: alf_accel_task_context_merge This API is only called when the context of another unloaded task instance is to be merged to current instance.
    1. If there are pending work blocks, go to Step 3.
  5. Write out task context.
  6. Unload image or pending for next scheduling.
    1. If a new task instance is created, go to Step 2.
For step 3, the calling order of the three function calls is defined by the following rules: