Task types

Each task has a specific type defining the injected load into the simulated platform, which, depending on its type, has different properties specifying its execution behavior. This page summarizes all task types and their corresponding properties and introduces how ElastiSim distributes the simulated load on resources.

Task payloads & distribution patterns

All tasks carry a load simulated on the platform, which ElastiSim defines as the task’s payload, and the unit of a payload depends on the task type (e.g., FLOPS for compute tasks or bytes for communication). Payloads are always defined using a single number and distributed among participating resources following a payload distribution pattern.

ElastiSim defines two types of distribution patterns: vector and matrix. While vector distribution patterns consider a one-dimensional distribution (e.g., FLOPS per compute node), matrix distribution patterns define communication matrices.

Vector distribution patterns

Pattern	Description
`all_ranks`	The payload is evenly distributed among all resources
`root_only`	Only the first resource performs the specified payload
`even_ranks`	The payload is evenly distributed among all even-numbered resources
`odd_ranks`	The payload is evenly distributed among all odd-numbered resources
`uniform`	All assigned resources perform the specified payload without any distribution
`vector`	An explicit vector defining the payload for each participating resource (only applicable to rigid jobs)

uniform is the only exception to other distribution patterns (vector and matrix), as it does not distribute the workload but describes the payload per resource. It is syntactic sugar for the pattern all_ranks with the performance model <payload size> * num_nodes or <payload size> * num_gpus.

Matrix distribution patterns

Pattern	Description
`all_to_all`	Each resource communicates bi-directionally with every other resource
`gather`	The first resource receives uni-directionally from all remaining resources
`scatter`	The first resource sends uni-directionally to all remaining other resources
`master_worker`	The first resource bi-directionally communicates with all remaining resources
`ring`	Each resource communicates bi-directionally with its direct neighbors
`ring_clockwise`	Each resource communicates uni-directionally with its right neighbor
`ring_counter_clockwise`	Each resource communicates uni-directionally with its left neighbor
`matrix`	An explicit matrix defining the payload for each possible pair of resources (defined as a vector with the dimension #resources × #resources, only applicable to rigid jobs)

CPU computation & communication task

ElastiSim divides computational tasks into two parts: compute and communication. The reasoning behind this twofold structure is to allow overlapping and coupling computation and communication. Each assigned node computes the load based on its computational capabilities and communicates using the links defined by the underlying topology. However, users are not required to specify both payloads. Specifying only a computation or communication is valid and will only simulate the specified payload. By setting the type property to cpu, the following properties get available:

Property	Description	Value type	Default value	Mandatory
`flops`	Computational load of the task	integer (FLOPS)	-	Yes, if `bytes` is not specified
`computation_pattern`	Payload distribution pattern of the computational load	vector distribution pattern	-	Yes, if `flops` is specified
`bytes`	Communication load of the task	integer (bytes)	-	Yes, if `flops` is not specified
`communication_pattern`	Payload distribution pattern of the communication load	matrix distribution pattern	-	Yes, if `bytes` is specified
`coupled`	Whether computation and communication is strictly coupled (i.e., bound by the slowest resource among all participating nodes)	bool	false	No

Example

{
  "type": "cpu",
  "name": "CPU compute & communication",
  "flops": 8e11,
  "computation_pattern": "uniform",
  "bytes": 5e10,
  "communication_pattern": "all_to_all",
  "coupled": true
}

GPU computation & communication task

Analogous to CPU tasks, GPU tasks also comprise computation and communication. However, as compute nodes can be equipped with multiple GPUs, the communication among GPUs takes place using intra- or inter-node communication. Depending on the platform topology, ElastiSim automatically utilizes the correct links. The type property to gpu, supports the following properties:

Property	Description	Value type	Default value	Mandatory
`flops`	Computational load of the task	integer (FLOPS)	-	Yes, if `bytes` is not specified
`computation_pattern`	Payload distribution pattern of the computational load	vector distribution pattern	-	Yes, if `flops` is specified
`bytes`	Communication load of the task	integer (bytes)	-	Yes, if `flops` is not specified
`communication_pattern`	Payload distribution pattern of the communication load	matrix distribution pattern	-	Yes, if `bytes` is specified

Example

{
  "type": "gpu",
  "name": "GPU compute & communication",
  "flops": 8e12,
  "computation_pattern": "all_ranks",
  "bytes": 7e10,
  "communication_pattern": "ring_clockwise"
}

I/O tasks

All I/O tasks follow the same structure and define the operation (read or write) and the target of the operation (PFS or node-local burst buffer).

`type`	Description
`pfs_read`	Read operation targeting the PFS
`pfs_write`	Write operation targeting the PFS
`bb_read`	Read operation targeting burst buffers
`bb_write`	Write operation targeting burst buffers

In contrast to compute tasks, I/O tasks support asynchronous execution among the following properties:

Property	Description	Value type	Default value	Mandatory
`bytes`	I/O size	integer (bytes)	-	Yes
`pattern`	Payload distribution pattern of the I/O size	vector distribution pattern	-	Yes
`async`	Whether the operation is executed asynchronously	bool	false	No

Example

{
  "type": "pfs_write",
  "name": "PFS write",
  "bytes": 5e11,
  "pattern": "all_ranks"
}

Delay tasks

Delay tasks are generic tasks occupying the compute node for a given amount of time, which can be useful to represent any task when computation, communication, or I/O tasks can not appropriately model the application. ElastiSim has two flavors of delay tasks representing either an idling or a busy wait activity. While idle tasks occupy compute nodes without resource utilization, busy wait tasks fully utilize the compute capabilities. Setting the type property to either idle or busy_wait introduces the following property:

Property	Description	Value type	Default value	Mandatory
`delay`	Period of time to occupy resources	integer (seconds)	-	Yes

Example

{
  "type": "busy_wait",
  "name": "Busy wait",
  "delay": 720,
  "pattern": "uniform"
}

Task sequences

Task sequences are simple containers that are especially useful when used for repeated execution of a specific sequence. The type to sequence defines a task sequence and makes the following property available:

Property	Description	Value type	Default value	Mandatory
`tasks`	Array of tasks	array	-	Yes

ElastiSim defines sequences recursively, allowing them to be nested.

Example

{
  "type": "sequence",
  "iterations": 12,
  "tasks": [
    {
      "type": "cpu",
      "flops": 8e10,
      "computation_pattern": "uniform"
    },
    {
      "type": "pfs_write",
      "name": "PFS write",
      "bytes": 6e10,
      "pattern": "all_ranks"
    }
  ]
}

Resource contention

All tasks in ElastiSim (except idle) utilize resources. While compute capabilities can be exclusively available to jobs if oversubscription is disabled (see Configuration), network communication depends on the underlying platform topology. The simulation engine evenly distributes the bandwidth of shared links when utilized by multiple jobs or overlapping asynchronous I/O tasks. However, if oversubscription is enabled, jobs can share computational resources. While CPUs are shared evenly and immediately with the execution of a new task, GPUs (and intra-node links) are utilized exclusively following a first come, first serve policy.

As busy_wait tasks utilize the compute capabilities of a node, multiple jobs oversubscribing the same node with a busy_wait (or even cpu) task will compete for resources (e.g., two busy_wait tasks of 15 minutes will take 30 minutes to finish).