Application model
Application models have a strict hierarchical structure and follow a building-block approach comprising phases and tasks. While phases represent stages within an application, tasks represent the load executed on the simulated platform.
To allow dynamic reconfigurations during runtime, ElastiSim introduces scheduling points. Malleable applications reaching a scheduling point can adapt to new resources if the scheduler has decided to reconfigure the job. Thus, dividing applications into phases and tasks allows for modeling malleable (and evolving) workloads. ElastiSim includes scheduling points by default at the beginning of each phase (unless specified otherwise).
Scheduling decisions not taken at a job’s scheduling point (e.g., periodic invocation) will be applied when the job reaches its next scheduling point. If a job has no further scheduling point, the reconfiguration remains unapplied.
Phases
Users define phases by specifying the following properties using the JSON format:
Property | Description | Value type | Default value | Mandatory |
---|---|---|---|---|
phases | Array containing all phases (top-level structure) | array | - | Yes |
iterations | Number of iterations (i.e., repetitions) of the phase (has to be inferable at start time) | integer or string (see Performance models) | 0 | No |
scheduling_point | Whether a scheduling point is included before (each iteration of) the phase | bool | true | No |
evolving_request | Evolving request specifying the number of nodes to issue before (each iteration of) the phase (unspecified implies no evolving request) | integer or string (see Performance models) | unspecified | No |
final_scheduling_point | ||||
barrier | Whether there is a barrier at the beginning of the phase (only considered when there is no corresponding scheduling point or evolving request) | bool | true | No |
tasks | Array holding the tasks of the phase | array | - | Yes |
By definition, applications do not hold scheduling points or evolving requests before the first phase (for phases specified in the phases
array).
ElastiSim clips evolving requests by default to stay in the possible range of configurations (i.e., [num_nodes_min
, num_nodes_max
], see Configuration). Applications consider evolving requests only if the number of requested nodes differs from the number of assigned nodes. In those cases, invoking the scheduler with an evolving request is mandatory. For adaptive jobs, if a phase specifies an evolving request and a scheduling point, the evolving request has a higher priority and will take precedence over the scheduling point.
Reconfiguration penalty
Reconfiguring applications during runtime can introduce additional overhead while the application adapts to its new configuration (e.g., data redistribution). ElastiSim defines that overhead as the reconfiguration penalty and provides two particular phases executed that can reflect such overhead. The on reconfiguration phase runs on all assigned resources on each reconfiguration. In contrast, the on expansion phase runs only on newly assigned resources. As applications also might have an initialization phase, ElastiSim provides an additional on initialization phase executed only on the first configuration of a job to facilitate the modeling of all stages during application execution.
Phase | Description |
---|---|
on_init | Executed only on the first job configuration (i.e., once in total) |
on_reconfiguration | Executed on each reconfiguration on all resources |
on_expansion | Executed on each reconfiguration only on newly assigned resources |
Tasks
Analogously, tasks are defined with the following properties:
Property | Description | Value type | Default value | Mandatory |
---|---|---|---|---|
type | Task type (see Task types) | string | - | Yes |
name | Name of the task (only relevant in log messages) | string | None | No |
iterations | Number of iterations (i.e., repetitions) of the task | integer or string (see Performance models) | 0 | No |
synchronized | Whether all resources (i.e., compute nodes) synchronize before executing the task (similar to an MPI_Barrier() before execution) | bool | false | No |
Example application model
An example of an application model with ten repetitions of a compute (CPU) and PFS write task.
See Task types for detailed task descriptions.
{
"on_init": {
"tasks": [
{
"type": "pfs_read",
"name": "Read model",
"bytes": "model_size",
"pattern": "root_only"
},
{
"type": "cpu",
"name": "Scatter model",
"bytes": "model_size",
"communication_pattern": "scatter"
}
]
},
"on_reconfiguration": {
"tasks": [
{
"type": "pfs_read",
"name": "Read model",
"bytes": "model_size",
"pattern": "root_only"
},
{
"type": "cpu",
"name": "Scatter model",
"bytes": "model_size",
"communication_pattern": "scatter"
}
]
},
"phases": [
{
"iterations": "iterations",
"evolving_request": "num_nodes % 2 == 0 ? num_nodes + 1 : num_nodes - 1",
"tasks": [
{
"type": "cpu",
"name": "Compute & communicate",
"flops": "flops/num_nodes^alpha",
"computation_pattern": "uniform",
"bytes": "communication_size",
"communication_pattern": "all_to_all",
"coupled": true
},
{
"type": "pfs_write",
"name": "Checkpoint",
"bytes": "checkpoint_size",
"pattern": "all_ranks"
}
]
}
]
}