Jobs¶
The job system is how Eyrie executes Deployment Books. Every lifecycle action (deploy, pause, backup, etc.) creates a job that serializes the book’s steps and runs them sequentially.
Job Model¶
eyrie.deployment.job is the runtime representation of a Deployment Book
execution. Key fields:
book_id — the Deployment Book being executed.
deployment_id / deployment_group_id / cluster_id — the scoped resources this job operates on.
state —
draft→scheduled→running→done/error/cancel.step_ids — serialized copies of the book’s steps, created at job creation time.
current_step_id — the step currently being executed.
Job Lifecycle¶
Draft — the job record is created.
Scheduled — the job is enqueued via OCA
queue_job. Thedate_scheduledtimestamp is recorded.Running — the job runner picks up the job and begins executing steps in sequence.
date_runningis set.Done — all steps completed successfully.
date_doneis set.Error — a step raised a non-retryable error.
Cancel — the job was manually cancelled.
Step Serialization¶
When a job is created, Eyrie copies the book’s steps into
eyrie.deployment.job.step records. Each step’s
python_serialize_job_step_data is evaluated at this point, injecting
computed data into the step’s context. This means the job is a snapshot — later
changes to the book do not affect in-flight jobs.
Retry Mechanism¶
Steps can signal that they should be retried by raising a
RetryableJobError (from OCA queue_job) or returning a
JobStepResultRetryable. The job runner re-enqueues the step and tries
again after a delay.
Non-retryable failures raise FailedJobError or return
JobStepResultFailed, which moves the job to the error state.
Integration with OCA queue_job¶
Eyrie delegates job scheduling and execution to the OCA queue_job module.
This provides:
Configurable concurrency channels.
Automatic retries with exponential backoff.
Dead-letter handling for permanently failed jobs.
Database-level locking to prevent duplicate execution.
Estimated Durations and Timeouts¶
Each step has an estimated_duration (seconds) and an
estimated_duration_over_stop multiplier (default: 10×). If a step runs
longer than estimated_duration × estimated_duration_over_stop, it is
considered timed out. Historical averages from previous job runs override the
static estimate during serialization.
Bus Notifications¶
When a job or step changes state, Eyrie sends a bus notification
(eyrie.deploy_event) so that the portal UI can update in real time. Users
see job progress, step completion, and errors without refreshing the page.
Jobs with should_notify = True (inherited from the book’s
should_notify flag) also trigger user-facing notifications.