3.7.8. Retry Failed Steps
In this chapter, you’ll learn how to configure steps to allow retry on failure.
What is a Step Retry?#
A step retry is a mechanism that allows a step to be retried automatically when it fails. This is useful for handling transient errors, such as network issues or temporary unavailability of a service.
By default, when a step fails, the workflow execution stops, and the workflow is marked as failed. However, you can configure a step to retry on failure.
When a step fails, you can configure the workflow engine to automatically retry the step a specified number of times before marking the workflow as failed. This can help improve the reliability and resilience of your workflows.
You can also configure the interval between retries, awllowing you to wait for a certain period before attempting the step again. This is useful when the failure is due to a temporary issue that may resolve itself after some time.
For example, if a step captures a payment, you may want to retry it daily until the payment is successful or the maximum number of retries is reached.
Configure a Step’s Retry#
By default, when an error occurs in a step, the step and the workflow fail, and the execution stops.
You can configure the step to retry on failure. The createStep
function can accept a configuration object instead of the step’s name as a first parameter.
For example:
5} from "@medusajs/framework/workflows-sdk"6 7const step1 = createStep(8 {9 name: "step-1",10 maxRetries: 2,11 },12 async () => {13 console.log("Executing step 1")14 15 throw new Error("Oops! Something happened.")16 }17)18 19const myWorkflow = createWorkflow(20 "hello-world", 21 function () {22 const str1 = step1()23 24 return new WorkflowResponse({25 message: str1,26 })27})28 29export default myWorkflow
The step’s configuration object accepts a maxRetries
property, which is a number indicating the number of times a step can be retried when it fails.
When you execute the above workflow, you’ll see the following result in the terminal:
The first line indicates the first time the step was executed, and the next two lines indicate the times the step was retried. After that, the step and workflow fail.
Disable Automatic Retries#
By default, a step configured with the maxRetries
property will be retried automatically when it fails.
You can disable automatic retries by setting the autoRetry
property to false
in the step's configuration object. Then, when the step fails, its status will be set to temporary failure, and you'll need to manually retry it using the Workflow Engine Module Service.
For example, to disable automatic retries:
This step will not be retried automatically when it fails. Instead, you'll need to manually retry it, as explained in the Manually Retry a Step section.
Step Retry Intervals#
By default, a step is retried immediately after it fails. To specify a wait time before a step is retried, pass a retryInterval
property to the step's configuration object. Its value is a number of seconds to wait before retrying the step.
For example:
In this example, if the step fails, it will be retried after two seconds.
Maximum Retry Interval#
The retryInterval
property's maximum value is Number.MAX_SAFE_INTEGER. So, you can set a very long wait time before the step is retried, allowing you to retry steps after a long period.
For example, to retry a step after a day:
In this example, if the step fails, it will be retried after 86400
seconds (one day).
Interval Changes Workflow to Long-Running#
By setting retryInterval
on a step, a workflow that uses that step becomes a long-running workflow that runs asynchronously in the background. This is useful when creating workflows that may fail and should run for a long time until they succeed, such as waiting for a payment to be captured or a shipment to be delivered.
However, since the long-running workflow runs in the background, you won't receive its result or errors immediately when you execute the workflow.
Instead, you must subscribe to the workflow's execution using the Workflow Engine Module's service. Learn more about this in the Long-Running Workflows chapter.
Manually Retry a Step#
In some cases, you may need to manually retry a step. For example:
- If a step's
autoRetry
property is set tofalse
. - If the machine running the Medusa application in development or worker mode dies or shuts down.
- If the step takes longer than expected to complete.
To retry a step manually, resolve the Workflow Engine Module's service from the Medusa container and call its retryStep
method.
For example, to do it in an API route:
1import { MedusaRequest, MedusaResponse } from "@medusajs/framework/http"2import { TransactionHandlerType } from "@medusajs/framework/utils"3 4export async function GET(5 req: MedusaRequest,6 res: MedusaResponse7) {8 const workflowEngine = req.scope.resolve("workflows")9 10 workflowEngine.retryStep({11 idempotencyKey: {12 action: TransactionHandlerType.INVOKE,13 transactionId: req.validatedQuery.transaction_id as string,14 stepId: "step-1",15 workflowId: "hello-world",16 },17 })18 19 res.json({ message: "Step retry initiated" })20}
When you send a request to the API route, the workflow execution will resume, retrying the specified step.
Learn more about the retryStep
method in the Workflow Engine Module Service reference.