Fusebit Tasks provide support for long-running, asynchronous code execution. Tasks are useful in three situations:

  1. Long execution times. When the logic you need to execute exceeds a reasonable lifetime of an HTTP request, Tasks provide a model for running code for up to 15 minutes. For example, you can process large CSV files using Tasks. Once scheduled, the status of a Task and its result can be queried asynchronously.
  2. Delayed execution. When you want to execute logic at some point in the future, you can schedule a Task to be executed up to 24 hours in advance. For example, you can use delayed Tasks to schedule a retry of a call to an external service that was rejected with an HTTP 429 response.
  3. Throttling. Fusebit can guarantee that no more than a specified number of Task instances will execute at the same time. For example, you can use throttled Tasks to limit the number of concurrent calls to an external API. You can also limit the number of concurrent calls to your own system in response to an external webhook.

📘

Fusebit CLI Version

To manipulate Fusebit Tasks using the Fusebit CLI, you need to have @fusebit/[email protected] or later installed.

Implementing a Task

Fusebit Tasks are regular Fusebit Functions configured to execute asynchronously. They are created using the Fusebit HTTP - Core APIs, specifically the PUT Function API.

The simplest Task implementation looks as follows:

module.exports = async (ctx) => {
  if (ctx.method === 'TASK') {
    // Invocation of an asynchronous task.
    // Task result is captured for asynchronous retrieval for up to 24 hours.
    return { status: 200, body: { arbitrary: 'content' } };
  }
  else {
    // Regular HTTP request
    return { status: 200, body: { arbitrary: 'content'  } };
  }
};

Asynchronous execution of a Fusebit Function will always have the value of the ctx.method parameter set to TASK, which can be used in code to differentiate between these two execution modes.

For a Fusebit Function to execute asynchronously, the URL of the call to the function must match one of the routes of the function designated as Task scheduling routes in the function specification. This configuration is done as part of the routes element of the function specification supplied in the body of the PUT Function API as follows:

{
  "nodejs": {
    "files": {
      "index.js": "{...content of the index.js above...}"
    }
  },
  "compute": {
    "timeout": 840, // execution time limit in seconds (max 840)
  },
  "routes": [
    {
      "path": "/task/importCsv",
      "task": {}
    },
    {
      "path": "/task/sendEmail",
      "task": {}
    }
  ]
}

Every element of the routes array defines one Task scheduling route for the function. A call to the function with a URL that matches the path property of a route will cause the function to execute asynchronously if the matching route also contains the task element. More on this in the next section.

Task Execution

Task execution is triggered with a Task scheduling request to the Fusebit Function. Just like a regular, synchronous function execution, the Task scheduling request is an HTTP request to the function, but it must meet several criteria:

  • The verb of the request must be POST.
  • The path property of one of the routes specified in the routes element of the function specification must prefix-match the relative URL of the request (routes are evaluated in order and the first matching route is used).
  • The matching route must specify the task property in fusebit.json

If all of the conditions above are met, the system will schedule the function to process the request asynchronously as a Task rather than run it synchronously with the current HTTP request. To asynchronously execute a Task, the system will:

  • Enforce security and throttling constraints.
  • Schedule a new Task for asynchronous execution by capturing the request body, headers, and query parameters.
  • Immediately respond to the caller with an HTTP 202 response. The response will also contain the Location HTTP response header with a URL the caller can use to query the status of the Task.

Scheduling a Task over HTTP

Let's say the base URL of a Fusebit Function is https://api.us-west-1.fusebit.io/v1/run/sub-ed9d9341ea356841/boundary1/function1, and the path of a Task scheduling route in the function specification is /task/task1. The following request will schedule the asynchronous execution of the Task:

curl -X POST \
  https://api.us-west-1.fusebit.io/v1/run/sub-ed9d9341ea356841/boundary1/function1/task/task1?a=b \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer {accessToken}" \
  -H "MyHeader: MyValue" \
  -d '{"some":"payload"}'

A couple of things to note about this request:

  • It is an HTTP POST request.
  • The URL matches one of the Task scheduling routes in the function specification.
  • The security requirements of the function are satisfied with the attached access token.
  • Request query parameters, headers, and body will be propagated to the asynchronous Task.

This is an example of a response the system will generate:

HTTP 202 Accepted
Location: https://api.us-west-1.fusebit.io/v1/account/acc-124a0b2e6a1043d4/subscription/sub-ed9d9341ea356841/boundary/boundary1/function/function1/task/tsk-2ec6c8dba6134772
Content-Type: application/json

{
  "accountId": "{accountId}",
  "subscriptionId": "{subscriptionId}",
  "boundaryId": "{boundaryId}",
  "functionId": "{functionId}",
  "taskId": "{taskId}",
  "status": "pending|running|completed|error",
  "notBefore": "{date}", // optional, if the Task was scheduled for the future
  "transitions": {
    "pending": "{date}"
  },
  "location": "{url}" // The URL to query Task status
}

Note that the Location response header (which matches the location property of the response body) contains a URL that can be used to query the status of the Task execution.

Within the function code, input parameters of a task are exposed on the ctx object as follows:

module.exports = async (ctx) => {
  if (ctx.method === 'TASK') {
    // ctx.query is {a: "b"}
    // ctx.body is {some: "payload"}
    // ctx.headers contains { MyHeader: "MyValue" }
    return { status: 200, body: { arbitrary: 'content' } };
  }
  else {
    // ...
  }
};

Other possible responses to the Task scheduling request include:

  • HTTP 403 - when the security requirements are not met.
  • HTTP 429 - when specific throttling limits are exceeded.

The example above shows how to schedule a Task for immediate execution. It is also possible to request that the Task executes in the future, up to 24 hours in advance. This is done by attaching the fusebit-task-not-before HTTP request header as follows:

curl -X POST \
  https://api.us-west-1.fusebit.io/v1/run/sub-ed9d9341ea356841/boundary1/function1/task/task1?a=b \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer {accessToken}" \
  -H "fusebit-task-not-before: {EPOCH time}" \
  -d '{"some":"payload"}'

The value of the fusebit-task-not-before header is expressed in EPOCH time.

Scheduling a Task Programmatically

The programming model of a Fusebit Function allows for easy scheduling of new tasks from within the function code itself:

module.exports = async (ctx) => {
  let task = await ctx.fusebit.scheduleTask({
    path: '/task/sendEmail', // required
    accessToken: 'ey...', // optional
    query: {}, // optional
    headers: {}, // optional
    body: {}, // optional, but hardly makes sense without
    notBefore: Date, // optional, absolute time as Date instance
    notBeforeRelative: 1244, // optional, seconds from now
  });  
  // ...
};

The convenience method ctx.fusebit.scheduleTask makes a task scheduling request back to the same function, using the supplied parameters as follows:

  • The path is appended to the ctx.baseUrl and must be prefix-matched by one of the routes of the function specification.
  • The accessToken, if specified, is attached as the bearer token in the Authorization header, and must satisfy the security requirements of the function. If the access token is not specified, the ctx.fusebit.functionAccessToken must be present and it will be used instead.
  • The query, headers, and body are all passed verbatim to the Task scheduling request.
  • The optional notBefore and notBeforeRelative are used to delay the execution of the Task up to 24 hours. The notBeforeRelative specifies the number of seconds from now. The notBefore specifies the Date instance.

A successful response from the scheduleTask method is an object with the following properties:

{
  "accountId": "{accountId}",
  "subscriptionId": "{subscriptionId}",
  "boundaryId": "{boundaryId}",
  "functionId": "{functionId}",
  "taskId": "{taskId}",
  "status": "pending|running|completed|error",
  "notBefore": "{date}", // optional, if the Task was scheduled for the future
  "transitions": {
    "pending": "{date}"
  },
  "location": "{url}" // The URL to poll for Task status (see New API section)
}

Querying the Task Status

The status of the scheduled Task can be queried using the URL provided in the HTTP 202 response to the Task scheduling request. The result of the Task execution is kept for up to 24 hours after Task completion.

To query Task status, make an HTTP GET request to the URL from the HTTP 202 response to the Task scheduling request, and attach a valid access token with function:schedule permission on the /account/{{accountId}}/subscription/{{subscriptionId}}/boundary/{{boundaryId}}/function/{{functionId}}/ resource:

curl https://api.us-west-1.fusebit.io/v1/account/acc-124a0b2e6a1043d4/subscription/sub-ed9d9341ea356841/boundary/boundary1/function/function1/task/tsk-2ec6c8dba6134772 \
  -H "Authorization: Bearer {accessToken}"

The HTTP 200 response body describes the current status of the Task:

{
  accountId: '{id}',
  subscriptionId: '{id}',
  boundaryId: '{id}',
  functionId: '{id}',
  taskId: '{id}',
  notBefore: '{date}', // only present if Task execution is delayed
  status: 'pending', // one of: pending, running, error, completed
  transitions: {
    pending: '{date}',
    running: '{date}',
    error: '{date}',
    completed: '{date}'
  },
  output: { // only present if status is completed
    response: { // response from the user Lambda function
      status: {number},
      body: any,
      spans: [...],
      logs: [...],
    },
    meta: { // metadata added by Fusebit
      source: "function", // which layer the response was generated by
      metrics: {
        lambda: {
          duration: {number},
          memory: {number}
        }
      }
    }
  },
  error: { // only present if error was generated at Fusebit layer
    status: 500,
    message: '{message}'
  },
  location: '{url-of-the-get-task-endpoint}'
}

Notable properties:

  • status is the main property describing the status of the Task. All newly scheduled Tasks start in the pending state. The running state means the Task is currently executing. The completed state means the Task completed execution (however, the execution may have failed - check the output property for details). The error state indicates an execution error at the Fusebit infrastructure level - check the error property for more information.
  • output describes the result of Task execution and is only present when status is completed. The presence of this element does not mean the Task was successful. You must check the output.response to make this determination.
  • output.response contains the status code, body, and headers returned from the code of the last, as well as the logs generated to stdout in the output.response.logs array.
  • output.meta contains execution statistics, including Task execution duration and memory used.

Securing Tasks

You can specify authentication and authorization requirements for Task scheduling requests. Fusebit will automatically enforce these requirements before scheduling a Task for execution, ensuring the validation happens early while any time-sensitive credentials in the request are still fresh. If the requirements are not met, the request is rejected with an HTTP 403 response, and the Task is not scheduled for execution.

Security requirements for a Task scheduling route can be provided as part of the routes property of the function specification:

{
  "nodejs": {
    "files": {
      "index.js": "{...content of the index.js...}"
    }
  },
  "routes": [
    {
      "path": "/task/sendEmail",
      "security": {
        "authentication": "none|optional|required",
        "authorization": [
          { "action": "{action1}", "resource": "resource1" },
          { "action": "{action2}", "resource": "resource2" }
        ]
      },
      "task": {}
    }
  ]
}

The security property of a route specifies the authentication and authorization requirements for the Task scheduling route. The property has the same structure and behavior as the function specification's corresponding top-level security property.

If you don't specify the security property at all on a Task scheduling route, the route is secure by default using the following security settings:

{
  "nodejs": {
    "files": {
      "index.js": "{...content of the index.js...}"
    }
  },
  "routes": [
    {
      "path": "/task/sendEmail",
      "security": {
        "authentication": "required",
        "authorization": [
          { 
            "action": "function:schedule", 
            "resource": "/account/{{accountId}}/subscription/{{subscriptionId}}/boundary/{{boundaryId}}/function/{{functionId}}/" 
          },
        ]
      },
      "task": {}
    }
  ]
}

If you want to allow anonymous callers to schedule Tasks, you must explicitly opt-out from security by setting the authentication requirement to none:

{
  "nodejs": {
    "files": {
      "index.js": "{...content of the index.js...}"
    }
  },
  "routes": [
    {
      "path": "/task/sendEmail",
      "security": {
        "authentication": "none"
      },
      "task": {}
    }
  ]
}

📘

The security element, just like the top-level security element of the function specification, also allows you to specify the functionPermissions property that controls the issuance and permissions of the ctx.fusebit.functionAccessToken.

Throttling Tasks

Fusebit can enforce two throttling limits for asynchronous Task execution:

  • Maximum Running - the maximum number of concurrently executing Tasks. All Tasks scheduled while the limit is exhausted are queued up for execution later, when the number of running Tasks falls below the limit.
  • Maximum Pending - the maximum number of Tasks pending execution (a sum of Tasks scheduled for future execution and those throttled by the Maximum Running limit). If another scheduling request is made while the number of pending Tasks exceeds the Maximum Pending, it is rejected with an HTTP 429 response.

Each limit is enforced independently at the level of an individual Task scheduling route.

📘

Current implementation of the Maximum Pending limit is soft as the system may sometimes take up to a minute to reach a consistent state after a spike of task scheduling requests. Do not rely on this mechanism if you need accuracy, it currently provides only resilience-in-depth.

Both throttling limits can be specified using the task property of a route in the function specification:

{
  "nodejs": {
    "files": {
      "index.js": "{...content of the index.js...}"
    }
  },
  "routes": [
    {
      "path": "/task/sendEmail",
      "task": {
        "maxPending": 1000,
        "maxRunning": 10
      }
    }
  ]
}

If you don't specify the maxPending limit, the number of pending requests remains unlimited (the default).

If you don't specify the maxRunning limit, the default is 10. If you set maxRunning to 0, there is no explicit throttling of the number of concurrently executing Tasks.

Getting Task Statistics

You can obtain statistics on the number of Tasks pending execution for all Task scheduling routes of a Fusebit Function using the GET Function API, and attaching include=task query parameter:

curl https://api.us-west-1.fusebit.io/v1/account/acc-9d9341ea356841ed/subscription/sub-ed9d9341ea356841/boundary/boundary1/function/function1?include=task \
  -H "Authorization: Bearer {accessToken}"

The response will list all Task scheduling routes of the function with their configuration as well as the stats property indicating the approximate number of pending Tasks:

{
  "routes": [
    {
      "path": "/task/t1",
      "security": {
        "authentication": "none"
      },
      "task": {
        "maxPending": 100,
        "maxRunning": 1,
        "stats": {
          "availableCount": 100,
          "delayedCount": 500,
          "pendingCount": 600
        }
      }
    }
  ]
}

The properties are as follows:

  • availableCount - number of Tasks pending execution throttled by the maxRunning setting.
  • delayedCount - number of Tasks scheduled for future execution.
  • pendingCount - sum of the two above.

📘

The values are just approximate and can take up to one minute to reach consistency after a spike of task scheduling requests.

Task Limits

Tasks have the following limits:

  • The maximum execution time of a Task is 14 minutes (840 seconds) or the number of seconds specified in the compute.timeout property of the function specification, whichever is less.
  • The maximum payload of a Task scheduling request (body, headers, and query parameters combined) is 200KB.
  • The maximum size of the Task result (the result value returned from the function code plus any logs generated) is 400KB.