Tasks
Fusebit Tasks provide support for long-running, asynchronous code execution. Tasks are useful in three situations:
- Long execution times. When the logic you need to execute exceeds a reasonable lifetime of an HTTP request, Tasks provide a model for running code for up to 15 minutes. For example, you can process large CSV files using Tasks. Once scheduled, the status of a Task and its result can be queried asynchronously.
- Delayed execution. When you want to execute logic at some point in the future, you can schedule a Task to be executed up to 24 hours in advance. For example, you can use delayed Tasks to schedule a retry of a call to an external service that was rejected with an HTTP 429 response.
- Throttling. Fusebit can guarantee that no more than a specified number of Task instances will execute at the same time. For example, you can use throttled Tasks to limit the number of concurrent calls to an external API. You can also limit the number of concurrent calls to your own system in response to an external webhook.
Fusebit CLI Version
To manipulate Fusebit Tasks using the Fusebit CLI, you need to have
@fusebit/[email protected]
or later installed.
Implementing a Task
Fusebit Tasks are regular Fusebit Functions configured to execute asynchronously. They are created using the Fusebit HTTP - Core APIs, specifically the PUT Function API.
The simplest Task implementation looks as follows:
module.exports = async (ctx) => {
if (ctx.method === 'TASK') {
// Invocation of an asynchronous task.
// Task result is captured for asynchronous retrieval for up to 24 hours.
return { status: 200, body: { arbitrary: 'content' } };
}
else {
// Regular HTTP request
return { status: 200, body: { arbitrary: 'content' } };
}
};
Asynchronous execution of a Fusebit Function will always have the value of the ctx.method
parameter set to TASK
, which can be used in code to differentiate between these two execution modes.
For a Fusebit Function to execute asynchronously, the URL of the call to the function must match one of the routes of the function designated as Task scheduling routes in the function specification. This configuration is done as part of the routes
element of the function specification supplied in the body of the PUT Function API as follows:
{
"nodejs": {
"files": {
"index.js": "{...content of the index.js above...}"
}
},
"compute": {
"timeout": 840, // execution time limit in seconds (max 840)
},
"routes": [
{
"path": "/task/importCsv",
"task": {}
},
{
"path": "/task/sendEmail",
"task": {}
}
]
}
Every element of the routes
array defines one Task scheduling route for the function. A call to the function with a URL that matches the path
property of a route will cause the function to execute asynchronously if the matching route also contains the task
element. More on this in the next section.
Task Execution
Task execution is triggered with a Task scheduling request to the Fusebit Function. Just like a regular, synchronous function execution, the Task scheduling request is an HTTP request to the function, but it must meet several criteria:
- The verb of the request must be
POST
. - The
path
property of one of the routes specified in theroutes
element of the function specification must prefix-match the relative URL of the request (routes are evaluated in order and the first matching route is used). - The matching route must specify the
task
property infusebit.json
If all of the conditions above are met, the system will schedule the function to process the request asynchronously as a Task rather than run it synchronously with the current HTTP request. To asynchronously execute a Task, the system will:
- Enforce security and throttling constraints.
- Schedule a new Task for asynchronous execution by capturing the request body, headers, and query parameters.
- Immediately respond to the caller with an HTTP 202 response. The response will also contain the
Location
HTTP response header with a URL the caller can use to query the status of the Task.
Scheduling a Task over HTTP
Let's say the base URL of a Fusebit Function is https://api.us-west-1.fusebit.io/v1/run/sub-ed9d9341ea356841/boundary1/function1
, and the path
of a Task scheduling route in the function specification is /task/task1
. The following request will schedule the asynchronous execution of the Task:
curl -X POST \
https://api.us-west-1.fusebit.io/v1/run/sub-ed9d9341ea356841/boundary1/function1/task/task1?a=b \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {accessToken}" \
-H "MyHeader: MyValue" \
-d '{"some":"payload"}'
A couple of things to note about this request:
- It is an HTTP
POST
request. - The URL matches one of the Task scheduling routes in the function specification.
- The security requirements of the function are satisfied with the attached access token.
- Request query parameters, headers, and body will be propagated to the asynchronous Task.
This is an example of a response the system will generate:
HTTP 202 Accepted
Location: https://api.us-west-1.fusebit.io/v1/account/acc-124a0b2e6a1043d4/subscription/sub-ed9d9341ea356841/boundary/boundary1/function/function1/task/tsk-2ec6c8dba6134772
Content-Type: application/json
{
"accountId": "{accountId}",
"subscriptionId": "{subscriptionId}",
"boundaryId": "{boundaryId}",
"functionId": "{functionId}",
"taskId": "{taskId}",
"status": "pending|running|completed|error",
"notBefore": "{date}", // optional, if the Task was scheduled for the future
"transitions": {
"pending": "{date}"
},
"location": "{url}" // The URL to query Task status
}
Note that the Location
response header (which matches the location
property of the response body) contains a URL that can be used to query the status of the Task execution.
Within the function code, input parameters of a task are exposed on the ctx
object as follows:
module.exports = async (ctx) => {
if (ctx.method === 'TASK') {
// ctx.query is {a: "b"}
// ctx.body is {some: "payload"}
// ctx.headers contains { MyHeader: "MyValue" }
return { status: 200, body: { arbitrary: 'content' } };
}
else {
// ...
}
};
Other possible responses to the Task scheduling request include:
- HTTP 403 - when the security requirements are not met.
- HTTP 429 - when specific throttling limits are exceeded.
The example above shows how to schedule a Task for immediate execution. It is also possible to request that the Task executes in the future, up to 24 hours in advance. This is done by attaching the fusebit-task-not-before
HTTP request header as follows:
curl -X POST \
https://api.us-west-1.fusebit.io/v1/run/sub-ed9d9341ea356841/boundary1/function1/task/task1?a=b \
-H "Content-Type: application/json" \
-H "Authorization: Bearer {accessToken}" \
-H "fusebit-task-not-before: {EPOCH time}" \
-d '{"some":"payload"}'
The value of the fusebit-task-not-before
header is expressed in EPOCH time.
Scheduling a Task Programmatically
The programming model of a Fusebit Function allows for easy scheduling of new tasks from within the function code itself:
module.exports = async (ctx) => {
let task = await ctx.fusebit.scheduleTask({
path: '/task/sendEmail', // required
accessToken: 'ey...', // optional
query: {}, // optional
headers: {}, // optional
body: {}, // optional, but hardly makes sense without
notBefore: Date, // optional, absolute time as Date instance
notBeforeRelative: 1244, // optional, seconds from now
});
// ...
};
The convenience method ctx.fusebit.scheduleTask
makes a task scheduling request back to the same function, using the supplied parameters as follows:
- The
path
is appended to thectx.baseUrl
and must be prefix-matched by one of the routes of the function specification. - The
accessToken
, if specified, is attached as the bearer token in theAuthorization
header, and must satisfy the security requirements of the function. If the access token is not specified, thectx.fusebit.functionAccessToken
must be present and it will be used instead. - The
query
,headers
, andbody
are all passed verbatim to the Task scheduling request. - The optional
notBefore
andnotBeforeRelative
are used to delay the execution of the Task up to 24 hours. ThenotBeforeRelative
specifies the number of seconds from now. ThenotBefore
specifies theDate
instance.
A successful response from the scheduleTask
method is an object with the following properties:
{
"accountId": "{accountId}",
"subscriptionId": "{subscriptionId}",
"boundaryId": "{boundaryId}",
"functionId": "{functionId}",
"taskId": "{taskId}",
"status": "pending|running|completed|error",
"notBefore": "{date}", // optional, if the Task was scheduled for the future
"transitions": {
"pending": "{date}"
},
"location": "{url}" // The URL to poll for Task status (see New API section)
}
Querying the Task Status
The status of the scheduled Task can be queried using the URL provided in the HTTP 202 response to the Task scheduling request. The result of the Task execution is kept for up to 24 hours after Task completion.
To query Task status, make an HTTP GET
request to the URL from the HTTP 202 response to the Task scheduling request, and attach a valid access token with function:schedule
permission on the /account/{{accountId}}/subscription/{{subscriptionId}}/boundary/{{boundaryId}}/function/{{functionId}}/
resource:
curl https://api.us-west-1.fusebit.io/v1/account/acc-124a0b2e6a1043d4/subscription/sub-ed9d9341ea356841/boundary/boundary1/function/function1/task/tsk-2ec6c8dba6134772 \
-H "Authorization: Bearer {accessToken}"
The HTTP 200 response body describes the current status of the Task:
{
accountId: '{id}',
subscriptionId: '{id}',
boundaryId: '{id}',
functionId: '{id}',
taskId: '{id}',
notBefore: '{date}', // only present if Task execution is delayed
status: 'pending', // one of: pending, running, error, completed
transitions: {
pending: '{date}',
running: '{date}',
error: '{date}',
completed: '{date}'
},
output: { // only present if status is completed
response: { // response from the user Lambda function
status: {number},
body: any,
spans: [...],
logs: [...],
},
meta: { // metadata added by Fusebit
source: "function", // which layer the response was generated by
metrics: {
lambda: {
duration: {number},
memory: {number}
}
}
}
},
error: { // only present if error was generated at Fusebit layer
status: 500,
message: '{message}'
},
location: '{url-of-the-get-task-endpoint}'
}
Notable properties:
- status is the main property describing the status of the Task. All newly scheduled Tasks start in the
pending
state. Therunning
state means the Task is currently executing. Thecompleted
state means the Task completed execution (however, the execution may have failed - check theoutput
property for details). Theerror
state indicates an execution error at the Fusebit infrastructure level - check theerror
property for more information. - output describes the result of Task execution and is only present when
status
iscompleted
. The presence of this element does not mean the Task was successful. You must check theoutput.response
to make this determination. - output.response contains the status code, body, and headers returned from the code of the last, as well as the logs generated to stdout in the
output.response.logs
array. - output.meta contains execution statistics, including Task execution duration and memory used.
Securing Tasks
You can specify authentication and authorization requirements for Task scheduling requests. Fusebit will automatically enforce these requirements before scheduling a Task for execution, ensuring the validation happens early while any time-sensitive credentials in the request are still fresh. If the requirements are not met, the request is rejected with an HTTP 403 response, and the Task is not scheduled for execution.
Security requirements for a Task scheduling route can be provided as part of the routes
property of the function specification:
{
"nodejs": {
"files": {
"index.js": "{...content of the index.js...}"
}
},
"routes": [
{
"path": "/task/sendEmail",
"security": {
"authentication": "none|optional|required",
"authorization": [
{ "action": "{action1}", "resource": "resource1" },
{ "action": "{action2}", "resource": "resource2" }
]
},
"task": {}
}
]
}
The security
property of a route specifies the authentication and authorization requirements for the Task scheduling route. The property has the same structure and behavior as the function specification's corresponding top-level security
property.
If you don't specify the security
property at all on a Task scheduling route, the route is secure by default using the following security settings:
{
"nodejs": {
"files": {
"index.js": "{...content of the index.js...}"
}
},
"routes": [
{
"path": "/task/sendEmail",
"security": {
"authentication": "required",
"authorization": [
{
"action": "function:schedule",
"resource": "/account/{{accountId}}/subscription/{{subscriptionId}}/boundary/{{boundaryId}}/function/{{functionId}}/"
},
]
},
"task": {}
}
]
}
If you want to allow anonymous callers to schedule Tasks, you must explicitly opt-out from security by setting the authentication requirement to none
:
{
"nodejs": {
"files": {
"index.js": "{...content of the index.js...}"
}
},
"routes": [
{
"path": "/task/sendEmail",
"security": {
"authentication": "none"
},
"task": {}
}
]
}
The
security
element, just like the top-levelsecurity
element of the function specification, also allows you to specify thefunctionPermissions
property that controls the issuance and permissions of thectx.fusebit.functionAccessToken
.
Throttling Tasks
Fusebit can enforce two throttling limits for asynchronous Task execution:
- Maximum Running - the maximum number of concurrently executing Tasks. All Tasks scheduled while the limit is exhausted are queued up for execution later, when the number of running Tasks falls below the limit.
- Maximum Pending - the maximum number of Tasks pending execution (a sum of Tasks scheduled for future execution and those throttled by the Maximum Running limit). If another scheduling request is made while the number of pending Tasks exceeds the Maximum Pending, it is rejected with an HTTP 429 response.
Each limit is enforced independently at the level of an individual Task scheduling route.
Current implementation of the Maximum Pending limit is soft as the system may sometimes take up to a minute to reach a consistent state after a spike of task scheduling requests. Do not rely on this mechanism if you need accuracy, it currently provides only resilience-in-depth.
Both throttling limits can be specified using the task
property of a route in the function specification:
{
"nodejs": {
"files": {
"index.js": "{...content of the index.js...}"
}
},
"routes": [
{
"path": "/task/sendEmail",
"task": {
"maxPending": 1000,
"maxRunning": 10
}
}
]
}
If you don't specify the maxPending
limit, the number of pending requests remains unlimited (the default).
If you don't specify the maxRunning
limit, the default is 10. If you set maxRunning
to 0, there is no explicit throttling of the number of concurrently executing Tasks.
Getting Task Statistics
You can obtain statistics on the number of Tasks pending execution for all Task scheduling routes of a Fusebit Function using the GET
Function API, and attaching include=task
query parameter:
curl https://api.us-west-1.fusebit.io/v1/account/acc-9d9341ea356841ed/subscription/sub-ed9d9341ea356841/boundary/boundary1/function/function1?include=task \
-H "Authorization: Bearer {accessToken}"
The response will list all Task scheduling routes of the function with their configuration as well as the stats
property indicating the approximate number of pending Tasks:
{
"routes": [
{
"path": "/task/t1",
"security": {
"authentication": "none"
},
"task": {
"maxPending": 100,
"maxRunning": 1,
"stats": {
"availableCount": 100,
"delayedCount": 500,
"pendingCount": 600
}
}
}
]
}
The properties are as follows:
- availableCount - number of Tasks pending execution throttled by the
maxRunning
setting. - delayedCount - number of Tasks scheduled for future execution.
- pendingCount - sum of the two above.
The values are just approximate and can take up to one minute to reach consistency after a spike of task scheduling requests.
Task Limits
Tasks have the following limits:
- The maximum execution time of a Task is 14 minutes (840 seconds) or the number of seconds specified in the
compute.timeout
property of the function specification, whichever is less. - The maximum payload of a Task scheduling request (body, headers, and query parameters combined) is 200KB.
- The maximum size of the Task result (the result value returned from the function code plus any logs generated) is 400KB.
Updated 11 days ago