Moving Beyond Polling to Async APIs

APIs commonly require a client to send a request to the server to receive new data; that is, the page requests data from the server. This client-server interaction style is common to REST-based APIs that we build today. 

If a client wishes to know when something new is available, it must periodically send a request to the API to check for data modifications. This pattern is known as polling and is a common solution for clients that need to become aware of new data or notified of backend events. 

However, polling isn’t an ideal solution, as it is complex, wasteful, and delivers a poor user experience. Let’s examine these challenges and how to overcome them by introducing an alternative interaction style that pushes data from the server to the client using HTTP.

Challenge 1: Polling is (often) complex

Polling can be implemented by the client on top of just about any API that uses a request-response style. Since REST-APIs fit this model, particularly those that apply the CRUD resource pattern, it becomes straightforward for a client to add polling support by calling a GET method occasionally to look for new data. 

Typical API designs that support polling by default may look like this example:

HTTP/1.1 GET /tasks Accept: application/json [
  { "name": "Task 1", … },
  …
]

Seems simple, right? Make a request to GET /tasks every 5 seconds to see if anything new has changed. Not really. Several things could go wrong:

  1. The API sends responses back with default, non-optimal sorting, e.g. oldest-to-newest. We then must request all entries to find out if anything new is available, often requiring us to keep a list of the IDs we already know about.
  2. Rate limiting prevents us from making the requests often enough to support the solution’s needs
  3. The data offered by the API doesn’t provide enough details for the client to determine if a specific event has occurred, such as a data updated event

Of course, there are a few ways to mitigate some of these issues, including the use of ETags with conditional requests to check for changes or lightweight endpoints that can check for changes since a given timestamp. 

While polling is often a default solution, it isn’t always the best. Sometimes it requires writing considerable code to work around the limitations of an API that wasn’t designed for polling.

Challenge 2: Polling is wasteful

According to a 2013 survey, Zapier says polling is wasteful. Only 1.5% of API requests result in new data. This means that 98.5% of API requests send back data that hasn’t changed! 

Zapier mentions how this waste is experienced not just on the app: 

Polling is the process of repeatedly hitting the same endpoint looking for new data. We don’t like doing this (its wasteful), vendors don’t like us doing it (again, its wasteful) and users dislike it (they have to wait a maximum interval to trigger on new data). 

Let’s explore an example of an API whose endpoint response payload totals 100KB per request. If one consumer polls the API every 30 seconds, the total data transferred in a 24-hour period is 281MB (2 requests per min * 1440 min * 100KB per request) – per unique consumer. If we have 100 consumers simultaneously polling, this results in 28GB per day! This calculation doesn’t include the computational and I/O load on our infrastructure, or network transmission costs into/out of cloud infrastructure.

Challenge 3: Polling delivers a poor user experience

In the Zapier quote above, it was stated that polling results in a poor user experience because of the lack of real-time notification (“users dislike it”). If you can poll an API, at best, every 30 seconds, the user will only see data changes at this interval. If anything happens during that time, they will be unaware until the next polling interval. Not ideal for our web or mobile users that demand near real-time notification.

What if we could push data to the API client?

The ideal situation is to have our servers inform the API client when new data or events are available. However, we can’t do this with a traditional request-response interaction style common with HTTP. We have to find a way to allow servers to push data to the client. Enter async APIs.

Async APIs is an API interaction style that allows the server to inform the consumer when something has changed. Let’s look at some popular technologies that helps us to create async APIs, along with how they work and when/when not to use them. 

Async APIs using server-sent events 

Server-sent events, or SSE for short, is based on the EventSource browser interface standardised as part of HTML5 by the W3C. It defines the use of HTTP to support longer-lived connections to allow servers to push data back to the client. These incoming messages are often designed as events that include associated data.

While SSE was originally designed to support pushing data to a web app, it is becoming a more popular way to push data to API clients while avoiding the challenges of polling.

How does SSE Work?

SSE uses a standard HTTP connection, but holds onto the connection for a longer period of time rather than disconnecting immediately. This connection allows servers to push data back to the client when it becomes available:

Diagram - How does sse work

Source: https://medium.com/conectric-networks/a-look-at-server-sent-events-54a77f8d6ff7

 

The specification outlines a few options for the format of the data coming back, allowing for event names, comments, single or multi-line text-based data, and event identifiers. Below is an example of how we might stream back new tasks as they are created:

GET /tasks/event-stream HTTP/1.1 
Accept: text/event-stream
Cache-Control: no-cache
...additional request headers...

HTTP/1.1 200 OK
Date: Tue, 18 Dec 2018 08:56:53 GMT
...additional response headers...
: this is a comment, useful for developers debugging an SSE 
connection

task_created:
data: {"id": "12345"}
id: 12345

task_created:
data: {"id": "6789"}
id: 6789

 

Pretty simple and quite flexible. The format for the data field may be any text-based content, from simple data points to single-line JSON payloads. We are also able to support multiple lines if we wish through the use of multiple data: prefixed lines. We can even offer a mixture of events from the same connection, rather than requiring a connect-per-event-type. 

Notice the identifiers for each event? Clients may use this to recover missed events if you go offline. The specification supports the use of the Last-Event-ID

 HTTP header, provided by the client when establishing the connection. The server will start the event stream immediately with events following the last event id, sending events missed while offline. Simple, yet powerful!

SSE supports several use cases:

  1. State change notifications to a front-end application, such as a browser or mobile app, to keep a user interface in sync with the latest server-side state
  2. Business events over HTTP, without requiring access to an internal message broker such as RabbitMQ or Kafka
  3. Streaming long-running queries or complex aggregations as results become available, rather than waiting for all the results to be obtained and pushed to the client at once

However, SSE does have a few cases where it may not be a fit:

  1. Your API gateway isn’t capable of handling long-running connections or has a brief timeout period (e.g. less than 30 seconds). While this isn’t a show-stopper, it will require the client to reconnect more often.
  2. You are targeting browsers that do not support SSE (compatible browsers)
  3. You need bi-directional communication between client and server. Here, you may wish to explore WebSockets, as SSE is server push only 

The W3C specification is easy to read and offers several examples. If you are interested in learning more about how SSE, consult the specification along with this page from Mozilla if you wish to ensure you are able to target compatible browsers. You may also wish to check out Simon Prickett’s article on Medium, “A Look at Server-Sent Events”, for a deeper look at how SSE works.

Server-to-server async APIs using webhooks

Webhooks are web-based callbacks that allow API servers to notify interested apps when an event has occurred. Unlike traditional callbacks, which occur within the same codebase, webhooks occur over the web, using an HTTP POST. The term webhooks was coined by Jeff Lindsay in 2007. 

The power of Github webhooks

GitHub has offered webhooks for many years. Anytime a developer pushes a commit to their GitHub-hosted repository, it will issue webhook-based callbacks to any registered subscribers. These callbacks are often used to drive automated build processes as part of a full CI/CD pipeline.

Here is how a typical GitHub webhook notification flow works:

  1. A developer pushes new code to a GitHub-hosted repository (public or private)
  2. GitHub publishes the event to their own private “Webhook Relay” service. This service is responsible for notifying any and all registered webhooks for the specific repo
  3. The “Webhook Relay” then publishes the event to each registered webhook subscriber using the “Webhook Relay agent”, which acts on behalf of the webhook subscription to ensure that the registered webhook is notified successfully. The agent is also responsible for retries and failure notifications if the webhook is unavailable or resulting in an error.
  4. The registered webhook receives the event from the “Webhook Relay agent” – in this case, Jenkins. Jenkins receives the event details, then responds with a 200 OK back to the agent. At this point, GitHub’s work is done
  5. Separately, Jenkins executes a fresh build process, first by fetching the latest code from the GitHub-hosted repo based on the commit details provided within the webhook callback. GitHub has no idea this is happening – all it knows is that it performed a callback to a specific URL with the details of the event

 

The power of github webhooks

 

Prior to GitHub’s webhooks, software had to be installed on the server hosting the repository. By externalising these concerns outside of GitHub’s servers, they have opened up a whole new marketplace for hosted, SaaS-based CI/CD tooling from vendors such as CircleCI, Shippable, and Travis CI. All with a single webhook – and without any prior involvement or knowledge by GitHub!

How do Webhooks work?

The API server makes a POST call to a URL that is provided by the system wishing to receive the callbacks.

For example, an interested subscriber may register to receive new task event notifications to an endpoint we created, e.g. POST /callbacks/new-tasks. The API server then sends a POST request with the details of the event to this URL. 

An important note about webhooks: the subscribing app must provide the endpoint necessary to receive the webhook callback, and it must be accessible to the API. Thus, this style of interaction isn’t appropriate for mobile apps, browser apps, or private apps hosted inside a firewall as they cannot offer a publicly accessible endpoint to receive the POST callback. However, it will work between private APIs and private apps on the same internal network as long as the appropriate ingress rules are present. 

Choosing between lean and rich event payloads

One of the common questions that emerges for teams designing async APIs is around the size of the event payloads. There are two common options: lean events and rich events (sometimes referred to as “thin” and “fat” events, respectively). 

Lean events send the minimal amount of information necessary for the client to know what happened. This may include the type of event and the ID of the resource that the event occurred. For example:

{
  "event": {
    "eventType": "task.created",
    "resourceType": "task",
    "resourceId": "abc123"
  }
}

 

Lean events are useful when you want to ensure that the notification event is minimal in size and details. This forces the callback endpoint to fetch the latest details about the resources involved in the event using a request/response style API interaction. 

On the other hand, rich events send all of the related information for the event, including any resource representations that may be useful for the callback to process. For example:

{
  "event": {
    "eventType": "task.created",
    "resourceType": "task",
    "resourceDetails": {
      "id": "abc123",
      "name": "My Task",
      ...
    }
  }
}

 

Rich events are useful when the callback processor needs to know everything about the event to take the appropriate action. Perhaps they need to have an exact snapshot of resource state at the time of the event. In this case, they should not be required to make an API call to fetch the latest details – especially if those details may have changed since the event was published. 

In either case, the event may include hypermedia links to inform subscribers where they can go to obtain more information. For example:

{
  "event": {
    "eventType": "task.created",
    "resourceType": " task ",
    "resourceId": "abc123",
    "_links": [
      { "href":"/tasks/abc123", "rel":"self" },
      { "href":"/tasks", "rel":"list" }
    ]
  }
}

 

Capturing your async API definitions

Capturing the definition of your Webhook callbacks in the OpenAPI Specification has been challenging and often met with frustration – until now. Starting with OpenAPI Specification v3, support for callbacks has been added. This example demonstrates how to define the endpoint for subscribing to webhooks, along with an example definition for the webhook callback itself. You can also read an interview with OAS creator Tony Tam and watch a video on how OAS v3 supports webhooks.

For those that are still using OpenAPI Specification v2, you cannot take advantage of these improvements. API provider teams have three options: upgrade to support OAS v3, add callback support through some custom extensions while remaining on OAS v2, or capture our webhooks using the new AsyncAPI specification. 

The AsyncAPI specification is a standard that is gaining ground for capturing all channels that offer event notification. From message brokers to SSE and Kafka streams, this standard is becoming popular as a one-stop location to define your message formats and the message-driven protocols available. It is important to note that this specification isn’t related to OAS, but has been inspired by it and strives to follow a similar format to make it easier for adoption. 

A word about data streaming

Async APIs push messages and events from an API server to an active subscriber over an HTTP connection. Server-sent events (SSE) and Webhooks are two popular options to support API streaming, but there are many other options. 

Unlike async APIs, data streaming focuses on raw data and is often delivered through a technology such as Apache Kafka or Apache Pulsar. Data streaming provides continuous, ordered delivery of atomic messages that represent state change in data.

Since Apache Kafka has popularised data streaming, there has been some confusion about whether to use async APIs or data streaming for messaging and event notification. I have recommended the following guidelines as a starting point to determine which option is the right fit:

  1. When sharing data for governance or when needing to analyse near real-time data as it becomes available, use data streaming
  2. When sharing events or data change notifications, use webhooks when subscribers will be other systems capable of exposing an HTTP POST operation to receive the callback
  3. When sharing events or data change notifications for external partner consumption, but the subscriber cannot expose an HTTP POST endpoint, use SSE (or a related technology) to offer an async API interaction style

In short: webhooks are for system-to-system callbacks via HTTP POST; API streaming with SSE is best for a resource-oriented event model that supports message and event publications alongside the additional behaviour offered by REST-based APIs; data streaming is for continuous data analytics or driving data governance for internal consumption.

Wrap-up: Expanding your platform with Async APIs

Polling is inefficient for both the client (who has to keep checking) and the server (who has more API requests as a result). Async APIs are useful when you need to push events that occur from your backend system to a browser, mobile app, or another application using HTTP callbacks. The next time you design an API, ask yourself this question: “How could API consumers use event notifications to integrate with my API in a more efficient and robust way?”