Excessive Data Exposure can occur when an API returns sensitive data in its responses. This data can then be used by an attacker, such as email addresses or other valuable personally identifiable information. The API may expect the client to filter out such data so that it’s not presented to the end user, but this does not prevent it from being read during transport, exploited by a malicious client, or even inadvertently revealed by a benign client.
To avoid excessive data exposure, API providers shouldn’t rely on the API client to filter sensitive data. Instead, sensitive data should either not be returned by the API, or returned in a redacted form. However, there may be cases where sensitive data has to be sent. In such situations, the team developing the API should seriously consider how to appropriately handle the data.
|Threat Agents/Attack Vectors||Security Weakness||Impacts|
|API Specific : Exploitability 3||Prevalence 2 : Detectability 2||Technical 2 : Business Specific|
|Exploitation of Excessive Data Exposure is simple, and is usually performed by sniffing the traffic to analyze the API responses, looking for sensitive data exposure that should not be returned to the user.||APIs rely on clients to perform the data filtering. Since APIs are used as data sources, sometimes developers try to implement them in a generic way without thinking about the sensitivity of the exposed data. Automatic tools usually can’t detect this type of vulnerability because it’s hard to differentiate between legitimate data returned from the API, and sensitive data that should not be returned without a deep understanding of the application.||Excessive Data Exposure commonly leads to exposure of sensitive data.|
Source: OWASP Excessive Data Exposure
The problem of excessive data exposure is best solved at the point of origin, rather than by APIM. APIs should not unnecessarily expose sensitive data in the first place. However, as an intermediary between an API client and server, an API Gateway can assist with solving the problem:
- Data transformation: API Gateways can transform data as it passes through them. This approach can be used to obscure, redact or entirely remove sensitive data prior to the API client receiving it.
- GraphQL: GraphQL uses the term over-fetching to represent when a client receives data it does not require. In this context, this could be sensitive data which is part of a GraphQL schema. An API Gateway can prevent sensitive data from being returned to a client by removing it from the schema, or by restricting access to the field to clients which are authorised to access it.
- Schema validation: To prevent sensitive data from escaping the API server, the API Gateway can validate responses it receives against a schema, blocking any which don’t comply.
Tyk has body transformation plugins which can be used to remove sensitive data from the response. This is compatible with responses which use JSON or XML data formats.
Tyk’s GraphQL engine allows it to act as a GraphQL server. Included in this is the ability to define the schema which clients can use to request data. By removing sensitive data from this schema, Tyk prevents clients from being able to request it by validating their GraphQL query. For a more nuanced approach, it’s also possible to use field-based permissions, which provides authorisation at a field level.
Tyk’s Universal Data Graph enables REST API endpoints to be added to GraphQL API schemas, enabling control over which fields can be queried.