Tyk Pump - Ship Analytics Data to Persistent Datastore

Last updated: 39 minutes read.

Introduction

Traffic analytics are captured by the Gateway nodes and then temporarily stored in Redis. The Tyk Pump is responsible for moving those analytics into a persistent data store, such as MongoDB, where the traffic can be analyzed.

What is the Tyk Pump?

The Tyk Pump is our open source analytics purger that moves the data generated by your Tyk nodes to any back-end. It is primarily used to display your analytics data in the Tyk Dashboard.

Note

The Tyk Pump is not currently configurable in our Tyk Cloud solution.

Tyk Pump Data Flow

Here’s the architecture depending on your deployment model:

Tyk Enterprise Pump Architecture

Tyk Open Source Pump Architecture

Tyk-Pump is both extensible, and flexible- meaning it is possible to configure Tyk-Pump to send data to multiple different backends at the same time as depicted by Pump Backends (i) and (ii), MongoDB and Elasticsearch respectively in Figure 1. Tyk-Pump is scalable, both horizontally and vertically, as indicated by Instances “1”, “2”, and “n”. Additionally, it is possible to apply filters that dictate WHAT analytics go WHERE, please see the docs on sharded analytics configuration here.

Configuration and Scaling of Tyk Pump

Figure 1: An architecture diagram illustrating horizontal scaling of “n” Instances of Tyk-Pump each with two different backends.

Other Supported Backend Services

We list our supported backends here.

Configuring your Tyk Pump

See Tyk Pump Configuration for more details on setting up your Tyk Pump.

Tyk Pump can be horizontally scaled without causing duplicate data, please see the following Table for the supported permutations of Tyk Pump scaling.

Supported Summary
Single Pump Instance, Single Backend
Single Pump Instance, Multiple Backend(s)
Multiple Pump Instances, Same Backend(s)
Multiple Pump Instances, Different Backend(s)

Getting Started

Tyk Pump Configuration

The Tyk Pump is our Open Source analytics purger that moves the data generated by your Tyk nodes to any back-end. By moving the analytics into your supported database, it allows the Tyk Dashboard to display traffic analytics across all your Tyk Gateways.

Tyk Dashboard

MongoDB

The Tyk Dashboard uses the mongo-pump-aggregate collection to display analytics. This is different than the standard mongo pump plugin that will store individual analytic items into MongoDB. The aggregate functionality was built to be fast, as querying raw analytics is expensive in large data sets. See Pump Dashboard Config for more details.

SQL

Note

Tyk no longer supports SQLite as of Tyk 5.7.0. To avoid disruption, please transition to PostgreSQL, MongoDB, or one of the listed compatible alternatives.

In v4.0 of the Tyk Dashboard, we added support for the following SQL platforms:

  • PostgreSQL
  • SQLite

Within your Dashboard configuration file (tyk-analytics.conf) there is now a storage section.

{
  ...
  "storage": {
    "main":{},
    "analytics":{},
    "logs":{},
    "uptime": {}
  }
}
Field description
  • main - Main storage (APIs, Policies, Users, User Groups, etc.)
  • analytics - Analytics storage (used for display all the charts and for all analytics screens)
  • logs - Logs storage (log browser page)
  • uptime - uptime tests analytics data
Common settings

For every storage section, you must populate the following fields:

{
...
  "storage": {
    ...
    "main": {
      "type": "postgres",
      "connection_string": "user=root password=admin database=tyk-demo-db host=tyk-db port=5432",
    }
  }
}
  • type use this field to define your SQL platform (currently SQLite or PostgreSQL are supported)
  • connection_string the specific connection settings for your platform

The pump needed for storing logs data in the database, is very similar to other pumps as well as the storage setting in your Tyk Dashboard config. It just requires the sql name and database specific configuration options.

####### SQL example

"sql": {
  "name": "sql",
  "meta": {
    "type": "postgres",
    "connection_string": "user=laurentiughiur password=test123 database=tyk-demo-db host=127.0.0.1 port=5432"
  }
},

Capping analytics data

Tyk Gateways can generate a lot of analytics data. Be sure to read about capping your Dashboard analytics

Omitting the configuration file

From Tyk Pump 1.5.1+, you can configure an environment variable to omit the configuration file with the TYK_PMP_OMITCONFIGFILE variable. This is specially useful when using Docker, since by default, the Tyk Pump has a default configuration file with pre-loaded pumps.

Sharding analytics to different data sinks

In a multi-organization deployment, each organization, team, or environment might have their preferred analytics tooling. This capability allows the Tyk Pump to send analytics for different organizations or various APIs to different destinations. E.g. Org A can send their analytics to MongoDB + DataDog while Org B can send their analytics to DataDog + expose the Prometheus metrics endpoint.

Configuring the sharded analytics

You can achieve the sharding by setting both an allowlist1 t and a blocklist2 , meaning that some data sinks can receive information for all orgs, whereas other data sinks will not receive certain organization’s analytics if it was block listed.

This feature makes use of the field called filters, which can be defined per pump. This is its structure:

"filters":{
  "api_ids":[],
  "org_ids":[],
  "skip_api_ids":[],
  "skip_org_ids":[]
     }
  • api_ids and org_ids works as allow list (APIs and orgs where we want to send the analytic records).
  • skip_api_ids and skip_org_ids works as block list (APIs and orgs where we want to filter out and not send their the analytic records).

The priority is always a blocklist2 over a allowlist1 .

An example of configuration would be:

"csv": {
 "type": "csv",
 "filters": {
   "org_ids": ["org1","org2"]
 },
 "meta": {
   "csv_dir": "./bar"
 }
},
"elasticsearch": {
 "type": "elasticsearch",
 "filters": {
   "skip_api_ids": ["api_id_1"],
   },
 "meta": {
   "index_name": "tyk_analytics",
   "elasticsearch_url": "https://elasticurl:9243",
   "enable_sniffing": false,
   "document_type": "tyk_analytics",
   "rolling_index": false,
   "extended_stats": false,
   "version": "6"
 }
}

With this configuration, all the analytics records related to org1 or org2 will go to the csv backend and everything but analytics records from api_id_1 to elasticsearch.

Setup Dashboard Analytics

To enable Dashboard Analytics, you would need to configure Tyk Pump to send analytic data to the Dashboard storage MongoDB / SQL.

These are the different pumps that handle different kinds of analytic data.

Analytics Activities Graph Log Browser Uptime Analytics
Mongo (Multi organization) Mongo Aggregate Pump Mongo Selective Pump Uptime Pump
Mongo (Single organization) Mongo Aggregate Pump Mongo Pump Uptime Pump
SQL SQL Aggregate Pump SQL Pump Uptime Pump

See below details about these pumps, their configs, matching collections and relevant dashboard setting, to view this data.

MongoDB

Mongo Pump

mongo Pump simply saves all individual requests across every organization to a collection called tyk_analytics. Each request will be stored as a single document.

Pump Config
{
  ...
  "pumps": { 
    "mongo": {
      "type": "mongo",
      "meta": {
        "collection_name": "tyk_analytics",
        "mongo_url": "mongodb://username:password@{hostname:port},{hostname:port}/{db_name}"
      }
    }
}
Capping

This collection should be capped due to the number of individual documents. This is especially important if the detailed_recording in the Gateway is turned on which means that the Gateway records the full payload of the request and response.

Omitting Indexes

From Pump 1.6+, the Mongo Pumps indexes default behavior is changed and the new configuration option omit_index_creation is available. This option is applicable to the following Pumps: Mongo Pump,Mongo Aggregate Pump and Mongo Selective Pump.

The behavior now depends upon the value of ‘omit_index_creation’ and the Pump in use, as follows:

  • If omit_index_creation is set to true, tyk-pump will not create any indexes (for Mongo pumps).
  • If omit_index_creation is set to false (default) and you are using DocumentDB, tyk-pump will create the Mongo indexes.
  • If omit_index_creation is set to false (default) and you are using MongoDB, the behavior of tyk-pump depends upon whether the collection already exists:
    • If the collection exists, tyk-pump will not create the indexes again.
    • If the collection does not already exist, tyk-pump will create the indexes.
Dashboard Setting

In API Usage Data > Log Browser screen you will see all the individual requests that the Gateway has recorded and saved in tyk_analytics collection using the mongo pump.

Because you have the option to store and display analytics of every organization or separately per organization, you need to configure the Tyk Dashboard with the matching setting according to the way you set the pump to store the data in MongoDB. The field use_sharded_analytics controls the collection that the dashboard will query.

  • If use_sharded_analytics: false - the dashboard will query the collection tyk_analytics that mongo pump populated
  • If use_sharded_analytics: true - the dashboard will query the collection that mongo-pump-selective pump populated
Mongo Aggregate Pump

mongo-pump-aggregate pump stores data in a collection called z_tyk_analyticz_aggregate_{ORG ID}.

Pump Config
{
  ...
  "pumps": {
    "mongo-pump-aggregate": {
      "name": "mongo-pump-aggregate",
      "meta": {
        "mongo_url": "mongodb://username:password@{hostname:port},{hostname:port}/{db_name}",
        "use_mixed_collection": true
      }
    }
  }
}
  • use_mixed_collection: true - will store analytics to both your organization defined collections z_tyk_analyticz_aggregate_{ORG ID} and your org-less tyk_analytics_aggregates collection.
  • use_mixed_collection: false- your pump will only store analytics to your org defined collection.

tyk_analytics_aggregates collection is used to query analytics across your whole Tyk setup. This can be used, for example, by a superuser role that is not attached to an organization. When set to true, you also need to set use_sharded_analytics to true in your Dashboard config.

Dashboard Setting

This pump supplies the data for the following sub categories API Usage Data:

  • Activity by API screen
  • Activity by Key screen
  • Errors screen

As with the regular analytics, because Tyk gives you the option to store and display aggregated analytics across all organizations or separately per organization, you need to configure the Tyk Dashboard with the matching setting according to the way to set the pump to store the data in MongoDB, otherwise, you won’t see the data in the Dashboard.

  1. The enable_aggregate_lookups: true field must be set in the Dashboard configuration file, in order for the Dashboard to query and display the aggregated data that mongo-pump-aggregate saved to MongoDB.
Capping

As a minimal number of documents get stored, you don’t need to worry about capping this. The documents contain aggregate info across an individual API, such as total requests, errors, tags and more.

####### High Traffic Environment Settings

If you have a high traffic environment, and you want to ignore aggregations to avoid Mongo overloading and/or reduce aggregation documents size, you can do it using the ignore_aggregations configuration option. The possible values are:

  • APIID
  • Errors
  • Versions
  • APIKeys
  • OauthIDs
  • Geo
  • Tags
  • Endpoints
  • KeyEndpoint
  • OauthEndpoint
  • ApiEndpoint

For example, if you want to ignore the API Keys aggregations:

pump.conf:

{
  ...
  "pumps": {
    "mongo-pump-aggregate": {
      "name": "mongo-pump-aggregate",
      "meta": {
        "mongo_url": "mongodb://username:password@{hostname:port},{hostname:port}/{db_name}",
        "use_mixed_collection": true,
        "ignore_aggregations": ["APIKeys"]
      }
    }
  }
}

####### Unique Aggregation Points

In case you set your API definition in the Tyk Gateway to tag unique headers (like request_id or timestamp), this collection can grow a lot since aggregation of unique values simply creates a record/document for every single value with a counter of 1. To mitigate this, avoid tagging unique headers as the first option. If you can’t change the API definition quickly, you can add the tag to the ignore list "ignore_aggregations": ["request_id"]. This ensures that Tyk pump does not aggregate per request_id.
Also, if you are not sure what’s causing the growth of the collection, you can also set time capping on these collections and monitor them.

Mongo Selective Pump

mongo-pump-selective pump stores individual requests per organization in collections called z_tyk_analyticz_{ORG ID}. Similar to the regular mongo pump, Each request will be stored as a single document.

Pump Config

This collection should be capped due to the number of individual documents.

{
  ...
  "pumps": {
    "mongo-pump-selective": {
      "name": "mongo-pump-selective",
      "meta": {
        "mongo_url": "mongodb://username:password@{hostname:port},{hostname:port}/{db_name}",
        "use_mixed_collection": true
      }
    }
  }
}
Capping

This collection should be capped due to the number of individual documents.

Dashboard Setting

As with the regular analytics, if you are using the Selective pump, you need to set use_sharded_keys: true in the dashboard config file so it will query z_tyk_analyticz_{ORG ID} collections to populate the Log Browser.

Uptime Tests Analytics
Pump Configuration
"uptime_pump_config": {
    "collection_name": "tyk_uptime_analytics",
    "mongo_url": "mongodb://tyk-mongo:27017/tyk_analytics",
  },
Tyk Dashboard Configuration
  “storage” : {
    ...
    “uptime”: {
      "type": "postgres",
      "connection_string": "user=root password=admin database=tyk-demo-db host=tyk-db port=5432",
    }
  }
}
Tyk Gateway Setting

To enable Uptime Pump, modify gateway configuration enable_uptime_analytics to true.

SQL

When using one of our supported SQL platforms, Tyk offers 3 types of SQL pumps:

  1. Aggregated Analytics: sql_aggregate
  2. Raw Logs Analytics: sql
  3. Uptime Tests Analytics

In a production environment, we recommend sharding. You can configure your analytics in the following ways:

  • Sharding raw logs
  • Sharding aggregated analytics
  • Sharding uptime tests
SQL Pump

While aggregated analytics offer a decent amount of details, there are use cases when you’d like to have access to all request details in your analytics. For that you can generate analytics based on raw logs. This is especially helpful when, once you have all the analytics generated based on raw logs stored in your SQL database, you can then build your own custom metrics, charts etc. outside of your Tyk Dashboard, which may bring more value to your product.

The pump needed for storing log data in the database is very similar to other pumps as well as the storage setting in the Tyk Dashboard config. It just requires the SQL name and database-specific configuration options.

SQL Pump Configuration

For storing logs into the tyk_analytics database table.

"sql": {
  "name": "sql",
  "meta": {
    "type": "postgres",
    "connection_string": "host=localhost port=5432 user=admin dbname=postgres_test password=test",
    "table_sharding": false
  }
}

type - The supported types are sqlite and postgres.

connection_string - Specifies the connection string to the database. For example, for sqlite it will be the path/name of the database, and for postgres, specifying the host, port, user, password, and dbname.

log_level - Specifies the SQL log verbosity. The possible values are: info,error and warning. By default, the value is silent, which means that it won’t log any SQL query.

table_sharding - Specifies if all the analytics records are going to be stored in one table or in multiple tables (one per day). By default, it is set to false.

If table_sharding is false, all the records are going to be stored in the tyk_analytics table. If set to true, daily records are stored in a tyk_analytics_YYYYMMDD date formatted table.

Dashboard Setting

In the API Usage Data > Log Browser screen you will see all the individual requests that the Gateway has recorded and saved in tyk_analytics collection using the sql pump.

Make sure you have configured the dashboard with your SQL database connection settings:

{
  ...
  "storage" : {
    ...
    "analytics": {
      "type": "postgres",
      "connection_string": "user=root password=admin host=tyk-db database=tyk-demo-db port=5432",
    }
  }
}
SQL Aggregate Pump

This is the default option offered by Tyk, because it is configured to store the most important analytics details which will satisfy the needs of most of our clients. This allows your system to save database space and reporting is faster, consuming fewer resources.

SQL Aggregate Pump Configuration

For storing logs into the tyk_aggregated database table.

"sql_aggregate": {
  "name": "sql_aggregate",
  "meta": {
    "type": "postgres",
    "connection_string": "host=localhost port=5432 user=admin dbname=postgres_test password=test",
    "table_sharding": true
  }
}

type - The supported types are sqlite and postgres.

connection_string - Specifies the connection string to the database. For example, for sqlite it will be the path/name of the database, and for postgres, specifying the host, port, user, password, and dbname.

log_level - Specifies the SQL log verbosity. The possible values are: info, error, and warning. By default, the value is silent, which means that it won’t log any SQL query.

track_all_paths - Specifies if it should store aggregated data for all the endpoints. By default, it is set to false, which means that it only stores aggregated data for tracked endpoints.

ignore_tag_prefix_list - Specifies prefixes of tags that should be ignored.

table_sharding - Specifies if all the analytics records are going to be stored in one table or in multiple tables (one per day). By default, it is set to false.

If table_sharding is false, all the records are going to be stored in the tyk_aggregated table. If set to true, daily records are stored in a tyk_aggregated_YYYYMMDD date formatted table.

Dashboard Setting

This pump supplies the data for the following sub categories API Usage Data:

  • Activity by API screen
  • Activity by Key screen
  • Errors screen

As with the regular analytics, because Tyk gives you the option to store and display aggregated analytics across all organizations or separately per organization, you need to configure the Tyk Dashboard with the matching set according to the way to set the pump to store the data in SQL, otherwise, you won’t see the data in the Dashboard.

  1. The enable_aggregate_lookups: true field must be set in the Dashboard configuration file, in order for the Dashboard to query and display the aggregated data that sql-aggregate saved to the database.

  2. Make sure you have configured the dashboard with your SQL database connection settings:

{
  ...
  "storage": {
    ...
    "analytics": {
      "type": "postgres",
      "connection_string": "user=root password=admin host=tyk-db database=tyk-demo-db port=5432",
    }
  }
}
SQL Uptime Pump

In an uptime_pump_config section, you can configure a SQL uptime pump. To do that, you need to add the field uptime_type with sql value.

"uptime_pump_config": {
  "uptime_type": "sql",
  "type": "postgres",
  "connection_string": "host=sql_host port=sql_port user=sql_usr dbname=dbname password=sql_pw",
  "table_sharding": false
},

type - The supported types are sqlite and postgres.

connection_string - Specifies the connection string to the database. For example, for sqlite it will be the path/name of the database, and for postgres, specifying the host, port, user, password, and dbname.

table_sharding - Specifies if all the analytics records will be stored in one table or multiple tables (one per day). By default, it is set to false.

If table_sharding is false, all the records will be stored in the tyk_analytics table. If set to true, daily records are stored in a tyk_analytics_YYYYMMDD date formatted table.

Tyk Dashboard Configuration

You need to set enable_aggregate_lookups to false

Then add your SQL database connection settings:

{
  ...
  “storage” : {
    ...
    “analytics”: {
      "type": "postgres",
      "connection_string": "user=root password=admin host=tyk-db database=tyk-demo-db port=5432",
    }
  }
}
Uptime Tests Analytics
Tyk Pump Configuration

For storing logs into the tyk_aggregated database table.

"uptime_pump_config": {
  "uptime_type": "sql",
  "type": "postgres",
  "connection_string": "host=sql_host port=sql_port user=sql_usr database=tyk-demo-db password=sql_pw",
},
Tyk Dashboard Configuration
  “storage” : {
    ...
    “uptime”: {
      "type": "postgres",
      "connection_string": "user=root password=admin database=tyk-demo-db host=tyk-db port=5432",
    }
  }
}
Tyk Gateway Setting

To enable Uptime Pump, modify gateway configuration enable_uptime_analytics to true.

Sharding

In a production environment, we recommend the following setup:

By default, all logs/analytics are stored in one database table, making it hard and less performant to execute CRUD operations on the dataset when it grows significantly.

To improve the data maintenance processes, as querying or removing data from one single table is slow, we have added a new option (table_sharding), so that the data can be stored daily (one table of data per day), which will automatically make querying or removing sets of data easier, whether dropping tables for removing logs/analytics, or reading multiple tables based on the selected period.

Tyk Pump Configuration
"sql": {
  ...
  "meta": {
    ...
    "table_sharding": true
  }
},
"sql_aggregate" : {
  ...
  "meta": {
    ...
    "table_sharding": true
  }
},
"uptime_pump_config": {
  ...
  "table_sharding": true
},
Tyk Dashboard Configuration
  "storage": {
    "main": {
      ...
      "table_sharding": true
    },
    "analytics": {
      ...
      "table_sharding": true
    },
    "logs": {
      ...
      "table_sharding": true
    },
    "uptime": {
      ...
      "table_sharding": true
    }
  },

Graph Pump setup

MongoDB

Starting with version 1.7.0 of Tyk Pump and version 4.3.0 of Tyk Gateway it is possible to configure Graph MongoDB Pump. Once configured, the pump enables support for Graphql-specific metrics. The Graphql-specific metrics currently supported include (more to be added in future versions ):

  • Types Requested.
  • Fields requested for each type.
  • Error Information (not limited to HTTP status codes).
Setting up Graph MongoDB Pump
  1. Set enable_analytics to true in your tyk.conf.
  2. Enable Detailed recording by setting enable_detailed_recording in your tyk.conf to true. This is needed so that the GraphQL information can be parsed from the request body and response.

Note

This will enable detailed recording globally, across all APIs. This means that the behavior of individual APIs that have this configuration parameter set will be overridden. The Gateway must be restarted after updating this configuration parameter.

  1. Set up your Mongo collection_name.
  2. Add your Graph MongoDB Pump configuration to the list of pumps in your pump.conf (pump configuration file).

Sample setup:

{
  ...
  "pumps": {
    ...
    "mongo-graph": {
      "meta": {
        "collection_name": "tyk_graph_analytics",
        "mongo_url": "mongodb://mongo/tyk_graph_analytics"
      }
    },
  }
}
Current limitations

The Graph MongoDB Pump is being improved upon regularly and as such there are a few things to note about the Graph MongoDB Pump current behavior:

  • Size of your records - due to the detailed recording being needed for this Pump’s to function correctly, it is important to note that your records and consequently, your MongoDB storage could increase in size rather quickly.
  • Subgraph requests are not recorded - Requests to tyk-controlled subgraphs from supergraphs in federation setting are currently not recorded by the Graph MongoDB Pump, just the supergraph requests are handled by the Graph MongoDB Pump.
  • UDG requests are recorded but subsequent requests to data sources are currently ignored.
  • Currently, Graph MongoDB Pump data can not be used in Tyk Dashboard yet, the data is only stored for recording purposes at the moment and can be exported to external tools for further analysis.

SQL

Starting with Version 1.8.0 of Tyk Pump and version 5.0.0 of the Tyk Gateway; It is possible to export GraphQL analytics to an SQL database.

Setting up Graph SQL Pump

With the Graph SQL pump currently includes information (per request) like:

  • Types Requested
  • Fields requested for each type
  • Error Information
  • Root Operations Requested.

Setup steps include:

  1. Set enable_anaytics to true in your tyk.conf.
  2. Enable Detailed recording by setting enable_detailed_recording in your tyk.conf to true. This is needed so that the GraphQL information can be parsed from the request body and response.

Note

This will enable detailed recording globally, across all APIs. This means that the behavior of individual APIs that have this configuration parameter set will be overridden. The Gateway must be restarted after updating this configuration parameter.

  1. Configure your pump.conf using this sample configuration:
"sql-graph": {
      "meta": {
        "type": "postgres",
        "table_name": "tyk_analytics_graph",
        "connection_string": "host=localhost user=postgres password=password dbname=postgres",
        "table_sharding": false
      }
},

The Graph SQL pump currently supports postgres, sqlite and mysql databases. The table_name refers to the table that will be created in the case of unsharded setups, and the prefix that will be used for sharded setups e.g tyk_analytics_graph_20230327.

The Graph SQL pump currently has the same limitations as the Graph Mongo Pump.

Setting up Graph SQL Aggregate Pump

The sql-graph-aggregate can be configured similar to the Graph SQL pump:

 "sql-graph-aggregate": {
    "meta": {
    "type": "postgres",
    "connection_string": "host=localhost port=5432 user=postgres dbname=postgres password=password",
    "table_sharding": false
  }
}

External Data Stores

The Tyk Pump component takes all of the analytics in Tyk and moves the data from the Gateway into your Dashboard. It is possible to set it up to send the analytics data it finds to other data stores. Currently we support the following:

See the Tyk Pump Configuration for more details.

CSV

Tyk Pump can be configured to create or modify a CSV file to track API Analytics.

JSON / Conf file

Add the following configuration fields to the pumps section within your pump.conf file:

{
  "csv": 
  {
    "type": "csv",
    "meta": {
      "csv_dir": "./your_directory_here"
    }
  }
}

Environment variables

TYK_PMP_PUMPS_CSV_TYPE=csv
TYK_PMP_PUMPS_CSV_META_CSVDIR=./your_directory_here

Datadog

The Tyk Pump can be configured to send your API traffic analytics to Datadog with which you can build a dashboards with various metrics based on your API traffic in Tyk.

Datadog dashboard example

We ceated a defaulkt Tyk dashboard canvat to give our users an easier starting point. You can find it in Datadog portal, under the Dashboards --> lists section, (https://app.datadoghq.com/dashboard/lists)[https://app.datadoghq.com/dashboard/lists], and it is called Tyk Analytics Canvas. To use this dashboard you will need to make sure that your datadog agent deployment has the following tag env:tyk-demo-env and that your Tyk Pump configuration has dogstatsd.meta.namespace set to pump. You can also import it from Datadog official GH repo and change those values in the dashboard itself to visualize your analytics data as it flows into Datadog.

Sample Datadog dashboard

Prerequisites

How it works

When running the Datadog Agent, DogstatsD gets the request_time metric from your Tyk Pump in real time, per request, so you can understand the usage of your APIs and get the flexibility of aggregating by various parameters such as date, version, returned code, method etc.

Tyk Pump configuration

Below is a sample DogstatD section from a Tyk pump.conf file

"dogstatsd": {
  "type": "dogstatsd",
  "meta": {
    "address": "dd-agent:8126",
    "namespace": "tyk",
    "async_uds": true,
    "async_uds_write_timeout_seconds": 2,
    "buffered": true,
    "buffered_max_messages": 32,
    "sample_rate": 0.9999999999,
    "tags": [
      "method",
      "response_code",
      "api_version",
      "api_name",
      "api_id",
      "org_id",
      "tracked",
      "path",
      "oauth_id"
    ]
  }
},
Field descriptions
  • address: address of the datadog agent including host & port
  • namespace: prefix for your metrics to datadog
  • async_uds: Enable async UDS over UDP
  • async_uds_write_timeout_seconds: Integer write timeout in seconds if async_uds: true
  • buffered: Enable buffering of messages
  • buffered_max_messages: Max messages in single datagram if buffered: true. Default 16
  • sample_rate: default 1 which equates to 100% of requests. To sample at 50%, set to 0.5
  • tags: List of tags to be added to the metric. The possible options are listed in the below example

If no tag is specified the fallback behavior is to use the below tags:

  • path
  • method
  • response_code
  • api_version
  • api_name
  • api_id
  • org_id
  • tracked
  • oauth_id

Note that this configuration can generate significant data due to the unbound nature of the path tag.

On startup, you should see the loaded configs when initialising the DogstatsD pump

[May 10 15:23:44]  INFO dogstatsd: initializing pump
[May 10 15:23:44]  INFO dogstatsd: namespace: pump.
[May 10 15:23:44]  INFO dogstatsd: sample_rate: 50%
[May 10 15:23:44]  INFO dogstatsd: buffered: true, max_messages: 32
[May 10 15:23:44]  INFO dogstatsd: async_uds: true, write_timeout: 2s

Elasticsearch

Elasticsearch is a highly scalable and distributed search engine that is designed to handle large amounts of data.

JSON / Conf

Add the following configuration fields to the pumps section within your pump.conf file:

{
  "pumps": {
      "elasticsearch": {
        "type": "elasticsearch",
        "meta": {
          "index_name": "tyk_analytics",
          "elasticsearch_url": "http://localhost:9200",
          "enable_sniffing": false,
          "document_type": "tyk_analytics",
          "rolling_index": false,
          "extended_stats": false,
          "version": "6"
        }
      }
    }
}

Configuration fields

  • index_name: The name of the index that all the analytics data will be placed in. Defaults to tyk_analytics
  • elasticsearch_url: If sniffing is disabled, the URL that all data will be sent to. Defaults to http://localhost:9200
  • enable_sniffing: If sniffing is enabled, the elasticsearch_url will be used to make a request to get a list of all the nodes in the cluster, the returned addresses will then be used. Defaults to false
  • document_type: The type of the document that is created in Elasticsearch. Defaults to tyk_analytics
  • rolling_index: Appends the date to the end of the index name, so each days data is split into a different index name. For example, tyk_analytics-2016.02.28. Defaults to false.
  • extended_stats: If set to true will include the following additional fields: Raw Request, Raw Response and User Agent.
  • version: Specifies the ES version. Use 3 for ES 3.X, 5 for ES 5.X, 6 for ES 6.X, 7 for ES 7.X . Defaults to 3.
  • disable_bulk: Disable batch writing. Defaults to false.
  • bulk_config: Batch writing trigger configuration. Each option is an OR with each other:
    • workers: Number of workers. Defaults to 1.
    • flush_interval: Specifies the time in seconds to flush the data and send it to ES. Default is disabled.
    • bulk_actions: Specifies the number of requests needed to flush the data and send it to ES. Defaults to 1000 requests. If it is needed, can be disabled with -1.
    • bulk_size: Specifies the size (in bytes) needed to flush the data and send it to ES. Defaults to 5MB. Can be disabled with -1.

Environment variables

TYK_PMP_PUMPS_ELASTICSEARCH_TYPE=elasticsearch
TYK_PMP_PUMPS_ELASTICSEARCH_META_INDEXNAME=tyk_analytics
TYK_PMP_PUMPS_ELASTICSEARCH_META_ELASTICSEARCHURL=http://localhost:9200
TYK_PMP_PUMPS_ELASTICSEARCH_META_ENABLESNIFFING=false
TYK_PMP_PUMPS_ELASTICSEARCH_META_DOCUMENTTYPE=tyk_analytics
TYK_PMP_PUMPS_ELASTICSEARCH_META_ROLLINGINDEX=false
TYK_PMP_PUMPS_ELASTICSEARCH_META_EXTENDEDSTATISTICS=false
TYK_PMP_PUMPS_ELASTICSEARCH_META_VERSION=5
TYK_PMP_PUMPS_ELASTICSEARCH_META_BULKCONFIG_WORKERS=2
TYK_PMP_PUMPS_ELASTICSEARCH_META_BULKCONFIG_FLUSHINTERVAL=60

Moesif

This is a step by step guide to setting up Moesif API Analytics and Monetization platform to understand customer API usage and setup usage-based billing.

We also have a blog post which highlights how Tyk and Moesif work together.

The assumptions are that you have Docker installed and Tyk Self-Managed already running. See the Tyk Pump Configuration for more details.

Overview

With the Moesif Tyk plugin, your API logs are sent to Moesif asynchronously to provide analytics on customer API usage along with your API payloads like JSON and XML. This plugin also enables you to monetize your API with billing meters and provide a self-service onboarding experience. Moesif also collects information such as the authenticated user (AliasId or OAuthId) to identify customers using your API. An overview on how Moesif and Tyk works together is available here.

Steps for Configuration

  1. Get a Moesif Application Id

    Go to www.moesif.com and sign up for a free account. Application Ids are write-only API keys specific to an application in Moesif such as “Development” or “Production”. You can always create more applications in Moesif.

  2. Enable Moesif backend in Tyk Pump

    Add Moesif as an analytics backend along with your Moesif Application Id you obtained in the last step to your Tyk Pump Configuration

####### JSON / Conf File

{
    "pumps": {
        "moesif": {
            "name": "moesif",
            "meta": {
            "application_id": "Your Moesif Application Id"
            }
        }
    }
}

####### Env Variables:

TYK_PMP_PUMPS_MOESIF_TYPE=moesif
TYK_PMP_PUMPS_MOESIF_META_APPLICATIONID=your_moesif_application_id
  1. Ensure analytics is enabled

If you want to log HTTP headers and body, ensure the detailed analytics recording flag is set to true in your Tyk Gateway Conf

####### JSON / Conf File

{
    "enable_analytics" : true,
    "analytics_config": {
      "enable_detailed_recording": true
    }
}

####### Env Variables:

TYK_GW_ENABLEANALYTICS=true
TYK_GW_ANALYTICSCONFIG_ENABLEDETAILEDRECORDING=true

Note

This will enable detailed recording globally, across all APIs. This means that the behavior of individual APIs that have this configuration parameter set will be overridden. The Gateway must be restarted after updating this configuration parameter.

  1. Restart Tyk Pump to pickup the Moesif config

Once your config changes are done, you need to restart your Tyk Pump and Tyk Gateway instances (if you’ve modified Tyk gateway config). If you are running Tyk Pump in Docker:

$ docker restart tyk-pump

  1. PROFIT!

You can now make a few API calls and verify they show up in Moesif.

$ curl localhost:8080

Step5

The Moesif Tyk integration automatically maps a Tyk Token Alias to a user id in Moesif. With a Moesif SDK, you can store additional customer demographics to break down API usage by customer email, company industry, and more.

Configuration options

The Tyk Pump for Moesif has a few configuration options that can be set in your pump.env:

Parameter Required Description Environment Variable
application_id required Moesif Application Id. Multiple Tyk api_id’s will be logged under the same app id. TYK_PMP_PUMPS_MOESIF_META_APPLICATIONID
request_header_masks optional Mask a specific request header field. Type: String Array [] string TYK_PMP_PUMPS_MOESIF_META_REQUESTHEADERMASKS
request_body_masks optional Mask a specific - request body field. Type: String Array [] string TYK_PMP_PUMPS_MOESIF_META_REQUESTBODYMASKS
response_header_masks optional Mask a specific response header field. Type: String Array [] string TYK_PMP_PUMPS_MOESIF_META_RESPONSEHEADERMASKS
response_body_masks optional Mask a specific response body field. Type: String Array [] string TYK_PMP_PUMPS_MOESIF_META_RESPONSEBODYMASKS
disable_capture_request_body optional Disable logging of request body. Type: Boolean. Default value is false. TYK_PMP_PUMPS_MOESIF_META_DISABLECAPTUREREQUESTBODY
disable_capture_response_body optional Disable logging of response body. Type: Boolean. Default value is false. TYK_PMP_PUMPS_MOESIF_META_DISABLECAPTURERESPONSEBODY
user_id_header optional Field name to identify User from a request or response header. Type: String. Default maps to the token alias TYK_PMP_PUMPS_MOESIF_META_USERIDHEADER
company_id_header optional Field name to identify Company (Account) from a request or response header. Type: String TYK_PMP_PUMPS_MOESIF_META_COMPANYIDHEADER

Identifying users

By default, the plugin will collect the authenticated user (AliasId or OAuthId) to identify the customer. This can be overridden by setting the user_id_header to a header that contains your API user/consumer id such as X-Consumer-Id. You can also set the company_id_header which contains the company to link the user to. See Moesif docs on identifying customers

Splunk

This is a step by step guide to setting Splunk to receive logs from the Tyk Pump.

The assumptions are that you have Docker installed and Tyk Pro Self-Managed already running.

Steps for Configuration

  1. Run Splunk using Docker

    Assuming you have Docker installed locally, run the following from a terminal:

    $ docker run \
    -p 8000:8000 \
    -p 8088:8088 \
    -v splunk-data:/opt/splunk/var \
    -v splunk-data:/opt/splunk/etc \
    -e SPLUNK_START_ARGS=--accept-license \
    -e SPLUNK_PASSWORD=mypassword \
    splunk/splunk:latest
    
  2. Setup a collector in Splunk

    A) Visit http://localhost:8000 and log into the Splunk Dashboard using the username admin and the password we set in the Docker run command, mypassword

    B) Create a new Data input

    Step1

    C) Select HTTP Event Collector -> Add New

    Step2

    D) Set the name to “tyk” and then leave everything else as default

    Step2b

    Grab your token at the end page:

    Step3

  3. Add the Splunk bit to pump.conf

    Edit your pump’s pump.conf and add this bit to the “Pumps” section, like so, adding the token from step #1:

    Make sure to add your token from the previous step into the collector_token field above

    {
        "pumps": {
            "splunk": {
                "type": "splunk",
                "meta": {
                    "collector_token": "<token>",
                    "collector_url": "https://localhost:8088/services/collector/event",
                    "ssl_insecure_skip_verify": true
                }
            }
        }
    }
    
    **Note**  
    
    Make sure that the `localhost` value matches with your setup. Head on over to our [community forum](https://community.tyk.io/) to ask for help if you are stuck here.
    
  4. Restart Tyk Pump to pickup the Splunk config

    If you are running Tyk Pump in Docker:

    $ docker restart tyk-pump

  5. PROFIT!

    Let’s make a few API calls against Tyk, and see if they flow into Splunk

    $ curl localhost:8080/loan-service-api/
    
    {
        "error": "Key not authorized"
    }%
    

    Success:

    Step4

Logzio

Logz.io is a cloud-based log management and analytics platform that provides log management built on Elasticsearch, Logstash and Kibana.

JSON / Conf file

Add the following configuration fields to the pumps section within your pump.conf file:

{
  "pumps"
  {
    "logzio": {
        "type": "logzio",
        "meta": {
          "token": "<YOUR-LOGZ.IO-TOKEN>"
        }
    }
  }
}

Environment variables

TYK_PMP_PUMPS_LOGZIO_TYPE=logzio
TYK_PMP_PUMPS_LOGZIO_META_TOKEN="{YOUR-LOGZIO-TOKEN}"

Advanced configuration fields

  • meta.url: Use if you do not want to use the default Logz.io URL, for example when using a proxy. The default url is https://listener.logz.io:8071.
  • meta.queue_dir: The directory for the queue.
  • meta.drain_duration: This sets the drain duration (when to flush logs on the disk). The default value is 3s.
  • meta.disk_threshold: Set the disk queue threshold. Once the threshold is crossed the sender will not enqueue the received logs. The default value is 98 (percentage of disk).
  • meta.check_disk_space: Set the sender to check if it crosses the maximum allowed disk usage. The default value is true.

Tyk Analytics Record Fields

Below is a detailed list of each field contained within our Tyk Analytics Record that is sent from Tyk Pump.

Method

Request method.

Example: GET, POST.

Host

Request Host header.

Remarks: Includes host and optional port number of the server to which the request was sent. Example: tyk.io, or tyk.io:8080 if port is included.

Path

Request path.

Remarks: Displayed in decoded form.
Example: /foo/bar for /foo%2Fbar or /foo/bar.

RawPath

Request path.

Remarks: Original request path without changes just decoded.
Example: /foo/bar for /foo%2Fbar or /foo/bar.

ContentLength

Request Content-Length header.

Remarks: The number of bytes in the request body.
Example: 10 for request body 0123456789.

UserAgent

Request User-Agent header.

Example: curl/7.86.0.

Day

Request day.

Remarks: Based on TimeStamp field.
Example: 16 for 2022-11-16T03:01:54Z.

Month

Request month.

Remarks: Based on TimeStamp field.
Example: 11 for 2022-11-16T03:01:54Z.

Year

Request year.

Remarks: Based on TimeStamp field. Example: 2022 for 2022-11-16T03:01:54Z.

Hour

Request hour.

Remarks: Based on TimeStamp field.
Example: 3 for 2022-11-16T03:01:54Z.

ResponseCode

Response code.

Remarks: Only contains the integer element of the response code. Can be generated by either the gateway or upstream server, depending on how the request is handled.
Example: 200 for 200 OK.

APIKey

Request authentication key.

Remarks: OAuthentication key, as provided in request. If no API key is provided then gateway will substitute a default value.
Example: Unhashed auth_key, hashed 6129dc1e8b64c6b4, or 00000000 if no authentication provided.

TimeStamp

Request timestamp.

Remarks: Generated by the gateway, based on the time it receives the request from the client.
Example: 2022-11-16T03:01:54.648+00:00.

APIVersion

Version of API Definition requested.

Remarks: Based on version configuration of context API definition. If API is unversioned then value is “Not Versioned”.
Example: Could be an alphanumeric value such as 1 or b. Is Not Versioned if not versioned.

APIName

Name of API Definition requested.

Example: Foo API.

APIID

Id of API Definition requested.

Example: 727dad853a8a45f64ab981154d1ffdad.

OrgID

Organization Id of API Definition requested.

Example: 5e9d9544a1dcd60001d0ed20.

OauthID

Id of OAuth client.

Remarks: Value is empty string if not using OAuth, or OAuth client not present.
Example: my-oauth-client-id.

RequestTime

Duration of upstream roundtrip.

Remarks: Equal to value of Latency.Total field. Example: 3 for a 3ms roundtrip.

RawRequest

Raw HTTP request.

Remarks: Base64 encoded copy of the request sent from the gateway to the upstream server.
Example: R0VUIC9nZXQgSFRUUC8xLjEKSG9zdDogdHlrLmlv.

RawResponse

Raw HTTP response.

Remarks: Base64 encoded copy of the response sent from the gateway to the client.
Example: SFRUUC8xLjEgMjAwIE9LCkNvbnRlbnQtTGVuZ3RoOiAxOQpEYXRlOiBXZWQsIDE2IE5vdiAyMDIyIDA2OjIxOjE2IEdNVApTZXJ2ZXI6IGd1bmljb3JuLzE5LjkuMAoKewogICJmb28iOiAiYmFyIgp9Cg==.

IPAddress

Client IP address.

Remarks: Taken from either X-Real-IP or X-Forwarded-For request headers, if set. Otherwise, determined by gateway based on request.
Example: 172.18.0.1.

Geo

Client geolocation data.

Remarks: Calculated using MaxMind database, based on client IP address.
Example: {"country":{"isocode":"SG"},"city":{"geonameid":0,"names":{}},"location":{"latitude":0,"longitude":0,"timezone":""}}.

Network

Network statistics.

Remarks: Not currently used.

Latency

Latency statistics

Remarks: Contains two fields; upstream is the roundtrip duration between the gateway sending the request to the upstream server and it receiving a response. total is the upstream value plus additional gateway-side functionality such as processing analytics data.
Example: {"total":3,"upstream":3}.

Note

We record the round trip time of the call from the gateways reverse proxy. So what you get is the sum of leaving Tyk -> upstream -> response received back at Tyk.

Tags

Session context tags.

Remarks: Can contain many tags which refer to many things, such as the gateway, API key, organization, API definition etc.
Example: ["key-00000000","org-5e9d9544a1dcd60001d0ed20","api-accbdd1b89e84ec97f4f16d4e3197d5c"].

Alias

Session alias.

Remarks: Alias of the context authenticated identity. Blank if no alias set or request is unauthenticated.
Example: my-key-alias.

TrackPath

Tracked endpoint flag.

Remarks: Value is true if the requested endpoint is configured to be tracked, otherwise false.
Example: true or false.

ExpireAt

Future expiry date.

Remarks: Can be used to implement automated data expiry, if supported by storage.
Example: 2022-11-23T07:26:25.762+00:00.

Monitor your APIs with Prometheus

Your Tyk Pump can expose Prometheus metrics for the requests served by your Tyk Gateway. This is helpful if you want to track how often your APIs are being called and how they are performing. Tyk collects latency data of how long your services take to respond to requests, how often your services are being called and what status code they return.

We have created a demo project in GitHub if you want to see this setup in action.

Prerequisites

  • A Tyk installation (either Self-Managed or Open Source Gateway)
  • Tyk Pump 1.6 or higher

Configure Tyk Pump to expose Prometheus metrics

Prometheus collects metrics from targets by scraping metrics HTTP endpoints. To expose Tyk’s metrics in the Prometheus format, you need to add the following lines to your Tyk Pump configuration file pump.conf:

Host

"prometheus": {
 "type": "prometheus",
 "meta": {
   "listen_address": "<tyk-pump>:9090",
   "path": "/metrics",
   "custom_metrics":[
     {
         "name":"tyk_http_requests_total",
         "description":"Total of API requests",
         "metric_type":"counter",
         "labels":["response_code","api_name","method","api_key","alias","path"]
     },
     {
         "name":"tyk_http_latency",
         "description":"Latency of API requests",
         "metric_type":"histogram",
         "labels":["type","response_code","api_name","method","api_key","alias","path"]
     }
 ]
 }
}

Replace <tyk-pump> with your host name or IP address.

Docker

"prometheus": {
 "type": "prometheus",
 "meta": {
   "listen_address": ":9090",
   "path": "/metrics",
   "custom_metrics":[
     {
         "name":"tyk_http_requests_total",
         "description":"Total of API requests",
         "metric_type":"counter",
         "labels":["response_code","api_name","method","api_key","alias","path"]
     },
     {
         "name":"tyk_http_latency",
         "description":"Latency of API requests",
         "metric_type":"histogram",
         "labels":["type","response_code","api_name","method","api_key","alias","path"]
     }
 ]
 }
}

Port 9090 needs also to be exported by Docker in addition to the port used for health check (here 8083), e.g. with Docker compose:

tyk-pump:
   image: tykio/tyk-pump-docker-pub:${PUMP_VERSION}
   ports:
   - 8083:8083
   - 9090:9090

Restart your Pump to apply to configuration change.

Verify that the metrics are being exposed by calling the metrics endpoint http://<tyk-pump>:9090 from your browser.

Configure Prometheus to scrape the metrics endpoint

Prometheus is configured via a configuration file where you can define the metrics endpoint Prometheus will scrape periodically.

Here’s an example configuration scraping Tyk Pump metric endpoints:

Host

global:
 scrape_interval:     15s
 evaluation_interval: 15s
 
scrape_configs:
 - job_name: tyk
   static_configs:
     - targets: ['tyk-pump:9090']

Docker

global:
 scrape_interval:     15s
 evaluation_interval: 15s
 
scrape_configs:
 - job_name: tyk
   static_configs:
     - targets: ['host.docker.internal:9090']
  1. Then restart your Prometheus instance after any configuration change
  2. In Prometheus under “Status” / “Targets”, we can see that Prometheus is able to scrape the metrics successfully: state is UP.

Prometheus status

Exploring your metrics in Grafana

Before trying out, make sure to generate traffic by calling your APIs. You will find a couple of useful queries in our Tyk Pump GitHub repo based on the metrics exposed by Tyk. These will demonstrate which metric types are exported and how you can customize them.

You also need to make sure that Grafana is connected to your Prometheus server. This can be configured under Configuration / Data sources

Grafana Configuration with Prometheus

Useful queries

Here are some useful queries to help you monitor the health of your APIs:

Upstream time across all services

Tyk collects latency data of how long your upstream services take to respond to requests. This data can be used to configure an alert if the latency goes beyond a certain threshold. This query calculated the 95th percentile of the total request latency of all the upstream services. To run the query:

histogram_quantile(0.95, sum(rate(tyk_http_latency_bucket[1m])) by (le))

Upstream Time Query output

Upstream time per API

This query calculated the 95th percentile of the request latency of upstream services for the selected API. To run this query:

histogram_quantile(0.90, sum(rate(tyk_http_latency_bucket{api_name="<api name>"}[1m])) by (le,api_name))

Replace <api name> with the name of your API for this query.

Request rate

Track the request rate of your services:

sum (rate(tyk_http_requests_total[1m]))

Request Rate per API

Track the request rate of your services for the selected API:

sum (rate(tyk_http_requests_total{api_name="<api name>"}[1m]))

Replace <api name> with the name of your API for this query.

Error Rates

Track the error rate your services are serving:

sum (rate(tyk_http_requests_total{response_code =~"5.."}[1m]))

Error rates per API

Track the error rate your services are serving for the selected API:

sum (rate(tyk_http_requests_total{response_code =~"5..", api_name="httpbin - HTTP Request & Response Service"}[1m]))

Replace <api name> with the name of your API for this query.

Setup Prometheus Pump

We’ll show you how to setup Tyk Pump for Prometheus Service Discovery.

pump-prometheus

Integrate with Prometheus using Prometheus Operator

Steps for Configuration:

  1. Setup Prometheus

    Using the prometheus-community/kube-prometheus-stack chart

    In this example, we use kube-prometheus-stack, which installs a collection of Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.

    helm install prometheus-stack prometheus-community/kube-prometheus-stack -n monitoring --create-namespace
    

    This is a useful stack where you can get Prometheus, the Prometheus Operator, and Grafana all deployed and configured in one go.

  2. Install Tyk Pump with PodMonitor

    If you have Prometheus Operator enabled on the cluster, it would look for “PodMonitor” or “ServiceMonitor” resources and scrap from specified port. The only thing you would need to modify here is the helm release name for Prometheus Operator.

    Also you can customize Prometheus Custom Metrics based on your analytics needs. We are using tyk_http_requests_total and tyk_http_latency described here for illustration:

    NAMESPACE=tyk-oss
    APISecret=foo
    REDIS_BITNAMI_CHART_VERSION=19.0.2
    PromOperator_Release=prometheus-stack
    Prometheus_Custom_Metrics='[{"name":"tyk_http_requests_total"\,"description":"Total of API requests"\,"metric_type":"counter"\,"labels":["response_code"\,"api_name"\,"method"\,"api_key"\,"alias"\,"path"]}\,          {              "name":"tyk_http_latency"\,              "description":"Latency of API requests"\,              "metric_type":"histogram"\,              "labels":["type"\,"response_code"\,"api_name"\,"method"\,"api_key"\,"alias"\,"path"]          }]'
    
    helm upgrade tyk-redis oci://registry-1.docker.io/bitnamicharts/redis -n $NAMESPACE --create-namespace --install --version $REDIS_BITNAMI_CHART_VERSION
    
    helm upgrade tyk-oss tyk-helm/tyk-oss -n $NAMESPACE --create-namespace \
    --install \
    --set global.secrets.APISecret="$APISecret" \
    --set global.redis.addrs="{tyk-redis-master.$NAMESPACE.svc.cluster.local:6379}" \
    --set global.redis.passSecret.name=tyk-redis \
    --set global.redis.passSecret.keyName=redis-password \
    --set global.components.pump=true \
    --set "tyk-pump.pump.backend={prometheus}" \
    --set tyk-pump.pump.prometheusPump.customMetrics=$Prometheus_Custom_Metrics \
    --set tyk-pump.pump.prometheusPump.prometheusOperator.enabled=true \
    --set tyk-pump.pump.prometheusPump.prometheusOperator.podMonitorSelector.release=$PromOperator_Release
    

    Note

    Please make sure you are installing Redis versions that are supported by Tyk. Please refer to Tyk docs to get list of supported versions.

    Note

    For Custom Metrics, commas are escaped to be used in helm –set command. You can remove the backslashes in front of the commas if you are to set it in values.yaml. We have included an example in the default values.yaml comments section.

  3. Verification

    When successfully configured, you could see the following messages in pump log:

    │ time="Jun 26 13:11:01" level=info msg="Starting prometheus listener on::9090" prefix=prometheus-pump                                                  │
    │ time="Jun 26 13:11:01" level=info msg="Prometheus Pump Initialized" prefix=prometheus-pump                                                            │
    │ time="Jun 26 13:11:01" level=info msg="Init Pump: PROMETHEUS" prefix=main
    

    On Prometheus Dashboard, you can see the Pump is listed as one of the target and Prometheus is successfully scrapped from it.

    pump-prometheus

    You can check our Guide on Monitoring API with Prometheus for a list of useful queries you can setup and use.

    e.g. The custom metrics tyk_http_requests_total can be retrieved:

    pump-prometheus

    pump-prometheus

Integrate with Prometheus using annotations

Steps for Configuration:

  1. Setup Prometheus

    Using the prometheus-community/prometheus chart

    Alternatively, if you are not using Prometheus Operator, please check how your Prometheus can support service discovery. Let say you’re using the prometheus-community/prometheus chart, which configures Prometheus to scrape from any Pods with following annotations:

    metadata:
    annotations:
        prometheus.io/scrape: "true"
        prometheus.io/path: /metrics
        prometheus.io/port: "9090"
    

    To install Prometheus, run

    helm install prometheus prometheus-community/prometheus -n monitoring --create-namespace
    
  2. Install Tyk Pump with prometheus annotations

    NAMESPACE=tyk-oss
    APISecret=foo
    REDIS_BITNAMI_CHART_VERSION=19.0.2
    PromOperator_Release=prometheus-stack
    Prometheus_Custom_Metrics='[{"name":"tyk_http_requests_total"\,"description":"Total of API requests"\,"metric_type":"counter"\,"labels":["response_code"\,"api_name"\,"method"\,"api_key"\,"alias"\,"path"]}\,          {              "name":"tyk_http_latency"\,              "description":"Latency of API requests"\,              "metric_type":"histogram"\,              "labels":["type"\,"response_code"\,"api_name"\,"method"\,"api_key"\,"alias"\,"path"]          }]'
    
    helm upgrade tyk-redis oci://registry-1.docker.io/bitnamicharts/redis -n $NAMESPACE --create-namespace --install --version $REDIS_BITNAMI_CHART_VERSION
    
    helm upgrade tyk-oss tyk-helm/tyk-oss -n $NAMESPACE --create-namespace \
    --install \
    --set global.secrets.APISecret="$APISecret" \
    --set global.redis.addrs="{tyk-redis-master.$NAMESPACE.svc.cluster.local:6379}" \
    --set global.redis.passSecret.name=tyk-redis \
    --set global.redis.passSecret.keyName=redis-password \
    --set global.components.pump=true \
    --set "tyk-pump.pump.backend={prometheus}" \
    --set tyk-pump.pump.prometheusPump.customMetrics=$Prometheus_Custom_Metrics \
    --set-string tyk-pump.pump.podAnnotations."prometheus\.io/scrape"=true \
    --set-string tyk-pump.pump.podAnnotations."prometheus\.io/port"=9090 \
    --set-string tyk-pump.pump.podAnnotations."prometheus\.io/path"=/metrics
    

    Note

    Please make sure you are installing Redis versions that are supported by Tyk. Please refer to Tyk docs to get list of supported versions.

  3. Verification

    After some time, you can see that Prometheus is successfully scraping from Tyk Pump:

    pump-prometheus

Expose a service for Prometheus to scrape

You can expose Pump as a service so that Prometheus can access the /metrics endpoint for scraping. Just enable service in tyk-pump.pump.service:

    service:
      # Tyk Pump svc is disabled by default. Set it to true to enable it.
      enabled: true

Tyk Pump Capping Analytics Data Storage

Tyk Gateways can generate a lot of analytics data. A guideline is that for every 3 million requests that your Gateway processes it will generate roughly 1GB of data.

If you have Tyk Pump set up with the aggregate pump as well as the regular MongoDB pump, then you can make the tyk_analytics collection a capped collection. Capping a collection guarantees that analytics data is rolling within a size limit, acting like a FIFO buffer which means that when it reaches a specific size, instead of continuing to grow, it will replace old records with new ones.

Note

If you are using DocumentDB, capped collections are not supported. See here for more details.

The tyk_analytics collection contains granular log data, which is why it can grow rapidly. The aggregate pump will convert this data into a aggregate format and store it in a separate collection. The aggregate collection is used for processing reporting requests as it is much more efficient.

If you’ve got an existing collection which you want to convert to be capped you can use the convertToCapped MongoDB command.

If you wish to configure the pump to cap the collections for you upon creating the collection, you may add the following configurations to your uptime_pump_config and / or mongo.meta objects in pump.conf.

"collection_cap_max_size_bytes": 1048577,
"collection_cap_enable": true

collection_cap_max_size_bytes sets the maximum size of the capped collection. collection_cap_enable enables capped collections.

If capped collections are enabled and a max size is not set, a default cap size of 5Gib is applied. Existing collections will never be modified.

Note

An alternative to capped collections is MongoDB’s Time To Live indexing (TTL). TTL indexes are incompatible with capped collections. If you have set a capped collection, a TTL index will not get created, and you will see error messages in the MongoDB logs. See MongoDB TTL Docs for more details on TTL indexes.

Time Based Cap in single tenant environments

If you wish to reduce or manage the amount of data in your MongoDB, you can add an TTL expire index to the collection, so older records will be evicted automatically.

Note

Time based caps (TTL indexes) are incompatible with already configured size based caps.

Run the following command in your preferred MongoDB tool (2592000 in our example is 30 days):

db.tyk_analytics.createIndex( { "timestamp": 1 }, { expireAfterSeconds: 2592000 } )

This command sets expiration rule to evict all the record from the collection which timestamp field is older then specified expiration time.

Time Based Cap in multi-tenant environments

When you have multiple organizations, you can control analytics expiration on per organization basis. This technique also use TTL indexes, as described above, but index should look like:

db.tyk_analytics.createIndex( { "expireAt": 1 }, { expireAfterSeconds: 0 } )

This command sets the value of expireAt to correspond to the time the document should expire. MongoDB will automatically delete documents from the tyk_analytics collection 0 seconds after the expireAt time in the document. The expireAt will be calculated and created by Tyk in the following step.

Create an Organization Quota

curl --header "x-tyk-authorization: {tyk-gateway-secret}" --header "content-type: application/json" --data @expiry.txt http://{tyk-gateway-ip}:{port}/tyk/org/keys/{org-id}

Where context of expiry.txt is:

{
  "org_id": "{your-org-id}",
  "data_expires": 86400
}

data_expires - Sets the data expires to a time in seconds for it to expire. Tyk will calculate the expiry date for you.

Size Based Cap

Add the Size Cap

Note

The size value should be in bytes, and we recommend using a value just under the amount of RAM on your machine.

Run this command in your MongoDB shell:

use tyk_analytics
db.runCommand({"convertToCapped": "tyk_analytics", size: 100000});

Adding the Size Cap if using a mongo_selective Pump

The mongo_selective pump stores data on a per organization basis. You will have to run the following command in your MongoDB shell for an individual organization as follows.

db.runCommand({"convertToCapped": "z_tyk_analyticz_<org-id>", size: 100000});

Separated Analytics Storage

For high-traffic systems that make heavy use of analytics, it makes sense to separate out the Redis analytics server from the Redis configuration server that supplies auth tokens and handles rate limiting configuration.

To enable a separate analytics server, update your tyk.conf with the following section:

"enable_separate_analytics_store": true,
"analytics_storage": {
  "type": "redis",
  "host": "",
  "port": 0,
  "addrs": [
      "localhost:6379"
  ],
  "username": "",
  "password": "",
  "database": 0,
  "optimisation_max_idle": 3000,
  "optimisation_max_active": 5000,
  "enable_cluster": false
},

Note

addrs is new in v2.9.3, and replaces hosts which is now deprecated.

If you set enable_cluster to false, you only need to set one entry in addrs:

The configuration is the same (and uses the same underlying driver) as the regular configuration, so Redis Cluster is fully supported.


  1. Allowlist - explicitly allowing access to identified entities. Previously known as whitelist. ↩︎

  2. Blocklist - explicitly blocking access to identified entities. Previously known as blacklist. ↩︎