Tyk Pump - Ship Analytics Data to Persistent Datastore
Last updated: 39 minutes read.
Introduction
Traffic analytics are captured by the Gateway nodes and then temporarily stored in Redis. The Tyk Pump is responsible for moving those analytics into a persistent data store, such as MongoDB, where the traffic can be analyzed.
What is the Tyk Pump?
The Tyk Pump is our open source analytics purger that moves the data generated by your Tyk nodes to any back-end. It is primarily used to display your analytics data in the Tyk Dashboard.
Note
The Tyk Pump is not currently configurable in our Tyk Cloud solution.
Tyk Pump Data Flow
Here’s the architecture depending on your deployment model:
Tyk-Pump is both extensible, and flexible- meaning it is possible to configure Tyk-Pump to send data to multiple different backends at the same time as depicted by Pump Backends (i) and (ii), MongoDB and Elasticsearch respectively in Figure 1. Tyk-Pump is scalable, both horizontally and vertically, as indicated by Instances “1”, “2”, and “n”. Additionally, it is possible to apply filters that dictate WHAT analytics go WHERE, please see the docs on sharded analytics configuration here.
|
---|
Figure 1: An architecture diagram illustrating horizontal scaling of “n” Instances of Tyk-Pump each with two different backends. |
Other Supported Backend Services
We list our supported backends here.
Configuring your Tyk Pump
See Tyk Pump Configuration for more details on setting up your Tyk Pump.
Tyk Pump can be horizontally scaled without causing duplicate data, please see the following Table for the supported permutations of Tyk Pump scaling.
Supported | Summary |
---|---|
✅ | Single Pump Instance, Single Backend |
✅ | Single Pump Instance, Multiple Backend(s) |
✅ | Multiple Pump Instances, Same Backend(s) |
❌ | Multiple Pump Instances, Different Backend(s) |
Getting Started
Tyk Pump Configuration
The Tyk Pump is our Open Source analytics purger that moves the data generated by your Tyk nodes to any back-end. By moving the analytics into your supported database, it allows the Tyk Dashboard to display traffic analytics across all your Tyk Gateways.
Tyk Dashboard
MongoDB
The Tyk Dashboard uses the mongo-pump-aggregate
collection to display analytics. This is different than the standard mongo
pump plugin that will store individual analytic items into MongoDB. The aggregate functionality was built to be fast, as querying raw analytics is expensive in large data sets. See Pump Dashboard Config for more details.
SQL
Note
Tyk no longer supports SQLite as of Tyk 5.7.0. To avoid disruption, please transition to PostgreSQL, MongoDB, or one of the listed compatible alternatives.
In v4.0 of the Tyk Dashboard, we added support for the following SQL platforms:
- PostgreSQL
- SQLite
Within your Dashboard configuration file (tyk-analytics.conf
) there is now a storage
section.
{
...
"storage": {
"main":{},
"analytics":{},
"logs":{},
"uptime": {}
}
}
Field description
main
- Main storage (APIs, Policies, Users, User Groups, etc.)analytics
- Analytics storage (used for display all the charts and for all analytics screens)logs
- Logs storage (log browser page)uptime
- uptime tests analytics data
Common settings
For every storage
section, you must populate the following fields:
{
...
"storage": {
...
"main": {
"type": "postgres",
"connection_string": "user=root password=admin database=tyk-demo-db host=tyk-db port=5432",
}
}
}
type
use this field to define your SQL platform (currently SQLite or PostgreSQL are supported)connection_string
the specific connection settings for your platform
The pump needed for storing logs data in the database, is very similar to other pumps as well as the storage setting in your Tyk Dashboard config. It just requires the sql
name and database specific configuration options.
####### SQL example
"sql": {
"name": "sql",
"meta": {
"type": "postgres",
"connection_string": "user=laurentiughiur password=test123 database=tyk-demo-db host=127.0.0.1 port=5432"
}
},
Capping analytics data
Tyk Gateways can generate a lot of analytics data. Be sure to read about capping your Dashboard analytics
Omitting the configuration file
From Tyk Pump 1.5.1+, you can configure an environment variable to omit the configuration file with the TYK_PMP_OMITCONFIGFILE
variable.
This is specially useful when using Docker, since by default, the Tyk Pump has a default configuration file with pre-loaded pumps.
Sharding analytics to different data sinks
In a multi-organization deployment, each organization, team, or environment might have their preferred analytics tooling. This capability allows the Tyk Pump to send analytics for different organizations or various APIs to different destinations. E.g. Org A can send their analytics to MongoDB + DataDog while Org B can send their analytics to DataDog + expose the Prometheus metrics endpoint.
Configuring the sharded analytics
You can achieve the sharding by setting both an allowlist1 t and a blocklist2 , meaning that some data sinks can receive information for all orgs, whereas other data sinks will not receive certain organization’s analytics if it was block listed.
This feature makes use of the field called filters
, which can be defined per pump. This is its structure:
"filters":{
"api_ids":[],
"org_ids":[],
"skip_api_ids":[],
"skip_org_ids":[]
}
api_ids
andorg_ids
works as allow list (APIs and orgs where we want to send the analytic records).skip_api_ids
andskip_org_ids
works as block list (APIs and orgs where we want to filter out and not send their the analytic records).
The priority is always a blocklist2 over a allowlist1 .
An example of configuration would be:
"csv": {
"type": "csv",
"filters": {
"org_ids": ["org1","org2"]
},
"meta": {
"csv_dir": "./bar"
}
},
"elasticsearch": {
"type": "elasticsearch",
"filters": {
"skip_api_ids": ["api_id_1"],
},
"meta": {
"index_name": "tyk_analytics",
"elasticsearch_url": "https://elasticurl:9243",
"enable_sniffing": false,
"document_type": "tyk_analytics",
"rolling_index": false,
"extended_stats": false,
"version": "6"
}
}
With this configuration, all the analytics records related to org1
or org2
will go to the csv
backend and everything but analytics records from api_id_1
to elasticsearch
.
Setup Dashboard Analytics
To enable Dashboard Analytics, you would need to configure Tyk Pump to send analytic data to the Dashboard storage MongoDB / SQL.
These are the different pumps that handle different kinds of analytic data.
Analytics | Activities Graph | Log Browser | Uptime Analytics |
---|---|---|---|
Mongo (Multi organization) | Mongo Aggregate Pump | Mongo Selective Pump | Uptime Pump |
Mongo (Single organization) | Mongo Aggregate Pump | Mongo Pump | Uptime Pump |
SQL | SQL Aggregate Pump | SQL Pump | Uptime Pump |
See below details about these pumps, their configs, matching collections and relevant dashboard setting, to view this data.
MongoDB
Mongo Pump
mongo
Pump simply saves all individual requests across every organization to a collection called tyk_analytics
. Each request will be stored as a single document.
Pump Config
{
...
"pumps": {
"mongo": {
"type": "mongo",
"meta": {
"collection_name": "tyk_analytics",
"mongo_url": "mongodb://username:password@{hostname:port},{hostname:port}/{db_name}"
}
}
}
Capping
This collection should be capped due to the number of individual documents. This is especially important if the detailed_recording
in the Gateway is turned on which means that the Gateway records the full payload of the request and response.
Omitting Indexes
From Pump 1.6+, the Mongo Pumps indexes default behavior is changed and the new configuration option omit_index_creation
is available. This option is applicable to the following Pumps: Mongo Pump
,Mongo Aggregate Pump
and Mongo Selective Pump
.
The behavior now depends upon the value of ‘omit_index_creation’ and the Pump in use, as follows:
- If
omit_index_creation
is set totrue
, tyk-pump will not create any indexes (for Mongo pumps). - If
omit_index_creation
is set tofalse
(default) and you are usingDocumentDB
, tyk-pump will create the Mongo indexes. - If
omit_index_creation
is set tofalse
(default) and you are usingMongoDB
, the behavior of tyk-pump depends upon whether the collection already exists:- If the collection exists, tyk-pump will not create the indexes again.
- If the collection does not already exist, tyk-pump will create the indexes.
Dashboard Setting
In API Usage Data > Log Browser screen you will see all the individual requests that the Gateway has recorded and saved in tyk_analytics
collection using the mongo
pump.
Because you have the option to store and display analytics of every organization or separately per organization, you need to configure the Tyk Dashboard with the matching setting according to the way you set the pump to store the data in MongoDB. The field use_sharded_analytics controls the collection that the dashboard will query.
- If
use_sharded_analytics: false
- the dashboard will query the collectiontyk_analytics
that mongo pump populated - If
use_sharded_analytics: true
- the dashboard will query the collection thatmongo-pump-selective
pump populated
Mongo Aggregate Pump
mongo-pump-aggregate
pump stores data in a collection called z_tyk_analyticz_aggregate_{ORG ID}.
Pump Config
{
...
"pumps": {
"mongo-pump-aggregate": {
"name": "mongo-pump-aggregate",
"meta": {
"mongo_url": "mongodb://username:password@{hostname:port},{hostname:port}/{db_name}",
"use_mixed_collection": true
}
}
}
}
use_mixed_collection: true
- will store analytics to both your organization defined collectionsz_tyk_analyticz_aggregate_{ORG ID}
and your org-lesstyk_analytics_aggregates
collection.use_mixed_collection: false
- your pump will only store analytics to your org defined collection.
tyk_analytics_aggregates
collection is used to query analytics across your whole Tyk setup. This can be used, for example, by a superuser role that is not attached to an organization. When set to true
, you also need to set use_sharded_analytics to true in your Dashboard config.
Dashboard Setting
This pump supplies the data for the following sub categories API Usage Data
:
- Activity by API screen
- Activity by Key screen
- Errors screen
As with the regular analytics, because Tyk gives you the option to store and display aggregated analytics across all organizations or separately per organization, you need to configure the Tyk Dashboard with the matching setting according to the way to set the pump to store the data in MongoDB, otherwise, you won’t see the data in the Dashboard.
- The enable_aggregate_lookups: true field must be set in the Dashboard configuration file, in order for the Dashboard to query and display the aggregated data that
mongo-pump-aggregate
saved to MongoDB.
Capping
As a minimal number of documents get stored, you don’t need to worry about capping this. The documents contain aggregate info across an individual API, such as total requests, errors, tags and more.
####### High Traffic Environment Settings
If you have a high traffic environment, and you want to ignore aggregations to avoid Mongo overloading and/or reduce aggregation documents size, you can do it using the ignore_aggregations
configuration option. The possible values are:
- APIID
- Errors
- Versions
- APIKeys
- OauthIDs
- Geo
- Tags
- Endpoints
- KeyEndpoint
- OauthEndpoint
- ApiEndpoint
For example, if you want to ignore the API Keys aggregations:
pump.conf:
{
...
"pumps": {
"mongo-pump-aggregate": {
"name": "mongo-pump-aggregate",
"meta": {
"mongo_url": "mongodb://username:password@{hostname:port},{hostname:port}/{db_name}",
"use_mixed_collection": true,
"ignore_aggregations": ["APIKeys"]
}
}
}
}
####### Unique Aggregation Points
In case you set your API definition in the Tyk Gateway to tag unique headers (like request_id
or timestamp), this collection can grow a lot since aggregation of unique values simply creates a record/document for every single value with a counter of 1. To mitigate this, avoid tagging unique headers as the first option. If you can’t change the API definition quickly, you can add the tag to the ignore list "ignore_aggregations": ["request_id"]
. This ensures that Tyk pump does not aggregate per request_id
.
Also, if you are not sure what’s causing the growth of the collection, you can also set time capping on these collections and monitor them.
Mongo Selective Pump
mongo-pump-selective
pump stores individual requests per organization in collections called z_tyk_analyticz_{ORG ID}
.
Similar to the regular mongo
pump, Each request will be stored as a single document.
Pump Config
This collection should be capped due to the number of individual documents.
{
...
"pumps": {
"mongo-pump-selective": {
"name": "mongo-pump-selective",
"meta": {
"mongo_url": "mongodb://username:password@{hostname:port},{hostname:port}/{db_name}",
"use_mixed_collection": true
}
}
}
}
Capping
This collection should be capped due to the number of individual documents.
Dashboard Setting
As with the regular analytics, if you are using the Selective pump, you need to set use_sharded_keys: true
in the dashboard config file so it will query z_tyk_analyticz_{ORG ID}
collections to populate the Log Browser
.
Uptime Tests Analytics
Pump Configuration
"uptime_pump_config": {
"collection_name": "tyk_uptime_analytics",
"mongo_url": "mongodb://tyk-mongo:27017/tyk_analytics",
},
Tyk Dashboard Configuration
“storage” : {
...
“uptime”: {
"type": "postgres",
"connection_string": "user=root password=admin database=tyk-demo-db host=tyk-db port=5432",
}
}
}
Tyk Gateway Setting
To enable Uptime Pump, modify gateway configuration enable_uptime_analytics to true.
SQL
When using one of our supported SQL platforms, Tyk offers 3 types of SQL pumps:
- Aggregated Analytics:
sql_aggregate
- Raw Logs Analytics:
sql
- Uptime Tests Analytics
In a production environment, we recommend sharding. You can configure your analytics in the following ways:
- Sharding raw logs
- Sharding aggregated analytics
- Sharding uptime tests
SQL Pump
While aggregated analytics offer a decent amount of details, there are use cases when you’d like to have access to all request details in your analytics. For that you can generate analytics based on raw logs. This is especially helpful when, once you have all the analytics generated based on raw logs stored in your SQL database, you can then build your own custom metrics, charts etc. outside of your Tyk Dashboard, which may bring more value to your product.
The pump needed for storing log data in the database is very similar to other pumps as well as the storage setting in the Tyk Dashboard config. It just requires the SQL name and database-specific configuration options.
SQL Pump Configuration
For storing logs into the tyk_analytics
database table.
"sql": {
"name": "sql",
"meta": {
"type": "postgres",
"connection_string": "host=localhost port=5432 user=admin dbname=postgres_test password=test",
"table_sharding": false
}
}
type
- The supported types are sqlite
and postgres
.
connection_string
- Specifies the connection string to the database. For example, for sqlite
it will be the path/name of the database, and for postgres
, specifying the host, port, user, password, and dbname.
log_level
- Specifies the SQL log verbosity. The possible values are: info
,error
and warning
. By default, the value is silent
, which means that it won’t log any SQL query.
table_sharding
- Specifies if all the analytics records are going to be stored in one table or in multiple tables (one per day). By default, it is set to false
.
If table_sharding
is false
, all the records are going to be stored in the tyk_analytics
table. If set to true
, daily records are stored in a tyk_analytics_YYYYMMDD
date formatted table.
Dashboard Setting
In the API Usage Data > Log Browser screen you will see all the individual requests that the Gateway has recorded and saved in tyk_analytics
collection using the sql
pump.
Make sure you have configured the dashboard with your SQL database connection settings:
{
...
"storage" : {
...
"analytics": {
"type": "postgres",
"connection_string": "user=root password=admin host=tyk-db database=tyk-demo-db port=5432",
}
}
}
SQL Aggregate Pump
This is the default option offered by Tyk, because it is configured to store the most important analytics details which will satisfy the needs of most of our clients. This allows your system to save database space and reporting is faster, consuming fewer resources.
SQL Aggregate Pump Configuration
For storing logs into the tyk_aggregated
database table.
"sql_aggregate": {
"name": "sql_aggregate",
"meta": {
"type": "postgres",
"connection_string": "host=localhost port=5432 user=admin dbname=postgres_test password=test",
"table_sharding": true
}
}
type
- The supported types are sqlite
and postgres
.
connection_string
- Specifies the connection string to the database. For example, for sqlite
it will be the path/name of the database, and for postgres
, specifying the host, port, user, password, and dbname.
log_level
- Specifies the SQL log verbosity. The possible values are: info
, error
, and warning
. By default, the value is silent
, which means that it won’t log any SQL query.
track_all_paths
- Specifies if it should store aggregated data for all the endpoints. By default, it is set to false
, which means that it only stores aggregated data for tracked endpoints
.
ignore_tag_prefix_list
- Specifies prefixes of tags that should be ignored.
table_sharding
- Specifies if all the analytics records are going to be stored in one table or in multiple tables (one per day). By default, it is set to false
.
If table_sharding
is false
, all the records are going to be stored in the tyk_aggregated
table. If set to true
, daily records are stored in a tyk_aggregated_YYYYMMDD
date formatted table.
Dashboard Setting
This pump supplies the data for the following sub categories API Usage Data
:
- Activity by API screen
- Activity by Key screen
- Errors screen
As with the regular analytics, because Tyk gives you the option to store and display aggregated analytics across all organizations or separately per organization, you need to configure the Tyk Dashboard with the matching set according to the way to set the pump to store the data in SQL, otherwise, you won’t see the data in the Dashboard.
-
The enable_aggregate_lookups: true field must be set in the Dashboard configuration file, in order for the Dashboard to query and display the aggregated data that
sql-aggregate
saved to the database. -
Make sure you have configured the dashboard with your SQL database connection settings:
{
...
"storage": {
...
"analytics": {
"type": "postgres",
"connection_string": "user=root password=admin host=tyk-db database=tyk-demo-db port=5432",
}
}
}
SQL Uptime Pump
In an uptime_pump_config
section, you can configure a SQL uptime pump. To do that, you need to add the field uptime_type
with sql
value.
"uptime_pump_config": {
"uptime_type": "sql",
"type": "postgres",
"connection_string": "host=sql_host port=sql_port user=sql_usr dbname=dbname password=sql_pw",
"table_sharding": false
},
type
- The supported types are sqlite
and postgres
.
connection_string
- Specifies the connection string to the database. For example, for sqlite
it will be the path/name of the database, and for postgres
, specifying the host, port, user, password, and dbname.
table_sharding
- Specifies if all the analytics records will be stored in one table or multiple tables (one per day). By default, it is set to false
.
If table_sharding
is false
, all the records will be stored in the tyk_analytics
table. If set to true
, daily records are stored in a tyk_analytics_YYYYMMDD
date formatted table.
Tyk Dashboard Configuration
You need to set enable_aggregate_lookups
to false
Then add your SQL database connection settings:
{
...
“storage” : {
...
“analytics”: {
"type": "postgres",
"connection_string": "user=root password=admin host=tyk-db database=tyk-demo-db port=5432",
}
}
}
Uptime Tests Analytics
Tyk Pump Configuration
For storing logs into the tyk_aggregated
database table.
"uptime_pump_config": {
"uptime_type": "sql",
"type": "postgres",
"connection_string": "host=sql_host port=sql_port user=sql_usr database=tyk-demo-db password=sql_pw",
},
Tyk Dashboard Configuration
“storage” : {
...
“uptime”: {
"type": "postgres",
"connection_string": "user=root password=admin database=tyk-demo-db host=tyk-db port=5432",
}
}
}
Tyk Gateway Setting
To enable Uptime Pump, modify gateway configuration enable_uptime_analytics to true.
Sharding
In a production environment, we recommend the following setup:
By default, all logs/analytics are stored in one database table, making it hard and less performant to execute CRUD operations on the dataset when it grows significantly.
To improve the data maintenance processes, as querying or removing data from one single table is slow, we have added a new option (table_sharding
), so that the data can be stored daily (one table of data per day), which will automatically make querying or removing sets of data easier, whether dropping tables for removing logs/analytics, or reading multiple tables based on the selected period.
Tyk Pump Configuration
"sql": {
...
"meta": {
...
"table_sharding": true
}
},
"sql_aggregate" : {
...
"meta": {
...
"table_sharding": true
}
},
"uptime_pump_config": {
...
"table_sharding": true
},
Tyk Dashboard Configuration
"storage": {
"main": {
...
"table_sharding": true
},
"analytics": {
...
"table_sharding": true
},
"logs": {
...
"table_sharding": true
},
"uptime": {
...
"table_sharding": true
}
},
Graph Pump setup
MongoDB
Starting with version 1.7.0
of Tyk Pump and version 4.3.0
of Tyk Gateway it is possible to configure Graph MongoDB Pump. Once configured, the pump enables support for Graphql-specific metrics. The Graphql-specific metrics currently supported include (more to be added in future versions ):
- Types Requested.
- Fields requested for each type.
- Error Information (not limited to HTTP status codes).
Setting up Graph MongoDB Pump
- Set
enable_analytics
totrue
in yourtyk.conf
. - Enable Detailed recording by setting
enable_detailed_recording
in yourtyk.conf
totrue
. This is needed so that the GraphQL information can be parsed from the request body and response.
Note
This will enable detailed recording globally, across all APIs. This means that the behavior of individual APIs that have this configuration parameter set will be overridden. The Gateway must be restarted after updating this configuration parameter.
- Set up your Mongo
collection_name
. - Add your Graph MongoDB Pump configuration to the list of pumps in your
pump.conf
(pump configuration file).
Sample setup:
{
...
"pumps": {
...
"mongo-graph": {
"meta": {
"collection_name": "tyk_graph_analytics",
"mongo_url": "mongodb://mongo/tyk_graph_analytics"
}
},
}
}
Current limitations
The Graph MongoDB Pump is being improved upon regularly and as such there are a few things to note about the Graph MongoDB Pump current behavior:
- Size of your records - due to the detailed recording being needed for this Pump’s to function correctly, it is important to note that your records and consequently, your MongoDB storage could increase in size rather quickly.
- Subgraph requests are not recorded - Requests to tyk-controlled subgraphs from supergraphs in federation setting are currently not recorded by the Graph MongoDB Pump, just the supergraph requests are handled by the Graph MongoDB Pump.
- UDG requests are recorded but subsequent requests to data sources are currently ignored.
- Currently, Graph MongoDB Pump data can not be used in Tyk Dashboard yet, the data is only stored for recording purposes at the moment and can be exported to external tools for further analysis.
SQL
Starting with Version 1.8.0
of Tyk Pump and version 5.0.0
of the Tyk Gateway; It is possible to export GraphQL analytics to an SQL database.
Setting up Graph SQL Pump
With the Graph SQL pump currently includes information (per request) like:
- Types Requested
- Fields requested for each type
- Error Information
- Root Operations Requested.
Setup steps include:
- Set
enable_anaytics
totrue
in yourtyk.conf
. - Enable Detailed recording by setting
enable_detailed_recording
in yourtyk.conf
totrue
. This is needed so that the GraphQL information can be parsed from the request body and response.
Note
This will enable detailed recording globally, across all APIs. This means that the behavior of individual APIs that have this configuration parameter set will be overridden. The Gateway must be restarted after updating this configuration parameter.
- Configure your
pump.conf
using this sample configuration:
"sql-graph": {
"meta": {
"type": "postgres",
"table_name": "tyk_analytics_graph",
"connection_string": "host=localhost user=postgres password=password dbname=postgres",
"table_sharding": false
}
},
The Graph SQL pump currently supports postgres
, sqlite
and mysql
databases. The table_name
refers to the table that will be created in the case of unsharded setups, and the prefix that will be used for sharded setups
e.g tyk_analytics_graph_20230327
.
The Graph SQL pump currently has the same limitations as the Graph Mongo Pump.
Setting up Graph SQL Aggregate Pump
The sql-graph-aggregate
can be configured similar to the Graph SQL pump:
"sql-graph-aggregate": {
"meta": {
"type": "postgres",
"connection_string": "host=localhost port=5432 user=postgres dbname=postgres password=password",
"table_sharding": false
}
}
External Data Stores
The Tyk Pump component takes all of the analytics in Tyk and moves the data from the Gateway into your Dashboard. It is possible to set it up to send the analytics data it finds to other data stores. Currently we support the following:
- MongoDB or SQL (Used by the Tyk Dashboard)
- CSV
- Elasticsearch (2.0 - 7.x)
- Graylog
- Resurface.io
- InfluxDB
- Moesif
- Splunk
- StatsD
- DogStatsD
- Hybrid (Tyk RPC)
- Prometheus
- Logz.io
- Kafka
- Syslog (FluentD)
See the Tyk Pump Configuration for more details.
CSV
Tyk Pump can be configured to create or modify a CSV file to track API Analytics.
JSON / Conf file
Add the following configuration fields to the pumps section within your pump.conf
file:
{
"csv":
{
"type": "csv",
"meta": {
"csv_dir": "./your_directory_here"
}
}
}
Environment variables
TYK_PMP_PUMPS_CSV_TYPE=csv
TYK_PMP_PUMPS_CSV_META_CSVDIR=./your_directory_here
Datadog
The Tyk Pump can be configured to send your API traffic analytics to Datadog with which you can build a dashboards with various metrics based on your API traffic in Tyk.
Datadog dashboard example
We ceated a defaulkt Tyk dashboard canvat to give our users an easier starting point. You can find it in Datadog portal, under the Dashboards --> lists
section, (https://app.datadoghq.com/dashboard/lists)[https://app.datadoghq.com/dashboard/lists], and it is called Tyk Analytics Canvas
. To use this dashboard you will need to make sure that your datadog agent deployment has the following tag env:tyk-demo-env
and that your Tyk Pump configuration has dogstatsd.meta.namespace
set to pump
. You can also import it from Datadog official GH repo and change those values in the dashboard itself to visualize your analytics data as it flows into Datadog.
Prerequisites
- A working Datadog agent installed on your Environment. See the Datadog Tyk integration docs for more information.
- Either a Tyk Pro install or Tyk OSS Gateway install along with a Tyk Pump install.
How it works
When running the Datadog Agent, DogstatsD gets the request_time metric from your Tyk Pump in real time, per request, so you can understand the usage of your APIs and get the flexibility of aggregating by various parameters such as date, version, returned code, method etc.
Tyk Pump configuration
Below is a sample DogstatD section from a Tyk pump.conf
file
"dogstatsd": {
"type": "dogstatsd",
"meta": {
"address": "dd-agent:8126",
"namespace": "tyk",
"async_uds": true,
"async_uds_write_timeout_seconds": 2,
"buffered": true,
"buffered_max_messages": 32,
"sample_rate": 0.9999999999,
"tags": [
"method",
"response_code",
"api_version",
"api_name",
"api_id",
"org_id",
"tracked",
"path",
"oauth_id"
]
}
},
Field descriptions
address
: address of the datadog agent including host & portnamespace
: prefix for your metrics to datadogasync_uds
: Enable async UDS over UDPasync_uds_write_timeout_seconds
: Integer write timeout in seconds ifasync_uds: true
buffered
: Enable buffering of messagesbuffered_max_messages
: Max messages in single datagram ifbuffered: true
. Default 16sample_rate
: default 1 which equates to 100% of requests. To sample at 50%, set to 0.5tags
: List of tags to be added to the metric. The possible options are listed in the below example
If no tag is specified the fallback behavior is to use the below tags:
path
method
response_code
api_version
api_name
api_id
org_id
tracked
oauth_id
Note that this configuration can generate significant data due to the unbound nature of the path
tag.
On startup, you should see the loaded configs when initialising the DogstatsD pump
[May 10 15:23:44] INFO dogstatsd: initializing pump
[May 10 15:23:44] INFO dogstatsd: namespace: pump.
[May 10 15:23:44] INFO dogstatsd: sample_rate: 50%
[May 10 15:23:44] INFO dogstatsd: buffered: true, max_messages: 32
[May 10 15:23:44] INFO dogstatsd: async_uds: true, write_timeout: 2s
Elasticsearch
Elasticsearch is a highly scalable and distributed search engine that is designed to handle large amounts of data.
JSON / Conf
Add the following configuration fields to the pumps section within your pump.conf
file:
{
"pumps": {
"elasticsearch": {
"type": "elasticsearch",
"meta": {
"index_name": "tyk_analytics",
"elasticsearch_url": "http://localhost:9200",
"enable_sniffing": false,
"document_type": "tyk_analytics",
"rolling_index": false,
"extended_stats": false,
"version": "6"
}
}
}
}
Configuration fields
index_name
: The name of the index that all the analytics data will be placed in. Defaults totyk_analytics
elasticsearch_url
: If sniffing is disabled, the URL that all data will be sent to. Defaults tohttp://localhost:9200
enable_sniffing
: If sniffing is enabled, theelasticsearch_url
will be used to make a request to get a list of all the nodes in the cluster, the returned addresses will then be used. Defaults tofalse
document_type
: The type of the document that is created in Elasticsearch. Defaults totyk_analytics
rolling_index
: Appends the date to the end of the index name, so each days data is split into a different index name. For example,tyk_analytics-2016.02.28
. Defaults tofalse
.extended_stats
: If set to true will include the following additional fields:Raw Request
,Raw Response
andUser Agent
.version
: Specifies the ES version. Use3
for ES 3.X,5
for ES 5.X,6
for ES 6.X,7
for ES 7.X . Defaults to3
.disable_bulk
: Disable batch writing. Defaults tofalse
.bulk_config
: Batch writing trigger configuration. Each option is an OR with each other:workers
: Number of workers. Defaults to1
.flush_interval
: Specifies the time in seconds to flush the data and send it to ES. Default is disabled.bulk_actions
: Specifies the number of requests needed to flush the data and send it to ES. Defaults to 1000 requests. If it is needed, can be disabled with-1
.bulk_size
: Specifies the size (in bytes) needed to flush the data and send it to ES. Defaults to 5MB. Can be disabled with-1
.
Environment variables
TYK_PMP_PUMPS_ELASTICSEARCH_TYPE=elasticsearch
TYK_PMP_PUMPS_ELASTICSEARCH_META_INDEXNAME=tyk_analytics
TYK_PMP_PUMPS_ELASTICSEARCH_META_ELASTICSEARCHURL=http://localhost:9200
TYK_PMP_PUMPS_ELASTICSEARCH_META_ENABLESNIFFING=false
TYK_PMP_PUMPS_ELASTICSEARCH_META_DOCUMENTTYPE=tyk_analytics
TYK_PMP_PUMPS_ELASTICSEARCH_META_ROLLINGINDEX=false
TYK_PMP_PUMPS_ELASTICSEARCH_META_EXTENDEDSTATISTICS=false
TYK_PMP_PUMPS_ELASTICSEARCH_META_VERSION=5
TYK_PMP_PUMPS_ELASTICSEARCH_META_BULKCONFIG_WORKERS=2
TYK_PMP_PUMPS_ELASTICSEARCH_META_BULKCONFIG_FLUSHINTERVAL=60
Moesif
This is a step by step guide to setting up Moesif API Analytics and Monetization platform to understand customer API usage and setup usage-based billing.
We also have a blog post which highlights how Tyk and Moesif work together.
The assumptions are that you have Docker installed and Tyk Self-Managed already running. See the Tyk Pump Configuration for more details.
Overview
With the Moesif Tyk plugin, your API logs are sent to Moesif asynchronously to provide analytics on customer API usage along with your API payloads like JSON and XML. This plugin also enables you to monetize your API with billing meters and provide a self-service onboarding experience. Moesif also collects information such as the authenticated user (AliasId or OAuthId) to identify customers using your API. An overview on how Moesif and Tyk works together is available here.
Steps for Configuration
-
Get a Moesif Application Id
Go to www.moesif.com and sign up for a free account. Application Ids are write-only API keys specific to an application in Moesif such as “Development” or “Production”. You can always create more applications in Moesif.
-
Enable Moesif backend in Tyk Pump
Add Moesif as an analytics backend along with your Moesif Application Id you obtained in the last step to your Tyk Pump Configuration
####### JSON / Conf File
{
"pumps": {
"moesif": {
"name": "moesif",
"meta": {
"application_id": "Your Moesif Application Id"
}
}
}
}
####### Env Variables:
TYK_PMP_PUMPS_MOESIF_TYPE=moesif
TYK_PMP_PUMPS_MOESIF_META_APPLICATIONID=your_moesif_application_id
- Ensure analytics is enabled
If you want to log HTTP headers and body, ensure the detailed analytics recording flag is set to true in your Tyk Gateway Conf
####### JSON / Conf File
{
"enable_analytics" : true,
"analytics_config": {
"enable_detailed_recording": true
}
}
####### Env Variables:
TYK_GW_ENABLEANALYTICS=true
TYK_GW_ANALYTICSCONFIG_ENABLEDETAILEDRECORDING=true
Note
This will enable detailed recording globally, across all APIs. This means that the behavior of individual APIs that have this configuration parameter set will be overridden. The Gateway must be restarted after updating this configuration parameter.
- Restart Tyk Pump to pickup the Moesif config
Once your config changes are done, you need to restart your Tyk Pump and Tyk Gateway instances (if you’ve modified Tyk gateway config). If you are running Tyk Pump in Docker:
$ docker restart tyk-pump
- PROFIT!
You can now make a few API calls and verify they show up in Moesif.
$ curl localhost:8080
The Moesif Tyk integration automatically maps a Tyk Token Alias to a user id in Moesif. With a Moesif SDK, you can store additional customer demographics to break down API usage by customer email, company industry, and more.
Configuration options
The Tyk Pump for Moesif has a few configuration options that can be set in your pump.env
:
Parameter | Required | Description | Environment Variable |
---|---|---|---|
application_id | required | Moesif Application Id. Multiple Tyk api_id’s will be logged under the same app id. | TYK_PMP_PUMPS_MOESIF_META_APPLICATIONID |
request_header_masks | optional | Mask a specific request header field. Type: String Array [] string | TYK_PMP_PUMPS_MOESIF_META_REQUESTHEADERMASKS |
request_body_masks | optional | Mask a specific - request body field. Type: String Array [] string | TYK_PMP_PUMPS_MOESIF_META_REQUESTBODYMASKS |
response_header_masks | optional | Mask a specific response header field. Type: String Array [] string | TYK_PMP_PUMPS_MOESIF_META_RESPONSEHEADERMASKS |
response_body_masks | optional | Mask a specific response body field. Type: String Array [] string | TYK_PMP_PUMPS_MOESIF_META_RESPONSEBODYMASKS |
disable_capture_request_body | optional | Disable logging of request body. Type: Boolean. Default value is false. | TYK_PMP_PUMPS_MOESIF_META_DISABLECAPTUREREQUESTBODY |
disable_capture_response_body | optional | Disable logging of response body. Type: Boolean. Default value is false. | TYK_PMP_PUMPS_MOESIF_META_DISABLECAPTURERESPONSEBODY |
user_id_header | optional | Field name to identify User from a request or response header. Type: String. Default maps to the token alias | TYK_PMP_PUMPS_MOESIF_META_USERIDHEADER |
company_id_header | optional | Field name to identify Company (Account) from a request or response header. Type: String | TYK_PMP_PUMPS_MOESIF_META_COMPANYIDHEADER |
Identifying users
By default, the plugin will collect the authenticated user (AliasId or OAuthId) to identify the customer. This can be overridden by setting the user_id_header
to a header that contains your API user/consumer id such as X-Consumer-Id
. You can also set the company_id_header
which contains the company to link the user to. See Moesif docs on identifying customers
Splunk
This is a step by step guide to setting Splunk to receive logs from the Tyk Pump.
The assumptions are that you have Docker installed and Tyk Pro Self-Managed already running.
Steps for Configuration
-
Run Splunk using Docker
Assuming you have Docker installed locally, run the following from a terminal:
$ docker run \ -p 8000:8000 \ -p 8088:8088 \ -v splunk-data:/opt/splunk/var \ -v splunk-data:/opt/splunk/etc \ -e SPLUNK_START_ARGS=--accept-license \ -e SPLUNK_PASSWORD=mypassword \ splunk/splunk:latest
-
Setup a collector in Splunk
A) Visit http://localhost:8000 and log into the Splunk Dashboard using the username
admin
and the password we set in the Docker run command,mypassword
B) Create a new Data input
C) Select
HTTP Event Collector -> Add New
D) Set the name to “tyk” and then leave everything else as default
Grab your token at the end page:
-
Add the Splunk bit to pump.conf
Edit your pump’s
pump.conf
and add this bit to the “Pumps” section, like so, adding the token from step #1:Make sure to add your token from the previous step into the
collector_token
field above{ "pumps": { "splunk": { "type": "splunk", "meta": { "collector_token": "<token>", "collector_url": "https://localhost:8088/services/collector/event", "ssl_insecure_skip_verify": true } } } }
**Note** Make sure that the `localhost` value matches with your setup. Head on over to our [community forum](https://community.tyk.io/) to ask for help if you are stuck here.
-
Restart Tyk Pump to pickup the Splunk config
If you are running Tyk Pump in Docker:
$ docker restart tyk-pump
-
PROFIT!
Let’s make a few API calls against Tyk, and see if they flow into Splunk
$ curl localhost:8080/loan-service-api/ { "error": "Key not authorized" }%
Success:
Logzio
Logz.io is a cloud-based log management and analytics platform that provides log management built on Elasticsearch, Logstash and Kibana.
JSON / Conf file
Add the following configuration fields to the pumps section within your pump.conf
file:
{
"pumps"
{
"logzio": {
"type": "logzio",
"meta": {
"token": "<YOUR-LOGZ.IO-TOKEN>"
}
}
}
}
Environment variables
TYK_PMP_PUMPS_LOGZIO_TYPE=logzio
TYK_PMP_PUMPS_LOGZIO_META_TOKEN="{YOUR-LOGZIO-TOKEN}"
Advanced configuration fields
meta.url
: Use if you do not want to use the default Logz.io URL, for example when using a proxy. The default url ishttps://listener.logz.io:8071
.meta.queue_dir
: The directory for the queue.meta.drain_duration
: This sets the drain duration (when to flush logs on the disk). The default value is3s
.meta.disk_threshold
: Set the disk queue threshold. Once the threshold is crossed the sender will not enqueue the received logs. The default value is98
(percentage of disk).meta.check_disk_space
: Set the sender to check if it crosses the maximum allowed disk usage. The default value istrue
.
Tyk Analytics Record Fields
Below is a detailed list of each field contained within our Tyk Analytics Record that is sent from Tyk Pump.
Method
Request method.
Example: GET
, POST
.
Host
Request Host
header.
Remarks: Includes host and optional port number of the server to which the request was sent.
Example: tyk.io
, or tyk.io:8080
if port is included.
Path
Request path.
Remarks: Displayed in decoded form.
Example: /foo/bar
for /foo%2Fbar
or /foo/bar
.
RawPath
Request path.
Remarks: Original request path without changes just decoded.
Example: /foo/bar
for /foo%2Fbar
or /foo/bar
.
ContentLength
Request Content-Length
header.
Remarks: The number of bytes in the request body.
Example: 10
for request body 0123456789
.
UserAgent
Request User-Agent
header.
Example: curl/7.86.0
.
Day
Request day.
Remarks: Based on TimeStamp
field.
Example: 16
for 2022-11-16T03:01:54Z
.
Month
Request month.
Remarks: Based on TimeStamp
field.
Example: 11
for 2022-11-16T03:01:54Z
.
Year
Request year.
Remarks: Based on TimeStamp
field.
Example: 2022
for 2022-11-16T03:01:54Z
.
Hour
Request hour.
Remarks: Based on TimeStamp
field.
Example: 3
for 2022-11-16T03:01:54Z
.
ResponseCode
Response code.
Remarks: Only contains the integer element of the response code. Can be generated by either the gateway or upstream server, depending on how the request is handled.
Example: 200
for 200 OK
.
APIKey
Request authentication key.
Remarks: OAuthentication key, as provided in request. If no API key is provided then gateway will substitute a default value.
Example: Unhashed auth_key
, hashed 6129dc1e8b64c6b4
, or 00000000
if no authentication provided.
TimeStamp
Request timestamp.
Remarks: Generated by the gateway, based on the time it receives the request from the client.
Example: 2022-11-16T03:01:54.648+00:00
.
APIVersion
Version of API Definition requested.
Remarks: Based on version configuration of context API definition. If API is unversioned then value is “Not Versioned”.
Example: Could be an alphanumeric value such as 1
or b
. Is Not Versioned
if not versioned.
APIName
Name of API Definition requested.
Example: Foo API
.
APIID
Id of API Definition requested.
Example: 727dad853a8a45f64ab981154d1ffdad
.
OrgID
Organization Id of API Definition requested.
Example: 5e9d9544a1dcd60001d0ed20
.
OauthID
Id of OAuth client.
Remarks: Value is empty string if not using OAuth, or OAuth client not present.
Example: my-oauth-client-id
.
RequestTime
Duration of upstream roundtrip.
Remarks: Equal to value of Latency.Total
field.
Example: 3
for a 3ms roundtrip.
RawRequest
Raw HTTP request.
Remarks: Base64 encoded copy of the request sent from the gateway to the upstream server.
Example: R0VUIC9nZXQgSFRUUC8xLjEKSG9zdDogdHlrLmlv
.
RawResponse
Raw HTTP response.
Remarks: Base64 encoded copy of the response sent from the gateway to the client.
Example: SFRUUC8xLjEgMjAwIE9LCkNvbnRlbnQtTGVuZ3RoOiAxOQpEYXRlOiBXZWQsIDE2IE5vdiAyMDIyIDA2OjIxOjE2IEdNVApTZXJ2ZXI6IGd1bmljb3JuLzE5LjkuMAoKewogICJmb28iOiAiYmFyIgp9Cg==
.
IPAddress
Client IP address.
Remarks: Taken from either X-Real-IP
or X-Forwarded-For
request headers, if set. Otherwise, determined by gateway based on request.
Example: 172.18.0.1
.
Geo
Client geolocation data.
Remarks: Calculated using MaxMind database, based on client IP address.
Example: {"country":{"isocode":"SG"},"city":{"geonameid":0,"names":{}},"location":{"latitude":0,"longitude":0,"timezone":""}}
.
Network
Network statistics.
Remarks: Not currently used.
Latency
Latency statistics
Remarks: Contains two fields; upstream
is the roundtrip duration between the gateway sending the request to the upstream server and it receiving a response. total
is the upstream
value plus additional gateway-side functionality such as processing analytics data.
Example: {"total":3,"upstream":3}
.
Note
We record the round trip time of the call from the gateways reverse proxy. So what you get is the sum of leaving Tyk -> upstream -> response received back at Tyk
.
Tags
Session context tags.
Remarks: Can contain many tags which refer to many things, such as the gateway, API key, organization, API definition etc.
Example: ["key-00000000","org-5e9d9544a1dcd60001d0ed20","api-accbdd1b89e84ec97f4f16d4e3197d5c"]
.
Alias
Session alias.
Remarks: Alias of the context authenticated identity. Blank if no alias set or request is unauthenticated.
Example: my-key-alias
.
TrackPath
Tracked endpoint flag.
Remarks: Value is true
if the requested endpoint is configured to be tracked, otherwise false
.
Example: true
or false
.
ExpireAt
Future expiry date.
Remarks: Can be used to implement automated data expiry, if supported by storage.
Example: 2022-11-23T07:26:25.762+00:00
.
Monitor your APIs with Prometheus
Your Tyk Pump can expose Prometheus metrics for the requests served by your Tyk Gateway. This is helpful if you want to track how often your APIs are being called and how they are performing. Tyk collects latency data of how long your services take to respond to requests, how often your services are being called and what status code they return.
We have created a demo project in GitHub if you want to see this setup in action.
Prerequisites
- A Tyk installation (either Self-Managed or Open Source Gateway)
- Tyk Pump 1.6 or higher
Configure Tyk Pump to expose Prometheus metrics
Prometheus collects metrics from targets by scraping metrics HTTP endpoints. To expose Tyk’s metrics in the Prometheus format, you need to add the following lines to your Tyk Pump configuration file pump.conf
:
Host
"prometheus": {
"type": "prometheus",
"meta": {
"listen_address": "<tyk-pump>:9090",
"path": "/metrics",
"custom_metrics":[
{
"name":"tyk_http_requests_total",
"description":"Total of API requests",
"metric_type":"counter",
"labels":["response_code","api_name","method","api_key","alias","path"]
},
{
"name":"tyk_http_latency",
"description":"Latency of API requests",
"metric_type":"histogram",
"labels":["type","response_code","api_name","method","api_key","alias","path"]
}
]
}
}
Replace <tyk-pump>
with your host name or IP address.
Docker
"prometheus": {
"type": "prometheus",
"meta": {
"listen_address": ":9090",
"path": "/metrics",
"custom_metrics":[
{
"name":"tyk_http_requests_total",
"description":"Total of API requests",
"metric_type":"counter",
"labels":["response_code","api_name","method","api_key","alias","path"]
},
{
"name":"tyk_http_latency",
"description":"Latency of API requests",
"metric_type":"histogram",
"labels":["type","response_code","api_name","method","api_key","alias","path"]
}
]
}
}
Port 9090 needs also to be exported by Docker in addition to the port used for health check (here 8083), e.g. with Docker compose:
tyk-pump:
image: tykio/tyk-pump-docker-pub:${PUMP_VERSION}
ports:
- 8083:8083
- 9090:9090
Restart your Pump to apply to configuration change.
Verify that the metrics are being exposed by calling the metrics endpoint http://<tyk-pump>:9090
from your browser.
Configure Prometheus to scrape the metrics endpoint
Prometheus is configured via a configuration file where you can define the metrics endpoint Prometheus will scrape periodically.
Here’s an example configuration scraping Tyk Pump metric endpoints:
Host
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: tyk
static_configs:
- targets: ['tyk-pump:9090']
Docker
global:
scrape_interval: 15s
evaluation_interval: 15s
scrape_configs:
- job_name: tyk
static_configs:
- targets: ['host.docker.internal:9090']
- Then restart your Prometheus instance after any configuration change
- In Prometheus under “Status” / “Targets”, we can see that Prometheus is able to scrape the metrics successfully: state is UP.
Exploring your metrics in Grafana
Before trying out, make sure to generate traffic by calling your APIs. You will find a couple of useful queries in our Tyk Pump GitHub repo based on the metrics exposed by Tyk. These will demonstrate which metric types are exported and how you can customize them.
You also need to make sure that Grafana is connected to your Prometheus server. This can be configured under Configuration / Data sources
Useful queries
Here are some useful queries to help you monitor the health of your APIs:
Upstream time across all services
Tyk collects latency data of how long your upstream services take to respond to requests. This data can be used to configure an alert if the latency goes beyond a certain threshold. This query calculated the 95th percentile of the total request latency of all the upstream services. To run the query:
histogram_quantile(0.95, sum(rate(tyk_http_latency_bucket[1m])) by (le))
Upstream time per API
This query calculated the 95th percentile of the request latency of upstream services for the selected API. To run this query:
histogram_quantile(0.90, sum(rate(tyk_http_latency_bucket{api_name="<api name>"}[1m])) by (le,api_name))
Replace <api name>
with the name of your API for this query.
Request rate
Track the request rate of your services:
sum (rate(tyk_http_requests_total[1m]))
Request Rate per API
Track the request rate of your services for the selected API:
sum (rate(tyk_http_requests_total{api_name="<api name>"}[1m]))
Replace <api name>
with the name of your API for this query.
Error Rates
Track the error rate your services are serving:
sum (rate(tyk_http_requests_total{response_code =~"5.."}[1m]))
Error rates per API
Track the error rate your services are serving for the selected API:
sum (rate(tyk_http_requests_total{response_code =~"5..", api_name="httpbin - HTTP Request & Response Service"}[1m]))
Replace <api name>
with the name of your API for this query.
Setup Prometheus Pump
We’ll show you how to setup Tyk Pump for Prometheus Service Discovery.
Integrate with Prometheus using Prometheus Operator
Steps for Configuration:
-
Setup Prometheus
Using the prometheus-community/kube-prometheus-stack chart
In this example, we use kube-prometheus-stack, which installs a collection of Kubernetes manifests, Grafana dashboards, and Prometheus rules combined with documentation and scripts to provide easy to operate end-to-end Kubernetes cluster monitoring with Prometheus using the Prometheus Operator.
helm install prometheus-stack prometheus-community/kube-prometheus-stack -n monitoring --create-namespace
This is a useful stack where you can get Prometheus, the Prometheus Operator, and Grafana all deployed and configured in one go.
-
Install Tyk Pump with PodMonitor
If you have Prometheus Operator enabled on the cluster, it would look for “PodMonitor” or “ServiceMonitor” resources and scrap from specified port. The only thing you would need to modify here is the helm release name for Prometheus Operator.
Also you can customize Prometheus Custom Metrics based on your analytics needs. We are using
tyk_http_requests_total
andtyk_http_latency
described here for illustration:NAMESPACE=tyk-oss APISecret=foo REDIS_BITNAMI_CHART_VERSION=19.0.2 PromOperator_Release=prometheus-stack Prometheus_Custom_Metrics='[{"name":"tyk_http_requests_total"\,"description":"Total of API requests"\,"metric_type":"counter"\,"labels":["response_code"\,"api_name"\,"method"\,"api_key"\,"alias"\,"path"]}\, { "name":"tyk_http_latency"\, "description":"Latency of API requests"\, "metric_type":"histogram"\, "labels":["type"\,"response_code"\,"api_name"\,"method"\,"api_key"\,"alias"\,"path"] }]' helm upgrade tyk-redis oci://registry-1.docker.io/bitnamicharts/redis -n $NAMESPACE --create-namespace --install --version $REDIS_BITNAMI_CHART_VERSION helm upgrade tyk-oss tyk-helm/tyk-oss -n $NAMESPACE --create-namespace \ --install \ --set global.secrets.APISecret="$APISecret" \ --set global.redis.addrs="{tyk-redis-master.$NAMESPACE.svc.cluster.local:6379}" \ --set global.redis.passSecret.name=tyk-redis \ --set global.redis.passSecret.keyName=redis-password \ --set global.components.pump=true \ --set "tyk-pump.pump.backend={prometheus}" \ --set tyk-pump.pump.prometheusPump.customMetrics=$Prometheus_Custom_Metrics \ --set tyk-pump.pump.prometheusPump.prometheusOperator.enabled=true \ --set tyk-pump.pump.prometheusPump.prometheusOperator.podMonitorSelector.release=$PromOperator_Release
Note
Please make sure you are installing Redis versions that are supported by Tyk. Please refer to Tyk docs to get list of supported versions.
Note
For Custom Metrics, commas are escaped to be used in helm –set command. You can remove the backslashes in front of the commas if you are to set it in values.yaml. We have included an example in the default values.yaml comments section.
-
Verification
When successfully configured, you could see the following messages in pump log:
│ time="Jun 26 13:11:01" level=info msg="Starting prometheus listener on::9090" prefix=prometheus-pump │ │ time="Jun 26 13:11:01" level=info msg="Prometheus Pump Initialized" prefix=prometheus-pump │ │ time="Jun 26 13:11:01" level=info msg="Init Pump: PROMETHEUS" prefix=main
On Prometheus Dashboard, you can see the Pump is listed as one of the target and Prometheus is successfully scrapped from it.
You can check our Guide on Monitoring API with Prometheus for a list of useful queries you can setup and use.
e.g. The custom metrics tyk_http_requests_total can be retrieved:
Integrate with Prometheus using annotations
Steps for Configuration:
-
Setup Prometheus
Using the prometheus-community/prometheus chart
Alternatively, if you are not using Prometheus Operator, please check how your Prometheus can support service discovery. Let say you’re using the prometheus-community/prometheus chart, which configures Prometheus to scrape from any Pods with following annotations:
metadata: annotations: prometheus.io/scrape: "true" prometheus.io/path: /metrics prometheus.io/port: "9090"
To install Prometheus, run
helm install prometheus prometheus-community/prometheus -n monitoring --create-namespace
-
Install Tyk Pump with prometheus annotations
NAMESPACE=tyk-oss APISecret=foo REDIS_BITNAMI_CHART_VERSION=19.0.2 PromOperator_Release=prometheus-stack Prometheus_Custom_Metrics='[{"name":"tyk_http_requests_total"\,"description":"Total of API requests"\,"metric_type":"counter"\,"labels":["response_code"\,"api_name"\,"method"\,"api_key"\,"alias"\,"path"]}\, { "name":"tyk_http_latency"\, "description":"Latency of API requests"\, "metric_type":"histogram"\, "labels":["type"\,"response_code"\,"api_name"\,"method"\,"api_key"\,"alias"\,"path"] }]' helm upgrade tyk-redis oci://registry-1.docker.io/bitnamicharts/redis -n $NAMESPACE --create-namespace --install --version $REDIS_BITNAMI_CHART_VERSION helm upgrade tyk-oss tyk-helm/tyk-oss -n $NAMESPACE --create-namespace \ --install \ --set global.secrets.APISecret="$APISecret" \ --set global.redis.addrs="{tyk-redis-master.$NAMESPACE.svc.cluster.local:6379}" \ --set global.redis.passSecret.name=tyk-redis \ --set global.redis.passSecret.keyName=redis-password \ --set global.components.pump=true \ --set "tyk-pump.pump.backend={prometheus}" \ --set tyk-pump.pump.prometheusPump.customMetrics=$Prometheus_Custom_Metrics \ --set-string tyk-pump.pump.podAnnotations."prometheus\.io/scrape"=true \ --set-string tyk-pump.pump.podAnnotations."prometheus\.io/port"=9090 \ --set-string tyk-pump.pump.podAnnotations."prometheus\.io/path"=/metrics
Note
Please make sure you are installing Redis versions that are supported by Tyk. Please refer to Tyk docs to get list of supported versions.
-
Verification
After some time, you can see that Prometheus is successfully scraping from Tyk Pump:
Expose a service for Prometheus to scrape
You can expose Pump as a service so that Prometheus can access the /metrics
endpoint for scraping. Just enable service in tyk-pump.pump.service
:
service:
# Tyk Pump svc is disabled by default. Set it to true to enable it.
enabled: true
Tyk Pump Capping Analytics Data Storage
Tyk Gateways can generate a lot of analytics data. A guideline is that for every 3 million requests that your Gateway processes it will generate roughly 1GB of data.
If you have Tyk Pump set up with the aggregate pump as well as the regular MongoDB pump, then you can make the tyk_analytics
collection a capped collection. Capping a collection guarantees that analytics data is rolling within a size limit, acting like a FIFO buffer which means that when it reaches a specific size, instead of continuing to grow, it will replace old records with new ones.
Note
If you are using DocumentDB, capped collections are not supported. See here for more details.
The tyk_analytics
collection contains granular log data, which is why it can grow rapidly. The aggregate pump will convert this data into a aggregate format and store it in a separate collection. The aggregate collection is used for processing reporting requests as it is much more efficient.
If you’ve got an existing collection which you want to convert to be capped you can use the convertToCapped
MongoDB command.
If you wish to configure the pump to cap the collections for you upon creating the collection, you may add the following
configurations to your uptime_pump_config
and / or mongo.meta
objects in pump.conf
.
"collection_cap_max_size_bytes": 1048577,
"collection_cap_enable": true
collection_cap_max_size_bytes
sets the maximum size of the capped collection.
collection_cap_enable
enables capped collections.
If capped collections are enabled and a max size is not set, a default cap size of 5Gib
is applied.
Existing collections will never be modified.
Note
An alternative to capped collections is MongoDB’s Time To Live indexing (TTL). TTL indexes are incompatible with capped collections. If you have set a capped collection, a TTL index will not get created, and you will see error messages in the MongoDB logs. See MongoDB TTL Docs for more details on TTL indexes.
Time Based Cap in single tenant environments
If you wish to reduce or manage the amount of data in your MongoDB, you can add an TTL expire index to the collection, so older records will be evicted automatically.
Note
Time based caps (TTL indexes) are incompatible with already configured size based caps.
Run the following command in your preferred MongoDB tool (2592000 in our example is 30 days):
db.tyk_analytics.createIndex( { "timestamp": 1 }, { expireAfterSeconds: 2592000 } )
This command sets expiration rule to evict all the record from the collection which timestamp
field is older then specified expiration time.
Time Based Cap in multi-tenant environments
When you have multiple organizations, you can control analytics expiration on per organization basis. This technique also use TTL indexes, as described above, but index should look like:
db.tyk_analytics.createIndex( { "expireAt": 1 }, { expireAfterSeconds: 0 } )
This command sets the value of expireAt
to correspond to the time the document should expire. MongoDB will automatically delete documents from the tyk_analytics
collection 0 seconds after the expireAt
time in the document. The expireAt
will be calculated and created by Tyk in the following step.
Create an Organization Quota
curl --header "x-tyk-authorization: {tyk-gateway-secret}" --header "content-type: application/json" --data @expiry.txt http://{tyk-gateway-ip}:{port}/tyk/org/keys/{org-id}
Where context of expiry.txt is:
{
"org_id": "{your-org-id}",
"data_expires": 86400
}
data_expires
- Sets the data expires to a time in seconds for it to expire. Tyk will calculate the expiry date for you.
Size Based Cap
Add the Size Cap
Note
The size value should be in bytes, and we recommend using a value just under the amount of RAM on your machine.
Run this command in your MongoDB shell:
use tyk_analytics
db.runCommand({"convertToCapped": "tyk_analytics", size: 100000});
Adding the Size Cap if using a mongo_selective Pump
The mongo_selective
pump stores data on a per organization basis. You will have to run the following command in your MongoDB shell for an individual organization as follows.
db.runCommand({"convertToCapped": "z_tyk_analyticz_<org-id>", size: 100000});
Separated Analytics Storage
For high-traffic systems that make heavy use of analytics, it makes sense to separate out the Redis analytics server from the Redis configuration server that supplies auth tokens and handles rate limiting configuration.
To enable a separate analytics server, update your tyk.conf
with the following section:
"enable_separate_analytics_store": true,
"analytics_storage": {
"type": "redis",
"host": "",
"port": 0,
"addrs": [
"localhost:6379"
],
"username": "",
"password": "",
"database": 0,
"optimisation_max_idle": 3000,
"optimisation_max_active": 5000,
"enable_cluster": false
},
Note
addrs
is new in v2.9.3, and replaces hosts
which is now deprecated.
If you set enable_cluster
to false
, you only need to set one entry in addrs
:
The configuration is the same (and uses the same underlying driver) as the regular configuration, so Redis Cluster is fully supported.