Documentation Documentation

Tyk Pump Configuration

The Tyk Pump is our analytics purger that moves the data generated by your Tyk nodes to any back-end. It is primarily used to display your analytics data in the Tyk Dashboard.

Environment Variables

Environment variables can be used to override the settings defined in the configuration file. See Environment Variables for details. Where an environment variable is specified, its value will take precedence over the value in the configuration file.

Configuration

Configuring Tyk Pump is very simple.

Create a pump.conf file:

{
  "analytics_storage_type": "redis",
  "analytics_storage_config": {
    "type": "redis",
    "host": "localhost",
    "port": 6379,
    "hosts": null,
    "username": "",
    "password": "",
    "database": 0,
    "optimisation_max_idle": 100,
    "optimisation_max_active": 0,
    "enable_cluster": false
  },
  "purge_delay": 1,
  "pumps": {
    "dummy": {
      "type": "dummy",
      "meta": {
        
      }
    },
    "mongo": {
      "type": "mongo",
      "meta": {
        "collection_name": "tyk_analytics",
        "mongo_url": "mongodb://username:[email protected]{hostname:port},{hostname:port}/{db_name}"
      }
    },
    "mongo-pump-aggregate": {
      "name": "mongo-pump-aggregate",
      "meta": {
        "mongo_url": "mongodb://username:[email protected]{hostname:port},{hostname:port}/{db_name}",
        "use_mixed_collection": true
      }
    },
    "csv": {
      "type": "csv",
      "meta": {
        "csv_dir": "./"
      }
    },
    "elasticsearch": {
      "type": "elasticsearch",
      "meta": {
        "index_name": "tyk_analytics",
        "elasticsearch_url": "localhost:9200",
        "enable_sniffing": false,
        "document_type": "tyk_analytics",
        "rolling_index": false,
        "extended_stats": false,
        "version": "6"
      }
    },
    "influx": {
      "type": "influx",
      "meta": {
        "database_name": "tyk_analytics",
        "address": "http//localhost:8086",
        "username": "root",
        "password": "root",
        "fields": [
          "request_time"
        ],
        "tags": [
          "path",
          "response_code",
          "api_key",
          "api_version",
          "api_name",
          "api_id",
          "raw_request",
          "ip_address",
          "org_id",
          "oauth_id"
        ]
      }
    },
    "moesif": {
      "type": "moesif",
      "meta": {
        "application_id": ""
      }
    },
    "splunk": {
      "type": "splunk",
      "meta": {
        "collector_token": "<token>",
        "collector_url": "<url>",
        "ssl_insecure_skip_verify": false,
        "ssl_cert_file": "<cert-path>",
        "ssl_key_file": "<key-path>",
        "ssl_server_name": "<server-name>"
      }
    },
    "statsd": {
      "type": "statsd",
      "meta": {
        "address": "localhost:8125",
        "fields": [
          "request_time"
        ],
        "tags": [
          "path",
          "response_code",
          "api_key",
          "api_version",
          "api_name",
          "api_id",
          "raw_request",
          "ip_address",
          "org_id",
          "oauth_id"
        ]
      }
    },
    "dogstatsd": {
      "name": "dogstatsd",
      "meta": {
        "address": "localhost:8125",
        "namespace": "pump",
        "async_uds": true,
        "async_uds_write_timeout_seconds": 2,
        "buffered": true,
        "buffered_max_messages": 32
      }
    },
    "prometheus": {
      "type": "prometheus",
      "meta": {
        "listen_address": "localhost:9090",
        "path": "/metrics"
      }
    },
    "graylog": {
      "type": "graylog",
      "meta": {
        "host": "10.60.6.15",
        "port": 12216,
        "tags": [
          "method",
          "path",
          "response_code",
          "api_key",
          "api_version",
          "api_name",
          "api_id",
          "org_id",
          "oauth_id",
          "raw_request",
          "request_time",
          "raw_response"
        ]
      }
    },
    "hybrid": {
      "type": "hybrid",
      "meta": {
        "rpc_key": "<org-id>",
        "api_key": "<api-key>",
        "aggregated": false,
        "connection_string": "localhost:9090",
        "use_ssl": false,
        "ssl_insecure_skip_verify": false,
        "group_id": "",
        "call_timeout": 30,
        "ping_timeout": 60,
        "rpc_pool_size": 30
      }
    },
    "logzio": {
      "type": "logzio",
      "meta": {
        "token": "<YOUR-LOGZ.IO-TOKEN>"
      }
    }
  },
  "uptime_pump_config": {
    "collection_name": "tyk_uptime_analytics",
    "mongo_url": "mongodb://username:[email protected]{hostname:port},{hostname:port}/{db_name}"
  },
  "dont_purge_uptime_data": false
}

Note: mongo_ssl_insecure_skip_verify and mongo_use_ssl are available from v1.3.6 onwards.

Pumps are then added to the pumps section. Each should represent a sink to purge the data into.

Settings must be the same as for the original tyk.conf for Redis and for MongoDB.

Tyk Dashboard

The Tyk Dashboard uses the mongo-pump-aggregate collection to display analytics. This is different than the standard mongo pump plugin that will store individual analytic items into MongoDB. The aggregate functionality was built to be fast, as querying raw analytics is expensive in large data sets.

Other Supported Backend Services

The following services are supported:

  • MongoDB (to replace built-in purging)
  • CSV
  • ElasticSearch (2.0+)
  • Graylog
  • InfluxDB
  • Moesif
  • Splunk
  • StatsD
  • DogStatsD
  • Hybrid (Tyk RPC)
  • Prometheus
  • Logz.io

Elasticsearch Config

index_name - The name of the index that all the analytics data will be placed in. Defaults to “tyk_analytics”

elasticsearch_url - If sniffing is disabled, the URL that all data will be sent to. Defaults to http://localhost:9200. The HTTP prefix must be included in the URL.

enable_sniffing - If sniffing is enabled, the elasticsearch_url will be used to make a request to get a list of all the nodes in the cluster. The returned addresses will then be used. Defaults to false.

document_type - The type of the document that is created in ES. Defaults to “tyk_analytics”

rolling_index - Appends the date to the end of the index name, so each days data is split into a different index name. E.g. tyk_analytics-2016.02.28 Defaults to false

extended_stats - If set to true will include the following additional fields: Raw Request, Raw Response and User Agent.

version - Specifies the ES version. Use “3” for ES 3.x, “5” for ES 5.0 and “6” for ES 6.0. Defaults to “3”.

Moesif Config

Moesif is a logging and analytics service for APIs. The Moesif pump will move analytics data from Tyk to Moesif.

application_id - Moesif App Id JWT. Multiple api_id’s will go under the same app id.

DogStatsD

  • address: address of the datadog agent including host & port
  • namespace: prefix for your metrics to datadog
  • async_uds: Enable async UDS over UDP https://github.com/Datadog/datadog-go#unix-domain-sockets-client
  • async_uds_write_timeout_seconds: Integer write timeout in seconds if async_uds: true
  • buffered: Enable buffering of messages
  • buffered_max_messages: Max messages in single datagram if buffered: true. Default 16
  • sample_rate: default 1 which equates to 100% of requests. To sample at 50%, set to 0.5
"dogstatsd": {
  "name": "dogstatsd",
  "meta": {
    "address": "localhost:8125",
    "namespace": "pump",
    "async_uds": true,
    "async_uds_write_timeout_seconds": 2,
    "buffered": true,
    "buffered_max_messages": 32,
    "sample_rate": 0.5
  }
},

On startup, you should see the loaded configs when initialising the DogStatsD pump.

[May 10 15:23:44]  INFO dogstatsd: initializing pump
[May 10 15:23:44]  INFO dogstatsd: namespace: pump.
[May 10 15:23:44]  INFO dogstatsd: sample_rate: 50%
[May 10 15:23:44]  INFO dogstatsd: buffered: true, max_messages: 32
[May 10 15:23:44]  INFO dogstatsd: async_uds: true, write_timeout: 2s

Hybrid RPC Config

Hybrid Pump allows you to install Tyk Pump inside Multi-Cloud installations. You can configure Tyk Pump to send data to the source of your choice (i.e. ElasticSearch), and in parallel, forward analytics to the Tyk Cloud. Additionally, you can set the aggregated flag to send only aggregated analytics to MDCB or Tyk Cloud, in order to save network bandwidth between DCs.

NOTE: Make sure your tyk.conf has analytics_config.type set to empty string value.

rpc_key - Put your organization ID in this field.

api_key - This the API key of a user used to authenticate and authorise the Gateway’s access through MDCB. The user should be a standard Dashboard user with minimal privileges so as to reduce risk if compromised. The suggested security settings are read for Real-time notifications and the remaining options set to deny.

aggregated - Set this field to true to send only aggregated analytics to MDCB or Tyk Cloud.

connection_string - The MDCB instance or load balancer.

use_ssl - Set this field to true if you need secured connection (default value is false).

ssl_insecure_skip_verify - Set this field to true if you use self signed certificate.

group_id - This is the “zone” that this instance inhabits, e.g. the DC it lives in. It must be unique to each slave cluster / DC.

call_timeout - This is the timeout (in milliseconds) for RPC calls.

rpc_pool_size - This is maximum number of connections to MDCB.

Prometheus Config

Prometheus is an open-source monitoring system with a dimensional data model, flexible query language, efficient time series database and modern alerting approach.

Add the following section to expose the /metrics endpoint:

"prometheus": {
        "type": "prometheus",
  "meta": {
    "listen_address": "localhost:9090",
    "path": "/metrics"
  }
},

listen_address - this is the URL that Prometheus can pull data from.

NOTE: When running Prometheus as a Docker image then remove localhost from listen_address. For example: "listen_address": ":9090".

Multiple Pumps

From Tyk Pump v0.6.0 you can now create multiple pumps of the same type by by setting the top level type as a custom values. For example:

"csv": {
  "type": "csv",
  "meta": {
    "csv_dir": "./"
  }
},
"csv_alt": {
  "type": "csv",
    "meta": {
    "csv_dir": "./"
  }
}

Capping analytics data

Tyk Gateways can generate a lot of analytics data. A guideline is that for every 3 million requests that your Gateway processes it will generate roughly 1GB of data.

If you have Tyk Pump set up with the aggregate pump as well as the regular MongoDB pump, then you can make the tyk_analytics collection a capped collection. Capping a collection guarantees that analytics data is rolling within a size limit, acting like a FIFO buffer which means that when it reaches a specific size, instead of continuing to grow, it will replace old records with new ones.

The tyk_analytics collection contains granular log data, which is why it can grow rapidly. The aggregate pump will convert this data into a aggregate format and store it in a separate collection. The aggregate collection is used for processing reporting requests as it is much more efficient.

If you’ve got an existing collection which you want to convert to be capped you can use the convertToCapped MongoDB command.

If you wish to configure the pump to cap the collections for you upon creating the collection, you may add the following configurations to your uptime_pump_config and / or mongo.meta objects in pump.conf.

"collection_cap_max_size_bytes": 1048577,
"collection_cap_enable": true

collection_cap_max_size_bytes sets the maximum size of the capped collection. collection_cap_enable enables capped collections.

If capped collections are enabled and a max size is not set, a default cap size of 5Gib is applied. Existing collections will never be modified.

NOTE: An alternative to capped collections is MongoDB’s Time To Live indexing (TTL). TTL indexes are incompatible with capped collections. If you have set a capped collection, a TTL index will not get created, and you will see error messages in the MongoDB logs. See MongoDB TTL Docs for more details on TTL indexes.

Environment Variables

Environment variables can be used to override settings defined in the configuration file. The Tyk Pump environment variables page shows how the JSON member keys maps to the environment variable. Where an environment variable is specified, its value will take precedence over the value in the configuration file.