Managing Webcrawlers

Configure robots.txt (Built-in Solution)
Implement Additional HTTP Headers
Best Practices

Configure robots.txt (Built-in Solution)

From version 1.14.0, the Tyk Developer Portal includes built-in support for customizing the robots.txt file, which is the standard way to instruct search engines and other well-behaved crawlers about which parts of your site they should not access. To configure this:

Log in to the Admin Portal
Navigate to Settings > General
Scroll down to the robots.txt Settings section
Edit the content to control crawler access

A restrictive robots.txt configuration would look like:

User-agent: *
Disallow: /

This instructs all crawlers to avoid indexing any part of your site. By default, the Portal already uses a restrictive robots.txt configuration.

Implement Additional HTTP Headers

You can add custom response headers to further discourage crawling:

X-Robots-Tag: noindex, nofollow - Similar to robots.txt but as an HTTP header
Cache-Control: no-store, no-cache, must-revalidate - Prevents caching

These can be added in your proxy configuration or by customizing your portal theme.

Best Practices

Regularly check your server logs for unusual crawling patterns
Consider using a CAPTCHA for registration forms to prevent automated sign-ups (not supported natively by Tyk Developer Portal at this time)
Use JavaScript-based content rendering for sensitive information, as basic crawlers may not execute JavaScript

Remember that while these methods can deter most crawlers, they cannot provide absolute protection against determined scrapers that deliberately ignore robots.txt rules or use sophisticated techniques to mimic human behavior.

Email Service Resource Migration

⌘I

Overview

Getting Started

Publishing APIs

Consuming APIs

User Management

Portal Administration

Reference

Legacy Products

Configure robots.txt (Built-in Solution)

Implement Additional HTTP Headers

Best Practices

Overview

Getting Started

Publishing APIs

Consuming APIs

User Management

Portal Administration

Reference

Legacy Products

​Configure robots.txt (Built-in Solution)

​Implement Additional HTTP Headers

​Best Practices

Configure robots.txt (Built-in Solution)

Implement Additional HTTP Headers

Best Practices