Why enterprise teams succeed with policy as code using OPA

blog - header - Evolving beyond closely-coupled integrations by embracing open standards

Open standards can be a key driver for enforcing or even defining organizations’ API governance strategies. Open Policy Agent (OPA) is one such standard, having originated at Styra before becoming a CNCF project.

To help enterprises understand and implement OPA at scale, Charlie Egan, Senior Developer Advocate at Styra, talked to delegates at the recent Tyk LEAP 2.0 API governance conference. We’ve rounded up some key insights from Charlie’s session below, including:

  • What policy as code means – and why it’s so useful
  • How policy as code tackles enterprises’ unique governance challenges
  • How to use OPA with an API gateway
  • Why OPA is so beneficial for enterprises with sprawling systems and strict auditing requirements

What does policy as code mean? 

Understanding the value of OPA starts with understanding policies. Policies lay ground rules, such as only admins being allowed to perform certain actions within a system or controlling access across different departments or different teams.

Then there are other types of policies. In the Western world, for example, there are no row 13 seats available on most planes – so this is a policy that applies to airline ticket booking systems. A retail business on the other hand, might choose to limit sales of certain products to certain time periods.

Businesses are built on policies. It’s not just about who can access which system. It’s a codification of what businesses are doing.

What is policy as code? 

We looked above at a couple of policy examples expressed in natural language. A further example of this is: “Currently, only on-call members of the support team can trigger deployments to production at the weekend without review.”

Policy as code is about expressing a policy using policy language, instead of natural language. The above policy, then, would look like this if we use OPA language Rego to express it:

Why enterprise teams succeed with policy as code using OPA blog image 1

What are the benefits of policy as code?

Many of us are familiar with infrastructure as code (where you use configuration and programming code to define your infrastructure state) and the benefits it provides. With policy as code, many of the same benefits apply. These include:

  • Versioned configurations that you can roll forward and back and that can be audited or code reviewed. You can create versioned policy artifacts, which you can reason about, and know the policy was evaluated, with a given version in time.
  • Policy abstraction, where you can use tools to enable developers to focus on their own applications by abstracting away organizational and domain-specific policies. The platform enforces the policies, while the team having to implement them, freeing up developers to focus on their applications and easing the cognitive load.
  • Central policy management whereby you can share organizational policies while controlling and auditing them centrally. Much as you might manage infrastructure or a larger platform, granting resources to specific teams, you can grant control of a given policy to one team while retaining central control.
  • Collaboration and testing, with high-tech tools to ensure code is being tested and that it complies with whatever rules you require.

Policy as code at enterprise scale 

Taking things to enterprise scale introduces some challenges when it comes to implementing policy as code. There are three particular issues that arise: polyglot systems, strict auditing requirements and optimization complexity. Let’s dive into each of these.

Polyglot systems 

Large enterprises have disparate systems used by different teams, with sprawl occurring for a wide range of reasons. When you get to a certain scale, often there’s different teams with different areas of expertise, each using the best tools for the job. Your data scientists might be using Python while your API requests are written in Go (for example). And while these have different priorities and use cases, there is often overlap, such as shared data or a similar way of authenticating and authorizing staff users.

The fact that these systems often don’t work in isolation can pose a challenge, added to by siloed departments and acquisitions. All of this contributes to a sprawl of different technologies and integration challenges.

Organic business evolution complicates things further. It takes time to get to enterprise size meaning, by the time you get there, you’ve got legacy code, systems, data centers and platforms to work with – as well as disparate hosting and network infrastructures.

Strict auditing requirements

Large enterprises tend to have very strict auditing and compliance requirements, often working with several different compliance frameworks, with the threat of hefty fines and reputational damage if they fail to comply with them.

Enterprises have to understand what each compliance framework means for each team and each application. They also have to deliver consistently on the expectations of those frameworks in a standardized way. That’s a tough ask. It’s about much more than knowing who accessed what and when. You also need to understand which policy granted access and how the data context of that access was evaluated.

This understanding also extends to policy change, with a need to know who created different policy versions, how the policies changed and what the implications of those changes on access and data are. There needs to be a paper trail for all of this, which can be difficult when you have the disparate, sprawling systems we described above.

Optimization complexity 

From the need for low latencies to strict budgets, large datasets and globally distributed users and services, enterprise environments come with a range of optimization complexities and constraints.

Using OPA with an API gateway

OPA is a tool that meets teams where they’re at. It’s an open source, general purpose policy engine – essentially a tool to which you can offload policy decisions. You can use it to decouple policy logic that is resident in applications or systems and bring it into a standard mode of policy evaluation.

Open Policy Agent isn’t a replacement for an API gateway; it’s something you can integrate with an API gateway in order to provide policy functionality to that layer. An API gateway is a policy enforcement point, while OPA is a policy decision point.

Why enterprise teams succeed with policy as code using OPA blog image 2

The diagram above shows a typical example of how you can integrate with OPA. The service could be an API gateway or a team’s application. It makes a call out to OPA with information about a decision that needs to be made. OPA uses the loaded policy and data, then responds with the decision. That’s how any high-level integration with OPA works.

At the same time, OPA is sending audit logs about all the data that’s passing through it, all the decisions it has made, all the policy version changes that have happened and the policy versions that were used for given policy decisions. It submits all this in its audit box.

How OPA helps enterprises with polyglot systems 

There are lots of different ways to integrate with OPA: via REST API, using it in-process and with other language and framework integrations. On top of that, there are lots of different patterns you can use to integrate OPA: you can run it as a sidecar, next to API gateways, in-process, within the same contain image as your applications and even as a centralized service. This flexibility is ideal for enterprises with the kind of polyglot systems we discussed above.

Zalando, a large fashion retailer in Europe, provides a helpful case study of using OPA. The business has around 3,000 applications, developed by more than 2,000 engineers. Prior to using OPA, Zalando realized that it was using many different program languages and frameworks, configuring access in a wide range of different ways.

The business looked to OPA to externalize that authorization, moving to a common framework for all its different applications. With OPA, authorization was done in a standard way, with control delegated centrally. It delivered central team control over organization-wide policies, while policy logic that was relevant to a given domain or application could be owned by the teams themselves.

Zalando is also benefitting from embedding OPA directly within its API gateway instances. This enables them to deliver policy decisions with very minimal latency. Zalando reports that it also helps them keep the cost of the platform feature down, by including OPA within an existing instance.

How OPA helps with auditing

Policy as code can be versioned, tested and rolled out in a controlled way. In addition, OPA’s request logs contain input and output data, policy version information, data version information and more. The logs provide a full picture of why any decision from thousands of applications was made.

Capital One uses an OPA management framework for detailed audit logs and compliance records, both of how policies are changing and as how decisions are being made. Standardization was a key part of why it integrated OPA, as part of a wider effort to consolidate tooling while meeting a range of different auditing requirements. Capital One also uses OPA for central control of policies combined with delegation, as well as to enable peer review. Interestingly, the firm uses OPA not just for compliance but also for operational reasons, monitoring the production of audit data as part of its operational monitoring.

How OPA contributes to optimization

OPA is useful for very low latency use cases, as well as for enterprises that have very strict requirements around memory. The Miro whiteboarding tool is one business taking advantage of this. Miro needed to get the latency of its real-time system down, with even tens of milliseconds being too slow. By integrating with OPA in-process, leveraging the Rego SDK, Miro could achieve what it needed in terms of delivering authorization decisions on a very low latency budget.

Gusto is also using OPA for optimization, using it to improve the performance of individual queries for an application, by working on a batch API for query evaluation. Doing so means Gusto can run many queries in parallel as part of a single request, while still getting all the same decision logging that it would if it made them individually.

Embracing open standards

OPA is a super example of the power that embracing open standards delivers. Want to find out more? Then check out this blog post on how to move beyond closely coupled integrations by embracing open standards.