After speaking to more than 150 different companies of various sizes in different verticals - that shared with us how they manage their external API integrations, we have a strong understanding of how this process evolves.
This blog post is about sharing our API middleware-building observations to help you understand what middleware capabilities your business needs as you scale and to help you implement them.
When API middleware doesn’t exist yet.
This is the starting point for most companies we’ve spoken with. Developers integrate their chosen APIs, test, and hope for the best.
Problems creep in over time, things like API breaking changes, hitting rate limits, and unexpected API responses or errors.
Most companies getting started, and smaller teams, respond with ad-hoc solutions like these common examples:
- Visibility – Implementing basic visibility over API calls using either an APM (Datadog, NewRelic) or tracking them and building dashboards in Grafana.
- Error handling – Catching now and fixing later.
- Middleware capabilities – Basic throttling or caching. The application itself handles most API integrations at this point, using your programming language for some frameworks or an imported library.
- Authentication management – They manage problems ad-hoc with token renewal, key rotation, and handling authentication methods (but slowly!)
However, as we know and learned from the hundreds of tech companies we’ve spoken with - these responses aren’t enough to satisfy the evolving complexities of API integrations over time and especially as companies scale and grow to depend on 3rd party API calls.
Next step or state – when there is stateless middleware.
The next step in a company's architectural evolution of middleware is the stateless middleware.
Here are some typical problematic scenarios for growing companies:
- They integrate with too many external APIs, across teams, and the same middleware implementations are coded repetitively. Handling those integrations more efficiently requires a dedicated class or wrapper.
- They lose track of their external API usage and their efficiency. This typically results in things like unexpected billing from API providers and SLA breaches (which depend on data about external API transactions) due to, mainly, impaired customer experience companies may receive from the API provider.
At this point, companies begin to realize that in order to analyze or anticipate the next problem, they are going to need usage metrics and traces on all their API calls.
- Companies are getting lots of 429s, so they need more complex logic. It makes sense to create an auto-retry logic and implement practices to avoid exponential backoffs.
How do companies achieve this stateless middleware in their infrastructure?
Companies will either:
- Wire the API traffic from their backend to an internal gateway so they can edit API requests and responses.
- Build a wrapper for all API integrations.
We’ve seen many companies do both. This usually happens when their implementations have already given them a basic level of observability, and managing or maintaining the volume of API integrations requires constant attention from an integration team or teams. They will likely build many of the capabilities we’ve mentioned either themselves or in collaborations.
Final Destination – You have a stateful middleware
AKA you finally get an egress API proxy (stateful middleware). For most companies, their self-built API middleware is adequate until they hit a growth phase, which brings scaling problems that affect their API integrations and other architecture, and if they have become reliant on third-party APIs, then things start to break again.
Stateless middleware is no longer enough. Developers and architects need to build or buy a dedicated solution to handle all their needs (more about the build vs. buy dilemma here) that needs to be built on top of an API proxy (AKA egress proxy).
This proxy will sit between the backend and the API providers companies integrate with and will govern and orchestrate that egress traffic based on the business logic. (If you’re interested in digging deeper about what is an Egress API Proxy, check out this read here).
What use cases require a stateful API proxy?
1. Quota management. Multiple services are consuming the same API provider and sharing the same quota. A dedicated service can manage quotas by balancing the fractional quota based on the prioritized needs of each consuming service or each environment. Check out this cool sandbox we created to learn more.
2. Production grade requirements. Relying on a service to do this heavy lifting requires redundancy and backup to be in place to avoid impacting API call latency and to handle many concurrent requests in parallel. These needs bring complexities and architectural considerations.
3. Complex consumption scenarios. Implementing throttling is simple, but throttling and triggering caching for specific API calls with a soft limit alert set at 80% of the rate limit is more involved. Building complex middleware scenarios based on specific business logic requires a dedicated service.
4. Unlocking API observability. A dedicated service that tracks API calls and samples and stores them for later analysis is empowering. Seeing usage patterns in depth, predicting problems and usage peaks before they occur, and detecting anomalies in consumption or API behaviors are superpowers.
This is the point where we’ve seen savvy companies dedicating a platform team to building their own API proxy.
To Sum it Up: The Breakdown of Middleware Stages
Here’s what we’ve learned from dozens of companies about the evolution process.
Which stage are you at?