Design & Architecture

Third-Party Integrations & Resilience

Foundational

Every external dependency is a system you rely on but do not control. It will be slow, it will go down, it will change its responses, and one day it will be wrong. Integrate so their bad day does not become yours. Put the provider behind a layer you own, verify that responses are genuine, set sensible timeouts, and give a safe answer when the provider cannot.

We use third parties for things we should not build ourselves: identity (Veriff, Sumsub), payments (Stripe), storage, and more. This is the right choice. But each integration brings the provider's availability, latency, security, and correctness into our system. Resilient integration means you treat failures as normal and design for them on purpose. Do not just code the success path and hope.

Two concerns matter most for us. First, authenticity: verify inbound callbacks and webhooks before you trust them. The Finperiti unsigned-webhook finding shows what happens when you do not. Second, the fail direction: when a screening or KYC provider times out or errors, block-and-escalate, never auto-approve (see Designing for Failure). Resilience and safety are the same discipline here.

Isolate and verify

Trust the callback, no timeout [HttpPost("/stripe/webhook")] public IActionResult OnPaid(Event e) {
order.MarkPaid(); return Ok(); // unverified, and forged events accepted
}

Anyone can POST a fake 'paid' event and get goods for free. And calls without timeouts let a slow provider hang every request thread. Both authenticity and resilience are missing.

Verify, then act; bounded calls elsewhere var evt = stripe.ConstructEvent(rawBody, sigHeader, secret); // throws if forged
if (evt.Type == "payment_intent.succeeded") order.MarkPaid(evt.Id); // idempotent
// outbound: httpClient with timeout + Polly retry/circuit-breaker

The webhook is verified with cryptography before any action. Marking it paid is idempotent, so duplicates are safe. Outbound calls have timeouts and are isolated.

Survive their failures

Self-review checklist

Why it matters: Third parties are where availability and trust leave our control, so this is where outages and security holes most often start. A forged webhook can let a fraudster pass KYC, and a call with no timeout to a slow provider can take down the whole service. When we isolate the provider, verify its responses, and fail safe, their bad days become a contained, correct response on our side.