Skip to main content

Documentation Index

Fetch the complete documentation index at: https://developers.bloobank.com/llms.txt

Use this file to discover all available pages before exploring further.

When a request fails transiently, the question is not “should I retry?” — it is “how do I retry without making things worse?”. A naive retry loop turns a recoverable blip into a thundering herd. This page is the canonical retry recipe. For which errors are retryable, see Handling errors §Retry semantics.

The recipe

Three rules:
  1. Honor Retry-After. If the server tells you when to retry, do exactly that.
  2. Otherwise, exponential backoff with full jitter. Random in [0, min(cap, base × 2^attempt)).
  3. Bound attempts. Three is a reasonable default. After three failures, surface the error.

The formula

The canonical “decorrelated jitter” / “full jitter” formula (AWS guidance):
delay = random_between(0, min(cap, base * 2^attempt))
VariableRecommended value
base500 milliseconds
cap30,000 milliseconds (30 seconds)
attempt0 (first retry), 1, 2, …
Max attempts3
Sequence with base = 500ms, cap = 30s:
AttemptWindowTypical wait
0 (first retry)[0, 500ms)~250ms
1 (second retry)[0, 1s)~500ms
2 (third retry)[0, 2s)~1s
Subsequentgrows to 30s cap
The randomization is essential — without it, every retrying client hits the server at the same moments after a brief outage, prolonging recovery.

Code

async function withRetry(fn, { maxAttempts = 3, base = 500, cap = 30_000 } = {}) {
  for (let attempt = 0; ; attempt++) {
    try {
      return await fn();
    } catch (err) {
      const status   = err.body?.error?.status;
      const retryable = ['RESOURCE_EXHAUSTED', 'PROVIDER_UNAVAILABLE',
                         'DATABASE_UNAVAILABLE', 'UPSTREAM_UNAVAILABLE'].includes(status);
      if (!retryable || attempt >= maxAttempts - 1) throw err;

      const hintSecs = err.body?.error?.details?.[0]?.metadata?.retry_after_seconds;
      const retryAfter = err.headers?.['retry-after'];
      const delay = hintSecs ? hintSecs * 1000
                  : retryAfter ? parseInt(retryAfter, 10) * 1000
                  : Math.random() * Math.min(cap, base * 2 ** attempt);

      await new Promise(r => setTimeout(r, delay));
    }
  }
}

// Usage
const order = await withRetry(() => client.createPaymentOrder('production-main', body));

Which errors are retryable

Per Handling errors:
RetryableNot retryable
RESOURCE_EXHAUSTED (rate limit)INVALID_ARGUMENT (caller mistake)
PROVIDER_UNAVAILABLE (PIX network blip)WALLET_NOT_FOUND / PAYMENT_ORDER_NOT_FOUND
DATABASE_UNAVAILABLEWALLET_ALREADY_EXISTS
UPSTREAM_UNAVAILABLERBAC_DENY
REPLAY_DETECTED — once, with fresh request idIDEMPOTENCY_KEY_IN_USE_WITH_DIFFERENT_PARAMS
SIGNATURE_INVALID, TIMESTAMP_SKEW_EXCEEDED
INTERNAL (do not retry blindly)
INTERNAL is a deliberate exception — it indicates a server-side defect the platform understands but cannot resolve for you. Retrying compounds the problem without changing the outcome. Capture the ERROR_RECORDED id and escalate instead.

Per-error-class retry recipes

Rate limit (RESOURCE_EXHAUSTED)

1. Honor Retry-After header if present.
2. Otherwise honor details[0].metadata.retry_after_seconds.
3. Otherwise exponential backoff with full jitter.
4. Max 3 attempts.
5. Reuse idempotencyKey; mint fresh requestId.

Transient infrastructure (*_UNAVAILABLE)

1. Exponential backoff with full jitter.
2. Max 3 attempts.
3. Reuse idempotencyKey; mint fresh requestId.

Replay (REPLAY_DETECTED)

1. One-shot retry.
2. Generate a fresh X-Access-Request-Id.
3. Re-read the clock; re-sign.
This error happens when your request id generator collides or persists state across attempts. The retry should succeed; if it fails again, fix the request-id generator.

Auth (SIGNATURE_INVALID, TIMESTAMP_SKEW_EXCEEDED)

Do not retry. Fix the root cause (low-S, body bytes, clock drift). Retrying without fixing returns the same error.

Validation (INVALID_ARGUMENT)

Do not retry. The request shape is wrong. Surface the per-field errors from details[] to the user and let them resubmit.

Idempotency conflict (IDEMPOTENCY_KEY_IN_USE_WITH_DIFFERENT_PARAMS)

Do not retry with the same key. The key is permanently bound to the first body that used it. Either:
  • Pick a new key and retry with the new body, OR
  • Reconcile: fetch the original order via the local persistence layer (you stored the original idempotencyKey there, right?) and decide whether to keep it or create a separate operation.

Things that go wrong

Retry without backoff

// ✗ Wrong — thundering herd
for (let i = 0; i < 5; i++) {
  try { return await call(); }
  catch (e) { /* retry immediately */ }
}
A pile-up of clients all retrying with no delay turns a brief outage into a long one. Always backoff.

Retry without bounds

// ✗ Wrong — runaway
while (true) {
  try { return await call(); }
  catch (e) { await sleep(1000); }
}
If the failure is persistent (not transient), an unbounded loop never surfaces it. Cap attempts.

Retry without keeping idempotencyKey

// ✗ Wrong — duplicate payment on retry
async function pay(amount) {
  for (let i = 0; i < 3; i++) {
    try {
      return await client.createPaymentOrder('main', {
        idempotencyKey: randomUUID(),   // ← new key on every retry
        amount, currency: 'BRL', /* ... */
      });
    } catch (e) { await sleep(1000 * (i + 1)); }
  }
}
If the first attempt succeeded but the response was lost, the second attempt with a different key creates a duplicate. The idempotencyKey must be persisted before the call and reused on every retry.

Retry on RBAC_DENY

// ✗ Wrong — authorization will not change
catch (e) {
  if (e.status === 'RBAC_DENY') retry();   // never
}
If the credential lacks permission now, it lacks permission in five seconds. Surface the failure; request the role binding from your account team.

Beyond retries — circuit breaker

For client systems with bursty traffic, consider wrapping the BlooBank client in a circuit breaker:
  • After N consecutive failures, open the circuit — fail fast for a cooldown window without even calling.
  • Periodically allow one probe through (half-open).
  • On success, close the circuit and resume normal traffic.
Libraries to consider: opossum (Node), pybreaker (Python), hystrix-go (Go), Resilience4j (Java). A circuit breaker is not a substitute for retries; the two work together. Retries handle individual blips; the breaker handles sustained outages.

Next

Handling errors

The branching pattern in code.

Idempotency

Make every retry safe.